Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++ extraction aborted for compiler invocation when using std::format #18244

Open
ryftchen opened this issue Dec 8, 2024 · 15 comments
Open

C++ extraction aborted for compiler invocation when using std::format #18244

ryftchen opened this issue Dec 8, 2024 · 15 comments
Labels
question Further information is requested

Comments

@ryftchen
Copy link

ryftchen commented Dec 8, 2024

Description of the issue

Found that if the hpp file of the function that calls C++20 std::format is included in other cpp files, the result of "codeql database analyze" shows that the number of cpp files in this invocation is incomplete. At the same time, can find something similar to the following in the sarif file:

...
      {
        "message" : {
          "text" : "Extraction aborted for compiler invocation ...
          "id" : "cpp/diagnostics/failed-extractor-invocations",
          "index" : 1
        },
        "level" : "error",
        "descriptor" : {
          "id" : "cpp/diagnostics/failed-extractor-invocations",
          "index" : 1
        },
        "properties" : {
          "formattedMessage" : {
            "text" : "Extraction aborted for compiler invocation ...
          }
        }
      },
...

However, this issue does not occur when std::format is not used, is it currently supported? Thank you.

@ryftchen ryftchen added the question Further information is requested label Dec 8, 2024
@jketema
Copy link
Contributor

jketema commented Dec 8, 2024

Hi @ryftchen

std::format is supported, but it seems that you ran into some problem with the C/C++ front end we're using. Unfortunately, we cannot debug this without further details. What version of CodeQL are you using, and what is the C++ compiler (including its version number) that you're using? Also, which platform are you on (Linux, Windows, macOS)?

@ryftchen
Copy link
Author

ryftchen commented Dec 8, 2024

Hi @ryftchen

std::format is supported, but it seems that you ran into some problem with the C/C++ front end we're using. Unfortunately, we cannot debug this without further details. What version of CodeQL are you using, and what is the C++ compiler (including its version number) that you're using? Also, which platform are you on (Linux, Windows, macOS)?

Hi @jketema

CodeQL version: 2.18.4 (cpp-queries 1.2.2)
C++ compiler: clang++ 16.0.6
Platform: Linux (Ubuntu 24.04 x86_64)

P.S: For affected cpp files, after changes to parts unrelated to std::format sometimes don't result in an extraction aborted issue. There seems to be no pattern to the reproduction. But removing std::format does not cause the extraction aborted issue.

@jketema
Copy link
Contributor

jketema commented Dec 8, 2024

Thanks. Quite a few fixes have gone in since 2.18.4, would you be able to try with the latest version? Also, a copy of the build-tracer.log file, which will be in the database directory would help. It should give the actual error before the abort happened.

@ryftchen
Copy link
Author

ryftchen commented Dec 9, 2024

@jketema
Result of latest version (2.20.0): build-tracer.log.zip

@jketema
Copy link
Contributor

jketema commented Dec 9, 2024

Hi the log seems for 2.18.0, and not 2.19.4, which is the current latest version (2.20.0 should be released in the next few days).

In the log I see two errors. The majority of which are

"/usr/include/gtest/gtest.h", line 52: catastrophic error: cannot open source file "cstddef"
  #include <cstddef>

These are are coming from CMake test compilations, so are harmless. They likely also fail to compile when compiled with clang, and seem unrelated to thestd::format problem from above.

The other errors are (6 of them):

"/workspaces/foo/application/core/source/command.cpp", line 912: error: constinit variable requires dynamic initialization
      static constinit const auto resetter = []<HelperType Helper>() constexpr

These also seem unrelated to std::format.

So from this it doesn't seem there is a problem with std::format.

@ryftchen
Copy link
Author

ryftchen commented Dec 9, 2024

The other errors are (6 of them):

"/workspaces/foo/application/core/source/command.cpp", line 912: error: constinit variable requires dynamic initialization
static constinit const auto resetter = []() constexpr
These also seem unrelated to std::format.

So from this it doesn't seem there is a problem with std::format.

Hi, thank you for your support. This is a good catch, but I'm afraid that the issue doesn't seem to be with this file.

I checked again and noticed that the log file seemed to contain log content from earlier versions, so I confirmed the execution time and got the log for version 2.20.0 only: build-tracer.log.zip
I found the following log about exception, and the cpp file name involved can be mapped to the location of the "Extraction aborted for compiler invocation ..." message in the sarif file codeql.sarif.zip :

[E 01:51:47 4250] Warning[extractor-c++]: In add_constructor_init: Unexpected dynamic init kind 7.
CodeQL C++ extractor: Something bad happened.
CodeQL C++ extractor: Received signal 11: Segmentation fault.
CodeQL C++ extractor: Signal happened at 0x64.
CodeQL C++ extractor: Backtrace:
/usr/local/share/codeql/cpp/tools/linux64/extractor(+0x3dae3e)[0x55a4075a2e3e]
/usr/local/share/codeql/cpp/tools/linux64/extractor(+0x3dd0a0)[0x55a4075a50a0]
/lib/x86_64-linux-gnu/libc.so.6(+0x45320)[0x7f7a24e28320]
/usr/local/share/codeql/cpp/tools/linux64/extractor(+0x409428)[0x55a4075d1428]
/usr/local/share/codeql/cpp/tools/linux64/extractor(+0x41460b)[0x55a4075dc60b]
/usr/local/share/codeql/cpp/tools/linux64/extractor(+0x441056)[0x55a407609056]
/usr/local/share/codeql/cpp/tools/linux64/extractor(+0x43f97a)[0x55a40760797a]
/usr/local/share/codeql/cpp/tools/linux64/extractor(+0x403d5a)[0x55a4075cbd5a]
/usr/local/share/codeql/cpp/tools/linux64/extractor(+0x3dc348)[0x55a4075a4348]
/usr/local/share/codeql/cpp/tools/linux64/extractor(+0x3c3f76)[0x55a40758bf76]
/lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x7f7a24e0d1ca]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x7f7a24e0d28b]
/usr/local/share/codeql/cpp/tools/linux64/extractor(+0x3c3ea9)[0x55a40758bea9]
[E 01:51:47 4250] Finished compilation TRAP /workspaces/foo/report/sca/query/trap/cpp/compilations/48/56285707_0.trap.zst
CodeQL C++ extractor: Current location: /workspaces/foo/application/example/source/apply_design_pattern.cpp:238275,0
CodeQL C++ extractor: Current physical location: /workspaces/foo/application/example/source/apply_design_pattern.cpp:804,0 (end of source)
[T 01:51:47 4078] Extractor /usr/local/share/codeql/cpp/tools/linux64/extractor terminated with exit code 1.

@jketema
Copy link
Contributor

jketema commented Dec 9, 2024

This is a crash in our tooling. To fix this, I would need a test case that reproduces the problem. The logs do not contain enough information to track down the issue.

@ryftchen
Copy link
Author

Is any other information needed? So far, this has only been found to happen when using the std::format function, not when using std::string splicing. Only a part of the cpp files are affected.

@jketema
Copy link
Contributor

jketema commented Dec 10, 2024

Is any other information needed?

Yes, as I mentioned above, you'll need to provide a test case that reproduces the problem, because the logs do not provide sufficient information, and I have not been able to reproduce the problem locally.

@ryftchen
Copy link
Author

Is any other information needed?

Yes, as I mentioned above, you'll need to provide a test case that reproduces the problem, because the logs do not provide sufficient information, and I have not been able to reproduce the problem locally.

Okay, I'll try to make a simple case, but I'm afraid it's a little hard to find the reproduction condition from the code project.

@jketema
Copy link
Contributor

jketema commented Dec 10, 2024

Okay, I'll try to make a simple case, but I'm afraid it's a little hard to find the reproduction condition from the code project.

A possible approach to this is to preprocess one of the files on which the problem occurs with clang (-E option) and store the output in a new cpp file. Then see if the problem still occurs when you build a database for the new cpp file. If so, and if there's nothing sensitive in the file, then sharing that new file should be sufficient. If there is sensitive information in the file, then you best option is to try to reduce the file with a tool like creduce to create a minimal reproducing test case.

@ryftchen
Copy link
Author

Okay, I'll try to make a simple case, but I'm afraid it's a little hard to find the reproduction condition from the code project.

A possible approach to this is to preprocess one of the files on which the problem occurs with clang (-E option) and store the output in a new cpp file. Then see if the problem still occurs when you build a database for the new cpp file. If so, and if there's nothing sensitive in the file, then sharing that new file should be sufficient. If there is sensitive information in the file, then you best option is to try to reduce the file with a tool like creduce to create a minimal reproducing test case.

Thanks for the suggestion, unfortunately using the precompiled content as a new file doesn't seem to reproduce the crash: issue.cpp.zip

The compile command is simplified to:
/usr/lib/llvm-16/bin/clang -std=c++20 -stdlib=libstdc++ -c issue.cpp -o issue.o

As with the precompilation, if replacing the single call to std::format in the corresponding source file with a string splice, there is no issue. It was also found that even if the version is the same, but the device on which it is running is different (e.g. in a WSL or in a Docker container), the files affected are different.

@jketema
Copy link
Contributor

jketema commented Dec 13, 2024

but the device on which it is running is different (e.g. in a WSL or in a Docker container)

Question for my understanding, to make sure nothing is lost in translation: are you running on the same machine, just using WSL or Docker, or is it a completely different machine?

@ryftchen
Copy link
Author

Question for my understanding, to make sure nothing is lost in translation: are you running on the same machine, just using WSL or Docker, or is it a completely different machine?

It is a completely different machine, but the running environment is the same. Specifically, I tried to run the same version in the WSL (Ubuntu 22.04 x86_64) of Windows or Docker container (Ubuntu 22.04 x86_64) of Mac. Assuming that there are a total of 5 cpp files calling the function where std::format is located, the results from the WSL and Docker runs show that the total number of exception files involved will be different on both sides, although both are less than 5.

@jketema
Copy link
Contributor

jketema commented Dec 16, 2024

Hi. Thanks for your answer. It's current;y very difficult to see what is going on. We've made some changes internally that should improve the quality of the stack traces on Linux. This will be part of CodeQL 2.20.1, which will be released early January. At this point it seems best to wait for that change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants