[CIR][CIRGen] Add support for global named registers #741

566hub · 2024-07-15T14:33:53Z

Realize RG and GET_RG op which are copied from cir.Global
op. The concept is that the same asm label means a unique abstract memory
address with u64i type, and get_rg will return ptr. We could use
cir.cast(bitcast) to transform ptr to ptr.

…lvm#501) This PR adds the LLVMIR lowering support for CIR bit operations. For `cir.bit.clz`, `cir.bit.ctz`, and `cir.bit.popcount`, they can be lowered directly to LLVM intrinsic calls to `@llvm.ctlz`, `@llvm.cttz`, and `@llvm.ctpop`, respectively. For the other three bit operations, namely `cir.bit.clrsb`, `cir.bit.ffs`, and `cir.bit.parity`, they are lowered to a sequence of LLVM IR instructions that implements their functionalities. This lowering scheme is also used by the original clang CodeGen.

CIRGenFunction::buildFromMemory can handle the `cir.bool` values. So we no longer need to emit the `NIY` error here.

This is the next step in inline assembly support and it's more like a service PR and mostly dedicated to the in/out argument types. Also, operand attributes are added and it's the last change in the `cir.asm` operation afaik. But I would wait untill the next PR, which will contain more examples and maybe will help us to get more readable format for the operation. Note, that we have to add an attribute for each operand - because the lowering of the llvm dialect to LLVM IR iterates over them in the same order. The next PR will be last one (so far) in the series of PRs dedicated to the inline assembly support. It will add storing of the results.

Three small changes, all cleanup or refactoring in nature. 1. Fix the assemblyFormat for all the vector operations in the ClangIR dialect so that vector types appear in ClangIR as `!cir.vector<type x n>` instead of as just `<type x n>`. When I first created the vector ops, I forgot to use `qualified` as necessary when writing out types. This change fixes that. There is no change in behavior, but there is a change to the text version of ClangIR, which required changing the ClangIR expected results and ClangIR inputs in the tests. 2. Create a new `cir.vec.splat` operation and use that for "vector splat", i.e. a conversion from a scalar to a vector. A "vector splat" conversion had been implemented with `cir.vec.create` before. This change results in different ClangIR and different LLVM IR, which again required updating the tests, but no noticeable change in compiler behavior. 3. Create an `IntegerVector` type constraint, which requires that the given type be a vector whose element type is an integer. It can be any integral type, and the vector can be of any size. Use the new type constraint in the definition of `cir.vec.ternary`, whose condition operand must be an `IntegerVector`. Remove the integral type check from `VecTernaryOp::verify`, since doing the check there is now redundant. The only possibly visible change is to the text of an error message when validation of `cir.vec.ternary` fails. The expected output of a validation test was updated with the new message.

This PR intends to fix some problems with packed structures support, so the llvm#473 will work. Basically, the main problem for the packed structures support is an absence of arbitrary sized integers in CIR. Well, one workaround is to use `mlir::IntegerType` for that, but it's kind of wrong way (please correct me if I'm wrong). Another way is to introduce this type in CIR. So far I suggest this way: instead of arbitrary sized integers we will create an array of bytes for bitfield storages whenever they doesn't fit into the CIR `IntType`. Well, the original codegen creates storages with alignment 8 - so it can be `i24` storage type for instance. Previously, we just created storages that could be represented as CIR `IntType`: 8, 16, 32, 64. And it was working before I came up with a necessity to support packed structures. At first glance it's not a problem - just add `determinePacked` method from the original codegen and that's it. But it turned out that this method _infers_ the fact if a structure is packed or not. It doesn't use the AST attribute for that as one could think - it works with offsets and alignments of fields. Thus, we either need to invent our own way to determine packed structures (which is error prone and maybe not doable at all) or try to use the existing one. Also, we go closer to the original lllvm's data layout in this case. 1) I had to move the lowering details from the `LoweringPrepare` to the `LowerToLLVM`, because it's not possible to do a `load` from the array of bytes to the integer type - and it's ok in llvm dialect. Thus, all the math operations can be expressed without any problems. Basically the most of the diff you see is because of the changes in the lowering. The remaining part is more or less easy to read. 2) There are minor changes in `CIRRecordLayoutBuilder` - as described above, we use may generate an array of bytes as a storage. 3) Some cosmetic changes in `CIRGenExpr` - since we don't want to infer the storage type again and just use the one stored in the `CIRGenBitFieldInfo`. 4) Helpers are introduced in the lowering - but nothing hard - just shifts and logical ops. 5) I removed `bitfield-ops` test - because now the test cases covered there are all in `bitfields.c` and `bitfields.cpp` . So ... This is still a suggestion, though I believe it's a good one. So you are welcome to discuss, suggest another ways to solve the problem and etc.

Still missing lowering support, which will come next.

This doesn't change existing functionality, for existing crashes related to atomics we just hit asserts a bit further now, but no support added just yet. Next set of commits will introduce functionlity with testcases.

Just like previous commit, add more infra pieces, still NFC since all relevant testcases hit asserts, just a bit deeper.

This PR adds a support for packed structures. Basically, now both `pragma pack(...)` and `__attribute__((aligned(...)))` should work. The only problem is that `getAlignment` is not a total one - I fix only a couple of issues I faced with - for struct types and arrays.

PR adds support for initialization of unions. The change is copy-pasted from the original CodeGen.

This PR adds support for `__builtin_prefetch`. CIRGen of this builtin emits the new 'cir.prefetch' opcode. Then `cir.prefetch` lowers to `llvm.prefetch` intrinsic. Co-authored-by: Bruno Cardoso Lopes <[email protected]>

This PR adds MemRead/MemWrite markers to the `GetBitfieldOp` and `SetBitfieldOp` (as discussed in llvm#487) Also, minor renaming in the `SetBitfieldOp` --------- Co-authored-by: Bruno Cardoso Lopes <[email protected]>

This PR adds support for `__builtin_constant_p`. Implementation introduces the new `cr.is_constant` opcode to it during the codegeneration of builtin. Codegeneration is taken from the original llvm codegen.

This patch adds CIRGen support for the C++20 three-way comparison operator `<=>`. The binary operator is directly lowered to existing CIR operations. Most of the changes are tests.

This change is taken from the original codegen.

…m#517) This PR adds handling of AttributedStmt to support fallthrough attribute.

This change is taken from the original codegen

We start our journey towards `goto` support and this is a first step on this way. There are some discussion in llvm#508 and according to the plan we do the following here: - a new pass called `cir-flatten-cfg` that is a stub now but later will be responsible for the regions inlining. The pass works only if `-emit-flat-cir` is passed in cmd line. Thus, the clang behavior is not changed here from the user's point of view. - The pass will be accomplished with `goto` solver later, so we talk about several passes that are mandatory for the lowering into `llvm` dialect. There are at least two clients of this pass that will be affected: `clang` itself and `cir-opt`, so we need a common point for them: and `populateCIRFlatteningPasses` and `populateCIRToLLVMPasses` guarantee that `CIR` will be in the correct state for all the clients, whatever new passes we will add later.

Implement `__builtin_shufflevector` and `__builtin_convertvector` in ClangIR. This change contributes to the implemention of issue llvm#284. `__builtin_convertvector` is implemented as a cast. LLVM IR uses the same instructions for arithmetic conversions of both individual scalars and entire vectors. So ClangIR does the same. The code for handling conversions, in both CodeGen and Lowering, is cleaned up to correctly handle vector types. To simplify the lowering code and avoid `if (type.isa<VectorType>())` statements everywhere, the utility function `elementTypeIfVector` was added to `LowerToLLVM.cpp`. `__builtin_shufflevector` has two forms, only one of which appears to be documented. The documented form, which takes a variable-sized list of integer constants for the indices, is implemented with the new ClangIR operation `cir.vec.shuffle.ints`. This operation is lowered to the `llvm.shufflevector` op. The undocumented form, which gets the indices from a vector operand, is implemented with the new ClangIR operation `cir.vec.shuffle.vec`. LLVM IR does not have an instruction for this, so it gets lowered to a long series of `llvm.extractelement` and `llvm.insertelement` operations.

This introduces CIRGen and LLVM lowering for the first of a bunch of these atomic operations, incremental work should generelize the current constructs.

This PR adds support for the following intrinsic functions: - `__builtin_bswap{16, 32, 64}` - `_byteswap_{ushort, ulong, uint64}` This PR adds a new `cir.bswap` operation to represent such an intrinsic call. CIRGen and LLVMIR lowering for the new operation is included in this PR.

This is the final commit for issue llvm#284. Vector types other than GNU vector types will be covered by other yet-to-be-created issues. Now that GNU vector types (the ones defined via the vector_size attribute) are implemented, do a final cleanup of the assertions and other checks related to vector types. Remove `UnimplementedFeature::cirVectorType()`. Deal with the remaining calls to that function. When the that is not yet implemented has to do with Arm SVE vectors, the assert was changed to `UnimplementedFeature::scalableVectors()` instead. The assertion was removed in cases where the code correctly handle GNU vector types. While cleaning up the assertion checks, I noticed that BinOp handling of vector types wasn't quite complete. Any special handling for integer or floating-point types wasn't happening when the operands were vector types. To fix this, split `BinOpInfo::Ty` into two fields, `FullType` and `CompType`. `FullType` is the type of the operands. `CompType` is normally the same as `FullType`, but is the element type when `FullType` is a vector type.

…rt fp

… value (llvm#719) This commit fixes GlobalOp lowering for floating without initial value. It implies to be initialized with zeros. E.g. float f[100]; double d;

…p::fold method (llvm#663)" This reverts commit 3b9f698.

ClangIR was failing on ``` __atomic_compare_exchange_n(&a, &old, 42, true, 5, 5); ``` The `true` was the problem. It would work with a literal `0` or `1`, but not with a literal `true` or `false`. The bug was in `isCstWeak` in CIRGenAtomic.cpp, which was only looking for an integral constant. It didn't recognize a boolean constant and was falling back on the non-constant path, which isn't implemented yet. Rewrite `isCstWeak` to check for both intergral and boolean constants.

This PR adds LLVM lowering for the following operations related to complex numbers: - `cir.complex.create`, - `cir.complex.real_ptr`, and - `cir.complex.imag_ptr`. The LLVM IR generated for `cir.complex.create` is a bit ugly since it includes the `insertvalue` instruction, which typically is not generated in upstream CodeGen. Later we may need further CIR canonicalization passes to try folding `cir.complex.create`.

This PR fixes the bug described as in llvm#727 (comment). It should resolve the crash reported in llvm#727 .

This PR introduces a new attribute `OpenCLKernelMetadataAttr` to model the OpenCL kernel metadata structurally in CIR, with its corresponding implementations of CodeGen, Lowering and Translation. The `"TypeAttr":$vec_type_hint` part is tricky because of the absence of the signless feature of LLVM IR, while SPIR-V requires it. According to the spec, the final LLVM IR should encode signedness with an extra `i32` boolean value. In this PR, the droping logic from CIR's `TypeConverter` is still used to avoid code duplication when lowering to LLVM dialect. However, the signedness is then restored (still capsuled by a CIR attribute) and dropped again in the translation into LLVM IR.

… BinOp (llvm#720) This commit extends the pass to support loop invariant BinOp hoisting as SCF forOp boundary. E.g. // (100 - 1) should be hoisted out of loop. // So the boundary could be input operand to generate SCF forOp. for (int i = 0; i < 100 - 1; ++i) {}

…#729) Previously, when lowering induction variable in forOp, we removed the IV load and replaced the users with SCF.IV. The CIR IV users might still CIR operations during lowering forOp. It caused the issue that CIR operation contained SCF.IV as operand which is MLIR integer type instead CIR type. This comment lower CIR load IV_ADDR with ARITH addi SCF.IV, 0 So SCF.IV can be propagated by OpAdaptor when lowering individual IV users. This simplifies the lowering and fixes the issue. The redundant arith.addi can be removed by later MLIR passes.

LLVM lowering for the following operations is introduced in llvm#616 and llvm#651: `cos`, `exp`, `exp2`, `log`, `log10`, `log2`, `sin`, `sqrt`, `fmod`, and `pow`. However, they are not lowered to their corresponding LLVM intrinsics; instead they are transformed to libc calls during lowering prepare. This does not match the upstream behavior. This PR tries to correct this mistake. It makes all CIR FP intrinsic ops lower to their corresponding LLVM intrinsics (`fmod` is a special case and it is lowered to the `frem` LLVM instruction).

For now only handle the cir.try part, cir.catch is coming next. Using flat cir for tests make this easy to incrementally build.

as title. document will be in another PR as it seems to be a different upstream branch

…lvm#733) In [this commit](llvm@e5d840b), minimal support for Darwin aarch64 triples was added. But TargetLoweringInfo was not updated correspondingly. This could lead to a failure of the test `driver.c` with CallConvLowering pass enabled (or `LowerModule` used in some other ways). This PR fixes the inconsistency and adds an extra missing feature flag for it.

Although currently LowerModule is not ready for formal usage, we need it for target-specific lowering to LLVM. This PR temporarily public the symbol `createLowerModule` to reuse the logic of preparing a `LowerModule`, making it easier for future refactor (making `TargetLoweringInfo` available for most stages in CIR Lowering).

In this PR, we 1. implement defaultVisibility as far as dsolocal is concerned, currently is either MLIR::Visibility isPublic() or isPrivate(). Now, we don't handle hiddenVisibility and protectedVisibility from AST. I put missFeature assert so that If in anyway we translate hiddenVisibility or protectedVisibility into mlir::SymbolTable::Visibility::Private (hopefully not for it'd be confusing), then we need to revise this defaultVisibility setting. 2. call setNonAliasAttributes on global op upon discovery of its initialization, thus we have globals dso_local correctly set. Still missing is lots of function should have dso_local set, but the all depend on comDat implementation, which will come from next PR within the next few days.

OG codegen does not generate any exception related content when there are not calls happening inside the try block. For now we mimic OG and do the same, until we see a concrete use case that would have used emitting this code.

bcardosolopes

Thanks for working on this, it's a nice feature addition!

We can achieve the same by re-using existing ClangIR components. The suggested approach:

Create/use a regular global.
Use regular cir.get_global to retrieve the address.
Add a new attribute #cir.asm_reg that holds a StringAttr. This should be used to initialize the global.
Add a isGlobalRegister method to cir.global's extraClassDeclaration, which checks if the global has an initializer and whether that is a #cir.asm_reg.
Add support in LowerToLLVM.cpp to materialize cir.get_globals into metadata !<whatever>. cir.{load,store} referring to these globals should be converted to the proper intrinsic calls.
Use a more complete usecase that exercise both intrinsics: https://godbolt.org/z/deGs17rox
Use clang/test/CIR/CodeGen/abstract-cond.c as an example for testing both CIR and LLVM output in the same test.

bcardosolopes · 2024-07-15T21:38:48Z

clang/test/CIR/CodeGen/global-register.c

+// cir.func no_proto @main() extra(#fn_attr) {
+// %0 = cir.const #cir.int<1> : !s32i
+// %1 = cir.cast(integral, %0 : !s32i), !s64i
+// %2 = cir.get_RG @llvm.named.register.x20 : !cir.ptr<!u64i>


Thanks for trying to improve this! We don't eagerly lower things to LLVM right away though, this is done in LowerToLLVM.cpp.

bcardosolopes and others added 30 commits June 20, 2024 14:52

[CIR][CIRGen] Emit more delete calls

dcab399

[CIR][CIRGen][NFC] Expand skeleton for constructing array types

e07bb72

[CIR][CIRGen] Enable support of bool increment (llvm#493)

db668fb

CIRGenFunction::buildFromMemory can handle the `cir.bool` values. So we no longer need to emit the `NIY` error here.

[CIR][CIRGen] Add support for ctor/dtor based array init/destroy

e50e95c

Still missing lowering support, which will come next.

[CIR][CIRGen] Lower C++ array init and destroy

69c8d94

[CIR][LoweringPrepare] Add a missing store for current array idx

6ec946e

[CIR][CIRGen][NFC] Skeleton for atomics support

dc0b5c2

This doesn't change existing functionality, for existing crashes related to atomics we just hit asserts a bit further now, but no support added just yet. Next set of commits will introduce functionlity with testcases.

[CIR][CIRGen][Atomics][NFC] Skeleton for constant order codegen

976c733

Just like previous commit, add more infra pieces, still NFC since all relevant testcases hit asserts, just a bit deeper.

[CIR][CIRGen] Support for initialization of unions. (llvm#495)

7fd38c4

PR adds support for initialization of unions. The change is copy-pasted from the original CodeGen.

[CIR][CIRGen] Support for __builtin_prefetch (llvm#504)

69fa22a

This PR adds support for `__builtin_prefetch`. CIRGen of this builtin emits the new 'cir.prefetch' opcode. Then `cir.prefetch` lowers to `llvm.prefetch` intrinsic. Co-authored-by: Bruno Cardoso Lopes <[email protected]>

[CIR] Add MemRead/MemWrite markers to bitfield ops (llvm#507)

6e6fb5a

This PR adds MemRead/MemWrite markers to the `GetBitfieldOp` and `SetBitfieldOp` (as discussed in llvm#487) Also, minor renaming in the `SetBitfieldOp` --------- Co-authored-by: Bruno Cardoso Lopes <[email protected]>

[CIR][CIRGen] Support for __builtin_constant_p (llvm#506)

22488f5

This PR adds support for `__builtin_constant_p`. Implementation introduces the new `cr.is_constant` opcode to it during the codegeneration of builtin. Codegeneration is taken from the original llvm codegen.

[CIR][CIRGen] Support for C++20 three-way comparison (llvm#485)

f2aad7c

This patch adds CIRGen support for the C++20 three-way comparison operator `<=>`. The binary operator is directly lowered to existing CIR operations. Most of the changes are tests.

[CIR][Rebase] Add CountAttributed enum case in a switch

92147b5

[CIR][CIRGen] Support for compound literal lvalue (llvm#515)

0d0801d

This change is taken from the original codegen.

[CIR][CIRGen] Support for __attribute__((fallthrough)) statement (llv…

ebb4410

…m#517) This PR adds handling of AttributedStmt to support fallthrough attribute.

[CIR][CIRGen] Add handling __extension__ keyword for lvalue (llvm#519)

369c58e

This change is taken from the original codegen

[CIR] Add MLIRCIREnumsGen as a dep to MLIRCIRInterfaces

8bc923c

[CIR][CIRGen] Add support for __atomic_add_fetch

45c885e

This introduces CIRGen and LLVM lowering for the first of a bunch of these atomic operations, incremental work should generelize the current constructs.

[CIR][CIRGen] Add support for other __atomic_{binop}_fetch

b9dd8d5

[CIR][CIRGen] Add a verifier to catch binop_fetch's that do not suppo…

6ee99cf

…rt fp

bcardosolopes and others added 18 commits July 8, 2024 12:32

[CIR][LowerToLLVM] Builtins: Lower cir.clear_cache to LLVM

06026aa

[CIR][ThroughMLIR] Fix floating GlobalOp lowering without initialized…

f7059c7

… value (llvm#719) This commit fixes GlobalOp lowering for floating without initial value. It implies to be initialized with zeros. E.g. float f[100]; double d;

Revert "[CIR][Transforms] Move RemoveRedundantBranches logic into BrO…

5792eb0

…p::fold method (llvm#663)" This reverts commit 3b9f698.

[CIR][NFC] Fix bug in MLIR lowering of cir.call (llvm#728)

7c4d03a

This PR fixes the bug described as in llvm#727 (comment). It should resolve the crash reported in llvm#727 .

[CIR][Lowering] Exceptions: Add support for flattening cir.try

965a3bc

For now only handle the cir.try part, cir.catch is coming next. Using flat cir for tests make this easy to incrementally build.

[CIR][NFCI] cir.try: improve docs and be more flexible on the region

42a4292

[CIR][IR] Fix parsing of dsolocal in cir.func (llvm#732)

96b0628

as title. document will be in another PR as it seems to be a different upstream branch

[CIR][CIRGen] cir.try: handle trivial ones and fix crash

b365964

OG codegen does not generate any exception related content when there are not calls happening inside the try block. For now we mimic OG and do the same, until we see a concrete use case that would have used emitting this code.

Realize RG and GET_RG op

b7fd5bb

566hub requested review from lanza and bcardosolopes as code owners July 15, 2024 14:33

566hub changed the title ~~Realize RG and GET_RG op~~ [CIR]Realize RG and GET_RG op Jul 15, 2024

bcardosolopes requested changes Jul 15, 2024

View reviewed changes

bcardosolopes changed the title ~~[CIR]Realize RG and GET_RG op~~ [CIR][CIRGen] Add support for global named registers Jul 15, 2024

lanza force-pushed the main branch from 67ce904 to 845c25a Compare September 19, 2024 03:42

lanza force-pushed the main branch from 605343a to 4c44bf1 Compare October 2, 2024 05:01

smeenai force-pushed the main branch from 93f58cc to 73f84e8 Compare October 9, 2024 21:57

lanza force-pushed the main branch from 5b99f59 to cb0cb34 Compare October 19, 2024 00:42

lanza force-pushed the main branch from 3ef67c1 to adf0c99 Compare November 4, 2024 00:41

smeenai force-pushed the main branch 2 times, most recently from 4aca8d4 to a04cf10 Compare November 23, 2024 06:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CIR][CIRGen] Add support for global named registers #741

[CIR][CIRGen] Add support for global named registers #741

566hub commented Jul 15, 2024

bcardosolopes left a comment

bcardosolopes Jul 15, 2024

[CIR][CIRGen] Add support for global named registers #741

Are you sure you want to change the base?

[CIR][CIRGen] Add support for global named registers #741

Conversation

566hub commented Jul 15, 2024

bcardosolopes left a comment

Choose a reason for hiding this comment

bcardosolopes Jul 15, 2024

Choose a reason for hiding this comment