[[riscv-cc]] = RISC-V Calling Conventions ifeval::["{docname}" == "riscv-cc"] include::prelude.adoc[] endif::[] == Register Convention === Integer Register Convention .Integer register convention [cols="1,1,2,2"] |=== | Name | ABI Mnemonic | Meaning | Preserved across calls? | x0 | zero | Zero | -- (Immutable) | x1 | ra | Return address | No | x2 | sp | Stack pointer | Yes | x3 | gp | Global pointer | -- (Unallocatable) | x4 | tp | Thread pointer | -- (Unallocatable) | x5 - x7 | t0 - t2 | Temporary registers | No | x8 - x9 | s0 - s1 | Callee-saved registers | Yes | x10 - x17 | a0 - a7 | Argument registers | No | x18 - x27 | s2 - s11 | Callee-saved registers | Yes | x28 - x31 | t3 - t6 | Temporary registers | No |=== In the standard ABI, procedures should not modify the integer registers tp and gp, because signal handlers may rely upon their values. The presence of a frame pointer is optional. If a frame pointer exists, it must reside in x8 (s0); the register remains callee-saved. If a platform requires use of a dedicated general-purpose register for a platform-specific purpose, it is recommended to use gp (x3). The platform ABI specification must document the use of this register. For such platforms, care must be taken to ensure all code (compiler generated or otherwise) avoids using gp in a way incompatible with the platform specific purpose, and that global pointer relaxation is disabled in the toolchain. === Frame Pointer Convention The presence of a frame pointer is optional. If a frame pointer exists, it must reside in x8 (s0); the register remains callee-saved. Code that uses a frame pointer will construct a linked list of stack frames, where each frame links to its caller using a "frame record". A frame record consists of two XLEN values on the stack; the return address and the link to the next frame record. The frame pointer register will point to the innermost frame, thereby starting the linked list. By convention, the lowest XLEN value shall point to the previous frame, while the next XLEN value shall be the return address. The end of the frame record chain is indicated by the address zero appearing as the next link in the chain. After the prologue, the frame pointer register will point to the Canonical Frame Address or CFA, which is the stack pointer value on entry to the current procedure. The previous frame pointer and return address pair will reside just prior to the current stack address held in `fp`. This puts the return address at `fp - XLEN/8`, and the previous frame pointer at `fp - 2 * XLEN/8`. It is left to the platform to determine the level of conformance with this convention. A platform may choose: - not to maintain a frame chain and use the frame pointer register as a general purpose callee-saved register. - to allow the frame pointer register be used as a general purpose callee-saved register, but provide a platform specific mechanism to reliably detect this condition. - to use a frame pointer to address a valid frame record at all times, but allow any procedure to choose to forgo creating a frame record. - to use the frame pointer to address a valid frame record at all times, except leaf functions, who may elect to forgo creating a frame record. === Floating-Point Register Convention .Floating-point register convention [cols="1,1,2,2"] |=== | Name | ABI Mnemonic | Meaning | Preserved across calls? | f0 - f7 | ft0 - ft7 | Temporary registers | No | f8 - f9 | fs0 - fs1 | Callee-saved registers | Yes* | f10 - f17 | fa0 - fa7 | Argument registers | No | f18 - f27 | fs2 - fs11 | Callee-saved registers | Yes* | f28 - f31 | ft8 - ft11 | Temporary registers | No |=== *: Floating-point values in callee-saved registers are only preserved across calls if they are no larger than the width of a floating-point register in the targeted ABI. Therefore, these registers can always be considered temporaries if targeting the base integer calling convention. The Floating-Point Control and Status Register (fcsr) must have thread storage duration in accordance with C11 section 7.6 "Floating-point environment ". === Vector Register Convention ==== Standard Calling Convention .Standard vector register calling convention [%autowidth] |=== | Name | ABI Mnemonic | Meaning | Preserved across calls? | v0-v31 | | Temporary registers | No | vl | | Vector length | No | vtype | | Vector data type register | No | vxrm | | Vector fixed-point rounding mode register | No | vxsat | | Vector fixed-point saturation flag register | No |=== NOTE: Vector registers are not used for passing arguments or return values in this calling convention. Use the calling convention variant to pass arguments and return values in vector registers. The `vxrm` and `vxsat` fields of `vcsr` are not preserved across calls and their values are unspecified upon entry. Procedures may assume that `vstart` is zero upon entry. Procedures may assume that `vstart` is zero upon return from a procedure call. NOTE: Application software should normally not write `vstart` explicitly. Any procedure that does explicitly write `vstart` to a nonzero value must zero `vstart` before either returning or calling another procedure. ==== Calling Convention Variant Functions that use vector registers to pass arguments and return values must follow this calling convention. Some programming languages can require extra functions to follow this calling convention (e.g. C/C++ functions with attribute `riscv_vector_cc`). Please refer to the <> section for more details about standard vector calling convention variant. .Variant vector register calling convention [%autowidth] |=== | Name | ABI Mnemonic | Meaning | Preserved across calls? | v0 | | Argument register | No | v1-v7 | | Callee-saved registers | Yes | v8-v23 | | Argument registers | No | v24-v31 | | Callee-saved registers | Yes | vl | | Vector length | No | vtype | | Vector data type register | No | vxrm | | Vector fixed-point rounding mode register | No | vxsat | | Vector fixed-point saturation flag register | No |=== NOTE: The `vxrm` and `vxsat` fields of `vcsr` follow the same behavior as the standard calling convention. == Procedure Calling Convention This chapter defines standard calling conventions and standard calling convention variants, and describes how to pass arguments and return values. Functions must follow the register convention defined in calling convention: the contents of any register without specifying it as an argument register in the calling convention are unspecified upon entry, and the content of any register without specifying it as a return value register or callee-saved in the calling convention are unspecified upon exit, the contents of all callee-saved registers must be restored to what was set on entry, and the contents of any fixed registers like `gp` and `tp` never change. NOTE: Calling convention for big-endian is *NOT* included in this specification yet, we intend to define that in future version of this specification. [#integer-cc] === Integer Calling Convention The base integer calling convention provides eight argument registers, a0-a7, the first two of which are also used to return values. Scalars that are at most XLEN bits wide are passed in a single argument register, or on the stack by value if none is available. When passed in registers or on the stack, integer scalars narrower than XLEN bits are widened according to the sign of their type up to 32 bits, then sign-extended to XLEN bits. When passed in registers or on the stack, floating-point types narrower than XLEN bits are widened to XLEN bits, with the upper bits undefined. When passed in registers or on the stack, `_BitInt(N)` types are handled as follows: - For N < XLEN: Types are widened to XLEN bits, with upper bits sign-extended for signed `_BitInt(N)` or zero-extended for unsigned `_BitInt(N)`. However, for N = 32 in RV64, both signed and unsigned types are sign-extended. + - For N = XLEN: Types are passed using the same rules as scalars that are XLEN bits wide. - For XLEN < N ≤ 2×XLEN: Types are passed using the same rules as scalars that are 2×XLEN bits wide. Any unused bits in the most significant chunk are sign-extended for signed types or zero-extended for unsigned types. - For N > 2×XLEN: Types are passed by reference (replaced in the argument list with the address). When stored in memory, any unused bits in the most significant chunk are sign-extended for signed types or zero-extended for unsigned types. [NOTE] ==== `_BitInt(32)` on RV64 has a special rule because it behaves consistently with regular `int` types. ==== Scalars that are 2×XLEN bits wide are passed in a pair of argument registers, with the low-order XLEN bits in the lower-numbered register and the high-order XLEN bits in the higher-numbered register. If no argument registers are available, the scalar is passed on the stack by value. If exactly one register is available, the low-order XLEN bits are passed in the register and the high-order XLEN bits are passed on the stack. Scalars wider than 2×XLEN bits are passed by reference and are replaced in the argument list with the address. Aggregates whose total size is no more than XLEN bits are passed in a register, with the fields laid out as though they were passed in memory. If no register is available, the aggregate is passed on the stack. Aggregates whose total size is no more than 2×XLEN bits are passed in a pair of registers; if only one register is available, the first XLEN bits are passed in a register and the remaining bits are passed on the stack. If no registers are available, the aggregate is passed on the stack. Bits unused due to padding, and bits past the end of an aggregate whose size in bits is not divisible by XLEN, are undefined. Aggregates or scalars passed on the stack are aligned to the greater of the type alignment and XLEN bits, but never more than the stack alignment. Aggregates larger than 2×XLEN bits are passed by reference and are replaced in the argument list with the address, as are {Cpp} aggregates with nontrivial copy constructors, destructors, or vtables. Fixed-length vectors are treated as aggregates. Empty structs or union arguments or return values are ignored by C compilers which support them as a non-standard extension. This is not the case for {Cpp}, which requires them to be sized types. Arguments passed by reference may be modified by the callee. Floating-point reals are passed the same way as aggregates of the same size; complex floating-point numbers are passed the same way as a struct containing two floating-point reals. (This constraint changes when the integer calling convention is augmented by the hardware floating-point calling convention.) In the base integer calling convention, variadic arguments are passed in the same manner as named arguments, with one exception. Variadic arguments with 2×XLEN-bit alignment and size at most 2×XLEN bits are passed in an *aligned* register pair (i.e., the first register in the pair is even-numbered), or on the stack by value if none is available. After a variadic argument has been passed on the stack, all future arguments will also be passed on the stack (i.e. the last argument register may be left unused due to the aligned register pair rule). Values are returned in the same manner as a first named argument of the same type would be passed. If such an argument would have been passed by reference, the caller allocates memory for the return value, and passes the address as an implicit first parameter. NOTE: There is no requirement that the address be returned from the function and so software should not assume that a0 will hold the address of the return value on return. The stack grows downwards (towards lower addresses) and the stack pointer shall be aligned to a 128-bit boundary upon procedure entry. The first argument passed on the stack is located at offset zero of the stack pointer on function entry; following arguments are stored at correspondingly higher addresses. In the standard ABI, the stack pointer must remain aligned throughout procedure execution. Non-standard ABI code must realign the stack pointer prior to invoking standard ABI procedures. The operating system must realign the stack pointer prior to invoking a signal handler; hence, POSIX signal handlers need not realign the stack pointer. In systems that service interrupts using the interruptee's stack, the interrupt service routine must realign the stack pointer if linked with any code that uses a non-standard stack-alignment discipline, but need not realign the stack pointer if all code adheres to the standard ABI. Procedures must not rely upon the persistence of stack-allocated data whose addresses lie below the stack pointer. Registers s0-s11 shall be preserved across procedure calls. No floating-point registers, if present, are preserved across calls. (This property changes when the integer calling convention is augmented by the hardware floating-point calling convention.) === Hardware Floating-Point Calling Convention The hardware floating-point calling convention adds eight floating-point argument registers, fa0-fa7, the first two of which are also used to return values. Values are passed in floating-point registers whenever possible, whether or not the integer registers have been exhausted. The remainder of this section applies only to named arguments. Variadic arguments are passed according to the integer calling convention. ABI_FLEN refers to the width of a floating-point register in the ABI. The ABI_FLEN must be no wider than the ISA's FLEN. The ISA might have wider floating-point registers than the ABI. For the purposes of this section, "struct" refers to a C struct with its hierarchy flattened, including any array fields. That is, `struct { struct { float f[1]; } a[2]; }` and `struct { float f0; float f1; }` are treated the same. Fields containing empty structs or unions are ignored while flattening, even in {Cpp}, unless they have nontrivial copy constructors or destructors. Fields containing zero-length bit-fields or zero-length arrays are ignored while flattening. Attributes such as `aligned` or `packed` do not interfere with a struct's eligibility for being passed in registers according to the rules below, i.e. `struct { int i; double d; }` and `+struct __attribute__((__packed__)) { int i; double d }+` are treated the same, as are `struct { float f; float g; }` and `+struct { float f; float g __attribute__ ((aligned (8))); }+`. NOTE: One exceptional case for the flattening rule is an array of empty structs or unions; C treats it as an empty field, but {Cpp} treats it as a non-empty field since {Cpp} defines the size of an empty struct or union as 1. i.e. for `struct { struct {} e[1]; float f; }` as the first argument, C will treat it like `struct { float f; }` and pass `f` in `fa0` as described below, whereas {Cpp} will pass the pass the entire aggregate in `a0` (XLEN = 64) or `a0` and `a1` (XLEN = 32), as described in the integer calling convention. Zero-length arrays of empty structs or union will be ignored for both C and {Cpp}. i.e. For `struct { struct {} e[0]; float f; };`, as the first argument, C and {Cpp} will treat it like `struct { float f; }` and pass `f` in `fa0` as described below. A real floating-point argument is passed in a floating-point argument register if it is no more than ABI_FLEN bits wide and at least one floating-point argument register is available. Otherwise, it is passed according to the integer calling convention. When a floating-point argument narrower than FLEN bits is passed in a floating-point register, it is 1-extended (NaN-boxed) to FLEN bits. A struct containing just one floating-point real is passed as though it were a standalone floating-point real. A struct containing two floating-point reals is passed in two floating-point registers, if neither real is more than ABI_FLEN bits wide and at least two floating-point argument registers are available. (The registers need not be an aligned pair and are assigned to the two reals in memory order.) Otherwise, it is passed according to the integer calling convention. A complex floating-point number, or a struct containing just one complex floating-point number, is passed as though it were a struct containing two floating-point reals. A struct containing one floating-point real and one integer (or bitfield), in either order, is passed in a floating-point register and an integer register, provided the floating-point real is no more than ABI_FLEN bits wide and the integer is no more than XLEN bits wide, and at least one floating-point argument register and at least one integer argument register is available. If the struct is passed in this manner, and the integer is narrower than XLEN bits, the remaining bits are unspecified. If the struct is not passed in this manner, then it is passed according to the integer calling convention. Unions are never flattened and are always passed according to the integer calling convention. Values are returned in the same manner as a first named argument of the same type would be passed. Floating-point registers fs0-fs11 shall be preserved across procedure calls, provided they hold values no more than ABI_FLEN bits wide. === Standard Vector Calling Convention Variant The _RISC-V V Vector Extension_<> defines a set of thirty-two vector registers, v0-v31. The _RISC-V Vector Extension Intrinsic Document_<> defines vector types which include vector mask types, vector data types, and tuple vector data types. A value of vector type can be stored in vector register groups. The remainder of this section applies only to named vector arguments, other named arguments and return values follow the standard calling convention. Variadic vector arguments are passed by reference. v0 is used to pass the first vector mask argument to a function, and to return vector mask result from a function. v8-v23 are used to pass vector data arguments, tuple vector data arguments and the rest vector mask arguments to a function, and to return vector data and vector tuple results from a function. It must ensure that the entire contents of v1-v7 and v24-v31 are preserved across the call. Each vector data type and vector tuple type has an LMUL attribute that indicates a vector register group. The value of LMUL indicates the number of vector registers in the vector register group and requires the first vector register number in the vector register group must be a multiple of it. For example, the LMUL of `vint64m8_t` is 8, so v8-v15 vector register group can be allocated to this type, but v9-v16 can not because the v9 register number is not a multiple of 8. If LMUL is less than 1, it is treated as 1. If it is a vector mask type, its LMUL is 1. Each vector tuple type also has an NFIELDS attribute that indicates how many vector register groups the type contains. Thus a vector tuple type needs to take up LMUL×NFIELDS registers. The rules for passing vector arguments are as follows: 1. For the first vector mask argument, use v0 to pass it. 2. For vector data arguments or rest vector mask arguments, starting from the v8 register, if a vector register group between v8-v23 that has not been allocated can be found and the first register number is a multiple of LMUL, then allocate this vector register group to the argument and mark these registers as allocated. Otherwise, pass it by reference and are replaced in the argument list with the address. 3. For tuple vector data arguments, starting from the v8 register, if NFIELDS consecutive vector register groups between v8-v23 that have not been allocated can be found and the first register number is a multiple of LMUL, then allocate these vector register groups to the argument and mark these registers as allocated. Otherwise, pass it by reference and are replaced in the argument list with the address. NOTE: The registers assigned to the tuple vector data argument must be consecutive. For example, for the function `void foo(vint32m1_t a, vint32m2_t b, vint32m1x2_t c)`, v8 will be allocated to `a`, v10-v11 will be allocated to `b`, v12-v13 instead of v9 and v12 will beallocated to `c`. NOTE: It should be stressed that the search for the appropriate vector register groups starts at v8 each time and does not start at the next register after the registers are allocated for the previous vector argument. Therefore, it is possible that the vector register number allocated to a vector argument can be less than the vector register number allocated to previous vector arguments. For example, for the function `void foo (vint32m1_t a, vint32m2_t b, vint32m1_t c)`, according to the rules of allocation, v8 will be allocated to `a`, v10-v11 will be allocated to `b` and v9 will be allocated to `c`. This approach allows more vector registers to be allocated to arguments in some cases. Vector values are returned in the same manner as the first named argument of the same type would be passed. Vector types are disallowed in struct or union. Vector arguments and return values are disallowed to pass to an unprototyped function. NOTE: Functions that use the standard vector calling convention variant must be marked with `STO_RISCV_VARIANT_CC`, see <> for the meaning of `STO_RISCV_VARIANT_CC`. NOTE: `setjmp`/`longjmp` follow the standard calling convention, which clobbers all vector registers. Hence, the standard vector calling convention variant won't disrupt the `jmp_buf` ABI. NOTE: Functions that use the standard vector calling convention variant follow an additional name mangling rule for {Cpp}. For more details, see <>. === Standard Fixed-length Vector Calling Convention Variant This section defines the calling convention variant for fixed-length vectors. The intention of this variant is to pass fixed-length vectors via the vector registers. For the definition of a fixed-length vector, see <>. This variant is based on the standard vector calling convention variant: the register convention and the rules for passing arguments and return values are the same. NOTE: The reason we define a separate calling convention variant is that we would like to define a flexible convention to utilize the variable length feature in the vector extension, also considering embedded vector extensions, such as `Zve32x`. ABI_VLEN refers to the width of a vector register in the calling convention variant. The ABI_VLEN must be no wider than the ISA's VLEN, meaning that the ISA may support wider vector registers than the ABI, but the ABI's VLEN cannot exceed the ISA's VLEN. ABI_VLEN represents the width (in bits) of the vector registers available in the calling convention for fixed-length vectors. ABI_VLEN can vary from 32 bits (as in `Zve32x`) up to the maximum supported by the ISA. The flexibility of ABI_VLEN enables the convention to adapt to both low-end embedded systems and high-performance processors that utilize wider vector registers. The ABI_VLEN is a parameter of this calling convention variant. It could be set by a command line option for the compiler or specified by a function attribute in the source code. NOTE: We suggest the toolchain implementation set the default value of ABI_VLEN to 128, as it's the most common minimal requirement. However, it is not fixed to 128, since the ISA allows the VLEN to be only 32 bits or 64 bits. This also enables the utilization of the capacity of longer VLEN. Users can build with an optimized library with larger ABI_VLEN for better utilization of those cores with longer VLEN. A fixed-length vector argument is passed in one vector argument register if the size of the vector is less than or equal to ABI_VLEN bit. [NOTE] ==== Even in the absence of specific vector extension support for certain element types, such as `__bf16`, `_Float16`, `float`, or `double`, the standard fixed-length vector calling convention rules still apply. For example, even without the support of extensions like `Zvfbfmin`, `Zve32f`, or `Zve64d`, these element types will be passed according to the calling convention rules outlined here. Additionally, data types such as `__int128_t`, which currently do not have direct support in any vector extension, will also follow these rules. This design ensures that the calling convention remains forward-compatible, minimizing the need for continuous adjustments as new extensions and data types are introduced in the future. The consistency in applying these rules to unsupported element types guarantees a smooth transition when future vector extensions become available, allowing for seamless integration of new features without requiring significant changes to the calling convention. ==== A fixed-length vector argument is passed in two vector argument registers, similar to vector data arguments with LMUL=2 and following the same register constraints, if the size of the vector is greater than ABI_VLEN bits and less than or equal to 2×ABI_VLEN bits. A fixed-length vector argument is passed in four vector argument registers, similar to vector data arguments with LMUL=4 and following the same register constraints, if the size of the vector is greater than ABI_VLEN bits and less than or equal to 4×ABI_VLEN bits. A fixed-length vector argument is passed in eight vector argument registers, similar to vector data arguments with LMUL=4 and following the same register constraints, if the size of the vector is greater than ABI_VLEN bits and less than or equal to 8×ABI_VLEN bits. [NOTE] ==== Fixed-length vectors that are not a power-of-2 in size will be rounded up to the next power-of-2 length for the purpose of register allocation and handling. For instance, a vector type like `int32x3_t` (which contains three 32-bit integers) will be treated as an `int32x4_t` (a 128-bit vector, as LMUL=1 for ABI_VLEN=128) in the ABI, and passed accordingly. This ensures consistency in how vectors are handled and simplifies the process of argument passing. Example: Consider an `int32x3_t` vector (three 32-bit integers): - The vector's total size is 96 bits, which is not a power of 2. - The ABI will round up the size to 128 bits (corresponding to `int32x4_t`), meaning the vector will be passed using one vector argument register when ABI_VLEN=128. This rule applies to all non-power-of-2 fixed-length vectors, ensuring they are treated consistently across different ABI_VLEN settings. ==== A fixed-length vector argument is passed by reference and is replaced in the argument list with the address if it is larger than 8×ABI_VLEN bit or if there is a shortage of vector argument registers. A struct containing members with all fixed-length vectors will be passed in vector argument registers like a vector tuple type if all members have the same length, the length is less than or equal to 4×ABI_VLEN bit, and the size of the whole struct is less than or equal to 8×ABI_VLEN bit. If there are not enough vector argument registers to pass the entire struct, it will pass by reference and is replaced in the argument list with the address. Otherwise, it will use the rule defined in the hardware floating-point calling convention. A struct containing just one fixed-length vector or a fixed-length vector array of length one, will be flattened as a single fixed-length vector argument if the size of the vector is less than or equal to 8×ABI_VLEN bit. Structs with zero-length fixed-length arrays use the rule defined in the hardware floating-point calling convention, which means it won't consume vector argument register either in C or {Cpp}. A struct containing just one fixed-length vector array is passed as though it were a vector tuple type if the size of the base element for the array is less than or equal to 8×ABI_VLEN bit, and the size of the array is less than 8×ABI_VLEN bits. If there are not enough vector argument registers to pass the entire struct, it will pass by reference and is replaced in the argument list with the address. Otherwise, it will use the rule defined in the hardware floating-point calling convention. Unions with fixed-length vectors are always passed according to the integer calling convention. The details of vector argument register rules are the same as the standard vector calling convention variant. NOTE: Functions that use the standard fixed-length vector calling convention variant must be marked with STO_RISCV_VARIANT_CC. See <> for the meaning of STO_RISCV_VARIANT_CC. NOTE: Functions that use the standard fixed-length vector calling convention variant follow an additional name mangling rule for {Cpp}. For more details, see <>. [NOTE] ==== When ABI_VLEN is smaller than the VLEN, the number of vector argument registers utilized remains unchanged. However, in such cases, values are only placed in a portion of these vector argument registers, corresponding to the size of ABI_VLEN. The remaining portion of the vector argument registers, which extends beyond the ABI_VLEN, will remain idle. This means that while the full capacity of the vector argument registers may not be used, the allocation of these registers do not change, ensuring consistency in register usage regardless of the ABI_VLEN to VLEN ratio. Example: With ABI_VLEN at 32 bits and VLEN at 128 bits, consider passing an `int32x4_t` parameter (four 32-bit integers). Allocation: Four vector argument registers are allocated for `int32x4_t`, based on LMUL=4. Utilization: All four integers are placed in the first vector register, utilizing its full 128-bit capacity (VLEN), despite ABI_VLEN being 32 bits. Remaining Registers: The other three allocated registers remain unused and idle. .int32x4_t layout on different VLEN with ABI_VLEN at 32 bits: [cols="2,3,3,3,3"] [width=100%] |=== | VLEN | v8 | v9 | v10 | v11 | 32 | a | b | c | d | 64 | a, b | c, d | -, - | -, - | 128 | a, b, c, d | -, -, -, - | -, -, -, - | -, -, -, - | 256 | a, b, c, d, -, -, -, - | -, -, -, -, -, -, -, - | -, -, -, -, -, -, -, - | -, -, -, -, -, -, -, - |=== .int64x8_t layout on different VLEN with ABI_VLEN at 128 bits: [cols="2,3,3,3,3"] [width=100%] |=== | VLEN | v8 | v9 | v10 | v11 | 128 | a, b | c, d | e, f | g, h | 256 | a, b, c, d | e, f, g, h | -, -, -, - | -, -, -, - | 512 | a, b, c, d, e, f, g, h | -, -, -, -, -, -, -, - | -, -, -, -, -, -, -, - | -, -, -, -, -, -, -, - |=== `-` means that part are not used, and the value can be anything. ==== NOTE: In a single compilation unit, different functions may use different ABI_VLEN values. This means that ABI_VLEN is not uniform across the entire unit, allowing for function-specific optimization. However, this necessitates that users ensure consistency in ABI_VLEN between calling and called functions. It is the user's responsibility to verify that the ABI_VLEN matches on both sides of a function call to ensure correct operation and data handling. === ILP32E Calling Convention IMPORTANT: RV32E is not a ratified base ISA and so we cannot guarantee the stability of ILP32E, in contrast with the rest of this document. This documents the current implementation in GCC as of the time of writing, but may be subject to change. The ILP32E calling convention is designed to be usable with the RV32E ISA. This calling convention is the same as the integer calling convention, except for the following differences. The stack pointer need only be aligned to a 32-bit boundary. Registers x16-x31 do not participate in the calling convention, so there are only six argument registers, a0-a5, only two callee-saved registers, s0-s1, and only three temporaries, t0-t2. If used with an ISA that has any of the registers x16-x31 and f0-f31, then these registers are considered temporaries. The ILP32E calling convention is not compatible with ISAs that have registers that require load and store alignments of more than 32 bits. In particular, this calling convention must not be used with the D ISA extension. === RV64ILP32* Calling Convention IMPORTANT: RV64ILP32* ABIs are experimental. The RV64ILP32* calling convention is designed to be usable with the RV64* ISA. These calling conventions are composed of the integer & floating-point & vector calling conventions. When passed in registers or on the stack, pointer scalars (32-bit), narrower than XLEN bits (64-bit), are sign-extended to XLEN bits. === Named ABIs This specification defines the following named ABIs: [[abi-ilp32]] ILP32:: Integer calling-convention only, hardware floating-point calling convention is not used (i.e. <> and <>). [[abi-ilp32f]] ILP32F:: ILP32 with hardware floating-point calling convention for ABI_FLEN=32 (i.e. <> and <>). [[abi-ilp32d]] ILP32D:: ILP32 with hardware floating-point calling convention for ABI_FLEN=64 (i.e. <> and <>). [[abi-ilp32e]] ILP32E:: <> only, hardware floating-point calling convention is not used (i.e. <>, <>, and <>). [[abi-lp64]] LP64:: Integer calling-convention only, hardware floating-point calling convention is not used (i.e. <> and <>). [[abi-lp64f]] LP64F:: LP64 with hardware floating-point calling convention for ABI_FLEN=32 (i.e. <> and <>). [[abi-lp64d]] LP64D:: LP64 with hardware floating-point calling convention for ABI_FLEN=64 (i.e. <> and <>). [[abi-lp64q]] LP64Q:: LP64 with hardware floating-point calling convention for ABI_FLEN=128 (i.e. <> and <>). [[abi-rv64ilp32]] RV64ILP32:: Integer calling-convention only, hardware floating-point calling convention is not used (i.e. <> and <>). [[abi-rv64ilp32f]] RV64ILP32F:: RV64ILP32 with hardware floating-point calling convention for ABI_FLEN=32 (i.e. <> and <>). [[abi-rv64ilp32d]] RV64ILP32D:: RV64ILP32 with hardware floating-point calling convention for ABI_FLEN=64 (i.e. <> and <>). [[abi-rv64ilp32q]] RV64ILP32Q:: RV64ILP32 with hardware floating-point calling convention for ABI_FLEN=128 (i.e. <> and <>). The LP64* ABIs are only compatible with RV64* ISAs. The ILP32* are compatible with RV32* and RV64* ISAs. NOTE: RV64ILP32* ABIs are experimental. The *F ABIs require the *F ISA extension, the *D ABIs require the *D ISA extension, and the LP64Q ABI requires the Q ISA extension. NOTE: This means code targeting the Zfinx extension always uses the ILP32, ILP32E or LP64 integer calling-convention only ABIs as there is no dedicated hardware floating-point register file. === Default ABIs While various different ABIs are technically possible, for software compatibility reasons it is strongly recommended to use the following default ABIs for specific architectures: [horizontal] on RV32G:: <> on RV64G:: <> NOTE: Although RV64GQ systems can technically use <>, it is strongly recommended to use LP64D on general-purpose RV64GQ systems for compatibility with standard RV64G software. == Calling Convention for System Calls The calling convention for system calls does not fall within the scope of this document. Please refer to the documentation of the RISC-V execution environment interface (e.g OS kernel ABI, SBI). == C/{Cpp} Type Details === C/{Cpp} Type Sizes and Alignments There are two conventions for C/{Cpp} type sizes and alignments. ILP32, ILP32F, ILP32D, and ILP32E:: Use the following type sizes and alignments (based on the ILP32 convention): + .C/{Cpp} type sizes and alignments for ILP32 [cols="4,>2,>3,4"] [width=60%] |=== | Type | Size (Bytes) | Alignment (Bytes) | Note | bool/_Bool | 1 | 1 | | char | 1 | 1 | | short | 2 | 2 | | int | 4 | 4 | | long | 4 | 4 | | long long | 8 | 8 | | ++intmax_t++ | 8 | 8 | ++long long int++ | ++uintmax_t++ | 8 | 8 | ++unsigned long long int++ | void * | 4 | 4 | | +++__bf16+++ | 2 | 2 | Half precision floating point (bfloat16) | _Float16 | 2 | 2 | Half precision floating point (binary16 in IEEE 754-2008) | float | 4 | 4 | Single precision floating point (binary32 in IEEE 754-2008) | double | 8 | 8 | Double precision floating point (binary64 in IEEE 754-2008) | long double | 16 | 16 | Quadruple precision floating point (binary128 in IEEE 754-2008) | float _Complex | 8 | 4 | | double _Complex | 16 | 8 | | long double _Complex | 32 | 16 | | _BitInt(1 ≤ N ≤ 8) | 1 | 1 | | _BitInt(9 ≤ N ≤ 16) | 2 | 2 | | _BitInt(17 ≤ N ≤ 32) | 4 | 4 | | _BitInt(33 ≤ N ≤ 64) | 8 | 8 | | _BitInt(N > 64) | RoundUp(N / 64) * 8 | 8 | |=== LP64, LP64F, LP64D, and LP64Q:: Use the following type sizes and alignments (based on the LP64 convention): + .C/{Cpp} type sizes and alignments for LP64 [cols="4,>2,>3,4"] [width=60%] |=== | Type | Size (Bytes) | Alignment (Bytes) | Note | bool/_Bool | 1 | 1 | | char | 1 | 1 | | short | 2 | 2 | | int | 4 | 4 | | long | 8 | 8 | | long long | 8 | 8 | | ++intmax_t++ | 8 | 8 | ++long int++ | ++uintmax_t++ | 8 | 8 | ++unsigned long int++ | +++__int128+++ | 16 | 16 | | void * | 8 | 8 | | +++__bf16+++ | 2 | 2 | Half precision floating point (bfloat16) | _Float16 | 2 | 2 | Half precision floating point (binary16 in IEEE 754-2008) | float | 4 | 4 | Single precision floating point (binary32 in IEEE 754-2008) | double | 8 | 8 | Double precision floating point (binary64 in IEEE 754-2008) | long double | 16 | 16 | Quadruple precision floating point (binary128 in IEEE 754-2008) | float _Complex | 8 | 4 | | double _Complex | 16 | 8 | | long double _Complex | 32 | 16 | | _BitInt(1 ≤ N ≤ 8) | 1 | 1 | | _BitInt(9 ≤ N ≤ 16) | 2 | 2 | | _BitInt(17 ≤ N ≤ 32) | 4 | 4 | | _BitInt(33 ≤ N ≤ 64) | 8 | 8 | | _BitInt(65 ≤ N ≤ 128)| 16 | 16 | | _BitInt(N > 128) | RoundUp(N / 128) * 16 | 16 | |=== The alignment of `max_align_t` is 16. `CHAR_BIT` is 8. Structs and unions are aligned to the alignment of their most strictly aligned member. The size of any object is a multiple of its alignment. === Fixed-Length Vector Various compilers have support for fixed-length vector types, for example GCC and Clang both support declaring a type with `\\__attribute__\((vector_size(N))`, where N is a positive number larger than zero. The alignment requirement for the fixed length vector shall be equivalent to the alignment requirement of its elemental type. The size of the fixed length vector is determined by multiplying the size of its elemental type by the total number of elements within the vector. ==== _BitInt (N) `_BitInt (N)` is the type defined in C23, allow user to define an arbitrary-sized integer type, where N is a postive integer larger than zero. `_BitInt(N)` types are stored in memory as follows: - For N ≤ 2×XLEN: Types follow the standard scalar endianness rules. - For N > 2×XLEN: Types are divided into 2×XLEN-bit chunks. Each chunk follows the standard scalar endianness rules. The chunk ordering in memory follows the system endianness: on little-endian systems, lower-addressed chunks contain less significant bits; on big-endian systems, lower-addressed chunks contain more significant bits. NOTE: The big-endian ABI is still experimental and may change in the future. Unused bits of the `_BitInt(N)` are extended based on the signedness of the type when stored in memory or register. For signed `_BitInt(N)`, unused bits are sign-extended from the most significant bit of the N-bit value. For unsigned `_BitInt(N)`, unused bits are zero-extended. NOTE: According to the C language specification, `_BitInt(N)` without an explicit `unsigned` qualifier is treated as signed. For example, `_BitInt(66)` is signed and will use sign-extension for unused bits, while `unsigned _BitInt(66)` will use zero-extension. === C/{Cpp} Type Representations `char` is unsigned. Booleans (`bool`/`_Bool`) stored in memory or when being passed as scalar arguments are either `0` (`false`) or `1` (`true`). A null pointer (for all types) has the value zero. `_Float16` is as defined in the C ISO/IEC TS 18661-3 extension. `__bf16` has the same parameter passing and return rules as for `_Float16`. `_Complex` types have the same layout as a struct containing two fields of the corresponding real type (`float`, `double`, or `long double`), with the first member holding the real part and the second member holding the imaginary part. The type `size_t` is defined as `unsigned int` for RV32 and `unsigned long` for RV64. The type `ptrdiff_t` is defined as `int` for RV32 and `long` for RV64. === Bit-Fields Bit-fields are packed in little-endian fashion. A bit-field that would span the alignment boundary of its integer type is padded to begin at the next alignment boundary. For example, `struct { int x : 10; int y : 12; }` is a 32-bit type with `x` in bits 9-0, `y` in bits 21-10, and bits 31-22 undefined. By contrast, `struct { short x : 10; short y : 12; }` is a 32-bit type with `x` in bits 9-0, `y` in bits 27-16, and bits 31-28 and bits 15-10 undefined. Bit-fields which are larger than their integer types are only present in {Cpp} and are defined by the _Itanium {Cpp} ABI_ <>. The bit-field and containing struct are aligned on a boundary corresponding to the largest integral type smaller than the bit-field, up to 64-bit alignment on RV32 or 128-bit alignment on RV64. Any bits in excess of the size of the declared type are treated as padding. For example `struct { char x : 9; char y; }` is a 24-bit type with `x` in bits 7-0, `y` in bit 23-16, and bits 15-8 undefined; `struct { char x : 9; char y : 2 }` is a 16-bit type with `x` in bits 7-0, `y` in bit 10-9, and bits 8 and 15-11 undefined. Unnamed nonzero length bit-fields allocate space in the same fashion as named bitfields but do not affect the alignment of the containing struct. Zero length bit-fields are aligned relative to the start of the containing struct according to their declared type and, since they must be unnamed, do not affect the struct alignment. C requires bit-fields on opposite sides of a zero-length bitfield to be treated as separate memory locations for the purposes of data races. === va_list, va_start, and va_arg The `va_list` type has the same representation as `void*` and points to a sequence of zero or more arguments with preceding padding for alignment, formatted and aligned as variadic arguments passed on the stack according to the integer calling convention (<>). All standard calling conventions use the same representation for variadic arguments to allow `va_list` types to be shared between them. The `va_start` macro in a function initializes its `va_list` argument to point to the first address at which a variadic argument could be passed to the function. If all integer argument registers are used for named formal arguments, the first variadic argument will have been passed on the stack by the caller, and the `va_list` can point to the address immediately after the last named argument passed on the stack, or the `sp` value on entry if no named arguments were passed on the stack. If some integer argument registers were not used for named formal arguments, then the first variadic argument may have been passed in a register. The function is then expected to construct a _varargs save area_ immediately below the entry `sp` and fill it with the entry values of all integer argument registers not used for named arguments, in sequence. The `va_list` value can then be initialized to the start of the varargs save area, and it will iterate through any variadic arguments passed via registers before continuing to variadic arguments passed on the stack, if any. The `va_arg` macro will align its `va_list` argument, fetch a value, and increment the `va_list` according to the alignment and size of a variadic argument of the given type, which may not be the same as the alignment and size of the given type in memory. If the type is passed by reference, the size and alignment used will be those of a pointer, and the fetched pointer will be used as the address of the actual argument. The `va_copy` macro is a single pointer copy and the `va_end` macro performs no operation. === Vector Type Sizes and Alignments This section defines the sizes and alignments for the vector types defined in the _RISC-V Vector Extension Intrinsic Document_<>. The actual size of each type is determined by the hardware configuration, which is based on the content of the `vlenb` register. There are three classes of vector types: the vector mask types, the vector data types and the vector tuple types. .Type sizes and alignments for vector mask types [cols="4,3,>3,>2"] [width=80%] |=== | Internal Name | Type | Size (Bytes) | Alignment (Bytes) | __rvv_vbool1_t | vbool1_t | VLENB | 1 | __rvv_vbool2_t | vbool2_t | VLENB / 2 | 1 | __rvv_vbool4_t | vbool4_t | VLENB / 4 | 1 | __rvv_vbool8_t | vbool8_t | ceil(VLENB / 8) | 1 | __rvv_vbool16_t | vbool16_t | ceil(VLENB / 16) | 1 | __rvv_vbool32_t | vbool32_t | ceil(VLENB / 32) | 1 | __rvv_vbool64_t | vbool64_t | ceil(VLENB / 64) | 1 |=== include::vector_type_infos.adoc[] NOTE: The vector mask types utilize a portion of the space, while the remaining content may be undefined, both in the register and in memory. NOTE: Size must be a positive integer. [appendix] == Linux-Specific ABI NOTE: This section of the RISC-V calling convention specification only applies to Linux-based systems. In order to ensure compatibility between different implementations of the C library for Linux, we provide some extra definitions which only apply on those systems. These are noted in this section. === Linux-Specific C Type Sizes and Alignments The following definitions apply for all ABIs defined in this document. Here there is no differentiation between ILP32 and LP64 ABIs. .Linux-specific C type sizes and alignments [cols="2,>1,>1"] [width=80%] |=== | Type | Size (Bytes) | Alignment (Bytes) | wchar_t | 4 | 4 | wint_t | 4 | 4 |=== === Linux-Specific C Type Representations The following definitions apply for all ABIs defined in this document. Here there is no differentiation between ILP32 and LP64 ABIs. `wchar_t` is signed. `wint_t` is unsigned. [bibliography] == References * [[[riscv-v-extension]]] "RISC-V V vector extension specification" https://github.com/riscv/riscv-v-spec * [[[rvv-intrinsic-doc]]] "RISC-V Vector Extension Intrinsic Document" https://github.com/riscv-non-isa/rvv-intrinsic-doc