Commit Graph

289 Commits

Author SHA1 Message Date
Alice Carlotti
3b6b69205c aarch64: Add support for --march=armv9.6-a 2025-07-12 10:04:27 +01:00
Alice Carlotti
891fa528c2 aarch64: Refactor exclusion of reg names in immediates
When parsing immediate values, register names should not be
misinterpreted as symbols.  However, for backwards compatibility we need
to permit some newer register names within older instructions.  The
current mechanism for doing so depends on the list of explicit
architecture requirements for the instructions, which is fragile and
easy to forget, and grows increasingly messy as more architecture
features are added.

This patch add explicit flags to each opcode to indicate which set of
register names is disallowed in each instance.  These flags are
mandatory for all opcodes with immediate operands, which ensures that
the choice of disallowed names will always be deliberate and explicit.

This patch should have no functional change.
2025-07-12 10:04:26 +01:00
Ezra Sitorus
87dcc3ddd6 aarch64: Support for FEAT_SVE_AES2
FEAT_SVE_AES2 implements the SVE multi-vector Advanced Encryption
Standard and 128-bit destination element polynomial multiply long
instructions, when the PE is not in Streaming SVE mode.
2025-07-11 12:53:25 +01:00
Ezra Sitorus
621c0c3469 aarch64: Support for FEAT_LSUI
FEAT_LSUI introduces unprivileged variants of load and store instructions so
that clearing PSTATE.PAN is never required in privileged software.
2025-07-11 12:53:19 +01:00
Ezra Sitorus
b80240ecba aarch64: Support for FEAT_PCDPHINT
FEAT_PCDPHINT - Producer-consumer data placement hints - is an optional
ISA extension that provides hint instructions to indicate:
- a store in the current execution thread is generating data at a specific
location, which a thread of execution on one or more other observers is
waiting on.
- the thread of execution on the current PE will read a location that may not
yet have been written with the value to be consumed.

This extension introduces:
- STSHH, a hint instruction, with operands (policies) keep and strm
- PRFM *IR*, a new prefetch memory operand.
2025-07-11 12:53:09 +01:00
Alice Carlotti
e68a412e16 aarch64: Add support for FEAT_SVE2p2 and FEAT_SME2p2 2025-07-08 21:15:43 +01:00
Srinath Parvathaneni
5103708c01 aarch64: Add supports for FEAT_PoPS feature and DC instructions.
This patch add support for FEAT_PoPS feature which can be enabled
through +pops command line flag.

This patch also adds support for following DC instructions and the
spec can be found here [1].
1. "dc cigdvaps" enabled on passing +memtag+pops command line flags.
2. "dc civaps" enabled on passing +pops command line flag.

[1]: https://developer.arm.com/documentation/ddi0601/2025-03/AArch64-Instructions?lang=en
2025-06-25 13:34:59 +01:00
Ezra Sitorus
17cae8183b aarch64: Support for FEAT_LSFE
FEAT_LSFE - Large System Float Extension - implements A64 base atomic
floating-point in-memory instructions.
2025-06-19 14:48:13 +01:00
Ezra Sitorus
4a6d6c97ca aarch64: Support for FEAT_SVE_F16F32MM, FEAT_F8F16M, FEAT_F8F32MM
FEAT_SVE_F16F32MM introduces the SVE half-precision floating-point
matrix multiply-accumulate to single-precision instruction.

FEAT_F8F32MM introduces the Advanced SIMD 8-bit floating-point matrix
multiply-accumulate to single-precision instruction.

FEAT_F8F16MM introduces the Advanced SIMD 8-bit floating-point matrix
multiply-accumulate to half-precision instruction.
2025-06-19 14:36:33 +01:00
Ezra Sitorus
a1f853de0f aarch64: Support for FEAT_CMPBR
FEAT_CMPBR - Compare and branch instructions. This patch adds these
instructions:
- CB<CC> (register)
- CB<CC> (immediate)
- CBH<CC>
- CBB<CC>

where CC is one of the following:
- EQ
- NE
- GT
- GE
- LT
- LE
- HI
- HS
- LO
- LS
2025-06-19 14:30:34 +01:00
Ezra Sitorus
78155cbb35 aarch64: Add occmo flag for FEAT_OCCMO
FEAT_OCCMO support was introduced, but the feature flags were missing.
This patch adds these flags, as well as splitting up the tests to test
occmo vs occmo+memtag operands.
2025-06-19 14:05:14 +01:00
Ezra Sitorus
3165109751 aarch64: Support for FEAT_SVE_BFSCALE
FEAT_SVE_BFSCALE introduces the SVE BFSCALE instruction, when the PE is not in
Streaming SVE mode. If FEAT_SME2 is implemented, FEAT_SVE_BFSCALE also
introduces SME multi-vector Z-targeting BFloat16 scaling instructions, BFSCALE
and BFMUL.
2025-06-19 13:59:29 +01:00
Richard Ball
f9a37571ba aarch64: Add support for FEAT_FPRCVT
FEAT_FPRCVT introduces new versions of previous instructions.
The instructions are used to convert between floating points and
Integers. These new versions take as operands SIMD&FP registers
for both the source and destination register. FEAT_FPRCVT also
enables the use of some existing AdvSIMD instructions in
streaming mode. However, no changes are needed in gas to support this.
2025-06-12 01:39:24 +01:00
Yury Khrustalev
c97cba49cf aarch64: Add definitions for missing architecture bits
Complete macros for feature bits for v9.1-A, v9.2-A, v9.3-A,
and v9.4-A.
2025-06-11 09:05:07 +01:00
Richard Earnshaw
ab65e51fa9 aarch64: Increase the number of feature words to 3
Now that most of the effort of updating the number of feature words is
handled by macros, add an additional one, taking the number of
supported features to 192.
2025-06-09 15:42:35 +01:00
Richard Earnshaw
dccb302cf2 aarch64: use macro trickery to automate feature array size replication
There are quite a few macros that need to be changed when we need to
increase the number of words in the features data structure.  With
some macro trickery we can automate most of this so that a single
macro needs to be updated.

With C2X we could probably do even better by using recursion, but this
is still a much better situation than we had previously.

A static assertion is used to ensure that there is always enough space
in the flags macro for the number of feature bits we need to support.
2025-06-09 15:42:35 +01:00
Yury Khrustalev
ec5409b186 aarch64: Fix typos in opcode headers 2025-06-09 10:45:35 +01:00
Alice Carlotti
92f7d4ddde aarch64: Eliminate AARCH64_OPND_SVE_ADDR_R
Adjust parsing for AARCH64_OPND_SVE_ADDR_RR{_LSL*} operands to accept
implicit XZR offsets.  Add new AARCH64_OPND_SVE_ADDR_RM{_LSL*} operands
to support instructions where an XZR offset is allowed but must be
specified explicitly.  This allows the removal of the duplicate opcode
table entries using AARCH64_OPND_SVE_ADDR_R.
2025-05-09 20:19:30 +01:00
Andrew Carlotti
3b44637d9d aarch64: Fix sve2p1 gating and add missing instructions
Many FEAT_SVE2p1 instructions need to be enabled by either of two
different features (one for streaming mode, and one for non-streaming
mode).  This patch adds correct gating conditions for these
instructions.

There were also a few sve2p1 instructions missing altogether, so add
those as well.

The testsuite is modified to check for all alternative enablement
conditions.  In many cases this is done by adding an alternative
assembler commands to existing test files.  For some SME/SME2 tests,
only some of the instructions are enabled by +sve2p1, so these are
copied into a separate test.  For original SVE2p1 tests, the non-SME2p1
instructions have been moved to a separate test file.

There are also new tests for the newly added instructions.  These
include a couple of fixme comments relating to bad error reporting,
which should be investigated later.
2025-01-17 16:19:56 +00:00
Srinath Parvathaneni
308d7670f0 aarch64: Add support for FEAT_SME_B16B16 feature.
This patch adds support for SME ZA-targeting non-widening BFloat16 instructions,
under tick FEAT_SME_B16B16 and command line flag "+sme-b16b16".

FEAT_SME_B16B16 implements FEAT_SME2 and FEAT_SVE_B16B16, in accordance with that
"+sme-b16b16" enables "+sme2" and "+sve-b16b16".

Also the test files related to FEAT_SME_B16B16 are prefixed with sme-b16b16*.
eg: sme-b16b16-1.s, sme-b16b16-1.d.

The spec for this feature and instructions is availabe here [1]:
[1]: https://developer.arm.com/documentation/ddi0602/2024-06/SME-Instructions?lang=en
2025-01-10 16:47:51 +00:00
Srinath Parvathaneni
d8c923031e aarch64: Add support for FEAT_SVE_B16B16 feature.
In the current code, SVE2 Bfloat16 instructions are implemented with tick
FEAT_B16B16 and command line flag "+b16b16" and this feature was suspended
due to incomplete support.

In the new spec available here[1], FEAT_B16B16 is replaced with
FEAT_SVE_B16B16 and command line flag "+b16b16" is replace with "sve-b16b16".

Also the test files related to FEAT_SVE_B16B16 are prefixed with sve-b16b16*.
eg: sve-b16b16-sve2-1.s, sve-b16b16-sve2-1.d.

This patch supports the SVE Z-targeting non-widening BFloat16 instructions
with command line flag "+sve-b16b16+sve2".
[1]: https://developer.arm.com/documentation/ddi0602/2024-06/SVE-Instructions?lang=en
2025-01-10 16:47:30 +00:00
Andrew Carlotti
b5378decd2 aarch64: Rename AARCH64_OPND_SME_ZT0_INDEX2_12
Rename to AARCH64_OPND_SME_ZT0_INDEX_MUL_VL.
2025-01-10 16:24:33 +00:00
Andrew Carlotti
2dd36fcc80 aarch64: Remove redundant sme-lutv2 qualifiers and operands 2025-01-10 16:24:33 +00:00
Srinath Parvathaneni
7bbf34834d aarch64: Add support for FEAT_SME_F16F16 feature.
This patch adds support for FEAT_SME_F16F16 feature (Non-widening
half-precision FP16 to FP16 arithmetic for SME2), which is enabled
using command line flags +sme-f16f16 to -march (which enables both
FEAT_SME2 and FEAT_SME_F16F16).

There are couple of instructions (fadd and fsub variants) which should
be allowed by the assembler on either passing +sme-f16f16 or +sme-f8f16.
Those instructions are already supported in the current assembler, this
patch adds tests for those instructions as well.
2025-01-10 14:07:06 +00:00
Alan Modra
e8e7cf2abe Update year range in copyright notice of binutils files 2025-01-01 18:29:57 +10:30
Matthieu Longo
46dace1933 aarch64: improve debuggability on array of enum
The current space optmization on enum aarch64_opn_qualifier forced its
encoding using an unsigned char. This "hard-coded" optimization has the
bad consequence of making the array of such enums being completely
unreadable when debugging with GDB because the enum type is lost along
the way.
Keeping this space optimization, and the enum type as well, is possible
when the declaration of the enum is tagged with attribute((packed)).
attribute((packed)) is a GNU extension, and is wrapped in the macro
ATTRIBUTE_PACKED (defined in ansidecl.h), and should be used instead.
2024-11-08 11:35:46 +00:00
Matthieu Longo
c703d0aff5 aarch64: change returned type to bool to match semantic of functions 2024-11-08 11:35:46 +00:00
Matthieu Longo
f0af85dc9e aarch64: make comment clearer about the location
The enum aarch64_opnd_qualifiers in include/opcode/aarch64.h needs to
stay in sync with the array of struct operand_qualifier_data which
defines various properties for the different type of operands. For
instance, for:
- registers: the size of the register, the number of elements.
- immediates: lower and upper bits to determine the range of values.
2024-11-08 11:35:46 +00:00
Indu Bhagat
002ac05902 opcodes: aarch64: enforce checks on subclass flags in aarch64-gen.c
Enforce some checks on the newly added subclass flags:
  - If a subclass is set of one insn of an iclass, every insn of that
    iclass must have non-zero subclass field.
  - For all other iclasses, the subclass bits are zero for all insns.

include/
        * opcode/aarch64.h (enum aarch64_insn_class): Identify the
	maximum iclass enum value.

opcodes/
        * aarch64-gen.c (iclass_has_subclasses_p): New array of bool.
        (read_table): Enforce checks on subclass flags.
2024-07-18 20:54:14 -07:00
Indu Bhagat
04521e258e include: opcodes: aarch64: define new subclasses
The existing iclass information tells us the general shape and purpose
of the instructions.  In some cases, however, we need to further disect
the iclass on the basis of other finer-grain information.  E.g., for the
purpose of SCFI, we need to know whether a given insn with iclass of
ldst_* is a load or a store.  Similarly, whether a particular arithmetic
insn is an add or sub or mov, etc.

This patch defines new flags to demarcate the insns.  Also provide an
access function for subclass lookup.

Later, we will enforce (in aarch64-gen.c) that if an iclass has at least
one instruction with a non-zero subclass, all instructions of the iclass
must have a non-zero subclass information.  If none of the defined
subclasses are applicable (or not required for SCFI purposes),
F_SUBCLASS_OTHER can be used for such instructions.

include/
        * opcode/aarch64.h (F_SUBCLASS): New flag.
        (F_SUBCLASS_OTHER): Likewise.
        (F_LDST_LOAD): Likewise.
        (F_LDST_STORE): Likewise.
        (F_ARITH_ADD): Likewise.
        (F_ARITH_SUB): Likewise.
        (F_ARITH_MOV): Likewise.
        (F_BRANCH_CALL): Likewise.
        (F_BRANCH_RET): Likewise.
	(F_DP_TAG_ONLY): Likewise.
        (aarch64_opcode_subclass_p): New definition.
2024-07-18 20:54:14 -07:00
Srinath Parvathaneni
7bdb051fd6 aarch64: Add support for sme2.1 zero instructions.
This patch adds support for following sme2.1 zero instructions and
the spec is available here [1].

1. ZERO (single-vector).
2. ZERO (double-vector).
3. ZERO (quad-vector).

The VECTOR GROUP symbols VGx2 and VGx4 are optional for the assembler
for most of the sme and sve instructions. But for few of the sme2.1
zero instruction variants VECTOR GROUP symbols VGx2 and VGx4 are mandatory.
To address this a bit "F_VG_REQ" is introduced in this patch, on setting
F_VG_REQ bit in flags of aarch64_opcode forces the assembler to accept
instruction operand only having VECTOR GROUP symbols.

[1]: https://developer.arm.com/documentation/ddi0602/2024-03/SME-Instructions?lang=en
2024-07-12 15:41:56 +01:00
Srinath Parvathaneni
6ab366f264 aarch64: Add support for sme2.1 movaz instructions.
This patch adds support for following sme2.1 movaz instructions and
the spec is available here [1].

1. MOVAZ (array to vector, two registers).
2. MOVAZ (array to vector, four registers).
3. MOVAZ (tile to vector, single).

[1]: https://developer.arm.com/documentation/ddi0602/2024-03/SME-Instructions?lang=en
2024-07-12 15:40:48 +01:00
Srinath Parvathaneni
9858d3031e aarch64: Add support for sme2.1 luti2 and luti4 instructions.
This patch adds support for following sme2.1 luti2 and luti4 instructions, spec is
available here [1]

1. LUTI2 (two registers) strided.
2. LUTI2 (four registers) strided.
3. LUTI4 (two registers) strided.
4. LUTI4 (four registers) strided.

[1]: https://developer.arm.com/documentation/ddi0602/2024-03/SME-Instructions?lang=en
2024-07-12 15:39:15 +01:00
srinath
de7a30ceaa aarch64: Add support for sve2p1 pmov instruction.
This patch adds support for followign SVE2p1 instruction, spec is available here [1].

1. PMOV (to vector)
2. PMOV (to predicate)

Both pmov (to vector) and pmov (to predicate) have destination scalable vector
register and source scalable vector register respectively as an operand with no
suffix and optional index. To handle this case we have added 8 new operands in
this patch.

AARCH64_OPND_SVE_Zn0_INDEX,      /* Zn[index], bits [9:5].  */
AARCH64_OPND_SVE_Zn1_17_INDEX,    /* Zn[index], bits [9:5,17].  */
AARCH64_OPND_SVE_Zn2_18_INDEX,    /* Zn[index], bits [9:5,18:17].  */
AARCH64_OPND_SVE_Zn3_22_INDEX,    /* Zn[index], bits [9:5,18:17,22].  */
AARCH64_OPND_SVE_Zd0_INDEX,      /* Zn[index], bits [4:0].  */
AARCH64_OPND_SVE_Zd1_17_INDEX,    /* Zn[index], bits [4:0,17].  */
AARCH64_OPND_SVE_Zd2_18_INDEX,    /* Zn[index], bits [4:0,18:17].  */
AARCH64_OPND_SVE_Zd3_22_INDEX,    /* Zn[index], bits [4:0,18:17,22].  */

Since the index of the <Zd> operand is optional, the index part is
dropped in disassembly in both the cases of "no index" or "zero index".

As per spec: PMOV <Zd>{[<imm>]}, <Pn>.D
             PMOV <Pn>.D, <Zd>{[<imm>]}

Example1:
	Assembly: pmov z5[0], p6.d
	Disassembly: pmov z5, p6.d

        Assembly: pmov z5, p6.d
        Disassembly: pmov z5, p6.d

Example2:
	Assembly: pmov p4.b, z5[0]
	Disassembly: pmov p4.b, z5

        Assembly: pmov p4.b, z5
        Disassembly: pmov p4.b, z5
[1]: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions?lang=en
2024-07-08 17:48:23 +01:00
Matthieu Longo
f83675969b aarch64: add STEP2 feature and its associated registers
AArch64 defines new registers for the feature step2 (Enhanced Software Step
Extension). step2 is an Armv9.5-A feature.

This patch also adds relevant tests. Regression tested on aarch64-none-elf,
and no regression found.
2024-07-05 15:39:28 +01:00
Matthieu Longo
27e411ef5d aarch64: add SPMU2 feature and its associated registers
AArch64 defines new registers for the feature spmu2 (System Performance
Monitors Extension version 2). spmu2 is an Armv9.5-A feature.

This patch also adds relevant tests. Regression tested on aarch64-none-elf,
and no regression found.
2024-07-05 15:39:28 +01:00
Matthieu Longo
a15809c010 aarch64: add E3DSE feature and its associated registers
AArch64 defines new registers for the feature e3dse (Delegated SError
exceptions for EL3): vdisr_el3 and vdisr_el3. e3dse is an Armv9.5-A
feature.

This patch also adds relevant tests. Regression tested on aarch64-none-elf,
and no regression found.
2024-07-05 15:39:28 +01:00
Claudio Bantaloukas
032eb4f718 aarch64: Add support for Armv9.5-A architecture
The new -march=armv9.5-a flag enables access to the
mandatory cpa, lut and faminmax extensions.
Existing test cases for features are extended to verify they
work without additional flags.
2024-06-28 14:52:30 +01:00
Srinath Parvathaneni
4f2cb9d129 aarch64: Fix sve2p1 ld[1-4]/st[1-4]q instruction operands.
This patch fixes encoding and syntax for sve2p1 instructions ld[1-4]q/st[1-4]q
as mentioned below, for the issues reported here.
https://sourceware.org/pipermail/binutils/2024-February/132408.html

1) Previously all the ld[1-4]q/st[1-4]q instructions are wrongly added as
predicated instructions and this issue is fixed in this patch by replacing
"SVE2p1_INSNC" with "SVE2p1_INSN" macro.
2) Wrong first operand in all the ld[1-4]q/st[1-4]q instructions is fixed
by replacing "SVE_Zt" with "SVE_ZtxN".
3) Wrong operand qualifiers in ld1q and st1q instructions are also fixed in
this patch.
4) In ld1q/st1q the index in the second argument is optional and if index
   is xzr and is skipped in the assembly, the index field is ignored by the
   disassembler.

Fixing above mentioned issues helps with following:
1) ld1q and st1q first register operand accepts enclosed figure braces.
2) ld2q, ld3q, ld4q, st2q, st3q, and st4q instructions accepts wrapping
   sequence of vector registers.

For the instructions ld[2-4]q/st[2-4]q, tests for wrapping sequence of vector
registers are added along with short-form of operands for non-wrapping sequence.

I have added test using following logic:
ld2q {Z0.Q, Z1.Q}, p0/Z, [x0,  #0, MUL VL]  //raw insn encoding (all zeroes)
ld2q {Z31.Q, Z0.Q}, p0/Z, [x0,  #0, MUL VL] // encoding of <Zt1>
ld2q {Z0.Q, Z1.Q}, p7/Z, [x0,  #0, MUL VL] // encoding of <Pg>
ld2q {Z0.Q, Z1.Q}, p0/Z, [x30,  #0, MUL VL] // encoding of <Xm>
ld2q {Z0.Q, Z1.Q}, p0/Z, [x0,  #-16, MUL VL] // encoding of <imm> (low value)
ld2q {Z0.Q, Z1.Q}, p0/Z, [x0,  #14, MUL VL] // encoding of <imm> (high value)
ld2q {Z31.Q, Z0.Q}, p7/Z, [x30,  #-16, MUL VL] // encoding of all fields (all ones)
ld2q {Z30.Q, Z31.Q}, p1/Z, [x3,  #-2, MUL VL] // random encoding.

For all the above form of instructions the hyphenated form is preferred for
disassembly if there are more than two registers in the list, and the register
numbers are monotonically increasing in increments of one.
2024-06-25 13:38:48 +01:00
Srinath Parvathaneni
f50b1a3c1f aarch64: Fix sve2p1 extq instruction operands.
This patch fixes the syntax of sve2p1 "extq" instruction by modifying the operands
count to 4. A new operand AARCH64_OPND_SVE_UIMM4 is defined to handle the 4th
argument an 4-bit unsigned immediate of extq instruction. The instruction encoding
is updated to use constraint C_SCAN_MOVPRFX, to enable "extq" instruction to immediately
precede in program order by a MOVPRFX instruction. Also removed the unused operand
AARCH64_OPND_SVE_Zm_imm4.

This issues was reported here:
 https://sourceware.org/pipermail/binutils/2024-February/132408.html
2024-06-25 13:38:48 +01:00
Srinath Parvathaneni
f5f38efc0a aarch64: Fix sve2p1 dupq instruction operands.
This patch fixes the syntax of sve2p1 "dupq" instruction by modifying the way
2nd operand does the encoding and decoding using the [<imm>] value.

dupq makes use of already existing aarch64_ins_sve_index and aarch64_ext_sve_index
inserter and extractor functions. The definitions of aarch64_ins_sve_index_imm (inserter)
and aarch64_ext_sve_index_imm (extractor) is removed in this patch.

This issues was reported here:
 https://sourceware.org/pipermail/binutils/2024-February/132408.html
2024-06-25 13:38:48 +01:00
Srinath Parvathaneni
8e018c070c aarch64: Enable mandatory feature bits for v9.4-A.
This patch fixes the mandatory feature bits in v9.4-a architectures,
by enabling FEAT_SVE2p1 for Armv9.4-A architecture by default.
2024-06-25 13:38:01 +01:00
Andrew Carlotti
a6e529673a aarch64: Add SME FP8 multiplication instructions
This includes:
- FEAT_SME_F8F32 (+sme-f8f32)
- FEAT_SME_F8F16 (+sme-f8f16)

The FP16 addition/subtraction instructions originally added by
FEAT_SME_F16F16 haven't been added to Binutils yet.  They are also
required to be enabled if FEAT_SME_F8F16 is present, so they are
included in this patch.
2024-06-24 16:50:28 +01:00
Andrew Carlotti
59b78ab1c1 aarch64: Add FP8 Neon and SVE multiplication instructions
This includes all the instructions under the following features:
- FEAT_FP8FMA (+fp8fma)
- FEAT_FP8DOT4 (+fp8dot4)
- FEAT_FP8DOT2 (+fp8dot2)
- FEAT_SSVE_FP8FMA (+ssve-fp8fma)
- FEAT_SSVE_FP8DOT4 (+ssve-fp8dot4)
- FEAT_SSVE_FP8DOT2 (+ssve-fp8dot2)
2024-06-24 16:50:28 +01:00
saurabh.jha@arm.com
adea87e275 gas, aarch64: Add SME2 lutv2 extension
Introduces instructions for the SME2 lutv2 extension for AArch64. They
are documented in the following document:

  * ARM DDI0602

For both luti4 instructions, we introduced an operand called
SME_Znx2_BIT_INDEX. We use the existing function parse_vector_reg_list
for parsing but modified that function so that it can accept operands
without qualifiers and rejects instructions that have operands with
qualifiers but are not supposed to have operands with qualifiers.
For disassembly, we modified print_register_list so that it could
accept register lists without qualifiers.

For one luti4 instruction, we introduced a SME_Zdnx4_STRIDED. It is
similar to SME_Ztx4_STRIDED and we could use existing code for parsing,
encoding, and disassembly.

For movt instruction, we introduced an operand called SME_ZT0_INDEX2_12.
This is a ZT0 register with a bit index encoded in [13:12]. It is
similar to SME_ZT0_INDEX.

We also introduced an iclass named sme_size_12_b so that we can encode
size bits [13:12] correctly when only 'b' is allowed as qualifier.
2024-06-24 15:00:40 +01:00
Andrew Carlotti
835fb5ac2a aarch64: Enable +cssc for armv8.9-a
FEAT_CSSC is mandatory in the architecture from Armv8.9.
2024-06-23 13:59:01 +01:00
Claudio Bantaloukas
72476aca8f aarch64: add Branch Record Buffer extension instructions
The FEAT_BRBE extension provides two aliases of sys:
- brb iall (Invalidates all Branch records in the Branch Record Buffer)
- brb inj (Injects the Branch Record held in BRBINFINJ_EL1,
  BRBSRCINJ_EL1, and BRBTGTINJ_EL1 into the Branch Record Buffer)

This patch adds:
- the feature option "brbe" that must be added for the aliases to be available
- a new operand flag AARCH64_OPND_Rt_IN_SYS_ALIASES that warns in a comment
  when Rt is set to the non default value 0b11111 (it is constrained
  unpredictable whether the instruction is undefined or behaves as if the Rt
  field is set to 0b11111).
- a new operand flag AARCH64_OPND_BRBOP that encodes and decodes Op2 values
  from bit 5
- support for the two brb aliases above

See:
- https://developer.arm.com/documentation/ddi0602/2024-03/Base-Instructions/BRB--Branch-Record-Buffer--an-alias-of-SYS-?lang=en
- https://developer.arm.com/documentation/ddi0601/2024-03/AArch64-Instructions/BRB-INJ--Branch-Record-Injection-into-the-Branch-Record-Buffer?lang=en
- https://developer.arm.com/documentation/ddi0601/2024-03/AArch64-Instructions/BRB-IALL--Invalidate-the-Branch-Record-Buffer?lang=en
2024-06-12 14:58:35 +01:00
saurabh.jha@arm.com
444c60fe33 gas, aarch64: Add SVE2 lut extension
Introduces instructions for the SVE2 lut extension for AArch64. They are documented in the following links:
* luti2: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions/LUTI2--Lookup-table-read-with-2-bit-indices-?lang=en
* luti4: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions/LUTI4--Lookup-table-read-with-4-bit-indices-?lang=en

These instructions use new SVE2 vector operands. They are called
SVE_Zm1_23_INDEX, SVE_Zm2_22_INDEX, and Zm3_12_INDEX and they have
1 bit, 2 bit, and 3 bit indices respectively.

The lsb and width of these new operands are the same as many existing
operands but the convention is to give different names to fields that
serve different purpose so we introduced new fields in aarch64-opc.c
and aarch64-opc.h.

We made a design choice for the second operand of the halfword variant of
luti4 with two register tables. We could have either defined a new operand,
like SVE_Znx2, or we could have use the existing operand SVE_ZnxN. With
the new operand, we would need to implement constraints on register
lists based on either operand or opcode flag. With existing operand, we
could just existing constraint checks using opcode flag. We chose
the second approach and went with SVE_ZnxN and added opcode flag to
enforce lengths of vector register list operands. This way, we can reuse
the existing constraint check logic.
2024-05-28 17:28:29 +01:00
saurabh.jha@arm.com
c3bb4211d9 gas, aarch64: Add AdvSIMD lut extension
Introduces instructions for the Advanced SIMD lut extension for AArch64. They are documented in the following links:
* luti2: https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/LUTI2--Lookup-table-read-with-2-bit-indices-?lang=en
* luti4: https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/LUTI4--Lookup-table-read-with-4-bit-indices-?lang=en

These instructions needed definition of some new operands. We will first
discuss operands for the third operand of the instructions and then
discuss a vector register list operand needed for the second operand.

The third operands are vectors with bit indices and without type
qualifiers. They are called Em_INDEX1_14, Em_INDEX2_13, and Em_INDEX3_12
and they have 1 bit, 2 bit, and 3 bit indices respectively. For these
new operands, we defined new parsing case branch. The lsb and width of
these operands are the same as many existing but the convention is to
give different names to fields that serve different purpose so we
introduced new fields in aarch64-opc.c and aarch64-opc.h for these new
operands.

For the second operand of these instructions, we introduced a new
operand called LVn_LUT. This represents a vector register list with
stride 1. We defined new inserter and extractor for this new operand and
it is encoded in FLD_Rn. We are enforcing the number of registers in the
reglist using opcode flag rather than operand flag as this is what other
SIMD vector register list operands are doing. The disassembly also uses
opcode flag to print the correct number of registers.
2024-05-28 17:28:29 +01:00
Victor Do Nascimento
7d0383ad39 aarch64: fp8 convert and scale - add feature flags and related structures 2024-05-16 13:22:30 +01:00