Commit Graph

158 Commits

Author SHA1 Message Date
Richard Earnshaw
0454220d58 aarch64: Use an enum to refer to indices in the opcode table
The indices into the auto-generated tables for opcodes are relatively
unstable.  Adding a new opcode can permute the code significantly.
But most of this churn is down to changes in the index values.  To
minimize this use enumerated constants.  While the index values
change, the enumeration names will need to do so far less often, so
most of the changes in the generated code become localized to the
addition (occasionally removal) of opcodes.  This change also makes
the state-change comments unnecessary.  The enumeration names contain
the same information (and more), so these are simply deleted.

The enumeration values are placed in a new header file, aarch64-tbl-2.h,
so aarch64-gen gains a new option to build this header and the Makefile
rules are adjusted accordingly.
2025-07-21 18:18:51 +01:00
Richard Earnshaw
63de89b2c1 aarch64: use an enumeration for operand indices.
The generated aarch64 operand tables use index values into an array.  But if
the table of operands is modified by inserting a new operand into the middle
of the table, *all* the index values can change, leading to a lot of
churn in the generated output.

include/opcode/aarch64.h already provides an enumeration for the operands,
so make use of that instead of printing out the raw index values.
2025-07-21 18:17:51 +01:00
Richard Earnshaw
9db671074c aarch64: minor code cleanups to aarch64-gen.c
Fix some overly-long lines.
2025-07-21 18:15:39 +01:00
Ezra Sitorus
87dcc3ddd6 aarch64: Support for FEAT_SVE_AES2
FEAT_SVE_AES2 implements the SVE multi-vector Advanced Encryption
Standard and 128-bit destination element polynomial multiply long
instructions, when the PE is not in Streaming SVE mode.
2025-07-11 12:53:25 +01:00
Ezra Sitorus
621c0c3469 aarch64: Support for FEAT_LSUI
FEAT_LSUI introduces unprivileged variants of load and store instructions so
that clearing PSTATE.PAN is never required in privileged software.
2025-07-11 12:53:19 +01:00
Ezra Sitorus
b80240ecba aarch64: Support for FEAT_PCDPHINT
FEAT_PCDPHINT - Producer-consumer data placement hints - is an optional
ISA extension that provides hint instructions to indicate:
- a store in the current execution thread is generating data at a specific
location, which a thread of execution on one or more other observers is
waiting on.
- the thread of execution on the current PE will read a location that may not
yet have been written with the value to be consumed.

This extension introduces:
- STSHH, a hint instruction, with operands (policies) keep and strm
- PRFM *IR*, a new prefetch memory operand.
2025-07-11 12:53:09 +01:00
Ezra Sitorus
17cae8183b aarch64: Support for FEAT_LSFE
FEAT_LSFE - Large System Float Extension - implements A64 base atomic
floating-point in-memory instructions.
2025-06-19 14:48:13 +01:00
Ezra Sitorus
4a6d6c97ca aarch64: Support for FEAT_SVE_F16F32MM, FEAT_F8F16M, FEAT_F8F32MM
FEAT_SVE_F16F32MM introduces the SVE half-precision floating-point
matrix multiply-accumulate to single-precision instruction.

FEAT_F8F32MM introduces the Advanced SIMD 8-bit floating-point matrix
multiply-accumulate to single-precision instruction.

FEAT_F8F16MM introduces the Advanced SIMD 8-bit floating-point matrix
multiply-accumulate to half-precision instruction.
2025-06-19 14:36:33 +01:00
Ezra Sitorus
a1f853de0f aarch64: Support for FEAT_CMPBR
FEAT_CMPBR - Compare and branch instructions. This patch adds these
instructions:
- CB<CC> (register)
- CB<CC> (immediate)
- CBH<CC>
- CBB<CC>

where CC is one of the following:
- EQ
- NE
- GT
- GE
- LT
- LE
- HI
- HS
- LO
- LS
2025-06-19 14:30:34 +01:00
Ezra Sitorus
3165109751 aarch64: Support for FEAT_SVE_BFSCALE
FEAT_SVE_BFSCALE introduces the SVE BFSCALE instruction, when the PE is not in
Streaming SVE mode. If FEAT_SME2 is implemented, FEAT_SVE_BFSCALE also
introduces SME multi-vector Z-targeting BFloat16 scaling instructions, BFSCALE
and BFMUL.
2025-06-19 13:59:29 +01:00
Richard Ball
f9a37571ba aarch64: Add support for FEAT_FPRCVT
FEAT_FPRCVT introduces new versions of previous instructions.
The instructions are used to convert between floating points and
Integers. These new versions take as operands SIMD&FP registers
for both the source and destination register. FEAT_FPRCVT also
enables the use of some existing AdvSIMD instructions in
streaming mode. However, no changes are needed in gas to support this.
2025-06-12 01:39:24 +01:00
Alice Carlotti
fd45b1c1aa aarch64: Mark predicate-as-counter pseudo instructions
Using explicit pseudo aliases is clearer and more consistent with other
instruction aliases.

This does not change behaviour.  For the non-alias instructions
(everything except mov) we already picked the first matching entry for
disassembly by default.  For mov we picked the last matching aliased
entry, which remained the original alias since do_misc_decoding doesn't
recognise OP_MOV_PN_PN.
2025-05-09 20:27:22 +01:00
Alice Carlotti
f1c037989a aarch64: Fix dgh disassembly 2025-05-09 20:27:22 +01:00
Alice Carlotti
51df25b00f aarch64: Mark SME mova aliases
This will only change behaviour during disassembly with -M no-aliases.
2025-05-09 20:27:22 +01:00
Alice Carlotti
a8d71f52d0 aarch64: Mark rev64 as a pseudo instruction
This is more natural than raising the priority of rev with F_P1, and
is functionally equivalent.
2025-05-09 20:27:22 +01:00
Alice Carlotti
92f7d4ddde aarch64: Eliminate AARCH64_OPND_SVE_ADDR_R
Adjust parsing for AARCH64_OPND_SVE_ADDR_RR{_LSL*} operands to accept
implicit XZR offsets.  Add new AARCH64_OPND_SVE_ADDR_RM{_LSL*} operands
to support instructions where an XZR offset is allowed but must be
specified explicitly.  This allows the removal of the duplicate opcode
table entries using AARCH64_OPND_SVE_ADDR_R.
2025-05-09 20:19:30 +01:00
Andrew Carlotti
2dd36fcc80 aarch64: Remove redundant sme-lutv2 qualifiers and operands 2025-01-10 16:24:33 +00:00
Alan Modra
e8e7cf2abe Update year range in copyright notice of binutils files 2025-01-01 18:29:57 +10:30
Srinath Parvathaneni
6ab366f264 aarch64: Add support for sme2.1 movaz instructions.
This patch adds support for following sme2.1 movaz instructions and
the spec is available here [1].

1. MOVAZ (array to vector, two registers).
2. MOVAZ (array to vector, four registers).
3. MOVAZ (tile to vector, single).

[1]: https://developer.arm.com/documentation/ddi0602/2024-03/SME-Instructions?lang=en
2024-07-12 15:40:48 +01:00
srinath
de7a30ceaa aarch64: Add support for sve2p1 pmov instruction.
This patch adds support for followign SVE2p1 instruction, spec is available here [1].

1. PMOV (to vector)
2. PMOV (to predicate)

Both pmov (to vector) and pmov (to predicate) have destination scalable vector
register and source scalable vector register respectively as an operand with no
suffix and optional index. To handle this case we have added 8 new operands in
this patch.

AARCH64_OPND_SVE_Zn0_INDEX,      /* Zn[index], bits [9:5].  */
AARCH64_OPND_SVE_Zn1_17_INDEX,    /* Zn[index], bits [9:5,17].  */
AARCH64_OPND_SVE_Zn2_18_INDEX,    /* Zn[index], bits [9:5,18:17].  */
AARCH64_OPND_SVE_Zn3_22_INDEX,    /* Zn[index], bits [9:5,18:17,22].  */
AARCH64_OPND_SVE_Zd0_INDEX,      /* Zn[index], bits [4:0].  */
AARCH64_OPND_SVE_Zd1_17_INDEX,    /* Zn[index], bits [4:0,17].  */
AARCH64_OPND_SVE_Zd2_18_INDEX,    /* Zn[index], bits [4:0,18:17].  */
AARCH64_OPND_SVE_Zd3_22_INDEX,    /* Zn[index], bits [4:0,18:17,22].  */

Since the index of the <Zd> operand is optional, the index part is
dropped in disassembly in both the cases of "no index" or "zero index".

As per spec: PMOV <Zd>{[<imm>]}, <Pn>.D
             PMOV <Pn>.D, <Zd>{[<imm>]}

Example1:
	Assembly: pmov z5[0], p6.d
	Disassembly: pmov z5, p6.d

        Assembly: pmov z5, p6.d
        Disassembly: pmov z5, p6.d

Example2:
	Assembly: pmov p4.b, z5[0]
	Disassembly: pmov p4.b, z5

        Assembly: pmov p4.b, z5
        Disassembly: pmov p4.b, z5
[1]: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions?lang=en
2024-07-08 17:48:23 +01:00
Srinath Parvathaneni
4f2cb9d129 aarch64: Fix sve2p1 ld[1-4]/st[1-4]q instruction operands.
This patch fixes encoding and syntax for sve2p1 instructions ld[1-4]q/st[1-4]q
as mentioned below, for the issues reported here.
https://sourceware.org/pipermail/binutils/2024-February/132408.html

1) Previously all the ld[1-4]q/st[1-4]q instructions are wrongly added as
predicated instructions and this issue is fixed in this patch by replacing
"SVE2p1_INSNC" with "SVE2p1_INSN" macro.
2) Wrong first operand in all the ld[1-4]q/st[1-4]q instructions is fixed
by replacing "SVE_Zt" with "SVE_ZtxN".
3) Wrong operand qualifiers in ld1q and st1q instructions are also fixed in
this patch.
4) In ld1q/st1q the index in the second argument is optional and if index
   is xzr and is skipped in the assembly, the index field is ignored by the
   disassembler.

Fixing above mentioned issues helps with following:
1) ld1q and st1q first register operand accepts enclosed figure braces.
2) ld2q, ld3q, ld4q, st2q, st3q, and st4q instructions accepts wrapping
   sequence of vector registers.

For the instructions ld[2-4]q/st[2-4]q, tests for wrapping sequence of vector
registers are added along with short-form of operands for non-wrapping sequence.

I have added test using following logic:
ld2q {Z0.Q, Z1.Q}, p0/Z, [x0,  #0, MUL VL]  //raw insn encoding (all zeroes)
ld2q {Z31.Q, Z0.Q}, p0/Z, [x0,  #0, MUL VL] // encoding of <Zt1>
ld2q {Z0.Q, Z1.Q}, p7/Z, [x0,  #0, MUL VL] // encoding of <Pg>
ld2q {Z0.Q, Z1.Q}, p0/Z, [x30,  #0, MUL VL] // encoding of <Xm>
ld2q {Z0.Q, Z1.Q}, p0/Z, [x0,  #-16, MUL VL] // encoding of <imm> (low value)
ld2q {Z0.Q, Z1.Q}, p0/Z, [x0,  #14, MUL VL] // encoding of <imm> (high value)
ld2q {Z31.Q, Z0.Q}, p7/Z, [x30,  #-16, MUL VL] // encoding of all fields (all ones)
ld2q {Z30.Q, Z31.Q}, p1/Z, [x3,  #-2, MUL VL] // random encoding.

For all the above form of instructions the hyphenated form is preferred for
disassembly if there are more than two registers in the list, and the register
numbers are monotonically increasing in increments of one.
2024-06-25 13:38:48 +01:00
Srinath Parvathaneni
f50b1a3c1f aarch64: Fix sve2p1 extq instruction operands.
This patch fixes the syntax of sve2p1 "extq" instruction by modifying the operands
count to 4. A new operand AARCH64_OPND_SVE_UIMM4 is defined to handle the 4th
argument an 4-bit unsigned immediate of extq instruction. The instruction encoding
is updated to use constraint C_SCAN_MOVPRFX, to enable "extq" instruction to immediately
precede in program order by a MOVPRFX instruction. Also removed the unused operand
AARCH64_OPND_SVE_Zm_imm4.

This issues was reported here:
 https://sourceware.org/pipermail/binutils/2024-February/132408.html
2024-06-25 13:38:48 +01:00
Srinath Parvathaneni
f5f38efc0a aarch64: Fix sve2p1 dupq instruction operands.
This patch fixes the syntax of sve2p1 "dupq" instruction by modifying the way
2nd operand does the encoding and decoding using the [<imm>] value.

dupq makes use of already existing aarch64_ins_sve_index and aarch64_ext_sve_index
inserter and extractor functions. The definitions of aarch64_ins_sve_index_imm (inserter)
and aarch64_ext_sve_index_imm (extractor) is removed in this patch.

This issues was reported here:
 https://sourceware.org/pipermail/binutils/2024-February/132408.html
2024-06-25 13:38:48 +01:00
Andrew Carlotti
a6e529673a aarch64: Add SME FP8 multiplication instructions
This includes:
- FEAT_SME_F8F32 (+sme-f8f32)
- FEAT_SME_F8F16 (+sme-f8f16)

The FP16 addition/subtraction instructions originally added by
FEAT_SME_F16F16 haven't been added to Binutils yet.  They are also
required to be enabled if FEAT_SME_F8F16 is present, so they are
included in this patch.
2024-06-24 16:50:28 +01:00
Andrew Carlotti
59b78ab1c1 aarch64: Add FP8 Neon and SVE multiplication instructions
This includes all the instructions under the following features:
- FEAT_FP8FMA (+fp8fma)
- FEAT_FP8DOT4 (+fp8dot4)
- FEAT_FP8DOT2 (+fp8dot2)
- FEAT_SSVE_FP8FMA (+ssve-fp8fma)
- FEAT_SSVE_FP8DOT4 (+ssve-fp8dot4)
- FEAT_SSVE_FP8DOT2 (+ssve-fp8dot2)
2024-06-24 16:50:28 +01:00
saurabh.jha@arm.com
adea87e275 gas, aarch64: Add SME2 lutv2 extension
Introduces instructions for the SME2 lutv2 extension for AArch64. They
are documented in the following document:

  * ARM DDI0602

For both luti4 instructions, we introduced an operand called
SME_Znx2_BIT_INDEX. We use the existing function parse_vector_reg_list
for parsing but modified that function so that it can accept operands
without qualifiers and rejects instructions that have operands with
qualifiers but are not supposed to have operands with qualifiers.
For disassembly, we modified print_register_list so that it could
accept register lists without qualifiers.

For one luti4 instruction, we introduced a SME_Zdnx4_STRIDED. It is
similar to SME_Ztx4_STRIDED and we could use existing code for parsing,
encoding, and disassembly.

For movt instruction, we introduced an operand called SME_ZT0_INDEX2_12.
This is a ZT0 register with a bit index encoded in [13:12]. It is
similar to SME_ZT0_INDEX.

We also introduced an iclass named sme_size_12_b so that we can encode
size bits [13:12] correctly when only 'b' is allowed as qualifier.
2024-06-24 15:00:40 +01:00
Claudio Bantaloukas
72476aca8f aarch64: add Branch Record Buffer extension instructions
The FEAT_BRBE extension provides two aliases of sys:
- brb iall (Invalidates all Branch records in the Branch Record Buffer)
- brb inj (Injects the Branch Record held in BRBINFINJ_EL1,
  BRBSRCINJ_EL1, and BRBTGTINJ_EL1 into the Branch Record Buffer)

This patch adds:
- the feature option "brbe" that must be added for the aliases to be available
- a new operand flag AARCH64_OPND_Rt_IN_SYS_ALIASES that warns in a comment
  when Rt is set to the non default value 0b11111 (it is constrained
  unpredictable whether the instruction is undefined or behaves as if the Rt
  field is set to 0b11111).
- a new operand flag AARCH64_OPND_BRBOP that encodes and decodes Op2 values
  from bit 5
- support for the two brb aliases above

See:
- https://developer.arm.com/documentation/ddi0602/2024-03/Base-Instructions/BRB--Branch-Record-Buffer--an-alias-of-SYS-?lang=en
- https://developer.arm.com/documentation/ddi0601/2024-03/AArch64-Instructions/BRB-INJ--Branch-Record-Injection-into-the-Branch-Record-Buffer?lang=en
- https://developer.arm.com/documentation/ddi0601/2024-03/AArch64-Instructions/BRB-IALL--Invalidate-the-Branch-Record-Buffer?lang=en
2024-06-12 14:58:35 +01:00
saurabh.jha@arm.com
444c60fe33 gas, aarch64: Add SVE2 lut extension
Introduces instructions for the SVE2 lut extension for AArch64. They are documented in the following links:
* luti2: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions/LUTI2--Lookup-table-read-with-2-bit-indices-?lang=en
* luti4: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions/LUTI4--Lookup-table-read-with-4-bit-indices-?lang=en

These instructions use new SVE2 vector operands. They are called
SVE_Zm1_23_INDEX, SVE_Zm2_22_INDEX, and Zm3_12_INDEX and they have
1 bit, 2 bit, and 3 bit indices respectively.

The lsb and width of these new operands are the same as many existing
operands but the convention is to give different names to fields that
serve different purpose so we introduced new fields in aarch64-opc.c
and aarch64-opc.h.

We made a design choice for the second operand of the halfword variant of
luti4 with two register tables. We could have either defined a new operand,
like SVE_Znx2, or we could have use the existing operand SVE_ZnxN. With
the new operand, we would need to implement constraints on register
lists based on either operand or opcode flag. With existing operand, we
could just existing constraint checks using opcode flag. We chose
the second approach and went with SVE_ZnxN and added opcode flag to
enforce lengths of vector register list operands. This way, we can reuse
the existing constraint check logic.
2024-05-28 17:28:29 +01:00
saurabh.jha@arm.com
c3bb4211d9 gas, aarch64: Add AdvSIMD lut extension
Introduces instructions for the Advanced SIMD lut extension for AArch64. They are documented in the following links:
* luti2: https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/LUTI2--Lookup-table-read-with-2-bit-indices-?lang=en
* luti4: https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/LUTI4--Lookup-table-read-with-4-bit-indices-?lang=en

These instructions needed definition of some new operands. We will first
discuss operands for the third operand of the instructions and then
discuss a vector register list operand needed for the second operand.

The third operands are vectors with bit indices and without type
qualifiers. They are called Em_INDEX1_14, Em_INDEX2_13, and Em_INDEX3_12
and they have 1 bit, 2 bit, and 3 bit indices respectively. For these
new operands, we defined new parsing case branch. The lsb and width of
these operands are the same as many existing but the convention is to
give different names to fields that serve different purpose so we
introduced new fields in aarch64-opc.c and aarch64-opc.h for these new
operands.

For the second operand of these instructions, we introduced a new
operand called LVn_LUT. This represents a vector register list with
stride 1. We defined new inserter and extractor for this new operand and
it is encoded in FLD_Rn. We are enforcing the number of registers in the
reglist using opcode flag rather than operand flag as this is what other
SIMD vector register list operands are doing. The disassembly also uses
opcode flag to print the correct number of registers.
2024-05-28 17:28:29 +01:00
Saurabh Jha
f3f34f2b26 gas, aarch64: Add faminmax extension 2024-03-19 15:41:41 +00:00
Nick Clifton
0273b5967e Regenerate AArch64 opcodes files 2024-03-18 18:38:23 +00:00
Victor Do Nascimento
f1870e2fad aarch64: rcpc3: Regenerate aarch64-*-2.c files 2024-01-15 13:11:48 +00:00
Nick Clifton
2db11bdf84 Add generated source files and fix thinko in aarch64-asm.c 2024-01-15 11:45:42 +00:00
Victor Do Nascimento
e244fa1a6b aarch64: Regenerate aarch64-*-2.c files 2024-01-09 10:16:41 +00:00
Alan Modra
fd67aa1129 Update year range in copyright notice of binutils files
Adds two new external authors to etc/update-copyright.py to cover
bfd/ax_tls.m4, and adds gprofng to dirs handled automatically, then
updates copyright messages as follows:

1) Update cgen/utils.scm emitted copyrights.
2) Run "etc/update-copyright.py --this-year" with an extra external
   author I haven't committed, 'Kalray SA.', to cover gas testsuite
   files (which should have their copyright message removed).
3) Build with --enable-maintainer-mode --enable-cgen-maint=yes.
4) Check out */po/*.pot which we don't update frequently.
2024-01-04 22:58:12 +10:30
Andrea Corallo
d645278cdf aarch64: Add FEAT_ITE support
This patch add support for FEAT_ITE "Instrumentation Extension" adding
the "trcit" instruction.

This is enabled by the +ite march flag.
2023-12-19 15:35:49 +01:00
Andrea Corallo
db168da2e0 aarch64: Add FEAT_ECBHB support
This patch add support for FEAT_ECBHB "Exploitative control using
branch history information" adding the "clrbhb" instruction.  AFAIU
the same alias was originally added as "clearbhb" before the
architecture was finalized (Mandatory v8.9-a/v9.4-a; Optional
v8.0-a+/v9.0-a+).
2023-12-19 15:35:49 +01:00
Andrea Corallo
88b5a8ae13 aarch64: Add FEAT_SPECRES2 support
This patch add supports for FEAT_SPECRES2 "Enhanced speculation
restriction instructions" adding the "cosp" instruction.

This is mandatory v8.9-a/v9.4-a and optional v8.0-a+/v9.0-a+.  It is
enabled by the +predres2 march flag.
2023-12-19 15:35:49 +01:00
Victor Do Nascimento
f3f6c0df60 aarch64: Add LSE128 instructions
Implement, together with the necessary tests, the following new LSE128
atomic instructions:

  * Atomic bit clear on quadword in memory (ldclrp{a|l|al});
  * Atomic bit set on quadword in memory (ldsetp{a|l|al});
  * Swap quadword in memory (swpp{a|l|al});

gas/ChangeLog:

	* testsuite/gas/aarch64/lse128-atomic.d: New.
	* testsuite/gas/aarch64/lse128-atomic.s: Likewise.

opcodes/ChangeLog:

	* aarch64-tbl.h (ldclrp): new _LSE128_INSN entry.
	(ldclrpa):  Likewise.
	(ldclrpal): Likewise.
	(ldclrpl): Likewise.
	(ldsetp): Likewise.
	(ldsetpa): Likewise.
	(ldsetpal): Likewise.
	(ldsetpl): Likewise.
	(swpp): Likewise.
	(swppa): Likewise.
	(swppal): Likewise.
	(swppl): Likewise.
	* aarch64-asm-2.c: Regenerate.
	* aarch64-dis-2.c: Likewise.
	* aarch64-opc-2.c: Likewise.
2023-11-07 21:54:19 +00:00
Srinath Parvathaneni
c58f84d899 aarch64: Add support for GCSB DSYNC instruction.
This patch adds support for Guarded control stack data synchronization
instruction (GCSB DSYNC). This instruction is allocated to existing
HINT space and uses the HINT number 19 and to match this an entry is
added to the aarch64_hint_options array.
2023-11-02 13:09:26 +00:00
srinath
f985c2512a aarch64: Add support for GCS extension.
This patch adds for Guarded Control Stack Extension (GCS) extension. GCS feature is
optional from Armv9.4-A architecture and enabled by passing +gcs option to -march
(eg: -march=armv9.4-a+gcs) or using ".arch_extension gcs" directive in the assembly file.

Also this patch adds support for GCS instructions gcspushx, gcspopcx, gcspopx,
gcsss1, gcsss2, gcspushm, gcspopm, gcsstr and gcssttr.
2023-11-02 13:06:00 +00:00
Richard Sandiford
8ff429203d aarch64: Add the RPRFM instruction
This patch adds the RPRFM (range prefetch) instruction.
It was introduced as part of SME2, but it belongs to the
prefetch hint space and so doesn't require any specific
ISA flags.

The aarch64_rprfmop_array initialiser (deliberately) only
fills in the leading non-null elements.
2023-03-30 11:09:18 +01:00
Richard Sandiford
dfc12f9f53 aarch64: Add new SVE dot-product instructions
This patch adds the SVE FDOT, SDOT and UDOT instructions,
which are available when FEAT_SME2 is implemented.  The patch
also reorders the existing SVE_Zm3_22_INDEX to keep the
operands numerically sorted.
2023-03-30 11:09:17 +01:00
Richard Sandiford
6efa660124 aarch64: Add the SME2 shift instructions
There are two instruction formats here:

- SQRSHR, SQRSHRU and UQRSHR, which operate on lists of two
  or four registers.

- SQRSHRN, SQRSHRUN and UQRSHRN, which operate on lists of
  four registers.

These are the first SME2 instructions to have immediate operands.
The patch makes sure that, when parsing SME2 instructions with
immediate operands, the new predicate-as-counter registers are
parsed as registers rather than as #-less immediates.
2023-03-30 11:09:16 +01:00
Richard Sandiford
a8cb21aa06 aarch64: Add the SME2 MLALL and MLSLL instructions
SMLALL, SMLSLL, UMLALL and UMLSLL have the same format.
USMLALL and SUMLALL allow the same operand types as those
instructions, except that SUMLALL does not have the multi-vector
x multi-vector forms (which would be redundant with USMLALL).
2023-03-30 11:09:14 +01:00
Richard Sandiford
ed429b33c1 aarch64: Add the SME2 MLAL and MLSL instructions
The {BF,F,S,U}MLAL and {BF,F,S,U}MLSL instructions share the same
encoding.  They are the first instance of a ZA (as opposed to ZA tile)
operand having a range of offsets.  As with ZA tiles, the expected
range size is encoded in the operand-specific data field.
2023-03-30 11:09:13 +01:00
Richard Sandiford
80752eb098 aarch64: Add the SME2 FMLA and FMLS instructions 2023-03-30 11:09:13 +01:00
Richard Sandiford
e87ff6724f aarch64: Add the SME2 ADD and SUB instructions
Add support for the SME2 ADD. SUB, FADD and FSUB instructions.
SUB and FSUB have the same form as ADD and FADD, except that
ADD also has a 2-operand accumulating form.

The 64-bit ADD/SUB instructions require FEAT_SME_I16I64 and the
64-bit FADD/FSUB instructions require FEAT_SME_F64F64.

These are the first instructions to have tied register list
operands, as opposed to tied single registers.

The parse_operands change prevents unsuffixed Z registers (width==-1)
from being treated as though they had an Advanced SIMD-style suffix
(.4s etc.).  It means that:

  Error: expected element type rather than vector type at operand 2 -- `add za\.s\[w8,0\],{z0-z1}'

becomes:

  Error: missing type suffix at operand 2 -- `add za\.s\[w8,0\],{z0-z1}'
2023-03-30 11:09:13 +01:00
Richard Sandiford
cbd11b8818 aarch64: Add the SME2 ZT0 instructions
SME2 adds lookup table instructions for quantisation.  They use
a new lookup table register called ZT0.

LUTI2 takes an unsuffixed SVE vector index of the form Zn[<imm>],
which is the first time that this syntax has been used.
2023-03-30 11:09:12 +01:00
Richard Sandiford
99e01a66b4 aarch64: Add the SME2 predicate-related instructions
Implementation-wise, the main things to note here are:

- the WHILE* instructions have forms that return a pair of predicate
  registers.  This is the first time that we've had lists of predicate
  registers, and they wrap around after register 15 rather than after
  register 31.

- the predicate-as-counter WHILE* instructions have a fourth operand
  that specifies the vector length.  We can treat this as an enumeration,
  except that immediate values aren't allowed.

- PEXT takes an unsuffixed predicate index of the form PN<n>[<imm>].
  This is the first instance of a vector/predicate index having
  no suffix.
2023-03-30 11:09:12 +01:00