Allow userspace reads to `VMX_CONTROL_SECONDARY_PROCESSOR_CONTROLS`
field of VMCS. seL4 already allow writes to the field so no reason to
not allow reads.
Signed-off-by: Bill Nguyen <bill.nguyen@unsw.edu.au>
Previously, X86_64_VTX_64BIT_GUESTS forced the VCPU into 64-bit mode by
hardcoding the IA-32e bit in VM-Entry Controls. This prevented userspace
from starting a VCPU in real mode, which break cases where you want the
guest to be able to manage its own mode transitions. Such as booting
an UEFI firmware.
This change allows the guest to manage its own mode transitions
(real -> protected -> long mode). The X86_64_VTX_64BIT_GUESTS config is
now scoped strictly to enabling 64-bit features like MSR access.
Signed-off-by: Bill Nguyen <bill.nguyen@unsw.edu.au>
Allow userspace reads and writes to `VMX_CONTROL_ENTRY_CONTROLS` field
of VMCS. Which is primarily used for enabling long mode and loading of
certain MSRs on VM Entry. As far as I know this field only controls the
behaviour of the VCPU and would not allow userspace to violate any
kernel properties such as disabling external interrupt exiting or EPT.
The effects of bits in this field are defined in
Chapter 27.8 "VM-ENTRY CONTROL FIELDS" of the
Intel SDM Order Number: 325462-090US February 2026.
Signed-off-by: Bill Nguyen <bill.nguyen@unsw.edu.au>
Allow CR0.PE and CR0.PG to be cleared if "Unrestricted Guest" mode is
supported by the host CPU. This mode allows guest software to run in
unpaged protected mode or in real-address mode. From [1] with Indan's
feedback applied.
[1] https://github.com/seL4/seL4/pull/1438
Signed-off-by: Bill Nguyen <bill.nguyen@unsw.edu.au>
Co-authored-by: Heyang Zhou <hello@su3.io>
When original.family >= 0xF, the extended model was incorrectly
assigned to ci->display.family instead of ci->display.model. This
matches the Intel implementation and AMD CPUID specification.
Signed-off-by: Weixie Cui <cuiweixie@gmail.com>
Made-with: Cursor
Previously, CMake defaults to adding various flags to the compile
options, such as `-O3`, `-DNDEBUG`, and `-g`. Here is an excerpt from
CMakeCache.txt:
//Flags used by the C compiler during RELEASE builds.
CMAKE_C_FLAGS_RELEASE:STRING=-O3 -DNDEBUG
//Flags used by the C compiler during RELWITHDEBINFO builds.
CMAKE_C_FLAGS_RELWITHDEBINFO:STRING=-O2 -g -DNDEBUG
We instead change the CMAKE_BUILD_TYPE to 'None' so that CMake
doesn't try to be clever and append extra compile flags. Anything
that depends on seL4 will be unaffected as CMake variables are scoped.
This should have no meaningful change to the kernel's generated code,
as the compile flags were being overridden anyway.
Relevant parts of the build.ninja file for comparison:
- Now, `-DRELEASE=ON`:
FLAGS = -march=armv8-a -D__KERNEL_64__ -std=c99 -Wall -Werror
-Wstrict-prototypes -Wmissing-prototypes -Wnested-externs
-Wmissing-declarations -Wundef -Wpointer-arith -Wno-nonnull
-nostdinc -ffreestanding -fno-stack-protector
-fno-asynchronous-unwind-tables -fno-common -O2 -nostdlib
-fno-pic -fno-pie -mgeneral-regs-only -mno-outline-atomics
-E -CC -I/sel4test-manifest/build/kernel/generated_prune
- Before, `-DRELEASE=ON`:
FLAGS = -march=armv8-a -D__KERNEL_64__ -O3 -DNDEBUG -std=c99 -Wall
-Werror -Wstrict-prototypes -Wmissing-prototypes
-Wnested-externs -Wmissing-declarations -Wundef
-Wpointer-arith -Wno-nonnull -nostdinc -ffreestanding
-fno-stack-protector -fno-asynchronous-unwind-tables
-fno-common -O2 -nostdlib -fno-pic -fno-pie
-mgeneral-regs-only -mno-outline-atomics -E -CC
-I/sel4test-manifest/build/kernel/generated_prune
Note the lack of -O3 -DNDEBUG options.
Signed-off-by: Julia Vassiliki <julia.vassiliki@unsw.edu.au>
This was done way back in 2017: 0abc72027, but somehow
was missed from CMakeLists.txt and hyp_traps.S files.
Signed-off-by: Julia Vassiliki <julia.vassiliki@unsw.edu.au>
Only call the command on CMake files that may be the first file
evaluated from the project. Remove from all other files.
Signed-off-by: Kent McLeod <kent@kry10.com>
The AArch64 FPU trap helpers were modifying both CPTR_EL2.TFP and
CPTR_EL2.TCPAC when enabling or disabling FPU trapping. However, only
TFP controls trapping of FP/SIMD instructions from EL0/EL1 to EL2.
TCPAC traps accesses to CPACR_EL1 and is unrelated to FPU instruction
execution. Trapping CPACR_EL1 is unnecessary for seL4’s FPU handling
and can interfere with other architectural features that are
controlled via CPACR_EL1.
In particular, CPACR_EL1 is an Architectural Feature Access Control
Register and is not limited to FP/SIMD. For example, Morello uses
CPACR_EL1.CEN to enable CHERI instructions. Trapping CPACR_EL1 via
TCPAC would therefore require special handling or could break guests
that legitimately modify CPACR_EL1 for non-FPU features.
Since seL4 only needs to trap FP/SIMD instructions if FPU is disabled
per TCB/VM, it is sufficient to control CPTR_EL2.TFP alone. Remove the
modification of TCPAC from enableTrapFpu()/disableTrapFpu() and leave
CPACR_EL1 accesses untrapped.
This aligns the implementation with the architectural intent of the
AArch64 FP/SIMD trapping model and avoids assuming that CPACR_EL1 is
used solely for FPU control.
See Issue #1601 for more context and discussion.
Signed-off-by: Hesham Almatary <hesham.almatary@cl.cam.ac.uk>
This eases verification by using a local variable
which remains unchanged during execution of the
function.
This preserves semantics since refill_unblock_check
will not modify the tcbSchedContext field of any TCB.
Signed-off-by: Michael McInerney <michael.mcinerney@proofcraft.systems>
At the moment, the seL4 microkit does not setup TLS variable
support. The workaround has been to `#define __thread` (blank)
before including `<sel4/sel4.h>` in `<microkit.h>`, which causes
issues if `<microkit.h>` is included *after* `<sel4/sel4.h>`,
often with obscure linker errors to `__emutls_**` symbols.
Instead we add a libsel4 config option that allows us to build
it with __thread copied out. We introduce an LIBSEL4_THREAD_LOCAL
macro in a similar way to the existing LIBSEL4_INLINE macro.
Signed-off-by: Julia Vassiliki <julia.vassiliki@unsw.edu.au>
- Factor out the inline assembly from invokeSMCCall so that it can be
made a separate machine operation in the specification. This part of
the change should produce the same binary as before.
- Parse IPC buffer arguments in decode, not in invoke. This eliminates
the "buffer" argument to the invoke function and keeps argument
processing in decode.
- Use setMR instead of replicating its functionality.
- Reduce decode argument list to the ones that are used.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
No change to behaviour, the commit just makes sure all cases are
explicitly listed instead of falling through to default.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
The AArch64 kernel now has verification support for SMC caps. Remove
dependency of KernelAllowSMCCalls on "NOT KernelVerificationBuild" and
set KernelAllowSMCCalls to ON by default for AArch64 verification
builds.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
This in particular introduces the function
schedContext_unbindReply, which is used within
finaliseCap, as well as invokeSchedContext_Unbind.
Signed-off-by: Michael McInerney <michael.mcinerney@proofcraft.systems>
Done in order to ensure overflow does not occur. More
specifically, we call usToTicks on an argument only
when it has been checked to be at most MAX_PERIOD_US,
so that we know that overflow does not occur as a
result of the conversion. We also rephrase the check
on extra_refills. Rather than performing an addition,
which may in extreme cases overflow, we perform a
subtraction, which we know does not underflow, as a
result of our invariants.
Signed-off-by: Michael McInerney <michael.mcinerney@proofcraft.systems>
Not all platforms are 512-byte aligned. Specifically, 64-bit
platforms are 1024-byte byte aligned.
Signed-off-by: Julia Vassiliki <julia.vassiliki@unsw.edu.au>
Try detect if we are running as a guest based on the CPUID
and skip clock_sync_test if we are.
Similar as previous patches for skipping QEMU on ARM/RISC-V
but at run-time instead of build-time.
Tested with qemu-system-x86_64 using `-smp 4`, the test is now
skipped with regular QEMU as well as `-accel kvm`.
Signed-off-by: Ivan Velickovic <i.velickovic@unsw.edu.au>
schedContext_donate() is performance critical and should be fast,
but migrateTCB() assumes that there is a change in affinity and
calls fpuRelease() unconditionally, causing unnecessary FPU
saving and restoring.
Signed-off-by: Indan Zupancic <indan@nul.nu>
White space change only:
- left-align sizes within each block declaration
- min 2 spaces to separate field name from size
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
Eliminate #ifdefs for BF_CANONICAL_RANGE in bitfield specifications,
using the new field_ptr command. Use word_size expressions for some of
the padding fields to make clearer where the sizes come from.
The transformations in this commit are written to produce exactly
identical output for code and proofs. In some rare cases, padding
could in the future be rearranged to make more use of field_ptr, but
these edits would create code differences and are left for later.
It may now also be to share more blocks between generic 32 and 64
definitions if they only reference word_size. This is also left for
later to reduce noise.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
Add the new block command
field_ptr(align) name size
as an abbreviation for
padding size - canonical + align
field_high name canonical - align
The (align) part is optional. If left out, align is 0.
This is useful for removing #ifdefs for pointer storage that are trying
to achieve constant size over different canonical bit representations.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
Allow arithmetic integer expressions in all places where previously
only integer literals were allowed.
The operators +, -, *, / and % are supported. Additionally, in field,
field_high, and padding specifications, the constants "word_size" and
"canonical" are available.
This enables more readable bitfield specifications without changing
any of the code and proof generation. Produces identical output to
before.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
Doing an ISB is very expensive, about ~30 cycles on some CPUs,
so try to avoid them if possible.
For the writes to (most) system registers to be observed, a
'Context Synchronization event' needs to occur (see ARM DDI 0487
M.a §D24.1.2.2 / pD24-8367).
The kernel is not doing any FPU operations after disabling the FPU,
so there is never a need for an ISB there. Returning to user space
counts as a 'Context Synchronization event' and ensures the FPU enable
or disable has completed when user space starts executing.
An ISB after enabling the FPU is only needed in case the FPU was
disabled for the kernel and the kernel wants to do FPU operations
like saving or restoring the FPU state. The ISB ensures that the
FPU enable system register write is finished before FPU using
instructions are executed.
For Aarch64 non-HYP it is possible to disable the FPU for user space
(EL0), but keep it enabled for the kernel (EL1). This is done by
setting CPACR_EL1.FPEN to 1, which the code actually already did.
Aarch64 with virtualisation enabled is more complicated and needs
higher level changes to achieve the same, which aren't done here,
the change is limited to removing the ISB in disableFpu().
This partially fixes issue #1569.
Signed-off-by: Indan Zupancic <indan@nul.nu>
There is no switchToThread() call when a task toggles
seL4_TCBFlag_fpuDisabled for itself, so the FPU stays
disabled even when it was just enabled.
The disable case was handled properly as fpuRelease()
does the right thing already.
Signed-off-by: Indan Zupancic <indan@nul.nu>
SEL4_VMENTER_CALL_CONTROL_ENTRY_MR refers to the
VM Entry Interruption-Information Field of the VMCS [1].
Not the VM-Entry Controls, which are two different things.
[1]: src/arch/x86/object/vcpu.c: vcpu_update_state_sysvmenter()
Signed-off-by: Bill Nguyen <bill.nguyen@student.unsw.edu.au>
Writing too quickly to a UART risks loosing bytes when its internal
FIFO fills up. Also lines should be terminated with "\r\n" for
compatibility with 1975 hardware.
Signed-off-by: Mathieu Mirmont <mat@neutrality.ch>
In the definition of tcb_t, there's extra fields if there's
SMP or BENCHMARK_TRACK_UTILISATION or ARM_HYPERIVSOR_SUPPORT.
It looks like the combination of all three (which Microkit uses)
is not enough for 11 bits.
Signed-off-by: Ivan Velickovic <i.velickovic@unsw.edu.au>