... from the device UT listed by platform_gen. The kernel itself
does not care about this memory, and it is just given as device UT.
We also just remove the reserved array entirely from the return of
`get_physical_memory` since it only seems to be a footgun, it's
only used internally to affect what memory the kernel wants to use.
Co-authored-by: Kent McLeod <kent@kry10.com>
Signed-off-by: julia <git.ts@trainwit.ch>
`align_memory()` in hardware.py both modifies the first normal memory
region to adjust the base of it for alignment, and adds an extra
reserved region to our list of reserved regions. This then feeds
through `get_addrspace_exclude` which inverts the regions given
and turns it into the available "device memory" at user-level.
dev_mem = hardware.utils.memory.get_addrspace_exclude(
list(reserved) + phys_mem + kernel_devs, config)
Anything as an argument to this is not given as "device memory" by
the kernel. (It does not precisely match how the kernel works).
However, since `align_memory()` has adjusted both the phys_mem up
(which *would* have added this region as "device" memory) but also
added it to "reserved" region, which then made it disappear entirely,
as "reserved" regions are not exposed to userspace.
However, this **does not** match the behaviour of the kernel, as it was
not reserved, so this behaviour did not match the untypeds given to
userspace. This commit solves this by removing the extra reserved
region being added for that memory.
PR #1426 worked around this issue by removing the alignment on AArch64.
Whilst this fixed the issue that microkit was seeing, it just masked
the underlying issue. Reverting that PR, then applying this fix,
results in the following platform_gen.yaml:
devices:
- end: 0x1000000
start: 0x0
- end: 0xff800000
start: 0x3b400000
- end: 0xff841000
start: 0xff801000
- end: 0x100000000000
start: 0xff843000
memory:
- end: 0x3b400000
start: 0x1000000
Note especially the device region from 0x0 to 0x1000000; which is the
combination of the 0x0 to 0x1000 reserved region, and the 0x1000 to
0x1000000 reserved by the kernel's alignment requirement. Previously,
the platform_gen.yaml reported only the 0x0 to 0x1000 region,
devices:
- end: 0x1000
start: 0x0
- end: 0xff800000
start: 0x3b400000
- end: 0xff841000
start: 0xff801000
- end: 0x100000000000
start: 0xff843000
memory:
- end: 0x3b400000
start: 0x1000000
I will be following this commit up with a PR to instead make the
alignment-reserved region into a new memory region, since there's not
any reason why userspace can't use this memory.
This has been tested on a few platforms with sel4test and with
microkit on the pi4B.
Signed-off-by: julia <git.ts@trainwit.ch>
The Isabelle side of the binary verification tool chain currently does
not implement proof tactics for the "continue" statement. Use skip
(semicolon) instead for empty loop body.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
Wrap NUM_SGIS using wrap_config_set() so that verification automation
does not simplify it away automatically when it is 0.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
Make the treatment of spurious IRQs equal between handleInterruptEntry
and preemption handling. The infoflow proof requires that these behave
identically -- this was so far only true for platforms that define
handleSpuriousIRQ to be empty (which is almost all of them).
Adding platforms to the verification that have a non-empty body for
handleSpuriousIRQ requires equalising the calls.
Factor out the code that is common between preemption and interrupt
kernel entry into its own function checkInterrupt to remove duplication.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
Move setIRQTrigger from <arch>/hardware.h into interrupt.h together
with the rest of the IRQ interface, because it is now a visible
interface even for platforms and architectures that do not provide the
functionality.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
Provide a default dummy implementation for setIRQTrigger even for
non-GIC platforms where this operation is not supported. Otherwise,
the function is declared in the header, but has no implementation.
Implementing it as UNREACHABLE() means verification will show that the
function is not called.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
Adding a spec with precondition False means verification has to show
that the function is not called.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
Provide an implementation of the header function plat_SGITargetValid
even for non-GIC platforms. This function will not be called, but
verification has a general lemma about it, and the default
implementation makes that lemma true without adding special cases.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
Fail early with IllegalOperation if IssueSGISignal is not supported at
all for the platform.
This also helps with verification, because the later range check gets
simplified away automatically, leading to divergent code paths.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
There's a few cases in the kernel where the ksKernelEntry tracking
is not perfect, such as in SError reporting, and (I believe) a few
other places which I haven't tracked down to a cause - but some of
e.g. the RISC-V trap code where the first entry faults and the 2nd
proceeds can report stale information.
In these cases, the kernel says that the entry was via a certain
syscall or interrupt (etc), even though that was clearly not the
case because we know the kernel exited. Now we will print out this:
halting...
Kernel entry via Unknown (0)
The changes:
- When exiting the kernel, via `c_exit_hook()`, reset
`ksKernelEntry.path` to "Unknown".
An alternative here would have been add a global "valid" boolean
to the kernel state, but this requires modifying every site where
we set the ksKernelEntry.path to also set valid = true, which is
ugly.
- Remove Entry_UnimplementedDevice from entry_type_t as it is never
used, to leave enough room to add Entry_Unknown.
- Switch out the CONFIG_DEBUG_BUILD || BENCHMARK TRACK ENTRIES #if
in the x86 breakpoint code with the more concise
`TRACK_KERNEL_ENTRIES` define used elsewhere.
Signed-off-by: julia <git.ts@trainwit.ch>
Similar to #510 but for all other platforms. The idle_thread runs
without a stack and so cannot handle the stack prologue. This should
hopefully make the kernel rely less on FORCE_INLINE for this as well.
We create idle.S assembly files for each platform, as GCC does not
support `__attribute__((naked))` on AArch64 (GCC 13.2.0) and bails out.
Signed-off-by: julia <git.ts@trainwit.ch>
Helps the C compiler to recognise irqInvalid as a constant. This in
turn helps with binary verification, because irqInvalid is already
parsed as a constant in Isabelle since it is never written to.
See also the discussion in #1349 and #1324
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
The GIC platforms use irq_t for irqInvalid -- bring remaining AArch32
platforms in line with that.
irq_t is an unsigned type (when it is an integer) and enum constants are
signed. For the proofs to treat these platforms uniformly, they need to
be the same kind. They can't all be enums, because irq_t can be a more
complex type for SMP platforms.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
This was *fine*, because IT_ASID has an ASID_HIGH(IT_ASID) of
0, but the asid_pool_cap_new expects capASIDBase which should
instead be an aligned ASID ID (i.e. asidLowBits all 0).
Signed-off-by: julia <git.ts@trainwit.ch>
Also, change the definition of ASID_HIGH from
((a >> asidLowBits) & MASK(asidHighBits))
to
((a) >> asidLowBits)
because the mask is an unnecessary operation (this is handled by the
proofs). It is unnecessary because these (SW) ASIDs are kernel-only
and are never exposed to users, so they will never set bits asid from
the asidLow and asidHigh bits.
Signed-off-by: julia <git.ts@trainwit.ch>
Before:
static inline uint64_t PURE
pte_pte_table_ptr_get_pt_base_address(pte_t *pte_ptr) {
uint64_t ret;
assert(((pte_ptr->words[0] >> 0) & 0x400000000000003) ==
0x3ull);
After:
static inline uint64_t PURE
pte_pte_table_ptr_get_pt_base_address(pte_t *pte_ptr) {
uint64_t ret;
/* fail if union does not have the expected tag */
assert(((pte_ptr->words[0] >> 0) & 0x400000000000003) ==
0x3ull /* sliced tag pte_pte_table */);
Signed-off-by: julia <git.ts@trainwit.ch>
Memory regions in the boot code are represented as top-end-exclusive
and so printing them as [a..b] is at best misleading (especially
as the elfloader code that runs beforehand uses [a..b] for inclusive
ranges). The boot log now looks like this (QEMU AArch64):
ELF-loading image 'rootserver' to 40239000
paddr=[40239000..4061efff]
vaddr=[400000..7e5fff]
virt_entry=40e3f8
Enabling MMU and paging
Jumping to kernel-image entry point...
Bootstrapping kernel
Warning: Could not infer GIC interrupt target ID, assuming 0.
available phys memory regions: 1
[40000000..80000000)
reserved virt address space regions: 3
[ffffff8040000000..ffffff8040237000)
[ffffff8040237000..ffffff8040238e11)
[ffffff8040239000..ffffff804061f000)
Booting all finished, dropped to user space
Signed-off-by: julia <git.ts@trainwit.ch>
This was missed in cb8ee83f0cd2268b1d6b1cd5ca30b031db5896c4; their
use was removed from syscall_stub_gen but their typedefs remained.
Signed-off-by: julia <git.ts@trainwit.ch>
This would previously create regions that were off-by-1 on the upper
end with addresses such as 0x40001. This seems to have been fine, and
just created an extra unnecessary memory region covering the extra
bootinfo memory. But at the same time this was inconsistent with e.g.
calculate_rootserver_size() which contained the check:
size += extra_bi_size_bits > 0 ? BIT(extra_bi_size_bits) : 0;
and this created issues when I was experimenting with some boot code
changes as the bootinfo size accounting was inconsistent.
Signed-off-by: julia <git.ts@trainwit.ch>
Recent doxygen versions generate references for constants that are named
in the text. Since we do not produce a constant table in the manual
(neither markdown nor tex), these references point to nowhere and
produce a KeyError in the ref_dict on lookup.
The corresponding xml nodes are of this form:
<ref kindref="member"
refid="include_2sel4_2constants_8h_1a9a3...">seL4_TCBFlag</ref>
Ignore such dangling references and return the content (seL4_TCBFlag in
the example) instead. Run text escape on the content to cover
underscores etc in LaTeX.
Does not produce a warning since we expect this to be normal behaviour.
Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
Previously, this would error with a (confusing) warning since #546 about
Variable 'KernelArch' is not set
sel4test/settings.cmake had some code which complained about an invalid
platform, but it only ran if the kernel was found correctly:
set(valid_platforms ${KernelPlatform_all_strings})
set_property(CACHE PLATFORM PROPERTY STRINGS ${valid_platforms})
if(NOT "${PLATFORM}" IN_LIST valid_platforms)
message(FATAL_ERROR "Invalid PLATFORM selected: \"${PLATFORM}\"
Valid platforms are: \"${valid_platforms}\"")
endif()
Because of the CMake cache, if a correct platform was set, *then* an
incorrect platform, you could see this error, as the KernelArch would be
retained between builds in the cache. (Note: because of the cache, the
error message within the kernel is not triggered if we go from a valid
to an invalid platform. However the sel4test/settings.cmake one is).
Previously:
- If no platform is specified:
sel4test/build$ ../init-build.sh
loading initial cache file sel4test/projects/sel4test/settings.cmake
-- Set platform details from PLATFORM=
-- KernelPlatform:
-- Found seL4: sel4test/kernel
CMake Error at sel4test/kernel/configs/seL4Config.cmake:185
Variable 'KernelArch' is not set.
Call Stack (most recent call first):
sel4test/kernel/FindseL4.cmake:21 (include)
settings.cmake:32 (sel4_configure_platform_settings)
- If an invalid platform is specified:
sel4test/build$ ../init-build.sh -DPLATFORM=Cheshire
loading initial cache file sel4test/projects/sel4test/settings.cmake
-- Set platform details from PLATFORM=Cheshire
-- KernelPlatform: Cheshire
-- Found seL4: sel4test/kernel
CMake Error at sel4test/kernel/configs/seL4Config.cmake:185
Variable 'KernelArch' is not set.
Call Stack (most recent call first):
sel4test/kernel/FindseL4.cmake:21 (include)
settings.cmake:32 (sel4_configure_platform_settings)
Now, it looks like:
- If no platform is specified:
sel4test/build$ ../init-build.sh
loading initial cache file sel4test/projects/sel4test/settings.cmake
-- Set platform details from PLATFORM=
-- KernelPlatform:
-- Found seL4: sel4test/kernel
CMake Error at sel4test/kernel/configs/seL4Config.cmake:180
Variable 'KernelPlatform' is not set - is PLATFORM '' correct? Valid
platforms are
'allwinnerA20;am335x;apq8064;ariane;bcm2711;bcm2837;cheshire;...'
Call Stack (most recent call first):
sel4test/kernel/FindseL4.cmake:21 (include)
settings.cmake:32 (sel4_configure_platform_settings)
- If an invalid platform is specified:
sel4test/build$ ../init-build.sh -DPLATFORM=Cheshire
loading initial cache file sel4test/projects/sel4test/settings.cmake
-- Set platform details from PLATFORM=Cheshire
-- KernelPlatform: Cheshire
-- Found seL4: sel4test/kernel
CMake Error at sel4test/kernel/configs/seL4Config.cmake:180
Variable 'KernelPlatform' is not set - is PLATFORM 'Cheshire'
correct? Valid platforms are 'allwinnerA20;am335x;apq8064;...'
Call Stack (most recent call first):
sel4test/kernel/FindseL4.cmake:21 (include)
settings.cmake:32 (sel4_configure_platform_settings)
- If a valid platform is specified:
sel4test/build$ ./init-build.sh -DPLATFORM=cheshire
loading initial cache file sel4test/projects/sel4test/settings.cmake
-- Set platform details from PLATFORM=cheshire
-- KernelPlatform: cheshire
-- Found seL4: sel4test/kernel
-- platform cheshire supports multiple architectures, none was given
-- defaulting to: riscv64
-- Found GCC with prefix riscv64-none-elf-
Signed-off-by: julia <git.ts@trainwit.ch>
According to Arm documentation, the exception-handling routine
needs to zero the following bits in FPEXC:
DEX, IDF, IXF, UFF, OFF, DZF and IOF
In seL4, the user space fault handler should do this, as the kernel
doesn't know when user space is done handling the FPU trap.
In case of virtualisation, the guest kernel should clear these bits.
Now that the seL4 kernel doesn't handle FPU traps itself any more,
user space can handle asynchronous traps if it wants to.
Signed-off-by: Indan Zupancic <indan@nul.nu>
On 32-bit ARM the fpexc system register is set by loadFpuState,
which includes the FPU enable/disable bit FPEXC_EN_BIT. This
register is part of the usercontext and needs to be initialised
correctly, otherwise the FPU will be disabled by loadFpuState.
Before, this bug was hidden because the FPU was enabled lazily
after a trap. This bug just caused one extra FPU trap at first
FPU use for each task: The first handleFPUFault would fail to
enable the FPU, causing another FPU trap when user space gets
restarted.
On the second FPU fault, switchLocalFpuOwner calls enableFpu first
and then calls saveFpuState because ksActiveFPUState is set to the
current task's FPU state. Then it gets saved with FPEXC_EN_BIT set
and the task can continue with the FPU actually enabled.
This also means that with the old code, the initial FPEXC state
of each task was equal to the previous active FPU task's.
Signed-off-by: Indan Zupancic <indan@nul.nu>
Also release FPU when dissociating a vcpu, otherwise ksCurFPUOwner
may be wrong.
fpuRelease() sets ksCurFPUOwner to NULL, if this happens before
we return to the host, it may end up with the FPU disabled and no
FPU state loaded. Instead of hunting down and handling all obscure
corner cases where this might happen (dissociating vcpus, cross-core
operations), just check for this in vcpu_fpu_to_host().
Signed-off-by: Indan Zupancic <indan@nul.nu>
It's not only used when deleting a thread any more.
Swapping fpuRelease and dissociateVCPUTCB makes no practical
difference as they are independent, but it simplifies
verification slightly.
Signed-off-by: Indan Zupancic <indan@nul.nu>
Add config choice and change the default from XSAVEOPT to XSAVE.
The first config choice is used as the default option. Only XSAVE
is guaranteed to always work, the others require newer CPUs.
Get rid of dubious FPU state headers, we don't need them:
- XCOMP_BV_COMPACTED_FORMAT is set by xsavec or xsaves.
- We can init MXCSR with the ldmxcsr instruction.
Only system state should be configured in IA32_XSS_MSR,
setting FPU user state bits causes an exception.
All memory should be zeroed already, no need to do it again.
See also issue #179.
Signed-off-by: Indan Zupancic <indan@nul.nu>
Trivial change, except for x86 virtualisation and bootup.
Normally the seL4_TCBFlag_fpuDisabled flag is used to decide whether
to enable or disable the FPU. However, with x86 virtualisation there
is only one TCB. Use the task flag for the host and always enable FPU
for the guest.
When either the host or the guest's FPU is loaded, ksCurFPUOwner will
point to our TCB. saveFpuState and loadFpuState check fpu_active and
do the right thing depending on the current FPU user. To make this
work it was necessary to pass tcb_t to saveFpuState and loadFpuState.
Use the idle thread to store the initial FPU state. x86KSnullFpuState
is kept to simplify verification.
However, when Arch_initFpu is called, the idle thread hasn't been
created yet, so move the initialisation to after create_idle_thread,
but before create_initial_thread, as that leads to Arch_initFpuContext
using x86KSnullFpuState.
Also initialise VCPU FPU registers correctly for x86, otherwise the
initial state is wrong and can't be loaded when XSAVES is enabled.
Signed-off-by: Indan Zupancic <indan@nul.nu>
On x86 both the VM monitor as the VM guest have their own FPU state.
The FPU of the monitor is controlled with seL4_TCBFlag_fpuDisabled
TCB flag. For the guest running on the VCPU the FPU will always be
enabled.
Signed-off-by: Indan Zupancic <indan@nul.nu>