forked from Imagelibrary/seL4
287 lines
15 KiB
TeX
287 lines
15 KiB
TeX
%
|
|
% Copyright 2014, General Dynamics C4 Systems
|
|
%
|
|
% SPDX-License-Identifier: GPL-2.0-only
|
|
%
|
|
|
|
\chapter{\label{ch:ipc}Message Passing (IPC)}
|
|
|
|
The seL4 microkernel provides a message-passing IPC mechanism for communication
|
|
between threads. The same mechanism is also used for communication with
|
|
kernel-provided services. Messages are sent by invoking a capability to a
|
|
kernel object. Messages sent to \obj{Endpoint}s are destined for other
|
|
threads, while messages sent to other objects are processed by the kernel. This
|
|
chapter describes the common message format, endpoints,
|
|
and how they can be used for communication between applications.
|
|
|
|
\section{Message Registers}
|
|
\label{sec:messageinfo}
|
|
|
|
Each message contains a number of message words and optionally a number of
|
|
capabilities.
|
|
The message words are sent to or received from a thread by placing them in its \emph{message registers}.
|
|
The message registers are numbered and the first few message registers are implemented
|
|
using physical CPU registers, while the rest are backed by a fixed region of
|
|
memory called the \emph{IPC buffer}.
|
|
The reason for this design is efficiency:
|
|
very short messages need not use the memory.
|
|
The IPC buffer is assigned to the calling thread (see \autoref{sec:threads} and \autoref{api:tcb_setipcbuffer}).
|
|
%FIXME: seL4_TCB_SetIPCBuffer is only mentioned in the API reference!
|
|
|
|
Every IPC message also has a tag (structure \texttt{seL4\_MessageInfo\_t}). The
|
|
tag consists of four fields: the label, message length, number of capabilities
|
|
(the \texttt{extraCaps} field) and the \texttt{capsUnwrapped} field. The
|
|
message length and number of capabilities determine either the number of
|
|
message registers and capabilities that the sending thread wishes to transfer,
|
|
or the number of message registers and capabilities that were actually
|
|
transferred. The label is not interpreted by the
|
|
kernel and is passed unmodified as the first data payload of the message. The
|
|
label may, for example, be used to specify a requested operation. The
|
|
\texttt{capsUnwrapped} field is used only on the receive side, to indicate the
|
|
manner in which capabilities were received. It is described in
|
|
\autoref{sec:cap-transfer}.
|
|
|
|
% FIXME: a little too low-level, perhaps?
|
|
|
|
\newcommand{\ipcparam}[4]{\texttt{#1} \emph{#2}&\texttt{#3}\\ }
|
|
\begin{table}[htb]
|
|
\begin{center}
|
|
\begin{tabularx}{\textwidth}{p{0.28\textwidth}p{0.18\textwidth}X}
|
|
\toprule
|
|
\textbf{Type} & \textbf{Name} & \textbf{Description} \\
|
|
\midrule
|
|
\ipcparam{seL4\_MessageInfo\_t}{}{tag}{Message tag}
|
|
\ipcparam{seL4\_Word[]}{}{msg}{Message contents}
|
|
\ipcparam{seL4\_Word}{}{userData}{Base address of the structure, used by
|
|
supporting user libraries}
|
|
\ipcparam{seL4\_CPtr[]}{(in)}{caps}{Capabilities to transfer}
|
|
\ipcparam{seL4\_CapData\_t[]}{(out)}{badges}{Badges for
|
|
endpoint capabilities received}
|
|
\ipcparam{seL4\_CPtr}{}{receiveCNode}{CPtr to a CNode from which to
|
|
find
|
|
the receive slot}
|
|
\ipcparam{seL4\_CPtr}{}{receiveIndex}{CPtr to the receive slot
|
|
relative to \texttt{receiveCNode}}
|
|
\ipcparam{seL4\_Word}{}{receiveDepth}{Number of bits of
|
|
\texttt{receiveIndex} to
|
|
use}
|
|
\bottomrule
|
|
\end{tabularx}
|
|
\caption{\label{tbl:ipcbuffer}Fields of the
|
|
\texttt{seL4\_IPCBuffer} structure. Note that
|
|
\texttt{badges} and \texttt{caps} use the same area of memory in
|
|
the structure.}
|
|
\end{center}
|
|
\end{table}
|
|
|
|
The kernel assumes that the IPC buffer contains a structure of type
|
|
\texttt{seL4\_IPCBuffer} as defined in \autoref{tbl:ipcbuffer}. The
|
|
kernel uses as many physical registers as possible to transfer IPC
|
|
messages. When more arguments are transferred than physical message
|
|
registers are available, the kernel begins using the IPC buffer's
|
|
\texttt{msg} field to transfer arguments. However, it leaves room in
|
|
this array for the physical message registers. For example, if an IPC
|
|
transfer or kernel object invocation required
|
|
4 message registers (and there are only 2 physical message registers
|
|
available on this architecture) then arguments 1 and 2 would be
|
|
transferred via message registers and arguments 3 and 4 would be in
|
|
\texttt{msg[2]} and \texttt{msg[3]}.
|
|
This allows the user-level object-invocation stubs to copy the arguments passed in physical registers to
|
|
the space left in the \texttt{msg} array if desired.
|
|
The situation is similar for the tag field.
|
|
There is space for this field in the \texttt{seL4\_IPCBuffer} structure, which the kernel ignores.
|
|
User level stubs
|
|
may wish to copy the message tag from its CPU register to this field, although
|
|
the user level stubs provided with the kernel do not do this.
|
|
|
|
\section{Endpoints}
|
|
|
|
\obj{Endpoint}s allow a small amount
|
|
of data and capabilities (namely the IPC buffer) to be transferred between two
|
|
threads. \obj{Endpoint} objects are invoked directly using the seL4 system calls
|
|
described in \autoref{sec:syscalls}.
|
|
|
|
IPC \obj{Endpoints} uses a rendezvous model and as such is
|
|
synchronous and blocking. An \obj{Endpoint} object may queue
|
|
threads either to send or to receive. If no receiver is ready, threads
|
|
performing the \apifunc{seL4\_Send}{sel4_send} or \apifunc{seL4\_Call}{sel4_call}
|
|
system calls will wait in a queue for the first available receiver. Likewise, if
|
|
no sender is ready, threads performing the \apifunc{seL4\_Recv}{sel4_recv}
|
|
system call or the second half of \apifunc{seL4\_ReplyRecv}{sel4_replyrecv}
|
|
will wait for the first available sender.
|
|
|
|
Trying to Send or Call without the Write right will fail and return an error. In
|
|
the case of Send the error is ignored (The kernel isn't allowed to reply). Thus
|
|
there is no way of knowing that a send has failed because of missing right.
|
|
On the other hand calling \apifunc{seL4\_Recv}{sel4_recv} with a endpoint capability that
|
|
does not have the Read right will raise a fault, see \autoref{sec:faults}. This
|
|
because otherwise the error message would be indistinguishable from a normal
|
|
message received from another thread via the endpoint.
|
|
|
|
\subsection{Endpoint Badges\label{s:ep-badge}}
|
|
\label{sec:ep-badges}
|
|
|
|
Endpoint capabilities may be \emph{minted} to
|
|
create a new endpoint capability with a \emph{badge} attached to it, a data
|
|
word chosen by the invoker of the \emph{mint} operation. When a message is sent to an endpoint using a badged
|
|
capability, the badge is transferred to the receiving thread's
|
|
\texttt{badge} register.
|
|
|
|
An endpoint capability with a zero badge is said to be \emph{unbadged}.
|
|
Such a capability can be badged with the \apifunc{seL4\_CNode\_Mint}{cnode_mint}
|
|
invocation on the \obj{CNode} containing the capability. Endpoint
|
|
capabilities with badges cannot be unbadged, rebadged or used to create
|
|
child capabilities with different badges.
|
|
|
|
On 32-bit platforms, only the low 28 bits of the badge are available for use.
|
|
The kernel will silently ignore any usage of the high 4 bits.
|
|
On 64-bit platforms, 64 bits are available for badges.
|
|
|
|
\subsection{Capability Transfer}
|
|
\label{sec:cap-transfer}
|
|
|
|
Messages may contain capabilities, which will be copied to the
|
|
receiver, provided that the endpoint capability
|
|
invoked by the sending thread has Grant rights. An attempt to send
|
|
capabilities using an endpoint capability without the Grant right will
|
|
result in transfer of the raw message, without any capability transfer.
|
|
|
|
Capabilities to be sent in a message are specified in the sending thread's
|
|
IPC buffer in the \texttt{caps} field. Each entry in that array is interpreted
|
|
as a CPtr in the sending thread's capability space. The number of capabilities
|
|
to send is specified in the \texttt{extraCaps} field of the message tag.
|
|
|
|
The receiver specifies the slot
|
|
in which it is willing to receive a capability, with three fields within the IPC buffer: \texttt{receiveCNode}, \texttt{receiveIndex} and \texttt{receiveDepth}.
|
|
These fields specify the root CNode, capability address and number of bits to resolve, respectively, to find
|
|
the slot in which to put the capability. Capability
|
|
addressing is described in \autoref{sec:cap_addressing}.
|
|
|
|
Note that receiving threads may specify only one receive slot, whereas a
|
|
sending thread may include multiple capabilities in the message. Messages
|
|
containing more than one capability may be interpreted by kernel objects. They
|
|
may also be sent to receiving threads in the case where some of the extra
|
|
capabilities in the message can be \emph{unwrapped}.
|
|
|
|
If the n-th capability in the message refers to the endpoint through
|
|
which the message is sent, the capability is \emph{unwrapped}: its badge is placed into
|
|
the n-th
|
|
position of the receiver's badges array, and the kernel sets the n-th bit (counting from the
|
|
least significant) in the \texttt{capsUnwrapped} field of the message
|
|
tag. The capability itself is not transferred, so the receive slot may be used
|
|
for another capability.
|
|
|
|
A capability that is not unwrapped is transferred by copying it from the
|
|
sender's CNode slot to the receiver's CNode slot. The sender retains access
|
|
to the sent capability.
|
|
|
|
If a receiver gets a message whose tag has an \texttt{extraCaps} of 2 and a
|
|
\texttt{capsUnwrapped} of 2, then the first capability in the message was
|
|
transferred to the specified receive slot and the second capability was
|
|
unwrapped, placing its badge in \texttt{badges[1]}. There may have been a
|
|
third capability in the sender's message which could not be unwrapped.
|
|
|
|
\subsection{Errors}
|
|
|
|
Errors in capability transfers can occur at two places: in the send
|
|
phase or in the receive phase. In the send phase, all capabilities that
|
|
the caller is attempting to send are looked up to ensure that they exist
|
|
before the send is initiated in the kernel. If the lookup fails for any
|
|
reason, \apifunc{seL4\_Send}{sel4_send} and \apifunc{seL4\_Call}{sel4_call} system calls immediately abort and
|
|
no IPC or capability transfer takes place. The system call will return
|
|
a lookup failure error as described in \autoref{sec:errors}.
|
|
|
|
In the receive phase, seL4 transfers capabilities in the order that they
|
|
are found in the sending thread's IPC buffer \texttt{caps} array
|
|
and terminates as soon as an error is encountered. Possible error
|
|
conditions are:
|
|
|
|
\begin{itemize}
|
|
\item A source capability cannot be looked up. Although the presence
|
|
of the source capabilities is checked when the sending thread
|
|
performs the send system call, this error may still occur. The sending
|
|
thread may have been blocked on the endpoint for some time before it
|
|
was paired with a receiving thread. During this time, its
|
|
CSpace may have changed and the source capability pointers may
|
|
no longer be valid.
|
|
|
|
\item The destination slot cannot be looked up. Unlike the send
|
|
system call, the \apifunc{seL4\_Recv}{sel4_recv} system call does not check that the
|
|
destination slot exists and is empty before it initiates the receive.
|
|
Hence, the \apifunc{seL4\_Recv}{sel4_recv} system call will not fail with an error if the
|
|
destination slot is invalid and will instead transfer badged
|
|
capabilities until an attempt to save a capability to the
|
|
destination slot is made.
|
|
|
|
\item The capability being transferred cannot be derived. See
|
|
\autoref{sec:cap_derivation} for details.
|
|
\end{itemize}
|
|
|
|
An error will not void the entire transfer, it will just end it
|
|
prematurely. The capabilities processed before the failure are still
|
|
transferred and the \texttt{extraCaps} field in the receiver's IPC
|
|
buffer is set to the number of capabilities transferred up to failure.
|
|
No error message will be returned to the receiving thread in any of the
|
|
above cases.
|
|
|
|
\subsection{Calling and Replying}
|
|
\label{sec:ep-cal}
|
|
|
|
As explained in \autoref{sec:sys_call}, when the user calls
|
|
\apifunc{seL4\_Call}{sel4_call} on an endpoint capability,
|
|
some specific actions are taken. First a call will do exactly the same action as
|
|
a normal \apifunc{seL4\_Send}{sel4_send}. Then after the rendezvous and all the
|
|
normal IPC procedure happened, instead of returning directly to the caller,
|
|
\apifunc{seL4\_Call}{sel4_call} will check if either Grant or GrantReply are
|
|
present on the invoked endpoint capability:
|
|
|
|
\begin{itemize}
|
|
\item If this is not the case, the caller thread is suspended as if
|
|
\apifunc{seL4\_TCB\_Suspend}{tcb_suspend} was called on it. The send part of
|
|
the call would still have been performed as usual.
|
|
\item If this is the case. A reply capability is set in a specific slot of the
|
|
receiver TCB. The Grant right of that reply capability is set by copying the Grant
|
|
right of the endpoint capability invoked by the receiver in
|
|
\apifunc{seL4\_Recv}{sel4_recv}.
|
|
Then, the caller thread is blocked waiting for the reply.
|
|
\end{itemize}
|
|
|
|
A reply capability points directly to the caller thread and once the call has
|
|
been performed is completely unrelated to the original \obj{Endpoint}. Even if
|
|
the latter was destroyed, the reply capability would still exist and point to
|
|
the caller who would still be waiting for a reply.
|
|
|
|
The generated reply capability can then be either invoked in place (in the
|
|
specific TCB slot) with the \apifunc{seL4\_Reply}{sel4_reply} or saved to an
|
|
addressable slot using \apifunc{seL4\_CNode\_SaveCaller}{cnode_savecaller} to be
|
|
invoked later with \apifunc{seL4\_Send}{sel4_send}. The specific slot cannot be
|
|
directly addressed with any CPtr as it is not part of any CSpace.
|
|
|
|
A reply capability is invoked in the same way as a normal send on a
|
|
\obj{Endpoint}. A reply capability has implicitly the Write right, so the
|
|
message will always go through. Transferring caps in the reply can only happen
|
|
if the reply capability has the Grant right and is done in exactly the same way
|
|
as in a normal IPC transfer as described in \autoref{sec:cap-transfer}.
|
|
|
|
The main difference with a normal endpoint transfer is that the kernel guarantees
|
|
that invoking a reply capability cannot block: If you own a reply capability,
|
|
then the thread it points to is waiting for a reply. However a reply capability
|
|
is a non-owning reference, contrary to all the other capabilities. That means that
|
|
if the caller thread is destroyed or modified in any way that would render
|
|
a reply impossible (for example being suspended with
|
|
\apifunc{seL4\_TCB\_Suspend}{tcb_suspend}), the kernel would immediately destroy
|
|
the reply capability.
|
|
|
|
Once the reply capability has been invoked, the caller receives the message as if
|
|
it has been performing a \apifunc{seL4\_Recv}{sel4_recv} and just received the
|
|
message. In particular, it starts running again.
|
|
|
|
The \apifunc{seL4\_Call}{sel4_call} operation exists not only for
|
|
efficiency reasons (combining two operations into a single system
|
|
call). It differs from
|
|
\apifunc{seL4\_Send}{sel4_send} immediately followed by
|
|
\apifunc{seL4\_Recv}{sel4_recv} in ways that allow certain system setup to work
|
|
much more efficiently with much less setup that with a traditional setup.
|
|
In particular, it is guaranteed that the reply received by the caller comes from
|
|
the thread that received the call without having to check any kind of badge.
|