Make sure also the size is cache aligned since otherwise we may have
some overlap with the next allocation block. A cache invalidate on this
area would be fatal.
Add a local context structure to the SMP lock API for acquire and
release pairs. This context can be used to store the ISR level and
profiling information. It may be later used to enable more
sophisticated lock algorithms, e.g. MCS locks.
There is only one lock that cannot be used with a local context. This
is the per-CPU lock since here we would have to transfer the local
context through a context switch which is very complicated.