Allocate the per-CPU data for secondary processors directly from the
heap areas before heap initialization and not via
_Workspace_Allocate_aligned(). This avoids dependency on the workspace
allocator. It fixes also a problem on some platforms (e.g. QorIQ) where
at this early point in the system initialization the top of the RAM is
used by low-level startup code on secondary processors (boot pages).
Update #3507.