The first, mistake in buffer size computation for cache flush
and invalidate has been corrected.
GCC __attribute__( ( aligned( 64 ) ) ) should work and works for local
variables. Code ensures right stack alignment. But attribute has
to be moved to type declaration to ensure that structure size is affected
by attribute. But even this seems to not work reliably for some reason.
May it be, the stack area between frame start and end of local variable buffer
accessed during context switch or some stack prefetch during resturn
such way that some cache lines belonging to buffer are filled to cache.
Extending buffer by one more cache line padding helps there.
In the longer term perspective, buffer should be moved to some static
area or cache aligned dynamic memory allocated. Concurrent calls
to the VideoCore operations and access serialization should be added
too but problem is that some calls are required during workspace and MMU
setup so variant without need of mutex would be required as well.
Framebuffer setup code and other VideoCore calls check more
precisely for errors and do not proceed forward with incorrect
data now.
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
The mutex objects use the owner field of the thread queues for the mutex
owner. Use this and add a deadlock detection to
_Thread_queue_Enqueue_critical() for thread queues with an owner.
Update #2412.
Update #2556.
Close#2765.
The _Thread_Lock_acquire() function had a potentially infinite run-time
due to the lack of fairness at atomic operations level.
Update #2412.
Update #2556.
Update #2765.
Move the priority change due to priority interitance to the thread queue
enqueue operation to simplify the locking on SMP configurations.
Update #2412.
Update #2556.
Update #2765.
Raise the priority under thread queue lock protection and omit the
superfluous thread queue priority change, since the thread is extracted
anyway. The unblock operation will pick up the new priority.
Update #2412.
Update #2556.
Update #2765.
* Fixes a bug with elapsed time calculations misusing absolute time
arguments in nanosleep_helper by passing the requested relative interval.
* Fixes a bug with truncation of absolute timeouts by passing the
full 64-bit value to Thread_queue_Enqueue.
* Share yield logic between nanosleep and clock_nanosleep.
updates #2732
Sleeping with CLOCK_REALTIME should use the WATCHDOG_ABSOLUTE
clock discipline for the threadq so that the timeout interval
may change in case the clock source changes. Similarly,
CLOCK_MONOTONIC uses the WATCHDOG_RELATIVE threadq that will
only wakeup the thread after the requested count of ticks elapse.
updates #2732
Clock disciplines may be WATCHDOG_RELATIVE, WATCHDOG_ABSOLUTE,
or WATCHDOG_NO_TIMEOUT. A discipline of WATCHDOG_RELATIVE with
a timeout of WATCHDOG_NO_TIMEOUT is equivalent to a discipline
of WATCHDOG_NO_TIMEOUT.
updates #2732
Using conditional branches to find bits is extremely inefficient
and for asynchronous delivery of different interrupt sources
lead to total confusion of branch prediction unit.
Exception table setup is processed by common CPU architecture support.
For ARM architecture, it can be found in the file
rtems/c/src/lib/libbsp/arm/shared/start/start.S
and ends by bsp_vector_table_copy_done label.
The actual tabel content can be found at
bsp_start_vector_table_begin
For ARMv7-A and even other variant with hypervisor mode support,
it is even not necessary to copy table to address 0 at all
because CP15 register can be used to specify alternative
table start address
arm_cp15_set_vector_base_address(&)bsp_start_vector_table_begin;
ARMv7-M have register to set exception table base as well.
The rtems_monitor_task() setups/updates termios attributes
of the opened TTY and if there is ongoing some other output
it leads to the stuck.
It would be better to use some termios API function which
would call drainOutput() in rtems/cpukit/libcsupport/src/termios.c.
But functionality is not accessible outside of core termios
implementation.
The loop waiting for last character to be sent has to be there anyway
because hardware does not provide Tx machine/shift register empty
interrupt.
Memory content changes caused by relocation has to be
propagated to memory/cache level which is used/snooped
during instruction cache fill.
Closes#2438
Enable even the first megabyte of SDRAM to be cache-able after
problems with stale cache content has been resolved by previous commit.
Because major part of application usually fits to the first
megabyte this speedups test dhrystone application by factor 40.
This fix strange behavior where some stale content has been
stored in level 2 cache before RTEMS has been start from U-boot
which has reappeared after MMU enable and shadow vector
table at start of SDRAM.
Next cache operations should work on most of cores now
rtems_cache_flush_entire_data()
rtems_cache_invalidate_entire_data()
rtems_cache_invalidate_entire_instruction()
Instruction cache invalidate works on the first level for now only.
Data cacache operations are extended to ensure flush/invalidate
on all cache levels.
The CP15 arm_cp15_data_cache_clean_all_levels() function extended
to continue through unified levels too (ctype = 4).
Disabling MMU requires complex cache flushing and invalidation
operations. There is almost no way how to do that right
on SMP system without stopping all other CPUs. On the other hand,
there is documented sequence of operations which should be used
according to ARM manual and it guarantees even distribution
of maintenance operations to other cores for last generation
of Cortex-A cores with multiprocessor extension.
This change could require addition of appropriate entry
to arm_cp15_start_mmu_config_table for some BSPs to ensure
that MMU table stays accessible after MMU is enabled
{
.begin = (uint32_t) bsp_translation_table_base,
.end = (uint32_t) bsp_translation_table_base + 0x4000,
.flags = ARMV7_MMU_DATA_READ_WRITE_CACHED
}