CONFIG_FREERTOS_ISR_STACKSIZE was set to 2100 when ELF core dump was
enabled, which resulted in a non-16-byte-aligned interrupt stack
offset. This triggered "is SP corrupted" check in the backtrace,
terminating the backtrace early.
Fix the default value, and make sure that the stack is always aligned,
regardless of the value of CONFIG_FREERTOS_ISR_STACKSIZE.
This MR uses an intermediary function `start_app` to call after system
initialization instead of `app_main`.
In RTOS builds, freertos provides `start_app` and calls `app_main`.
In non-RTOS builds, user provides `start_app` directly.
Changes the startup flow to the ff:
hardware -> core libraries init -> other libraries init -> os
init (optional) -> app_main
- hardware init resides in the port layer, and is the entry point
- core libraries init executes init functions of core components
- other libraries init executes init functions of other components (weak
references)
- after other lib is init, the app_main function is called, however,
an OS can wrap the real call to app_main to init its own stuff, and
*then* call the real app_main
FreeRTOS scheduler uses additional stack space, as in some functions
variables are placed onto the stack instead of registers.
This issue resulted in occasional stack overflows in dport task, when
compiling at -O0 optimization level.
- Increase the configMINIMAL_STACK_SIZE to 1kB.
- Enable the watchpoint at the end of stack in CI startup test for
this optimization level.
This fixes the issue where XTOS_SET_INTLEVEL would lower INTLEVEL from
4 to 3, when eTaskGetState is invoked during the core dump, triggered
from the interrupt watchdog.
`xQueueGenericCreateStatic` is placed into flash by the linker script to
reduce IRAM usage. This will also cause the `xRingbufferCreate` not
not callable when cache is disabled.
The SPI bus lock on SPI1 introduces two side effects:
1. The device lock for the main flash requires the
`CONFIG_FREERTOS_SUPPORT_STATIC_ALLOCATION` to be selected, however this
option is disabled by default in earlier IDF versions. Some developers
may find their project cannot be built by their old sdkconfig files.
2. Usually we don't need the lock on the SPI1 bus, due to it's
restrictions. However the overhead still exists in this case, the IRAM
cost for static version of semaphore functions, and the time cost when
getting and releasing the lock.
This commit:
1. Add a CONFIG_SPI_FLASH_BYPASS_MAIN_LOCK option, which will forbid the
space cost, as well as the initialization of the main bus lock.
2. When the option is not selected, the bus lock is used, the
`CONFIG_FREERTOS_SUPPORT_STATIC_ALLOCATION` will be selected explicitly.
3. Revert default value of `CONFIG_FREERTOS_SUPPORT_STATIC_ALLOCATION`
to `n`.
introduced in 49a48644e4.
Closes https://github.com/espressif/esp-idf/issues/5046
Configurable option to use IRAM as byte accessible memory (in single core mode) using
load-store (non-word aligned and non-word size IRAM access specific) exception handlers.
This allows to use IRAM for use-cases where certain performance penalty
(upto 170 cpu cycles per load or store operation) is acceptable. Additional configuration
option has been provided to redirect mbedTLS specific in-out content length buffers to
IRAM (in single core mode), allows to save 20KB per TLS connection.
1. Clarify THREADPTR calculation in FreeRTOS code, explaining where
the constant 0x10 offset comes from.
2. On the ESP32-S2, .flash.rodata section had different default
alignment (8 bytes instead of 16), which resulted in different offset
of the TLS sections. Unfortunately I haven’t found a way to query
section alignment from C code, or to use a constant value to define
section alignment in the linker script. The linker scripts are
modified to force a fixed 16 byte alignment for .flash.rodata on the
ESP32 and ESP32-S2beta. Note that the base address of .flash.rodata
was already 16 byte aligned, so this has not changed the actual
memory layout of the application.
Full explanation of the calculation below.
Assume we have the TLS template section base address
(tls_section_vma), the address of a TLS variable in the template
(address), and the final relocation value (offset). The linker
calculates:
offset = address - tls_section_vma + align_up(TCB_SIZE, alignment).
At run time, the TLS section gets copied from _thread_local_start
(in .rodata) to task_thread_local_start. Let’s assume that an address
of a variable in the runtime TLS section is runtime_address.
Access to this address will happen by calculating THREADPTR + offset.
So, by a series of substitutions:
THREADPTR + offset = runtime_address THREADPTR = runtime_address - offset
THREADPTR = runtime_address - (address - tls_section_vma + align_up(TCB_SIZE, alignment)) THREADPTR = (runtime_address - address) + tls_section_vma - align_up(TCB_SIZE, alignment)
The difference between runtime_address and address is same as the
difference between task_thread_local_start and _thread_local_start.
And tls_section_vma is the address of .rodata section, i.e.
_rodata_start. So we arrive to
THREADPTR = task_thread_local_start - _thread_local_start + _rodata_start - align_up(TCB_SIZE, alignment).
The idea with TCB_SIZE being added to the THREADPTR when computing
the relocation was to let the OS save TCB pointer in the TREADPTR
register. The location of the run-time TLS section was assumed to be
immediately after the TCB, aligned to whatever the section alignment
was. However in our case the problem is that the run-time TLS section
is stored not next to the TCB, but at the top of the stack. Plus,
even if it was stored next to the TCB, the size of a FreeRTOS TCB is
not equal to 8 bytes (TCB_SIZE hardcoded in the linker). So we have
to calculate THREADPTR in a slightly obscure way, to compensate for
these differences.
Closes IDF-1239
spin_lock: cleaned-up port files and removed portmux files
components/soc: decoupled compare and set operations from FreeRTOS
soc/spinlock: filled initial implementation of spinlock refactor
It will decouple the spinlocks into separated components with not depencences of freertos
an similar interface was provided focusing the readabillity and maintenance, also
naming to spinlocks were adopted. On FreeRTOS side the legacy portMUX macros
gained a form of wrapper functions that calls the spinlocks component thus
minimizing the impact on RTOS side.
This feature aims to close IDF-967
soc/spinlock: spinlocks passed on unit test, missing test corner cases
components/compare_set: added better function namings plus minor performance optimization on spinlocks
soc/spinlock: code reordering to remove ISC C90 mix error
freertos/portmacro: gor rid of critical sections multiline macros, placed inline functions instead
soc/spinlock: improved spinlock performance from internal RAM
For cases where the spinlock is executed from IRAM, there is no
need to check where the spinlock object is placed on memory,
removing this checks caused a great improvement on performance.
components/freertos: cleaned up multicore option scheduler.
components/freertos: more cleanup and test optimization to present realistic results
components/freertos: remove unused macros of optimized task selection when multicore is used
freertos/Kconfig: fix trailing space on optimized scheduler option
freertos/tests: moved test context variables inside of test task.
The public variables used on scheduling time test now were packed into a structure allocated on test case task stack and passed to tasks as arguments saving RAM comsumption.
FreeRTOS have an platform dependent configuration to enable selection task in a optimized way.
Provided the platform dependent functions in order to allow the scheduler to use the optimized algorithms by telling to the port layer where to found bitscan instruction i.e. NSAU.
This closes IDF-1116
components/freertos: added option to disable the optimized scheduler