The Xtensa FreeRTOS port does not save the threadptr register when
doing a voluntary yield. This can result in a crash when multiple
tasks used the threadptr register and call "taskYIELD()".
This commit adds the threadptr register to the solicited stack frame.
Since dd849ffc, _rodata_start label has been moved to a different
linker output section from where the TLS templates (.tdata, .tbss)
are located. Since link-time addresses of thread-local variables are
calculated relative to the section start address, this resulted in
incorrect calculation of THREADPTR/$tp registers.
Fix by introducing new linker label, _flash_rodata_start, which points
to the .flash.rodata output section where TLS variables are located,
and use it when calculating THREADPTR/$tp.
Also remove the hardcoded rodata section alignment for Xtensa targets.
Alignment of rodata can be affected by the user application, which is
the issue dd849ffc was fixing. To accommodate any possible alignment,
save it in a linker label (_flash_rodata_align) and then use when
calculating THREADPTR. Note that this is not required on RISC-V, since
this target doesn't use TPOFF.
Fixes issue with DPORT init task, this task uses minimum stack size and may not be
enough if stack smashing detection is set to Overall mode.
Also reworks the way we calculate minimum stack to allow for adding multiple
contributing factors.
Closes https://github.com/espressif/esp-idf/issues/6403
Commit 891eb3b0 was fixing an issue with PS and EPC1 not being
preserved after the window spill procedure. It did so by saving PS in
a2 and EPC1 in a4. However the a4 register may be a live register of
another window in the call stack, and if it is overwritten and then
spilled to the stack, then the corresponding register value will end
up being corrupted. In practice the problem would show up as an
IllegalInstruction exception, when trying to return from a function
when a0 value was 0x40020.
Fix by using a0 register instead of a4 as scratch. Also fix a comment
about xthal_save_extra_nw, as this function in fact doesn't clobber
a4 or a5 because XCHAL_NCP_NUM_ATMPS is defined as 1.
Closes https://github.com/espressif/esp-idf/issues/5758
CONFIG_FREERTOS_ISR_STACKSIZE was set to 2100 when ELF core dump was
enabled, which resulted in a non-16-byte-aligned interrupt stack
offset. This triggered "is SP corrupted" check in the backtrace,
terminating the backtrace early.
Fix the default value, and make sure that the stack is always aligned,
regardless of the value of CONFIG_FREERTOS_ISR_STACKSIZE.
`xQueueGenericCreateStatic` is placed into flash by the linker script to
reduce IRAM usage. This will also cause the `xRingbufferCreate` not
not callable when cache is disabled.
The SPI bus lock on SPI1 introduces two side effects:
1. The device lock for the main flash requires the
`CONFIG_FREERTOS_SUPPORT_STATIC_ALLOCATION` to be selected, however this
option is disabled by default in earlier IDF versions. Some developers
may find their project cannot be built by their old sdkconfig files.
2. Usually we don't need the lock on the SPI1 bus, due to it's
restrictions. However the overhead still exists in this case, the IRAM
cost for static version of semaphore functions, and the time cost when
getting and releasing the lock.
This commit:
1. Add a CONFIG_SPI_FLASH_BYPASS_MAIN_LOCK option, which will forbid the
space cost, as well as the initialization of the main bus lock.
2. When the option is not selected, the bus lock is used, the
`CONFIG_FREERTOS_SUPPORT_STATIC_ALLOCATION` will be selected explicitly.
3. Revert default value of `CONFIG_FREERTOS_SUPPORT_STATIC_ALLOCATION`
to `n`.
introduced in 49a48644e4.
Closes https://github.com/espressif/esp-idf/issues/5046
Configurable option to use IRAM as byte accessible memory (in single core mode) using
load-store (non-word aligned and non-word size IRAM access specific) exception handlers.
This allows to use IRAM for use-cases where certain performance penalty
(upto 170 cpu cycles per load or store operation) is acceptable. Additional configuration
option has been provided to redirect mbedTLS specific in-out content length buffers to
IRAM (in single core mode), allows to save 20KB per TLS connection.
1. Clarify THREADPTR calculation in FreeRTOS code, explaining where
the constant 0x10 offset comes from.
2. On the ESP32-S2, .flash.rodata section had different default
alignment (8 bytes instead of 16), which resulted in different offset
of the TLS sections. Unfortunately I haven’t found a way to query
section alignment from C code, or to use a constant value to define
section alignment in the linker script. The linker scripts are
modified to force a fixed 16 byte alignment for .flash.rodata on the
ESP32 and ESP32-S2beta. Note that the base address of .flash.rodata
was already 16 byte aligned, so this has not changed the actual
memory layout of the application.
Full explanation of the calculation below.
Assume we have the TLS template section base address
(tls_section_vma), the address of a TLS variable in the template
(address), and the final relocation value (offset). The linker
calculates:
offset = address - tls_section_vma + align_up(TCB_SIZE, alignment).
At run time, the TLS section gets copied from _thread_local_start
(in .rodata) to task_thread_local_start. Let’s assume that an address
of a variable in the runtime TLS section is runtime_address.
Access to this address will happen by calculating THREADPTR + offset.
So, by a series of substitutions:
THREADPTR + offset = runtime_address THREADPTR = runtime_address - offset
THREADPTR = runtime_address - (address - tls_section_vma + align_up(TCB_SIZE, alignment)) THREADPTR = (runtime_address - address) + tls_section_vma - align_up(TCB_SIZE, alignment)
The difference between runtime_address and address is same as the
difference between task_thread_local_start and _thread_local_start.
And tls_section_vma is the address of .rodata section, i.e.
_rodata_start. So we arrive to
THREADPTR = task_thread_local_start - _thread_local_start + _rodata_start - align_up(TCB_SIZE, alignment).
The idea with TCB_SIZE being added to the THREADPTR when computing
the relocation was to let the OS save TCB pointer in the TREADPTR
register. The location of the run-time TLS section was assumed to be
immediately after the TCB, aligned to whatever the section alignment
was. However in our case the problem is that the run-time TLS section
is stored not next to the TCB, but at the top of the stack. Plus,
even if it was stored next to the TCB, the size of a FreeRTOS TCB is
not equal to 8 bytes (TCB_SIZE hardcoded in the linker). So we have
to calculate THREADPTR in a slightly obscure way, to compensate for
these differences.
Closes IDF-1239
spin_lock: cleaned-up port files and removed portmux files
components/soc: decoupled compare and set operations from FreeRTOS
soc/spinlock: filled initial implementation of spinlock refactor
It will decouple the spinlocks into separated components with not depencences of freertos
an similar interface was provided focusing the readabillity and maintenance, also
naming to spinlocks were adopted. On FreeRTOS side the legacy portMUX macros
gained a form of wrapper functions that calls the spinlocks component thus
minimizing the impact on RTOS side.
This feature aims to close IDF-967
soc/spinlock: spinlocks passed on unit test, missing test corner cases
components/compare_set: added better function namings plus minor performance optimization on spinlocks
soc/spinlock: code reordering to remove ISC C90 mix error
freertos/portmacro: gor rid of critical sections multiline macros, placed inline functions instead
soc/spinlock: improved spinlock performance from internal RAM
For cases where the spinlock is executed from IRAM, there is no
need to check where the spinlock object is placed on memory,
removing this checks caused a great improvement on performance.
components/freertos: cleaned up multicore option scheduler.
components/freertos: more cleanup and test optimization to present realistic results
components/freertos: remove unused macros of optimized task selection when multicore is used
freertos/Kconfig: fix trailing space on optimized scheduler option
freertos/tests: moved test context variables inside of test task.
The public variables used on scheduling time test now were packed into a structure allocated on test case task stack and passed to tasks as arguments saving RAM comsumption.
FreeRTOS have an platform dependent configuration to enable selection task in a optimized way.
Provided the platform dependent functions in order to allow the scheduler to use the optimized algorithms by telling to the port layer where to found bitscan instruction i.e. NSAU.
This closes IDF-1116
components/freertos: added option to disable the optimized scheduler
bugfix/pthread: fix pthread_once() race condiion possibility adding critical section in compare and set function
Closes IDFGH-2448
See merge request espressif/esp-idf!7236