Since dd849ffc, _rodata_start label has been moved to a different
linker output section from where the TLS templates (.tdata, .tbss)
are located. Since link-time addresses of thread-local variables are
calculated relative to the section start address, this resulted in
incorrect calculation of THREADPTR/$tp registers.
Fix by introducing new linker label, _flash_rodata_start, which points
to the .flash.rodata output section where TLS variables are located,
and use it when calculating THREADPTR/$tp.
Also remove the hardcoded rodata section alignment for Xtensa targets.
Alignment of rodata can be affected by the user application, which is
the issue dd849ffc was fixing. To accommodate any possible alignment,
save it in a linker label (_flash_rodata_align) and then use when
calculating THREADPTR. Note that this is not required on RISC-V, since
this target doesn't use TPOFF.
It is now possible to have any alignment restriction on rodata in the user
applicaiton. It will not affect the first section which must be aligned
on a 16-byte bound.
Closes https://github.com/espressif/esp-idf/issues/6719
rodata dummy section has now the same alignment as flash text section,
and at least the same size. For these reasons, the cache will map
correctly the following rodata section.
The CPU might prefetch instructions, which means it in some cases
will try to fetch instruction located after the last instruction in
flash.text.
Add dummy bytes to ensure fetching these wont result in an error,
e.g. MMU exceptions