doc: Rewrite heap debugging docs to add function/struct links & improve actionability

This commit is contained in:
Angus Gratton 2017-10-02 14:21:44 +11:00 committed by Angus Gratton
parent 530bca1813
commit 325bd3a4dc
2 changed files with 43 additions and 37 deletions

View File

@ -15,11 +15,11 @@ Heap Information
To obtain information about the state of the heap:
- ``xPortGetFreeHeapSize()`` is a FreeRTOS function which returns the number of free bytes in the (data memory) heap. This is equivalent to ``heap_caps_get_free_size(MALLOC_CAP_8BIT)``.
- ``heap_caps_get_free_size()`` can also be used to return the current free memory for different memory capabilities.
- ``heap_caps_get_largest_free_block()`` can be used to return the largest free block in the heap. This is the largest single allocation which is currently possible. Tracking this value and comparing to total free heap allows you to detect heap fragmentation.
- ``xPortGetMinimumEverFreeHeapSize()`` and the related ``heap_caps_get_minimum_free_size()`` can be used to track the heap "low water mark" since boot.
- ``heap_caps_get_info`` returns a ``multi_heap_info_t`` structure which contains the information from the above functions, plus some additional heap-specific data (number of allocations, etc.)
- :cpp:func:`xPortGetFreeHeapSize` is a FreeRTOS function which returns the number of free bytes in the (data memory) heap. This is equivalent to calling ``heap_caps_get_free_size(MALLOC_CAP_8BIT)``.
- :cpp:func:`heap_caps_get_free_size` can also be used to return the current free memory for different memory capabilities.
- :cpp:func:`heap_caps_get_largest_free_block` can be used to return the largest free block in the heap. This is the largest single allocation which is currently possible. Tracking this value and comparing to total free heap allows you to detect heap fragmentation.
- :cpp:func:`xPortGetMinimumEverFreeHeapSize` and the related :cpp:func:`heap_caps_get_minimum_free_size` can be used to track the heap "low water mark" since boot.
- :cpp:func:`heap_caps_get_info` returns a :cpp:class:`multi_heap_info_t` structure which contains the information from the above functions, plus some additional heap-specific data (number of allocations, etc.)
.. _heap-corruption:
@ -40,33 +40,47 @@ The heap implementation (``multi_heap.c``, etc.) includes a lot of assertions wh
If a heap integrity assertion fails, a line will be printed like ``CORRUPT HEAP: multi_heap.c:225 detected at 0x3ffbb71c``. The memory address which is printed is the address of the heap structure which has corrupt content.
It's also possible to manually check heap integrity by calling the ``heap_caps_check_integrity()`` function (see below). This function checks all of requested heap memory for integrity, and can be used even if assertions are disabled. If the integrity check prints an error, it will contain the address(es) of corrupt heap structures.
It's also possible to manually check heap integrity by calling :cpp:func:`heap_caps_check_integrity` function. This function checks all of requested heap memory for integrity, and can be used even if assertions are disabled. If the integrity check prints an error, it will also contain the address(es) of corrupt heap structures.
Finding Heap Corruption
^^^^^^^^^^^^^^^^^^^^^^^
Memory corruption can be one of the hardest classes of bugs to find and fix, as one area of memory can be corrupted from a totally different place. Some tips:
- A crash with a ``CORRUPT HEAP:`` message will usually include a stack trace, but this stack trace is rarely useful. The crash is the symptom of memory corruption when the system realises the heap is corrupt, but usually the corruption happened elsewhere and earlier in time.
- Increasing the Heap memory debugging `Configuration`_ level to "Light impact" or "Comprehensive" can give you a more accurate message with the first corrupt memory address.
- Adding regular calls to :cpp:func:`heap_caps_check_integrity` in your code will help you pin down the exact time that the corruption happened. You can move these checks around to "close in on" the section of code that corrupted the heap.
- Based on the memory address which is being corrupted, you can use :ref:`JTAG debugging <jtag-debugging-introduction>` to set a watchpoint on this address and have the CPU halt when it is written to.
- If you don't have JTAG, but you do know roughly when the corruption happens, then you can set a watchpoint in software just beforehand via :cpp:func:`esp_set_watchpoint`. A fatal exception will occur when the watchpoint triggers. For example ``esp_set_watchpoint(0, (void *)addr, 4, ESP_WATCHPOINT_STORE``. Note that watchpoints are per-CPU and are set on the current running CPU only, so if you don't know which CPU is corrupting memory then you will need to call this function on both CPUs.
- For buffer overflows, `heap tracing`_ in ``HEAP_TRACE_ALL`` mode lets you see which callers are allocating from heap. If you can find the function which allocates memory with an address immediately before the address which is corrupted, this will probably be the function which overflows the buffer.
Configuration
^^^^^^^^^^^^^
In ``make menuconfig``, under ``Component config`` there is a menu ``Heap memory debugging``. The setting ``Heap corruption detection`` can be set to one of three levels:
Temporarily increasing the heap corruption detection level can give more detailed information about heap corruption errors.
In ``make menuconfig``, under ``Component config`` there is a menu ``Heap memory debugging``. The setting :ref:`CONFIG_HEAP_CORRUPTION_DETECTION` can be set to one of three levels:
Basic (no poisoning)
^^^^^^^^^^^^^^^^^^^^
++++++++++++++++++++
This is the default level. No special heap corruption features are enabled, but checks will fail if any of the heap's internal data structures are overwritten or corrupted. This usually indicates a buffer overrun or out of bounds write.
This is the default level. No special heap corruption features are enabled, but provided assertions are enabled (the default configuration) then a heap corruption error will be printed if any of the heap's internal data structures appear overwritten or corrupted. This usually indicates a buffer overrun or out of bounds write.
If assertions are enabled, an assertion will also trigger if a double-free occurs (ie the same memory is freed twice).
If assertions are enabled, an assertion will also trigger if a double-free occurs (the same memory is freed twice).
Light impact
^^^^^^^^^^^^
++++++++++++
At this level, heap memory is additionally "poisoned" with head and tail "canary bytes" before and after each block which is allocated. If an application writes outside the bounds of the allocated buffer, the canary bytes will be corrupted and the integrity check will fail.
"Basic" heap corruption checks can also detect out of bounds writes, but this setting is more precise as even a single byte overrun will always be detected. With Basic heap checks, the number of overrun bytes before a failure is detected will depend on the properties of the heap.
"Basic" heap corruption checks can also detect most out of bounds writes, but this setting is more precise as even a single byte overrun will always be detected. With Basic heap checks, the number of overrun bytes before a failure is detected will depend on the properties of the heap.
Similar to other heap checks, these "canary bytes" are checked via assertion whenever memory is freed and can also be checked manually via ``heap_caps_check_integrity()``.
Similar to other heap checks, these "canary bytes" are checked via assertion whenever memory is freed and can also be checked manually via :cpp:func:`heap_caps_check_integrity`.
This level increases memory usage, each individual allocation will use 9 to 12 additional bytes of memory (depending on alignment).
Comprehensive
^^^^^^^^^^^^^
+++++++++++++
This level incorporates the "light impact" detection features plus additional checks for uninitialised-access and use-after-free bugs. In this mode, all freshly allocated memory is filled with the pattern 0xCE, and all freed memory is filled with the pattern 0xFE.
@ -78,15 +92,6 @@ If the IDF heap allocator fails because the pattern 0xFEFEFEFE was not found in
Enabling "Comprehensive" detection has a substantial runtime performance impact (as all memory needs to be set to the allocation patterns each time a malloc/free completes, and the memory also needs to be checked each time.)
Finding Heap Corruption
^^^^^^^^^^^^^^^^^^^^^^^
Memory corruption can be one of the hardest classes of bugs to find and fix, as one area of memory can be corrupted from a totally different place. Some tips:
- Once you know the address (in memory) which is being corrupted, you can set a watchpoint on this address via JTAG to have the CPU halt when it is written to.
- If you don't have JTAG, but you do know roughly when the corruption happens, then you can set a watchpoint in software. A fatal exception will occur when the watchpoint triggers. For example ``esp_set_watchpoint(0, (void *)addr, 4, ESP_WATCHPOINT_STORE``. Note that the watchpoint is set on the current running CPU only, so if you don't know which CPU is corrupting memory then you will need to call this function on both CPUs.
- For buffer overflows, `heap tracing`_ in ``HEAP_TRACE_ALL`` mode lets you see which callers allocate the memory address(es) immediately before the address which is being corrupted. There is a strong chance this is the code which overflows the buffer.
.. _heap-tracing:
Heap Tracing
@ -94,7 +99,9 @@ Heap Tracing
Heap Tracing allows tracing of code which allocates/frees memory.
Note: Heap tracing "standalone" mode is currently implemented, meaning that tracing does not require any external hardware but uses internal memory to hold trace data. Heap tracing via JTAG trace port is also planned.
.. note::
Heap tracing "standalone" mode is currently implemented, meaning that tracing does not require any external hardware but uses internal memory to hold trace data. Heap tracing via JTAG trace port is also planned.
Heap tracing can perform two functions:
@ -104,15 +111,15 @@ Heap tracing can perform two functions:
How To Diagnose Memory Leaks
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you suspect a memory leak, the first step is to figure out which part of the program is leaking memory. Use the ``xPortGetFreeHeapSize()``, ``heap_caps_get_free()``, and related functions to track memory use over the life of the application. Try to narrow the leak down to a single function or sequence of functions where free memory always decreases and never recovers.
If you suspect a memory leak, the first step is to figure out which part of the program is leaking memory. Use the :cpp:func:`xPortGetFreeHeapSize`, :cpp:func:`heap_caps_get_free`, and related functions to track memory use over the life of the application. Try to narrow the leak down to a single function or sequence of functions where free memory always decreases and never recovers.
Once you've identified the code which you think is leaking:
- Under ``make menuconfig``, navigate to ``Component settings`` -> ``Heap Memory Debugging`` and set ``Enable heap tracing``.
- Call the function ``heap_trace_init_standalone()`` early in the program, to register a buffer which can be used to record the memory trace.
- Call the function ``heap_trace_start()`` to begin recording all mallocs/frees in the system. Call this immediately before the piece of code which you suspect is leaking memory.
- Call the function ``heap_trace_stop()`` to stop the trace once the suspect piece of code has finished executing.
- Call the function ``heap_trace_dump()`` to dump the results of the heap trace.
- Under ``make menuconfig``, navigate to ``Component settings`` -> ``Heap Memory Debugging`` and set :ref:`CONFIG_HEAP_TRACING`.
- Call the function :cpp:func:`heap_trace_init_standalone` early in the program, to register a buffer which can be used to record the memory trace.
- Call the function :cpp:func:`heap_trace_start` to begin recording all mallocs/frees in the system. Call this immediately before the piece of code which you suspect is leaking memory.
- Call the function :cpp:func:`heap_trace_stop` to stop the trace once the suspect piece of code has finished executing.
- Call the function :cpp:func:`heap_trace_dump` to dump the results of the heap trace.
An example::
@ -186,11 +193,11 @@ When heap tracing is running, heap allocation/free operations are substantially
False-Positive Memory Leaks
^^^^^^^^^^^^^^^^^^^^^^^^^^^
Not everything printed by ``heap_trace_dump()`` is necessarily a memory leak. Among things which may show up here, but are not memory leaks:
Not everything printed by :cpp:func:`heap_trace_dump` is necessarily a memory leak. Among things which may show up here, but are not memory leaks:
- Any memory which is allocated after ``heap_trace_start()`` but then freed after ``heap_trace_stop()`` will appear in the leak dump.
- Allocations may be made by other tasks in the system. Depending on the timing of these tasks, it's quite possible this memory is freed after ``heap_trace_stop()`` is called.
- The first time a task uses stdio - for example, when it calls printf() - a lock (RTOS mutex semaphore) is allocated by the libc. This allocation lasts until the task is deleted.
- Any memory which is allocated after :cpp:func:`heap_trace_start` but then freed after :cpp:func:`heap_trace_stop` will appear in the leak dump.
- Allocations may be made by other tasks in the system. Depending on the timing of these tasks, it's quite possible this memory is freed after :cpp:func:`heap_trace_stop` is called.
- The first time a task uses stdio - for example, when it calls ``printf()`` - a lock (RTOS mutex semaphore) is allocated by the libc. This allocation lasts until the task is deleted.
- The Bluetooth, WiFi, and TCP/IP libraries will allocate heap memory buffers to handle incoming or outgoing data. These memory buffers are usually short lived, but some may be shown in the heap leak trace if the data was received/transmitted by the lower levels of the network while the leak trace was running.
- TCP connections will continue to use some memory after they are closed, because of the ``TIME_WAIT`` state. After the ``TIME_WAIT`` period has completed, this memory will be freed.

View File

@ -14,10 +14,10 @@ issues.
However, in order to fully make use of all of the memory types and their characteristics, esp-idf also has a
capabilities-based heap memory allocator. If you want to have memory with certain properties (for example, DMA-capable
memory, or executable memory), you can create an OR-mask of the required capabilities and pass that to
``heap_caps_malloc()``. For instance, the standard ``malloc()`` implementation internally allocates memory via
:cpp:func:`heap_caps_malloc`. For instance, the standard ``malloc()`` implementation internally allocates memory via
``heap_caps_malloc(size, MALLOC_CAP_8BIT)`` in order to get data memory that is byte-addressable.
Because malloc uses this allocation system as well, memory allocated using ``heap_caps_malloc()`` can be freed by calling
Because malloc uses this allocation system as well, memory allocated using :cpp:func:`heap_caps_malloc` can be freed by calling
the standard ``free()`` function.
The "soc" component contains a list of memory regions for the chip, along with the type of each memory (aka its tag) and the associated capabilities for that memory type. On startup, a separate heap is initialised for each contiguous memory region. The capabilities-based allocator chooses the best heap for each allocation, based on the requested capabilities.
@ -74,4 +74,3 @@ API Reference - Multi Heap API
(Note: The multi heap API is used internally by the heap capabilities allocator. Most IDF programs will never need to call this API directly.)
.. include:: /_build/inc/multi_heap.inc