GY-63_MS5611/libraries/RunningMedian/README.md
2023-07-13 14:27:45 +02:00

167 lines
7.0 KiB
Markdown

[![Arduino CI](https://github.com/RobTillaart/RunningMedian/workflows/Arduino%20CI/badge.svg)](https://github.com/marketplace/actions/arduino_ci)
[![Arduino-lint](https://github.com/RobTillaart/RunningMedian/actions/workflows/arduino-lint.yml/badge.svg)](https://github.com/RobTillaart/RunningMedian/actions/workflows/arduino-lint.yml)
[![JSON check](https://github.com/RobTillaart/RunningMedian/actions/workflows/jsoncheck.yml/badge.svg)](https://github.com/RobTillaart/RunningMedian/actions/workflows/jsoncheck.yml)
[![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](https://github.com/RobTillaart/RunningMedian/blob/master/LICENSE)
[![GitHub release](https://img.shields.io/github/release/RobTillaart/RunningMedian.svg?maxAge=3600)](https://github.com/RobTillaart/RunningMedian/releases)
# RunningMedian
Arduino library to determine the running median by means of a circular buffer.
## Description
Running Median looks like a running average with a small but important twist.
Running average averages the last N samples while the running median takes
the last N samples, sort them and take the middle one, or the average of the
middle two in case the internal buffer size is even.
Important differences between running average and running median:
- Running median will return real data (e.g. a real sample from a sensor)
if one uses an odd size of the buffer (therefore preferred).
Running average may return a value that is never sampled.
- Running median will give zero weight to outliers, and 100% to the middle sample,
whereas running average gives the same weight to all samples.
- Running median will give often constant values for some time.
- As one knows the values in the buffer one can predict the maximum change of
the running median in the next steps in advance.
- Running median is slower as one needs to keep the values in timed order
to remove the oldest and keep them sorted to be able to select the median.
#### Note: MEDIAN_MAX_SIZE
The maximum size of the internal buffer is defined by **MEDIAN_MAX_SIZE** and is
set to 255 (since version 0.3.1). The memory allocated currently is in the order
of 5 bytes per element plus some overhead, so 255 elements take ~1300 bytes.
For an UNO this is quite a bit.
With larger sizes the performance penalty to keep the internal array sorted
is large. For most applications a value much lower e.g. 19 is working well, and
is performance wise O(100x) faster in sorting than 255 elements.
#### Note: Configurable Options
There are several options that can be configured via defines at compile time, those being:
- **RUNNING_MEDIAN_USE_MALLOC**: bool
- true (default): Dynamic memory allocation is used for the buffer.
- false: Static buffers of size MEDIAN_MAX_SIZE are used.
- **MEDIAN_MIN_SIZE**: uint8_t
- Dynamic / Static: The buffer stores at least this many items.
- should be minimal 3.
- **MEDIAN_MAX_SIZE**: uint8_t
- Dynamic: Not used.
- Static: The buffer stores at most this many items.
#### Related
- https://github.com/RobTillaart/Correlation
- https://github.com/RobTillaart/GST - Golden standard test metrics
- https://github.com/RobTillaart/Histogram
- https://github.com/RobTillaart/RunningAngle
- https://github.com/RobTillaart/RunningAverage
- https://github.com/RobTillaart/RunningMedian
- https://github.com/RobTillaart/statHelpers - combinations & permutations
- https://github.com/RobTillaart/Statistic
## Interface
#### Constructor
- **RunningMedian(const uint8_t size)** Constructor, dynamically allocates memory.
- **~RunningMedian()** Destructor.
- **uint8_t getSize()** returns size of internal array.
- **uint8_t getCount()** returns current used elements, getCount() <= getSize().
- **bool isFull()** returns true if the internal buffer is 100% filled.
#### Base functions
- **clear()** resets internal buffer and variables, effectively empty the buffer.
- **add(const float value)** adds a new value to internal buffer,
optionally replacing the oldest element if the buffer is full.
- **float getMedian()** returns the median == middle element.
- **float getAverage()** returns average of **all** the values in the internal buffer.
- **float getAverage(uint8_t nMedian)** returns average of **the middle n** values.
This effectively removes noise from the outliers in the samples.
The function is improved in 0.3.8 to correct a bias, see #22.
- **float getMedianAverage(uint8_t nMedian)** almost same as above,
except it compensates for alignment bias, see #22.
This is done by adjusting the nMedian parameter (-1 or +1) if needed.
- **float getHighest()** get the largest values in the buffer.
- **float getLowest()** get the smallest value in the buffer.
- **float getQuantile(const float quantile)** returns the Quantile value from the buffer.
This value is often interpolated.
#### getMedianAverage(nMedian)
**getAverage(nMedian)** and **getMedianAverage(uint8_t nMedian)** differ.
When nMedian is odd and count is even or vice versa, the middle N are not
perfectly in the middle.
By auto-adjusting nMedian (-1 +1) this balance is restored.
Assume an internal size of 7 elements \[0..6] then
- **getAverage(4)** will average element 1, 2, 3, 4
- **getMedianAverage(4)** will adjust nMedian and average element 2, 3, 4.
The example **RunningMedian_getMedianAverage.ino** shows the difference.
The implementation of **getMedianAverage(uint8_t nMedian)** is experimental
and might change in the future.
Idea is taking top and bottom elements only for 50% if needed, however that
implies at least 2 extra float multiplications.
It is possible that the name **getMedianAverage(uint8_t nMedian)**
will change in the future to be more descriptive.
#### Less used functions
- **float getElement(const uint8_t n)** returns the n'th element from the values in time order.
- **float getSortedElement(const uint8_t n)** returns the n'th element from the values in size order (sorted ascending).
- **float predict(const uint8_t n)** predict the maximum change of median after n additions,
n must be smaller than **getSize()/2**.
#### SearchMode optimization
Since 0.3.7 the internal sort has been optimized.
It is now possible to select between LINEAR (=0) and BINARY (=1) insertion sort.
Pre-0.3.7 used linear insertion sort, and the new linear version is slightly optimized.
For larger internal arrays the performance gain of BINARY mode is substantial.
- **void setSearchMode(uint8_t searchMode = 0)** 0 = linear, 1 = binary - see table below.
Other values will set the searchMode to linear.
- **uint8_t getSearchMode()** returns the set mode
| searchMode | value | notes |
|:------------:|:-------:|:-------|
| LINEAR | 0 | fastest for smaller internal buffers (default)
| BINARY | 1 | faster for larger internal buffers
Depends on the board / clock used where the methods are equally fast.
Give it a try, and let me know your.
## Operation
See examples.
## Future
- improve documentation.
- check for optimizations.
- get the median without (full) sorting. QuickSelect()
- move all code to .cpp file