2023-07-07 12:31:58 -04:00
|
|
|
|
|
|
|
|
|
[![Arduino CI](https://github.com/RobTillaart/Gauss/workflows/Arduino%20CI/badge.svg)](https://github.com/marketplace/actions/arduino_ci)
|
|
|
|
|
[![Arduino-lint](https://github.com/RobTillaart/Gauss/actions/workflows/arduino-lint.yml/badge.svg)](https://github.com/RobTillaart/Gauss/actions/workflows/arduino-lint.yml)
|
|
|
|
|
[![JSON check](https://github.com/RobTillaart/Gauss/actions/workflows/jsoncheck.yml/badge.svg)](https://github.com/RobTillaart/Gauss/actions/workflows/jsoncheck.yml)
|
2023-11-02 11:03:14 -04:00
|
|
|
|
[![GitHub issues](https://img.shields.io/github/issues/RobTillaart/Gauss.svg)](https://github.com/RobTillaart/Gauss/issues)
|
|
|
|
|
|
2023-07-07 12:31:58 -04:00
|
|
|
|
[![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](https://github.com/RobTillaart/Gauss/blob/master/LICENSE)
|
|
|
|
|
[![GitHub release](https://img.shields.io/github/release/RobTillaart/Gauss.svg?maxAge=3600)](https://github.com/RobTillaart/Gauss/releases)
|
2023-11-02 11:03:14 -04:00
|
|
|
|
[![PlatformIO Registry](https://badges.registry.platformio.org/packages/robtillaart/library/Gauss.svg)](https://registry.platformio.org/libraries/robtillaart/Gauss)
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# Gauss
|
|
|
|
|
|
2023-07-10 14:25:09 -04:00
|
|
|
|
Library for the Gauss probability math. (Normal Distribution).
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Description
|
|
|
|
|
|
|
|
|
|
Gauss is an experimental Arduino library to approximate the probability that a value is
|
|
|
|
|
smaller or larger than a given value.
|
|
|
|
|
These under the premises of a Gaussian distribution with parameters **mean** and **stddev**
|
2023-07-08 08:23:27 -04:00
|
|
|
|
(a.k.a. average / mu / µ and standard deviation / sigma / σ).
|
2023-07-10 14:25:09 -04:00
|
|
|
|
If these parameters are not given, mean == 0 and stddev == 1 are used by default.
|
|
|
|
|
This is the normalized Gaussian distribution.
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
2023-07-10 14:25:09 -04:00
|
|
|
|
The values of the functions are approximated with a **MultiMap()** based lookup
|
|
|
|
|
using a 34 points interpolated lookup.
|
|
|
|
|
- Version 0.1.x used the **MultiMap** library need to be downloaded too (see related below).
|
|
|
|
|
- Version 0.2.0 and above embeds an optimized version, so no need to use **MultiMap**.
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
2023-07-10 14:25:09 -04:00
|
|
|
|
Note: The number of lookup points might chance in the future, keeping a balance between
|
|
|
|
|
accuracy and footprint.
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
|
|
|
|
|
2023-07-10 14:25:09 -04:00
|
|
|
|
#### Accuracy / precision
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
2023-07-10 14:25:09 -04:00
|
|
|
|
The version 0.2.0 lookup table has 34 points with 8 decimals.
|
2023-07-08 08:23:27 -04:00
|
|
|
|
This matches the precision of float data type.
|
2023-07-10 14:25:09 -04:00
|
|
|
|
Do not expect an 8 decimals accuracy / precision as interpolation is linear.
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
2023-07-10 14:25:09 -04:00
|
|
|
|
A first investigation (part 0.0 - 1.3) shows:
|
|
|
|
|
- maximum error ~ 0.0003016 <= 0.031%
|
|
|
|
|
- average error ~ 0.0001433 <= 0.015%
|
|
|
|
|
|
|
|
|
|
I expect that for many applications this accuracy is probably sufficient.
|
|
|
|
|
|
|
|
|
|
The 34 points are in a (mostly) equidistant table.
|
|
|
|
|
Searching the interpolation points is optimized in version 0.2.0.
|
|
|
|
|
The table uses the symmetry of the distribution to reduce the number of points.
|
|
|
|
|
|
|
|
|
|
Values of the table are calculated with ```NORM.DIST(x, mean, stddev, true)```
|
|
|
|
|
spreadsheet function.
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
2023-07-08 08:23:27 -04:00
|
|
|
|
Note: 0.1.0 was 32 points 4 decimals. Need to investigate reduction of points.
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
2023-07-10 14:25:09 -04:00
|
|
|
|
|
2023-07-07 12:31:58 -04:00
|
|
|
|
#### Applications
|
|
|
|
|
|
2023-07-10 14:25:09 -04:00
|
|
|
|
- use as a filter e.g. detect above N1 sigma and under N2 sigma
|
|
|
|
|
- compare historic data to current data e.g. temperature.
|
|
|
|
|
- transforming to sigma makes it scale C / F / K independent.
|
|
|
|
|
- fill a bag (etc) until a certain weight is reached (+- N sigma)
|
|
|
|
|
- compare population data with individual, e.g. Body Mass Index (BMI).
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
|
|
|
|
|
2023-07-08 08:23:27 -04:00
|
|
|
|
#### Character
|
|
|
|
|
|
|
|
|
|
| parameter | name | ALT-code | char |
|
|
|
|
|
|:-----------:|:------:|:----------:|:-----:|
|
|
|
|
|
| mean | mu | ALT-230 | µ |
|
|
|
|
|
| stddev | sigma | ALT-229 | σ |
|
2023-07-10 14:25:09 -04:00
|
|
|
|
| CDF | phi | ALT-232 | Φ | ALT-237 for lower case
|
2023-07-08 08:23:27 -04:00
|
|
|
|
|
|
|
|
|
- https://altcodesguru.com/greek-alt-codes.html
|
|
|
|
|
|
|
|
|
|
|
2023-07-07 12:31:58 -04:00
|
|
|
|
#### Related
|
|
|
|
|
|
|
|
|
|
- https://en.wikipedia.org/wiki/Normal_distribution
|
2023-07-08 08:23:27 -04:00
|
|
|
|
- https://sphweb.bumc.bu.edu/otlt/mph-modules/bs/bs704_probability/bs704_probability9.html
|
2023-07-07 12:31:58 -04:00
|
|
|
|
- https://github.com/RobTillaart/Multimap
|
|
|
|
|
- https://github.com/RobTillaart/Statistic (more stat links there).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Interface
|
|
|
|
|
|
|
|
|
|
```cpp
|
|
|
|
|
#include Gauss.h
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
#### Base
|
|
|
|
|
|
|
|
|
|
- **Gauss()** constructor. Uses mean = 0 and stddev = 1 by default.
|
2023-07-08 08:23:27 -04:00
|
|
|
|
- **bool begin(float mean = 0, float stddev = 1)** set the mean and stddev.
|
|
|
|
|
Returns true if stddev > 0 which should be so.
|
2023-07-10 14:25:09 -04:00
|
|
|
|
Returns false if stddev <= 0, however it could be a user choice to use this.
|
2023-07-08 08:23:27 -04:00
|
|
|
|
Note that if ```stddev == 0```, probabilities cannot be calculated
|
|
|
|
|
as the distribution is not Gaussian.
|
2023-07-10 14:25:09 -04:00
|
|
|
|
The default values (0, 1) gives the normalized Gaussian distribution.
|
|
|
|
|
**begin()** can be called at any time to change the mean and/or stddev.
|
2023-07-08 08:23:27 -04:00
|
|
|
|
- **float getMean()** returns current mean.
|
|
|
|
|
- **float getStddev()** returns current stddev.
|
|
|
|
|
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
|
|
|
|
#### Probability
|
|
|
|
|
|
2023-07-10 14:25:09 -04:00
|
|
|
|
Probability functions return NAN if stddev == 0.
|
|
|
|
|
Return values are given as a float 0.0 .. 1.0.
|
|
|
|
|
Multiply probabilities by 100.0 to get the value as a percentage.
|
2023-07-08 08:23:27 -04:00
|
|
|
|
|
2023-07-07 12:31:58 -04:00
|
|
|
|
- **float P_smaller(float f)** returns probability **P(x < f)**.
|
2023-07-08 08:23:27 -04:00
|
|
|
|
A.k.a. **CDF()** Cumulative Distribution Function.
|
2023-07-07 12:31:58 -04:00
|
|
|
|
- **float P_larger(float f)** returns probability **P(x > f)**.
|
2023-07-08 08:23:27 -04:00
|
|
|
|
As the distribution is continuous **P_larger(f) == 1 - P_smaller(f)**.
|
2023-07-07 12:31:58 -04:00
|
|
|
|
- **float P_between(float f, float g)** returns probability **P(f < x < g)**.
|
2023-07-10 14:25:09 -04:00
|
|
|
|
- if f >= g ==> returns 1.0
|
2023-07-07 12:31:58 -04:00
|
|
|
|
- **float P_equal(float f)** returns probability **P(x == f)**.
|
2023-07-08 08:23:27 -04:00
|
|
|
|
This uses the bell curve formula.
|
2023-07-10 14:25:09 -04:00
|
|
|
|
- **float P_outside(float f, float g)** returns probability **P(x < f) + P(g < x)**.
|
|
|
|
|
- note that f should be smaller or equal to g
|
|
|
|
|
- **P_outside() = 1 - P_between()**
|
2023-07-08 08:23:27 -04:00
|
|
|
|
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
2023-07-10 14:25:09 -04:00
|
|
|
|
#### Normalize
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
|
|
|
|
- **float normalize(float f)** normalize a value to normalized distribution.
|
2023-07-10 14:25:09 -04:00
|
|
|
|
E.g if mean == 50 and stddev == 14, then 71 ==> +1.5 sigma.
|
2023-07-07 12:31:58 -04:00
|
|
|
|
Is equal to number of **stddevs()**.
|
2023-07-10 14:25:09 -04:00
|
|
|
|
- **float denormalize(float f)** reverses normalize().
|
|
|
|
|
What value would have a deviation of 1.73 stddev.
|
2023-07-07 12:31:58 -04:00
|
|
|
|
- **float stddevs(float f)** returns the number of stddevs from the mean.
|
2023-07-10 14:25:09 -04:00
|
|
|
|
Identical to **normalize()**.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
#### Other
|
|
|
|
|
|
|
|
|
|
wrapper functions:
|
|
|
|
|
|
2023-07-07 12:31:58 -04:00
|
|
|
|
- **float bellCurve(float f)** returns probability **P(x == f)**.
|
2023-07-10 14:25:09 -04:00
|
|
|
|
- **float CDF(float f)** returns probability **P(x < f)**.
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
|
|
|
|
|
2023-07-08 08:23:27 -04:00
|
|
|
|
## Performance
|
|
|
|
|
|
|
|
|
|
Indicative numbers for 1000 calls, timing in micros.
|
|
|
|
|
|
|
|
|
|
Arduino UNO, 16 MHz, IDE 1.8.19
|
|
|
|
|
|
2023-07-10 14:25:09 -04:00
|
|
|
|
| function | 0.1.0 | 0.1.1 | 0.2.0 | notes |
|
|
|
|
|
|:--------------|:--------:|:--------:|:--------:|:--------|
|
|
|
|
|
| P_smaller | 375396 | 365964 | 159536 |
|
|
|
|
|
| P_larger | 384368 | 375032 | 169056 |
|
|
|
|
|
| P_between | 265624 | 269176 | 150148 |
|
|
|
|
|
| normalize | 44172 | 23024 | 23024 |
|
|
|
|
|
| bellCurve | 255728 | 205460 | 192524 |
|
|
|
|
|
| approx.bell | 764028 | 719184 | 333172 | see examples
|
2023-07-08 08:23:27 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ESP32, 240 MHz, IDE 1.8.19
|
|
|
|
|
|
2023-07-10 14:25:09 -04:00
|
|
|
|
| function | 0.1.0 | 0.1.1 | 0.2.0 | notes |
|
|
|
|
|
|:--------------|:--------:|:--------:|:--------:|:--------|
|
|
|
|
|
| P_smaller | - | 4046 | 1498 |
|
|
|
|
|
| P_larger | - | 4043 | 1516 |
|
|
|
|
|
| P_between | - | 3023 | 1569 |
|
|
|
|
|
| normalize | - | 592 | 585 |
|
|
|
|
|
| bellCurve | - | 13522 | 13133 |
|
|
|
|
|
| approx.bell | - | 7300 | 2494 |
|
2023-07-08 08:23:27 -04:00
|
|
|
|
|
|
|
|
|
|
2023-07-07 12:31:58 -04:00
|
|
|
|
## Future
|
|
|
|
|
|
|
|
|
|
#### Must
|
|
|
|
|
|
|
|
|
|
- documentation
|
2023-07-10 14:25:09 -04:00
|
|
|
|
- test test test
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
|
|
|
|
#### Should
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
#### Could
|
|
|
|
|
|
|
|
|
|
- add examples
|
2023-07-08 08:23:27 -04:00
|
|
|
|
- add unit tests
|
|
|
|
|
- **VAL(probability = 0.75)** ==> 134 whatever
|
2023-07-10 14:25:09 -04:00
|
|
|
|
- Returns the value for which the **CDF()** is at least probability.
|
|
|
|
|
- Inverse of **P_smaller()** (how? binary search)
|
|
|
|
|
|
2023-07-07 12:31:58 -04:00
|
|
|
|
|
|
|
|
|
#### Won't (unless requested)
|
|
|
|
|
|
2023-07-08 08:23:27 -04:00
|
|
|
|
- equality test Gauss objects
|
|
|
|
|
- does the stddev needs to be positive? Yes.
|
|
|
|
|
- what happens if negative values are allowed? P curve is reversed.
|
|
|
|
|
- move code to .cpp file? (rather small lib).
|
|
|
|
|
- **void setMean(float f)** can be done with begin()
|
|
|
|
|
- **void setStddev(float f)** can be done with begin()
|
2023-07-10 14:25:09 -04:00
|
|
|
|
- optimize accuracy
|
|
|
|
|
- (-6 .. 0) might be more accurate (significant digits)?
|
2023-11-02 11:03:14 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Support
|
|
|
|
|
|
|
|
|
|
If you appreciate my libraries, you can support the development and maintenance.
|
|
|
|
|
Improve the quality of the libraries by providing issues and Pull Requests, or
|
|
|
|
|
donate through PayPal or GitHub sponsors.
|
|
|
|
|
|
|
|
|
|
Thank you,
|
|
|
|
|
|