0.1.5 infiniteAverage

This commit is contained in:
rob tillaart 2021-12-20 16:20:02 +01:00
parent 0fb67ed221
commit 7935abba8a
10 changed files with 140 additions and 37 deletions

View File

@ -1,6 +1,6 @@
MIT License
Copyright (c) 2021-2021 Rob Tillaart
Copyright (c) 2021-2022 Rob Tillaart
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal

View File

@ -6,25 +6,26 @@
[![GitHub release](https://img.shields.io/github/release/RobTillaart/infiniteAverage.svg?maxAge=3600)](https://github.com/RobTillaart/infiniteAverage/releases)
# infinteAverage
# infiniteAverage
Arduino Library to calculate an average of many samples
Arduino Library to calculate an average of many samples.
## Description
This library is an experimental library that cascades a float and a uint32_t type.
It was created from an idea when an overflow was encountered in my Statistic Class
due too many samples.
due too many samples. https://github.com/RobTillaart/statistic
#### Problem
As an 32 bit float has ~7 decimals precision, after 10 million additions the sum
becomes 7 orders of magnitude larger than individual samples. From that moment
the addition will not increase the sum correctly or even not at all.
definitely becomes 7 orders of magnitude larger than individual samples.
From that moment the addition will not increase the sum correctly or even not at all.
(assume you add values between 0-100 e.g. temperatures)
Taking the average is taking the sum and divide that by the count of the numbers.
Taking the average is taking the sum of the samples and divide that by the count.
Only if the count is fixed one could divide the samples first and then sum them.
This library supports the first scenario.
@ -35,16 +36,18 @@ To cope with the overflow problem, this lib uses an float combined with an uint3
The float is used for the decimal part and the uint32_t for the whole part.
In theory this should give about 15 significant digits for the average in a 9.6 format.
but this precision is only internal to do some math. WHen the average() is calculated
but this precision is only internal to do some math. When the average() is calculated
the value returned is "just" a float.
(since 0.1.2)
If the library detects that there are 2 billion++ (0x8000000) added or if the whole
part of the sum reaches that number, all internal counters are divided by 2.
That does not affect the minimum and maximum and the average only slightly.
If the library detects that there are 4294967000 (almost 2^32) samples added or
if the internal sum of samples reaches a threshold (default 2^30 ~~ 1 billion) ,
the internal counter and sum are divided by 2.
That does not affect the minimum and maximum and the average only very slightly.
Since version 0.1.4 users can change this threshold and adjust it to data added.
NB if you add only small values e.g between 0..100 this threshold may be at 4 billion.
Since 0.1.4 users can change this threshold and adjust it to data added.
Depending on the data and maxValue per sampel this can have side effects.
Use at your own risk.
#### Conclusion (for now)
@ -73,18 +76,38 @@ First get more hands-on experience with it.
- **float decimals()** returns the internal float = decimals part.
- **uint32_t whole()** returns the internal whole part.
- **uint32_t count()** returns the number of values added.
Note this may be scaled back a factor of 2 or more.
Note this may be scaled back a power of 2 (2,4,8,16, ...).
- **float average()** returns the average in float format, or NAN if count == 0
- **float minimum()** returns the minimum in float format, or NAN if count == 0
- **float maximum()** returns the maximum in float format, or NAN if count == 0
#### Threshold
(since 0.1.4) User can set the value when the sum and count are divided by two, to prevent internal counters to overflow. Default at startup this value is (1UL << 31).
### 0.1.4
Users can set a threshold value to prevent the internal sum to overflow.
Default at startup this value is (1UL << 30), and depending on the maxValue
per sample added this should be set lower.
When the threshold is reached both the sum and the internal counter are divided by 2.
This keeps the average almost the same.
The internal sample counter will trigger the divide by 2 action when 4294967000
samples are added. That is a lot, roughly 1 samples per second for 130 years,
or 1000 samples per second for 40 days.
- **void setDivideThreshold(uint32_t threshold)**
- **uint32_t getDivideThreshold()**
### 0.1.5
- Fixed a rounding error of the whole part when dividing by 2.
The threshold value should be as large as possible to get an accurate value.
If n is small compared to maxValue(sample) there will be side effects that
might break your project. The average will tend to the average of the last
added values. So be careful!
## Operation
See examples
@ -102,10 +125,11 @@ to get around 28 significant digits => 18.10 format
This would allow to adjust to known order of size of the numbers.
(e.g. if numbers are all in the billions the uint32_t would overflow very fast)
- investigate other math with this data type, starting with + - / \* ?
- printable interface?
- printable interface? sprintf() ?
- play if time permits.
- update documentation
- add examples
- \_overflow => \_wholePart
**0.2.0**
- add negative numbers

View File

@ -1,11 +1,8 @@
//
// FILE: IA_test.ino
// AUTHOR: Rob Tillaart
// VERSION: 0.1.0
// PURPOSE: demo
// DATE: 2021-01-21
// (c) : MIT
//
#include "infiniteAverage.h"
@ -21,12 +18,13 @@ void setup()
Serial.println(__FILE__);
IA.reset();
IA.setDivideThreshold(1024);
while (1)
{
IA.add(random(10000) * 0.0001);
if (millis() - lastTime >= 1000)
if (millis() - lastTime >= 500)
{
lastTime = millis();
Serial.print(IA.count());
@ -48,3 +46,4 @@ void loop()
// -- END OF FILE --

View File

@ -0,0 +1,55 @@
//
// FILE: IA_test_threshold.ino
// AUTHOR: Rob Tillaart
// PURPOSE: demo
// DATE: 2021-12-20
#include "infiniteAverage.h"
IAVG iavg;
void setup()
{
Serial.begin(115200);
Serial.println(__FILE__);
iavg.reset();
for (int i = 0; i < 1000; i++)
{
iavg.add(1.0 * i);
}
Serial.print(iavg.count());
Serial.print("\t");
Serial.print(iavg.whole());
Serial.print("\t");
Serial.print(iavg.average());
Serial.print("\n");
// shows the effects of small thresholds with non-uniform data
for (uint32_t th = 2048; th <= 1000000UL; th *= 2)
{
iavg.reset();
iavg.setDivideThreshold(th);
for (int i = 0; i < 1000; i++)
{
iavg.add(1.0 * i);
}
Serial.print(th);
Serial.print("\t");
Serial.print(iavg.count());
Serial.print("\t");
Serial.print(iavg.whole());
Serial.print("\t");
Serial.print(iavg.average());
Serial.print("\n");
}
}
void loop()
{
}
// -- END OF FILE --

View File

@ -1,11 +1,9 @@
//
// FILE: IA_test.ino
// AUTHOR: Rob Tillaart
// VERSION: 0.1.0
// PURPOSE: demo
// DATE: 2021-01-21
// (c) : MIT
//
#include "infiniteAverage.h"
@ -60,3 +58,4 @@ void loop()
// -- END OF FILE --

View File

@ -2,14 +2,14 @@
//
// FILE: infiniteAverage.h
// AUTHOR: Rob Tillaart
// VERSION: 0.1.4
// VERSION: 0.1.5
// PURPOSE: Calculate the average of a very large number of values.
// URL: https://github.com/RobTillaart/I2C_24FC1025
#include "Arduino.h"
#define IAVG_LIB_VERSION (F("0.1.4"))
#define IAVG_LIB_VERSION (F("0.1.5"))
class IAVG
@ -52,9 +52,11 @@ public:
_overflow++;
_sum -= 1;
}
// scale back factor 2 when overflow comes near
if ((_count >= _threshold)|| (_overflow >= _threshold))
// scale back factor 2 when overflow comes near
// TODO abs(_overflow)
if ( (_overflow >= _threshold) || (_count >= 4294967000 ) )
{
if (_overflow & 1) _sum += 1.0; // fix rounding error.
_count /= 2;
_overflow /= 2;
_sum /= 2;
@ -119,7 +121,7 @@ private:
float _maximum = 0;
uint32_t _overflow = 0;
uint32_t _count = 0;
uint32_t _threshold = (1UL << 31);
uint32_t _threshold = (1UL << 30);
};

View File

@ -4,12 +4,12 @@
infinteAverage KEYWORD1
IAVG KEYWORD1
# Methods and Functions (KEYWORD2)
IAVG KEYWORD2
# Methods and Functions (KEYWORD2)
reset KEYWORD2
add KEYWORD2
average KEYWORD2
decimals KEYWORD2
whole KEYWORD2
count KEYWORD2

View File

@ -15,8 +15,9 @@
"type": "git",
"url": "https://github.com/RobTillaart/infiniteAverage"
},
"version": "0.1.4",
"version": "0.1.5",
"license": "MIT",
"frameworks": "arduino",
"platforms": "*"
"platforms": "*",
"headers": "infiniteAverage.h"
}

View File

@ -1,5 +1,5 @@
name=infiniteAverage
version=0.1.4
version=0.1.5
author=Rob Tillaart <rob.tillaart@gmail.com>
maintainer=Rob Tillaart <rob.tillaart@gmail.com>
sentence=Experimental Arduino Library to calculate a high precision average of many samples

View File

@ -37,6 +37,7 @@
unittest_setup()
{
fprintf(stderr, "IAVG_LIB_VERSION: %s\n", (char *) IAVG_LIB_VERSION);
}
@ -47,8 +48,6 @@ unittest_teardown()
unittest(test_constructor)
{
fprintf(stderr, "VERSION: %s\n", IAVG_LIB_VERSION);
IAVG iavg;
assertEqual(0, iavg.count());
@ -59,8 +58,6 @@ unittest(test_constructor)
unittest(test_add)
{
fprintf(stderr, "VERSION: %s\n", IAVG_LIB_VERSION);
IAVG iavg;
iavg.add(10000000);
@ -77,6 +74,32 @@ unittest(test_add)
}
unittest(test_threshold)
{
IAVG iavg;
iavg.reset();
for (int i = 0; i < 1000; i++)
{
iavg.add(1.0 * i);
}
fprintf(stderr, "%d \t%d \t%f\n", iavg.count(), iavg.whole(), iavg.average());
// shows the effects of (relative) small thresholds with non-uniform data
for (uint32_t th = 100000; th < 1000000; th += 100000)
{
iavg.reset();
iavg.setDivideThreshold(th);
for (int i = 0; i < 1000; i++)
{
iavg.add(1.0 * i);
}
fprintf(stderr, "%3d %d \t%d \t%f\n", th, iavg.count(), iavg.whole(), iavg.average());
}
}
unittest_main()