Merge branch 'm1'

This commit is contained in:
Rob Tillaart 2013-08-17 16:28:51 +02:00
commit 7730415547
5 changed files with 85 additions and 32 deletions

View File

@ -0,0 +1,27 @@
FAQ ARDUINO STATISTICS LIB ARDUINO
=====================================
Q: Are individual samples still available?
The values added to the library are not stored in the lib as it would use lots of memory quite fast. Instead a few calculated values are kept to be able to calculate the most important statistics.
Q: How many samples can the lib hold? (internal variables and overflow)
The counter of samples is a long, implying a maximum of 2Gig samples. In practice 'strange' things might happen before this number is reached. There are two internal variables, _sum which is the sum of the values and _ssq which is the sum of the squared values. Both can overflow especially _ssq will grow fast. The library does not protect against it.
There is a workaround for this (to some extend) if one knows the approx average of the samples before. Before adding values to the lib subtract the expected average. The sum of the samples would move to around zero. This workaround has no influence on the standard deviation. !! Do not forget to add this expected average to the calculated average. (Q: could this be build into the lib?)
Q: How about the precision of the library?
The precision of the internal variables is restricted due to the fact that they are 32 bit float (IEEE754). If the internal variable _sum has a large value, adding relative small values to the dataset wouldn't change its value any more. Same is true for _ssq. One might argue that statistically speaking these values are less significant, but in fact it is wrong.
There is a workaround for this (to some extend). If one has the samples in an array or on disk, one can sort the samples in increasing order (abs value) and add them from this sorted list. This will minimize the error, but it works only if the samples are available and the they may be added in the sorted order.
Q: When will internal var's overflow? esp. squared sum
IEEE754 floats have a max value of +- 3.4028235E+38.
Q: Why are there two functions for stdev?
There are two stdev functions the population stdev and the unbiased stdev.
See Wikipedia for an elaborate description of the difference between these two.

View File

@ -0,0 +1,11 @@
2012-05-19
-------------
This is a simple statistics library for the Arduino, version: 0.3.1
previous versions are not available.
2013-08-17
------------
version: 0.3.2
http://arduino.cc/playground/Main/Statistics

View File

@ -1,6 +1,6 @@
//
// FILE: Statistic.cpp
// AUTHOR: Rob dot Tillaart at gmail dot com
// AUTHOR: Rob dot Tillaart at gmail dot com
// modified at 0.3 by Gil Ross at physics dot org
// VERSION: see STATISTIC_LIB_VERSION in .h
// PURPOSE: Recursive statistical library for Arduino
@ -9,22 +9,22 @@
// Rob Tillaart's Statistic library uses one-pass of the data (allowing
// each value to be discarded), but expands the Sum of Squares Differences to
// difference the Sum of Squares and the Average Squared. This is susceptible
// to bit length precision errors with the float type (only 5 or 6 digits
// to bit length precision errors with the float type (only 5 or 6 digits
// absolute precision) so for long runs and high ratios of
// the average value to standard deviation the estimate of the
// the average value to standard deviation the estimate of the
// standard error (deviation) becomes the difference of two large
// numbers and will tend to zero.
//
// For small numbers of iterations and small Average/SE th original code is
// likely to work fine.
// It should also be recognised that for very large samples, questions
// It should also be recognised that for very large samples, questions
// of stability of the sample assume greater importance than the
// correctnness of the asymptotic estimators.
// correctness of the asymptotic estimators.
//
// This recursive algorithm, which takes slightly more computation per
// iteration is numerically stable.
// It updates the number, mean, max, min and SumOfSquaresDiff each step to
// deliver max min average, population standard error (standard deviation) and
// deliver max min average, population standard error (standard deviation) and
// unbiassed SE.
// -------------
//
@ -34,8 +34,13 @@
// 0.2.01 - 2010-10-30
// added minimim, maximum, unbiased stdev,
// changed counter to long -> int overflows @32K samples
// 0.3 - branched from 0.2.01 version of Rob Tillaart's code
// 0.3 - 2011-01-07
// branched from 0.2.01 version of Rob Tillaart's code
// 0.3.1 - minor edits
// 0.3.2 - 2012-11-10
// minor edits
// changed count -> unsigned long allows for 2^32 samples
// added variance()
//
// Released to the public domain
//
@ -49,14 +54,14 @@ Statistic::Statistic()
// resets all counters
void Statistic::clear()
{
{
_cnt = 0;
_sum = 0.0;
_min = 0.0;
_max = 0.0;
#ifdef STAT_USE_STDEV
_ssqdif = 0.0; // not _ssq but sum of square differences
// which is SUM(from i = 1 to N) of
// which is SUM(from i = 1 to N) of
// (f(i)-_ave_N)**2
#endif
}
@ -70,22 +75,22 @@ void Statistic::add(float f)
_max = f;
} else {
if (f < _min) _min = f;
if (f > _max) _max = f;
if (f > _max) _max = f;
}
_sum += f;
_cnt++;
#ifdef STAT_USE_STDEV
if (_cnt >1)
#ifdef STAT_USE_STDEV
if (_cnt >1)
{
_store = (_sum / _cnt - f);
_ssqdif = _ssqdif + _cnt * _store * _store / (_cnt-1);
}
}
#endif
}
// returns the number of values added
long Statistic::count()
unsigned long Statistic::count()
{
return _cnt;
}
@ -118,7 +123,14 @@ float Statistic::maximum()
// Population standard deviation = s = sqrt [ S ( Xi - µ )2 / N ]
// http://www.suite101.com/content/how-is-standard-deviation-used-a99084
#ifdef STAT_USE_STDEV
#ifdef STAT_USE_STDEV
float Statistic::variance()
{
if (_cnt == 0) return NAN; // otherwise DIV0 error
return _ssqdif / _cnt;
}
float Statistic::pop_stdev()
{
if (_cnt == 0) return NAN; // otherwise DIV0 error
@ -130,5 +142,6 @@ float Statistic::unbiased_stdev()
if (_cnt < 2) return NAN; // otherwise DIV0 error
return sqrt( _ssqdif / (_cnt - 1));
}
#endif
// END OF FILE

View File

@ -1,8 +1,8 @@
#ifndef Statistic_h
#define Statistic_h
//
//
// FILE: Statistic.h
// AUTHOR: Rob dot Tillaart at gmail dot com
// AUTHOR: Rob dot Tillaart at gmail dot com
// modified at 0.3 by Gil Ross at physics dot org
// PURPOSE: Recursive Statistical library for Arduino
// HISTORY: See Statistic.cpp
@ -16,27 +16,28 @@
#include <math.h>
#define STATISTIC_LIB_VERSION "0.3.1"
#define STATISTIC_LIB_VERSION "0.3.2"
class Statistic
class Statistic
{
public:
Statistic();
void clear();
void add(float);
long count();
unsigned long count();
float sum();
float average();
float minimum();
float maximum();
#ifdef STAT_USE_STDEV
float variance();
float pop_stdev(); // population stdev
float unbiased_stdev();
#endif
protected:
long _cnt;
unsigned long _cnt;
float _store; // store to minimise computation
float _sum;
float _min;

View File

@ -1,41 +1,42 @@
//
// FILE: Average.pde
// AUTHOR: Rob dot Tillaart at gmail dot com
// VERSION: 0.1
// FILE: Average.ino
// AUTHOR: Rob dot Tillaart at gmail dot com
// VERSION: 0.2
// PURPOSE: Sample sketch for statistic library Arduino
//
#include "Statistic.h"
Statistic myStats;
Statistic myStats;
void setup(void)
void setup(void)
{
Serial.begin(9600);
Serial.println("Demo Statistics lib Average");
Serial.print("Demo Statistics lib ");
Serial.print(STATISTIC_LIB_VERSION);
myStats.clear(); //explicitly start clean
}
void loop(void)
void loop(void)
{
long rn = random(0, 9999);
myStats.add(rn/100.0 + 1);
if (myStats.count() == 10000)
{
Serial.print(" Count: ");
Serial.println(myStats.count());
Serial.println(myStats.count());
Serial.print(" Min: ");
Serial.println(myStats.minimum(),4);
Serial.print(" Max: ");
Serial.println(myStats.maximum(),4);
Serial.println(myStats.maximum(),4);
Serial.print(" Average: ");
Serial.println(myStats.average(), 4);
// uncomment in Statistic.h file to use stdev
// uncomment in Statistic.h file to use stdev
#ifdef STAT_USE_STDEV
Serial.print(" pop stdev: ");
Serial.println(myStats.pop_stdev(), 4);
Serial.print(" unbias stdev: ");
Serial.println(myStats.unbiased_stdev(), 4);
Serial.println(myStats.unbiased_stdev(), 4);
#endif
Serial.println("=====================================");
myStats.clear();