GY-63_MS5611/libraries/Correlation/README.md
2021-01-29 12:31:58 +01:00

4.1 KiB

Arduino CI License: MIT GitHub release

Correlation

Arduino Library to determine linear correlation between X and Y datasets.

Description

This library calculates the coefficients of the linear correlation between two (relative small) datasets. The size of these datasets is 20 by default, but this can be set in the constructor.

The formula of the correlation is expressed as Y = A + B * X,

Please note that the correlation uses about ~50 bytes per instance, and 2 floats == 8 bytes per pair of elements. So ~120 elements will use up 50% of the RAM of an UNO.

Use with care.

Interface

Constructor

  • Correlation(uint8_t size = 20) allocates the array needed and resets internal admin.
  • ~Correlation() frees the allocated arrays.

Base functions

  • bool add(x, y) adds a pair of floats to the internal storage. Returns true if the value is added, returns false when internal array is full. When running correlation is set, it will replace the oldest element and return true.
  • uint8_t count() returns the amount of items in the internal arrays.
  • uint8_t size() returns the size of the internal arrays.
  • void clear() resets the datastructure to start condition (zero elements added)
  • bool calculate() does the math to calculate the correlation parameters A, B and R. This function will be called automatically when needed. You can call it on a more convenient time. Returns false if nothing to calculate count == 0
  • float getA() returns the A parameter of formula Y = A + B * X
  • float getB() returns the B parameter of formula Y = A + B * X
  • float getR() returns the correlation coefficient R. The closer to 0 the less correlation there is between X and Y. Correlation can be positive or negative. Most often the R squared sqr(R) is used.
  • float getRsquare() returns the sqr(R) which is always between 0.. 1.
  • float getEsquare() returns the error squared to get an indication of the quality of the relation.
  • float getAvgX() returns the average of all elements in the X dataset.
  • float getAvgY() returns the average of all elements in the Y dataset.
  • float getEstimateX(y) use to calculate the estimated X for a certain Y.
  • float getEstimateY(x) use to calculate the estimated Y for a certain X.

Running correlation

  • setRunningCorrelation(true | false) sets the internal variable runningMode which allows add() to overwrite old elements in the internal arrays.

The running correlation will be calculated over the last count elements. This allows for more adaptive formula finding e.g. find the relation between temperature and humidity for per hour.

Statistical

These functions give an indication of the "trusted interval" for estimations. The idea is that for getEstimateX() the further outside the range defined by getMinX() and getMaxX(), the less the result can be trusted. It also depends on R of course. Idem for getEstimateY()

  • float getMinX() idem
  • float getMaxX() idem
  • float getMinY() idem
  • float getMaxY() idem

Debugging / educational

Normally not used.

  • bool setXY(uint8_t idx, float x, float y) overwrites a pair of values. Returns true if succeeded, idx should be < count!
  • bool setX(uint8_t idx, float x) overwrites single X.
  • bool setY(uint8_t idx, float y) overwrites single Y.
  • float getX(uint8_t idx) returns single value.
  • float getY(uint8_t idx) returns single value.
  • float getSumXiYi() returns sum(Xi * Yi).
  • float getSumXi2() retuns sum(Xi * Xi).
  • float getSumYi2() retuns sum(Yi * Yi).

Future

  • Template version The constructor should get a TYPE parameter, as this allows smaller datatypes to be analyzed taking less memory.

Operation

See example