stats: module of statistical functions

PyHdust stats module: statistical tools

license:

GNU GPL v3.0 https://github.com/danmoser/pyhdust/blob/master/LICENSE

pyhdust.stats.cdf(x, xlim=None, savefig=False)[source]

Display the CDF (Cumulative Density Distribution) of a sample x.

A comparison with a gaussian and a linear one are made.

pyhdust.stats.corr_coef(x, y, clear_nan=True)[source]

Pearson correlation coefficient for two x and y arrays (same length).

See also scipy.stats.pearsonr()

pyhdust.stats.corr_coef_cov(x, y, clear_nan=True)[source]

Correlation coefficient based on the Covariance of two x and y arrays (same length).

\(\rho(x,y)=Cov(x,y)/sqrt(Var(x)*Var(y))\)

If \(\rho(x,y)= 0\) we say that X and Y are “uncorrelated.” If two variables are independent, then their correlation will be 0. However, like with covariance. it doesn’t go the other way. A correlation of 0 does not imply independence.

pyhdust.stats.corr_coef_cov_with_err(x, y, yerr, xerr=None, clear_nan=True, nsample=1000)[source]

TO BE DONE Correlation coefficient based on the Covariance of two x and y arrays (same length).

\(\rho(x,y)=Cov(x,y)/sqrt(Var(x)*Var(y))\)

If \(\rho(x,y)= 0\) we say that X and Y are “uncorrelated.” If two variables are independent, then their correlation will be 0. However, like with covariance. it doesn’t go the other way. A correlation of 0 does not imply independence.

pyhdust.stats.corr_coef_spearman(x, y, clear_nan=True)[source]

Spearman’s correlation coefficient for two x and y arrays (same length).

See also scipy.stats.spearmanr()

pyhdust.stats.mad(data, axis=None)[source]

Return 1.48xMAD (median absolute deviation)

The MAD is a robust statistic, being more resilient to outliers in a data set than the standard deviation.

pyhdust.stats.means(inarr, wtharr=None, quiet=False)[source]

Calculate many “means” for a given input array inarr.

wtharr is the weights array (e.g., inverse of the uncertainty).

Return simple, geom, harm, rms, median, mode

pyhdust.stats.snr(count_rate, texp=1.0, nexp=1, npix=10.0, bg=10.0, dk=0.0, ron=2.0, var=0.0)[source]

Calcute the Signal-to-Noise ratio based on Poisson statistics.

Parameters:
  • count_rate – = rate of counts (e-/time)

  • npix – = number os pixels for the given count

  • bg – = background rate per pixel (e-/time)

  • dk – = dark rate per pixel (e-/time)

  • ron – = readout noise (single pixel, in e-)

  • var – = variance on the source erroes (e-)

pyhdust.stats.summary(x, verbose=False)[source]

Returns the summary of the variable: “median”, “minus sigma” and “plus sigma” ROBUST values (i.e., median and [15.9, 84.1] percentiles).

Example:

import pyhdust.stats as stt

for i in range(8):
    a = _np.random.randn(10**i)+2
    print(np.average(a), np.std(a), stt.summary(a))