Home | Trees | Indices | Help |
|
---|
|
stats.py module
(Requires pstat.py module.)
:################################################# ####### Written by: Gary Strangman ########### ####### Last modified: Dec 28, 2000 ########### #################################################
A collection of basic statistical functions for python. The function names appear below.
IMPORTANT: There are really *3* sets of functions. The first set has an 'l' prefix, which can be used with list or tuple arguments. The second set has an 'a' prefix, which can accept NumPy array arguments. These latter functions are defined only when NumPy is available on the system. The third type has NO prefix (i.e., has the name that appears below). Functions of this set are members of a "Dispatch" class, c/o David Ascher. This class allows different functions to be called depending on the type of the passed arguments. Thus, stats.mean is a member of the Dispatch class and stats.mean(range(20)) will call stats.lmean(range(20)) while stats.mean(Numeric.arange(20)) will call stats.amean(Numeric.arange(20)). This is a handy way to keep consistent function names when different argument types require different functions to be called. Having implementated the Dispatch class, however, means that to get info on a given function, you must use the REAL function name ... that is "print stats.lmean.__doc__" or "print stats.amean.__doc__" work fine, while "print stats.mean.__doc__" will print the doc for the Dispatch class. NUMPY FUNCTIONS ('a' prefix) generally have more argument options but should otherwise be consistent with the corresponding list functions.
Disclaimers: The function list is obviously incomplete and, worse, the functions are not optimized. All functions have been tested (some more so than others), but they are far from bulletproof. Thus, as with any free software, no warranty or guarantee is expressed or implied. :-) A few extra functions that don't appear in the list below can be found by interested treasure-hunters. These functions don't necessarily have both list and array versions but were deemed useful:CENTRAL TENDENCY: geometricmean harmonicmean mean median medianscore mode MOMENTS: moment variation skew kurtosis skewtest (for Numpy arrays only) kurtosistest (for Numpy arrays only) normaltest (for Numpy arrays only) ALTERED VERSIONS: tmean (for Numpy arrays only) tvar (for Numpy arrays only) tmin (for Numpy arrays only) tmax (for Numpy arrays only) tstdev (for Numpy arrays only) tsem (for Numpy arrays only) describe FREQUENCY STATS: itemfreq scoreatpercentile percentileofscore histogram cumfreq relfreq VARIABILITY: obrientransform samplevar samplestdev signaltonoise (for Numpy arrays only) var stdev sterr sem z zs zmap (for Numpy arrays only) TRIMMING FCNS: threshold (for Numpy arrays only) trimboth trim1 round (round all vals to 'n' decimals; Numpy only) CORRELATION FCNS: covariance (for Numpy arrays only) correlation (for Numpy arrays only) paired pearsonr spearmanr pointbiserialr kendalltau linregress INFERENTIAL STATS: ttest_1samp ttest_ind ttest_rel chisquare ks_2samp mannwhitneyu ranksums wilcoxont kruskalwallish friedmanchisquare PROBABILITY CALCS: chisqprob erfcc zprob ksprob fprob betacf gammln betai ANOVA FUNCTIONS: F_oneway F_value SUPPORT FUNCTIONS: writecc incr sign (for Numpy arrays only) sum cumsum ss summult sumdiffsquared square_of_sums shellsort rankdata outputpairedstats findwithin
|
|||
|
Dispatch The Dispatch class, care of David Ascher, allows different functions to be called depending on the argument types. |
|
|||
|
lgeometricmean(inlist) Calculates the geometric mean of the values in the passed list. |
||
|
lharmonicmean(inlist) Calculates the harmonic mean of the values in the passed list. |
||
|
lmean(inlist) Returns the arithematic mean of the values in the passed list. |
||
|
lmedian(inlist,
numbins=1000) Returns the computed median value of a list of numbers, given the number of bins to use for the histogram (more bins brings the computed value closer to the median score, default number of bins = 1000). |
||
|
lmedianscore(inlist) Returns the 'middle' score of the passed list. |
||
|
lmode(inlist) Returns a list of the modal (most common) score(s) in the passed list. |
||
|
lmoment(inlist,
moment=1) Calculates the nth moment about the mean for a sample (defaults to the 1st moment). |
||
|
lvariation(inlist) Returns the coefficient of variation, as defined in CRC Standard Probability and Statistics, p.6. |
||
|
lskew(inlist) Returns the skewness of a distribution, as defined in Numerical Recipies (alternate defn in CRC Standard Probability and Statistics, p.6.) |
||
|
lkurtosis(inlist) Returns the kurtosis of a distribution, as defined in Numerical Recipies (alternate defn in CRC Standard Probability and Statistics, p.6.) |
||
|
ldescribe(inlist) Returns some descriptive statistics of the passed list (assumed to be 1D). |
||
|
litemfreq(inlist) Returns a list of pairs. |
||
|
lscoreatpercentile(inlist,
percent) Returns the score at a given percentile relative to the distribution given by inlist. |
||
|
lpercentileofscore(inlist,
score,
histbins=10,
defaultlimits=None) Returns the percentile value of a score relative to the distribution given by inlist. |
||
|
lhistogram(inlist,
numbins=10,
defaultreallimits=None,
printextras=0) Returns (i) a list of histogram bin counts, (ii) the smallest value of the histogram binning, and (iii) the bin width (the last 2 are not necessarily integers). |
||
|
lcumfreq(inlist,
numbins=10,
defaultreallimits=None) Returns a cumulative frequency histogram, using the histogram function. |
||
|
lrelfreq(inlist,
numbins=10,
defaultreallimits=None) Returns a relative frequency histogram, using the histogram function. |
||
|
lobrientransform(*args) Computes a transform on input data (any number of columns). |
||
|
lsamplevar(inlist) Returns the variance of the values in the passed list using N for the denominator (i.e., DESCRIBES the sample variance only). |
||
|
lsamplestdev(inlist) Returns the standard deviation of the values in the passed list using N for the denominator (i.e., DESCRIBES the sample stdev only). |
||
|
lvar(inlist) Returns the variance of the values in the passed list using N-1 for the denominator (i.e., for estimating population variance). |
||
|
lstdev(inlist) Returns the standard deviation of the values in the passed list using N-1 in the denominator (i.e., to estimate population stdev). |
||
|
lsterr(inlist) Returns the standard error of the values in the passed list using N-1 in the denominator (i.e., to estimate population standard error). |
||
|
lsem(inlist) Returns the estimated standard error of the mean (sx-bar) of the values in the passed list. |
||
|
lz(inlist,
score) Returns the z-score for a given input score, given that score and the list from which that score came. |
||
|
lzs(inlist) Returns a list of z-scores, one for each score in the passed list. |
||
|
ltrimboth(l,
proportiontocut) Slices off the passed proportion of items from BOTH ends of the passed list (i.e., with proportiontocut=0.1, slices 'leftmost' 10% AND 'rightmost' 10% of scores. |
||
|
ltrim1(l,
proportiontocut,
tail='right') Slices off the passed proportion of items from ONE end of the passed list (i.e., if proportiontocut=0.1, slices off 'leftmost' or 'rightmost' 10% of scores). |
||
|
lpaired(x,
y) Interactively determines the type of data and then runs the appropriated statistic for paired group data. |
||
|
lpearsonr(x,
y) Calculates a Pearson correlation coefficient and the associated probability value. |
||
|
lspearmanr(x,
y) Calculates a Spearman rank-order correlation coefficient. |
||
|
lpointbiserialr(x,
y) Calculates a point-biserial correlation coefficient and the associated probability value. |
||
|
lkendalltau(x,
y) Calculates Kendall's tau ... |
||
|
llinregress(x,
y) Calculates a regression line on x,y pairs. |
||
|
lttest_1samp(a,
popmean,
printit=0,
name='Sample',
writemode='a') Calculates the t-obtained for the independent samples T-test on ONE group of scores a, given a population mean. |
||
|
lttest_ind(a,
b,
printit=0,
name1='Samp1',
name2='Samp2',
writemode='a') Calculates the t-obtained T-test on TWO INDEPENDENT samples of scores a, and b. |
||
|
lttest_rel(a,
b,
printit=0,
name1='Sample1',
name2='Sample2',
writemode='a') Calculates the t-obtained T-test on TWO RELATED samples of scores, a and b. |
||
|
lchisquare(f_obs,
f_exp=None) Calculates a one-way chi square for list of observed frequencies and returns the result. |
||
|
lks_2samp(data1,
data2) Computes the Kolmogorov-Smirnof statistic on 2 samples. |
||
|
lmannwhitneyu(x,
y) Calculates a Mann-Whitney U statistic on the provided scores and returns the result. |
||
|
ltiecorrect(rankvals) Corrects for ties in Mann Whitney U and Kruskal Wallis H tests. |
||
|
lranksums(x,
y) Calculates the rank sums statistic on the provided scores and returns the result. |
||
|
lwilcoxont(x,
y) Calculates the Wilcoxon T-test for related samples and returns the result. |
||
|
lkruskalwallish(*args) The Kruskal-Wallis H-test is a non-parametric ANOVA for 3 or more groups, requiring at least 5 subjects in each group. |
||
|
lfriedmanchisquare(*args) Friedman Chi-Square is a non-parametric, one-way within-subjects ANOVA. |
||
|
lchisqprob(chisq,
df) Returns the (1-tailed) probability value associated with the provided chi-square value and df. |
||
|
lerfcc(x) Returns the complementary error function erfc(x) with fractional error everywhere less than 1.2e-7. |
||
|
lzprob(z) Returns the area under the normal curve 'to the left of' the given z value. |
||
|
lksprob(alam) Computes a Kolmolgorov-Smirnov t-test significance level. |
||
|
lfprob(dfnum,
dfden,
F) Returns the (1-tailed) significance level (p-value) of an F statistic given the degrees of freedom for the numerator (dfR-dfF) and the degrees of freedom for the denominator (dfF). |
||
|
lbetacf(a,
b,
x) This function evaluates the continued fraction form of the incomplete Beta function, betai. |
||
|
lgammln(xx) Returns the gamma function of xx.: |
||
|
lbetai(a,
b,
x) Returns the incomplete beta function: |
||
|
lF_oneway(*lists) Performs a 1-way ANOVA, returning an F-value and probability given any number of groups. |
||
|
lF_value(ER,
EF,
dfnum,
dfden) Returns an F-statistic given the following: |
||
|
writecc(listoflists,
file,
writetype='w',
extra=2) Writes a list of lists to a file in columns, customized by the max size of items within the columns (max size of items in col, +2 characters) to specified file. |
||
|
lincr(l,
cap) Simulate a counting system from an n-dimensional list. |
||
|
lsum(inlist) Returns the sum of the items in the passed list. |
||
|
lcumsum(inlist) Returns a list consisting of the cumulative sum of the items in the passed list. |
||
|
lss(inlist) Squares each value in the passed list, adds up these squares and returns the result. |
||
|
lsummult(list1,
list2) Multiplies elements in list1 and list2, element by element, and returns the sum of all resulting multiplications. |
||
|
lsumdiffsquared(x,
y) Takes pairwise differences of the values in lists x and y, squares these differences, and returns the sum of these squares. |
||
|
lsquare_of_sums(inlist) Adds the values in the passed list, squares the sum, and returns the result. |
||
|
lshellsort(inlist) Shellsort algorithm. |
||
|
lrankdata(inlist) Ranks the data in inlist, dealing with ties appropritely. |
||
|
outputpairedstats(fname,
writemode,
name1,
n1,
m1,
se1,
min1,
max1,
name2,
n2,
m2,
se2,
min2,
max2,
statname,
stat,
prob) Prints or write to a file stats for two groups, using the name, n, mean, sterr, min and max for each group, as well as the statistic name, its value, and the associated p-value. |
||
|
lfindwithin(data) Returns an integer representing a binary vector, where 1=within- subject factor, 0=between. |
||
|
ageometricmean(inarray,
dimension=None,
keepdims=0) Calculates the geometric mean of the values in the passed array. |
||
|
aharmonicmean(inarray,
dimension=None,
keepdims=0) Calculates the harmonic mean of the values in the passed array. |
||
|
amean(inarray,
dimension=None,
keepdims=0) Calculates the arithmatic mean of the values in the passed array. |
||
|
amedian(inarray,
numbins=1000) Calculates the COMPUTED median value of an array of numbers, given the number of bins to use for the histogram (more bins approaches finding the precise median value of the array; default number of bins = 1000). |
||
|
amedianscore(inarray,
dimension=None) Returns the 'middle' score of the passed array. |
||
|
amode(a,
dimension=None) Returns an array of the modal (most common) score in the passed array. |
||
|
atmean(a,
limits=None,
inclusive=(1,1)) Returns the arithmetic mean of all values in an array, ignoring values strictly outside the sequence passed to 'limits'. |
||
|
atvar(a,
limits=None,
inclusive=(1,1)) Returns the sample variance of values in an array, (i.e., using N-1), ignoring values strictly outside the sequence passed to 'limits'. |
||
|
atmin(a,
lowerlimit=None,
dimension=None,
inclusive=1) Returns the minimum value of a, along dimension, including only values less than (or equal to, if inclusive=1) lowerlimit. |
||
|
atmax(a,
upperlimit,
dimension=None,
inclusive=1) Returns the maximum value of a, along dimension, including only values greater than (or equal to, if inclusive=1) upperlimit. |
||
|
atstdev(a,
limits=None,
inclusive=(1,1)) Returns the standard deviation of all values in an array, ignoring values strictly outside the sequence passed to 'limits'. |
||
|
atsem(a,
limits=None,
inclusive=(1,1)) Returns the standard error of the mean for the values in an array, (i.e., using N for the denominator), ignoring values strictly outside the sequence passed to 'limits'. |
||
|
amoment(a,
moment=1,
dimension=None) Calculates the nth moment about the mean for a sample (defaults to the 1st moment). |
||
|
avariation(a,
dimension=None) Returns the coefficient of variation, as defined in CRC Standard Probability and Statistics, p.6. |
||
|
askew(a,
dimension=None) Returns the skewness of a distribution (normal ==> 0.0; >0 means extra weight in left tail). |
||
|
akurtosis(a,
dimension=None) Returns the kurtosis of a distribution (normal ==> 3.0; >3 means heavier in the tails, and usually more peaked). |
||
|
adescribe(inarray,
dimension=None) Returns several descriptive statistics of the passed array. |
||
|
askewtest(a,
dimension=None) Tests whether the skew is significantly different from a normal distribution. |
||
|
akurtosistest(a,
dimension=None) Tests whether a dataset has normal kurtosis (i.e., kurtosis=3(n-1)/(n+1)) Valid only for n>20. |
||
|
anormaltest(a,
dimension=None) Tests whether skew and/OR kurtosis of dataset differs from normal curve. |
||
|
aitemfreq(a) Returns a 2D array of item frequencies. |
||
|
ascoreatpercentile(inarray,
percent) Usage: ascoreatpercentile(inarray,percent) 0<percent<100 Returns: score at given percentile, relative to inarray distribution |
||
|
apercentileofscore(inarray,
score,
histbins=10,
defaultlimits=None) Note: result of this function depends on the values used to histogram the data(!). |
||
|
ahistogram(inarray,
numbins=10,
defaultlimits=None,
printextras=1) Returns (i) an array of histogram bin counts, (ii) the smallest value of the histogram binning, and (iii) the bin width (the last 2 are not necessarily integers). |
||
|
acumfreq(a,
numbins=10,
defaultreallimits=None) Returns a cumulative frequency histogram, using the histogram function. |
||
|
arelfreq(a,
numbins=10,
defaultreallimits=None) Returns a relative frequency histogram, using the histogram function. |
||
|
aobrientransform(*args) Computes a transform on input data (any number of columns). |
||
|
asamplevar(inarray,
dimension=None,
keepdims=0) Returns the sample standard deviation of the values in the passed array (i.e., using N). |
||
|
asamplestdev(inarray,
dimension=None,
keepdims=0) Returns the sample standard deviation of the values in the passed array (i.e., using N). |
||
|
asignaltonoise(instack,
dimension=0) Calculates signal-to-noise. |
||
|
avar(inarray,
dimension=None,
keepdims=0) Returns the estimated population variance of the values in the passed array (i.e., N-1). |
||
|
astdev(inarray,
dimension=None,
keepdims=0) Returns the estimated population standard deviation of the values in the passed array (i.e., N-1). |
||
|
asterr(inarray,
dimension=None,
keepdims=0) Returns the estimated population standard error of the values in the passed array (i.e., N-1). |
||
|
asem(inarray,
dimension=None,
keepdims=0) Returns the standard error of the mean (i.e., using N) of the values in the passed array. |
||
|
az(a,
score) Returns the z-score of a given input score, given thearray from which that score came. |
||
|
azs(a) Returns a 1D array of z-scores, one for each score in the passed array, computed relative to the passed array. |
||
|
azmap(scores,
compare,
dimension=0) Returns an array of z-scores the shape of scores (e.g., [x,y]), compared to array passed to compare (e.g., [time,x,y]). |
||
|
around(a,
digits=1) Rounds all values in array a to 'digits' decimal places. |
||
|
athreshold(a,
threshmin=None,
threshmax=None,
newval=0) Like Numeric.clip() except that values <threshmid or >threshmax are replaced by newval instead of by threshmin/threshmax (respectively). |
||
|
atrimboth(a,
proportiontocut) Slices off the passed proportion of items from BOTH ends of the passed array (i.e., with proportiontocut=0.1, slices 'leftmost' 10% AND 'rightmost' 10% of scores. |
||
|
atrim1(a,
proportiontocut,
tail='right') Slices off the passed proportion of items from ONE end of the passed array (i.e., if proportiontocut=0.1, slices off 'leftmost' or 'rightmost' 10% of scores). |
||
|
acovariance(X) Computes the covariance matrix of a matrix X. |
||
|
acorrelation(X) Computes the correlation matrix of a matrix X. |
||
|
apaired(x,
y) Interactively determines the type of data in x and y, and then runs the appropriated statistic for paired group data. |
||
|
apearsonr(x,
y,
verbose=1) Calculates a Pearson correlation coefficient and returns p. |
||
|
aspearmanr(x,
y) Calculates a Spearman rank-order correlation coefficient. |
||
|
apointbiserialr(x,
y) Calculates a point-biserial correlation coefficient and the associated probability value. |
||
|
akendalltau(x,
y) Calculates Kendall's tau ... |
||
|
alinregress(*args) Calculates a regression line on two arrays, x and y, corresponding to x,y pairs. |
||
|
attest_1samp(a,
popmean,
printit=0,
name='Sample',
writemode='a') Calculates the t-obtained for the independent samples T-test on ONE group of scores a, given a population mean. |
||
|
attest_ind(a,
b,
dimension=None,
printit=0,
name1='Samp1',
name2='Samp2',
writemode='a') Calculates the t-obtained T-test on TWO INDEPENDENT samples of scores a, and b. |
||
|
attest_rel(a,
b,
dimension=None,
printit=0,
name1='Samp1',
name2='Samp2',
writemode='a') Calculates the t-obtained T-test on TWO RELATED samples of scores, a and b. |
||
|
achisquare(f_obs,
f_exp=None) Calculates a one-way chi square for array of observed frequencies and returns the result. |
||
|
aks_2samp(data1,
data2) Computes the Kolmogorov-Smirnof statistic on 2 samples. |
||
|
amannwhitneyu(x,
y) Calculates a Mann-Whitney U statistic on the provided scores and returns the result. |
||
|
atiecorrect(rankvals) Tie-corrector for ties in Mann Whitney U and Kruskal Wallis H tests. |
||
|
aranksums(x,
y) Calculates the rank sums statistic on the provided scores and returns the result. |
||
|
awilcoxont(x,
y) Calculates the Wilcoxon T-test for related samples and returns the result. |
||
|
akruskalwallish(*args) The Kruskal-Wallis H-test is a non-parametric ANOVA for 3 or more groups, requiring at least 5 subjects in each group. |
||
|
afriedmanchisquare(*args) Friedman Chi-Square is a non-parametric, one-way within-subjects ANOVA. |
||
|
achisqprob(chisq,
df) Returns the (1-tail) probability value associated with the provided chi-square value and df. |
||
|
aerfcc(x) Returns the complementary error function erfc(x) with fractional error everywhere less than 1.2e-7. |
||
|
azprob(z) Returns the area under the normal curve 'to the left of' the given z value. |
||
|
aksprob(alam) Returns the probability value for a K-S statistic computed via ks_2samp. |
||
|
afprob(dfnum,
dfden,
F) Returns the 1-tailed significance level (p-value) of an F statistic given the degrees of freedom for the numerator (dfR-dfF) and the degrees of freedom for the denominator (dfF). |
||
|
abetacf(a,
b,
x,
verbose=1) Evaluates the continued fraction form of the incomplete Beta function, betai. |
||
|
agammln(xx) Returns the gamma function of xx.: |
||
|
abetai(a,
b,
x,
verbose=1) Returns the incomplete beta function: |
||
|
aglm(data,
para) Calculates a linear model fit ... |
||
|
aF_oneway(*args) Performs a 1-way ANOVA, returning an F-value and probability given any number of groups. |
||
|
aF_value(ER,
EF,
dfR,
dfF) Returns an F-statistic given the following: |
||
| outputfstats(Enum, Eden, dfnum, dfden, f, prob) | ||
|
F_value_multivariate(ER,
EF,
dfnum,
dfden) Returns an F-statistic given the following: |
||
|
asign(a) Usage: asign(a) Returns: array shape of a, with -1 where a<0 and +1 where a>=0 |
||
|
asum(a,
dimension=None,
keepdims=0) An alternative to the Numeric.add.reduce function, which allows one to (1) collapse over multiple dimensions at once, and/or (2) to retain all dimensions in the original array (squashing one down to size. |
||
|
acumsum(a,
dimension=None) Returns an array consisting of the cumulative sum of the items in the passed array. |
||
|
ass(inarray,
dimension=None,
keepdims=0) Squares each value in the passed array, adds these squares & returns the result. |
||
|
asummult(array1,
array2,
dimension=None,
keepdims=0) Multiplies elements in array1 and array2, element by element, and returns the sum (along 'dimension') of all resulting multiplications. |
||
|
asquare_of_sums(inarray,
dimension=None,
keepdims=0) Adds the values in the passed array, squares that sum, and returns the result. |
||
|
asumdiffsquared(a,
b,
dimension=None,
keepdims=0) Takes pairwise differences of the values in arrays a and b, squares these differences, and returns the sum of these squares. |
||
|
ashellsort(inarray) Shellsort algorithm. |
||
|
arankdata(inarray) Ranks the data in inarray, dealing with ties appropritely. |
||
|
afindwithin(data) Returns a binary vector, 1=within-subject factor, 0=between. |
|
|||
|
__version__ = 0.5
|
||
|
N = Numeric
|
||
|
LA = LinearAlgebra
|
||
|
geometricmean = Dispatch((lgeometricmean, (ListType, TupleType)), (a...
|
||
|
harmonicmean = Dispatch((lharmonicmean, (ListType, TupleType)), (ah...
|
||
|
mean = Dispatch((lmean, (ListType, TupleType)), (amean, (N....
|
||
|
median = Dispatch((lmedian, (ListType, TupleType)), (amedian,...
|
||
|
medianscore = Dispatch((lmedianscore, (ListType, TupleType)), (ame...
|
||
|
mode = Dispatch((lmode, (ListType, TupleType)), (amode, (N....
|
||
|
tmean = Dispatch((atmean, (N.ArrayType,)))
|
||
|
tvar = Dispatch((atvar, (N.ArrayType,)))
|
||
|
tstdev = Dispatch((atstdev, (N.ArrayType,)))
|
||
|
tsem = Dispatch((atsem, (N.ArrayType,)))
|
||
|
moment = Dispatch((lmoment, (ListType, TupleType)), (amoment,...
|
||
|
variation = Dispatch((lvariation, (ListType, TupleType)), (avari...
|
||
|
skew = Dispatch((lskew, (ListType, TupleType)), (askew, (N....
|
||
|
kurtosis = Dispatch((lkurtosis, (ListType, TupleType)), (akurto...
|
||
|
describe = Dispatch((ldescribe, (ListType, TupleType)), (adescr...
|
||
|
skewtest = Dispatch((askewtest, (ListType, TupleType)), (askewt...
|
||
|
kurtosistest = Dispatch((akurtosistest, (ListType, TupleType)), (ak...
|
||
|
normaltest = Dispatch((anormaltest, (ListType, TupleType)), (anor...
|
||
|
itemfreq = Dispatch((litemfreq, (ListType, TupleType)), (aitemf...
|
||
|
scoreatpercentile = Dispatch((lscoreatpercentile, (ListType, TupleType))...
|
||
|
percentileofscore = Dispatch((lpercentileofscore, (ListType, TupleType))...
|
||
|
histogram = Dispatch((lhistogram, (ListType, TupleType)), (ahist...
|
||
|
cumfreq = Dispatch((lcumfreq, (ListType, TupleType)), (acumfre...
|
||
|
relfreq = Dispatch((lrelfreq, (ListType, TupleType)), (arelfre...
|
||
|
obrientransform = Dispatch((lobrientransform, (ListType, TupleType)), ...
|
||
|
samplevar = Dispatch((lsamplevar, (ListType, TupleType)), (asamp...
|
||
|
samplestdev = Dispatch((lsamplestdev, (ListType, TupleType)), (asa...
|
||
|
signaltonoise = Dispatch((asignaltonoise, (N.ArrayType,)),)
|
||
|
var = Dispatch((lvar, (ListType, TupleType)), (avar, (N.Ar...
|
||
|
stdev = Dispatch((lstdev, (ListType, TupleType)), (astdev, (...
|
||
|
sterr = Dispatch((lsterr, (ListType, TupleType)), (asterr, (...
|
||
|
sem = Dispatch((lsem, (ListType, TupleType)), (asem, (N.Ar...
|
||
|
z = Dispatch((lz, (ListType, TupleType)), (az, (N.ArrayT...
|
||
|
zs = Dispatch((lzs, (ListType, TupleType)), (azs, (N.Arra...
|
||
|
threshold = Dispatch((athreshold, (N.ArrayType,)),)
|
||
|
trimboth = Dispatch((ltrimboth, (ListType, TupleType)), (atrimb...
|
||
|
trim1 = Dispatch((ltrim1, (ListType, TupleType)), (atrim1, (...
|
||
|
paired = Dispatch((lpaired, (ListType, TupleType)), (apaired,...
|
||
|
pearsonr = Dispatch((lpearsonr, (ListType, TupleType)), (apears...
|
||
|
spearmanr = Dispatch((lspearmanr, (ListType, TupleType)), (aspea...
|
||
|
pointbiserialr = Dispatch((lpointbiserialr, (ListType, TupleType)), (...
|
||
|
kendalltau = Dispatch((lkendalltau, (ListType, TupleType)), (aken...
|
||
|
linregress = Dispatch((llinregress, (ListType, TupleType)), (alin...
|
||
|
ttest_1samp = Dispatch((lttest_1samp, (ListType, TupleType)), (att...
|
||
|
ttest_ind = Dispatch((lttest_ind, (ListType, TupleType)), (attes...
|
||
|
ttest_rel = Dispatch((lttest_rel, (ListType, TupleType)), (attes...
|
||
|
chisquare = Dispatch((lchisquare, (ListType, TupleType)), (achis...
|
||
|
ks_2samp = Dispatch((lks_2samp, (ListType, TupleType)), (aks_2s...
|
||
|
mannwhitneyu = Dispatch((lmannwhitneyu, (ListType, TupleType)), (am...
|
||
|
tiecorrect = Dispatch((ltiecorrect, (ListType, TupleType)), (atie...
|
||
|
ranksums = Dispatch((lranksums, (ListType, TupleType)), (aranks...
|
||
|
wilcoxont = Dispatch((lwilcoxont, (ListType, TupleType)), (awilc...
|
||
|
kruskalwallish = Dispatch((lkruskalwallish, (ListType, TupleType)), (...
|
||
|
friedmanchisquare = Dispatch((lfriedmanchisquare, (ListType, TupleType))...
|
||
|
chisqprob = Dispatch((lchisqprob, (IntType, FloatType)), (achisq...
|
||
|
zprob = Dispatch((lzprob, (IntType, FloatType)), (azprob, (N...
|
||
|
ksprob = Dispatch((lksprob, (IntType, FloatType)), (aksprob, ...
|
||
|
fprob = Dispatch((lfprob, (IntType, FloatType)), (afprob, (N...
|
||
|
betacf = Dispatch((lbetacf, (IntType, FloatType)), (abetacf, ...
|
||
|
betai = Dispatch((lbetai, (IntType, FloatType)), (abetai, (N...
|
||
|
erfcc = Dispatch((lerfcc, (IntType, FloatType)), (aerfcc, (N...
|
||
|
gammln = Dispatch((lgammln, (IntType, FloatType)), (agammln, ...
|
||
|
F_oneway = Dispatch((lF_oneway, (ListType, TupleType)), (aF_one...
|
||
|
F_value = Dispatch((lF_value, (ListType, TupleType)), (aF_valu...
|
||
|
incr = Dispatch((lincr, (ListType, TupleType, N.ArrayType)),)
|
||
|
sum = Dispatch((lsum, (ListType, TupleType)), (asum, (N.Ar...
|
||
|
cumsum = Dispatch((lcumsum, (ListType, TupleType)), (acumsum,...
|
||
|
ss = Dispatch((lss, (ListType, TupleType)), (ass, (N.Arra...
|
||
|
summult = Dispatch((lsummult, (ListType, TupleType)), (asummul...
|
||
|
square_of_sums = Dispatch((lsquare_of_sums, (ListType, TupleType)), (...
|
||
|
sumdiffsquared = Dispatch((lsumdiffsquared, (ListType, TupleType)), (...
|
||
|
shellsort = Dispatch((lshellsort, (ListType, TupleType)), (ashel...
|
||
|
rankdata = Dispatch((lrankdata, (ListType, TupleType)), (arankd...
|
||
|
findwithin = Dispatch((lfindwithin, (ListType, TupleType)), (afin...
|
|
Calculates the geometric mean of the values in the passed list. That is: n-th root of (x1 * x2 * ... * xn). Assumes a '1D' list. Usage: lgeometricmean(inlist) |
Calculates the harmonic mean of the values in the passed list. That is: n / (1/x1 + 1/x2 + ... + 1/xn). Assumes a '1D' list. Usage: lharmonicmean(inlist) |
Returns the arithematic mean of the values in the passed list. Assumes a '1D' list, but will function on the 1st dim of an array(!). Usage: lmean(inlist) |
Returns the computed median value of a list of numbers, given the number of bins to use for the histogram (more bins brings the computed value closer to the median score, default number of bins = 1000). See G.W. Heiman's Basic Stats (1st Edition), or CRC Probability & Statistics. Usage: lmedian (inlist, numbins=1000) |
Returns the 'middle' score of the passed list. If there is an even number of scores, the mean of the 2 middle scores is returned. Usage: lmedianscore(inlist) |
Returns a list of the modal (most common) score(s) in the passed list. If there is more than one such score, all are returned. The bin-count for the mode(s) is also returned. Usage: lmode(inlist) Returns: bin-count for mode(s), a list of modal value(s) |
Calculates the nth moment about the mean for a sample (defaults to the 1st moment). Used to calculate coefficients of skewness and kurtosis. Usage: lmoment(inlist,moment=1) Returns: appropriate moment (r) from ... 1/n * SUM((inlist(i)-mean)**r) |
Returns the coefficient of variation, as defined in CRC Standard Probability and Statistics, p.6. Usage: lvariation(inlist) |
Returns the skewness of a distribution, as defined in Numerical Recipies (alternate defn in CRC Standard Probability and Statistics, p.6.) Usage: lskew(inlist) |
Returns the kurtosis of a distribution, as defined in Numerical Recipies (alternate defn in CRC Standard Probability and Statistics, p.6.) Usage: lkurtosis(inlist) |
Returns some descriptive statistics of the passed list (assumed to be 1D). Usage: ldescribe(inlist) Returns: n, mean, standard deviation, skew, kurtosis |
Returns a list of pairs. Each pair consists of one of the scores in inlist and it's frequency count. Assumes a 1D list is passed. Usage: litemfreq(inlist) Returns: a 2D frequency table (col [0:n-1]=scores, col n=frequencies) |
Returns the score at a given percentile relative to the distribution given by inlist. Usage: lscoreatpercentile(inlist,percent) |
Returns the percentile value of a score relative to the distribution given by inlist. Formula depends on the values used to histogram the data(!). Usage: lpercentileofscore(inlist,score,histbins=10,defaultlimits=None) |
Returns (i) a list of histogram bin counts, (ii) the smallest value of the histogram binning, and (iii) the bin width (the last 2 are not necessarily integers). Default number of bins is 10. If no sequence object is given for defaultreallimits, the routine picks (usually non-pretty) bins spanning all the numbers in the inlist. Usage: lhistogram (inlist, numbins=10, defaultreallimits=None,suppressoutput=0) Returns: list of bin values, lowerreallimit, binsize, extrapoints |
Returns a cumulative frequency histogram, using the histogram function. Usage: lcumfreq(inlist,numbins=10,defaultreallimits=None) Returns: list of cumfreq bin values, lowerreallimit, binsize, extrapoints |
Returns a relative frequency histogram, using the histogram function. Usage: lrelfreq(inlist,numbins=10,defaultreallimits=None) Returns: list of cumfreq bin values, lowerreallimit, binsize, extrapoints |
Computes a transform on input data (any number of columns). Used to test for homogeneity of variance prior to running one-way stats. From Maxwell and Delaney, p.112. Usage: lobrientransform(*args) Returns: transformed data for use in an ANOVA |
Returns the variance of the values in the passed list using N for the denominator (i.e., DESCRIBES the sample variance only). Usage: lsamplevar(inlist) |
Returns the standard deviation of the values in the passed list using N for the denominator (i.e., DESCRIBES the sample stdev only). Usage: lsamplestdev(inlist) |
Returns the variance of the values in the passed list using N-1 for the denominator (i.e., for estimating population variance). Usage: lvar(inlist) |
Returns the standard deviation of the values in the passed list using N-1 in the denominator (i.e., to estimate population stdev). Usage: lstdev(inlist) |
Returns the standard error of the values in the passed list using N-1 in the denominator (i.e., to estimate population standard error). Usage: lsterr(inlist) |
Returns the estimated standard error of the mean (sx-bar) of the values in the passed list. sem = stdev / sqrt(n) Usage: lsem(inlist) |
Returns the z-score for a given input score, given that score and the list from which that score came. Not appropriate for population calculations. Usage: lz(inlist, score) |
Returns a list of z-scores, one for each score in the passed list. Usage: lzs(inlist) |
Slices off the passed proportion of items from BOTH ends of the passed list (i.e., with proportiontocut=0.1, slices 'leftmost' 10% AND 'rightmost' 10% of scores. Assumes list is sorted by magnitude. Slices off LESS if proportion results in a non-integer slice index (i.e., conservatively slices off proportiontocut). Usage: ltrimboth (l,proportiontocut) Returns: trimmed version of list l |
Slices off the passed proportion of items from ONE end of the passed list (i.e., if proportiontocut=0.1, slices off 'leftmost' or 'rightmost' 10% of scores). Slices off LESS if proportion results in a non-integer slice index (i.e., conservatively slices off proportiontocut). Usage: ltrim1 (l,proportiontocut,tail='right') or set tail='left' Returns: trimmed version of list l |
Interactively determines the type of data and then runs the appropriated statistic for paired group data. Usage: lpaired(x,y) Returns: appropriate statistic name, value, and probability |
Calculates a Pearson correlation coefficient and the associated probability value. Taken from Heiman's Basic Statistics for the Behav. Sci (2nd), p.195. Usage: lpearsonr(x,y) where x and y are equal-length lists Returns: Pearson's r value, two-tailed p-value |
Calculates a Spearman rank-order correlation coefficient. Taken from Heiman's Basic Statistics for the Behav. Sci (1st), p.192. Usage: lspearmanr(x,y) where x and y are equal-length lists Returns: Spearman's r, two-tailed p-value |
Calculates a point-biserial correlation coefficient and the associated probability value. Taken from Heiman's Basic Statistics for the Behav. Sci (1st), p.194. Usage: lpointbiserialr(x,y) where x,y are equal-length lists Returns: Point-biserial r, two-tailed p-value |
Calculates Kendall's tau ... correlation of ordinal data. Adapted from function kendl1 in Numerical Recipies. Needs good test-routine.@@@ Usage: lkendalltau(x,y) Returns: Kendall's tau, two-tailed p-value |
Calculates a regression line on x,y pairs. Usage: llinregress(x,y) x,y are equal-length lists of x-y coordinates Returns: slope, intercept, r, two-tailed prob, sterr-of-estimate |
Calculates the t-obtained for the independent samples T-test on ONE group of scores a, given a population mean. If printit=1, results are printed to the screen. If printit='filename', the results are output to 'filename' using the given writemode (default=append). Returns t-value, and prob. Usage: lttest_1samp(a,popmean,Name='Sample',printit=0,writemode='a') Returns: t-value, two-tailed prob |
Calculates the t-obtained T-test on TWO INDEPENDENT samples of scores a, and b. From Numerical Recipies, p.483. If printit=1, results are printed to the screen. If printit='filename', the results are output to 'filename' using the given writemode (default=append). Returns t-value, and prob. Usage: lttest_ind(a,b,printit=0,name1='Samp1',name2='Samp2',writemode='a') Returns: t-value, two-tailed prob |
Calculates the t-obtained T-test on TWO RELATED samples of scores, a and b. From Numerical Recipies, p.483. If printit=1, results are printed to the screen. If printit='filename', the results are output to 'filename' using the given writemode (default=append). Returns t-value, and prob. Usage: lttest_rel(a,b,printit=0,name1='Sample1',name2='Sample2',writemode='a') Returns: t-value, two-tailed prob |
Calculates a one-way chi square for list of observed frequencies and returns the result. If no expected frequencies are given, the total N is assumed to be equally distributed across all groups. Usage: lchisquare(f_obs, f_exp=None) f_obs = list of observed cell freq. Returns: chisquare-statistic, associated p-value |
Computes the Kolmogorov-Smirnof statistic on 2 samples. From Numerical Recipies in C, page 493. Usage: lks_2samp(data1,data2) data1&2 are lists of values for 2 conditions Returns: KS D-value, associated p-value |
Calculates a Mann-Whitney U statistic on the provided scores and returns the result. Use only when the n in each condition is < 20 and you have 2 independent samples of ranks. NOTE: Mann-Whitney U is significant if the u-obtained is LESS THAN or equal to the critical value of U found in the tables. Equivalent to Kruskal-Wallis H with just 2 groups. Usage: lmannwhitneyu(data) Returns: u-statistic, one-tailed p-value (i.e., p(z(U))) |
Corrects for ties in Mann Whitney U and Kruskal Wallis H tests. See Siegel, S. (1956) Nonparametric Statistics for the Behavioral Sciences. New York: McGraw-Hill. Code adapted from |Stat rankind.c code. Usage: ltiecorrect(rankvals) Returns: T correction factor for U or H |
Calculates the rank sums statistic on the provided scores and returns the result. Use only when the n in each condition is > 20 and you have 2 independent samples of ranks. Usage: lranksums(x,y) Returns: a z-statistic, two-tailed p-value |
Calculates the Wilcoxon T-test for related samples and returns the result. A non-parametric T-test. Usage: lwilcoxont(x,y) Returns: a t-statistic, two-tail probability estimate |
The Kruskal-Wallis H-test is a non-parametric ANOVA for 3 or more groups, requiring at least 5 subjects in each group. This function calculates the Kruskal-Wallis H-test for 3 or more independent samples and returns the result. Usage: lkruskalwallish(*args) Returns: H-statistic (corrected for ties), associated p-value |
Friedman Chi-Square is a non-parametric, one-way within-subjects ANOVA. This function calculates the Friedman Chi-square test for repeated measures and returns the result, along with the associated probability value. It assumes 3 or more repeated measures. Only 3 levels requires a minimum of 10 subjects in the study. Four levels requires 5 subjects per level(??). Usage: lfriedmanchisquare(*args) Returns: chi-square statistic, associated p-value |
Returns the (1-tailed) probability value associated with the provided chi-square value and df. Adapted from chisq.c in Gary Perlman's |Stat. Usage: lchisqprob(chisq,df) |
Returns the complementary error function erfc(x) with fractional error everywhere less than 1.2e-7. Adapted from Numerical Recipies. Usage: lerfcc(x) |
Returns the area under the normal curve 'to the left of' the given z value. Thus, : for z<0, zprob(z) = 1-tail probability for z>0, 1.0-zprob(z) = 1-tail probability for any z, 2.0*(1.0-zprob(abs(z))) = 2-tail probability Adapted from z.c in Gary Perlman's |Stat. Usage: lzprob(z) |
Computes a Kolmolgorov-Smirnov t-test significance level. Adapted from Numerical Recipies. Usage: lksprob(alam) |
Returns the (1-tailed) significance level (p-value) of an F statistic given the degrees of freedom for the numerator (dfR-dfF) and the degrees of freedom for the denominator (dfF). Usage: lfprob(dfnum, dfden, F) where usually dfnum=dfbn, dfden=dfwn |
This function evaluates the continued fraction form of the incomplete Beta function, betai. (Adapted from: Numerical Recipies in C.) Usage: lbetacf(a,b,x) |
Returns the gamma function of xx.: Gamma(z) = Integral(0,infinity) of t^(z-1)exp(-t) dt. (Adapted from: Numerical Recipies in C.) Usage: lgammln(xx) |
Returns the incomplete beta function: I-sub-x(a,b) = 1/B(a,b)*(Integral(0,x) of t^(a-1)(1-t)^(b-1) dt) where a,b>0 and B(a,b) = G(a)*G(b)/(G(a+b)) where G(a) is the gamma function of a. The continued fraction formulation is implemented here, using the betacf function. (Adapted from: Numerical Recipies in C.) Usage: lbetai(a,b,x) |
Performs a 1-way ANOVA, returning an F-value and probability given any number of groups. From Heiman, pp.394-7. Usage: F_oneway(*lists) where *lists is any number of lists, one per treatment group Returns: F value, one-tailed p-value |
Returns an F-statistic given the following: ER = error associated with the null hypothesis (the Restricted model) EF = error associated with the alternate hypothesis (the Full model) dfR-dfF = degrees of freedom of the numerator dfF = degrees of freedom associated with the denominator/Full modelUsage: lF_value(ER,EF,dfnum,dfden) |
Writes a list of lists to a file in columns, customized by the max size of items within the columns (max size of items in col, +2 characters) to specified file. File-overwrite is the default. Usage: writecc (listoflists,file,writetype='w',extra=2) Returns: None |
Simulate a counting system from an n-dimensional list. Usage: lincr(l,cap) l=list to increment, cap=max values for each list pos'n Returns: next set of values for list l, OR -1 (if overflow) |
Returns the sum of the items in the passed list. Usage: lsum(inlist) |
Returns a list consisting of the cumulative sum of the items in the passed list. Usage: lcumsum(inlist) |
Squares each value in the passed list, adds up these squares and returns the result. Usage: lss(inlist) |
Multiplies elements in list1 and list2, element by element, and returns the sum of all resulting multiplications. Must provide equal length lists. Usage: lsummult(list1,list2) |
Takes pairwise differences of the values in lists x and y, squares these differences, and returns the sum of these squares. Usage: lsumdiffsquared(x,y) Returns: sum[(x[i]-y[i])**2] |
Adds the values in the passed list, squares the sum, and returns the result. Usage: lsquare_of_sums(inlist) Returns: sum(inlist[i])**2 |
Shellsort algorithm. Sorts a 1D-list. Usage: lshellsort(inlist) Returns: sorted-inlist, sorting-index-vector (for original list) |
Ranks the data in inlist, dealing with ties appropritely. Assumes a 1D inlist. Adapted from Gary Perlman's |Stat ranksort. Usage: lrankdata(inlist) Returns: a list of length equal to inlist, containing rank scores |
Prints or write to a file stats for two groups, using the name, n, mean, sterr, min and max for each group, as well as the statistic name, its value, and the associated p-value. Usage:outputpairedstats(fname,writemode, name1,n1,mean1,stderr1,min1,max1, name2,n2,mean2,stderr2,min2,max2, statname,stat,prob)Returns: None |
Returns an integer representing a binary vector, where 1=within- subject factor, 0=between. Input equals the entire data 2D list (i.e., column 0=random factor, column -1=measured values (those two are skipped). Note: input data is in |Stat format ... a list of lists ("2D list") with one row per measured value, first column=subject identifier, last column= score, one in-between column per factor (these columns contain level designations on each factor). See also stats.anova.__doc__. Usage: lfindwithin(data) data in |Stat format |
Calculates the geometric mean of the values in the passed array. That is: n-th root of (x1 * x2 * ... * xn). Defaults to ALL values in the passed array. Use dimension=None to flatten array first. REMEMBER: if dimension=0, it collapses over dimension 0 ('rows' in a 2D array) only, and if dimension is a sequence, it collapses over all specified dimensions. If keepdims is set to 1, the resulting array will have as many dimensions as inarray, with only 1 'level' per dim that was collapsed over. Usage: ageometricmean(inarray,dimension=None,keepdims=0) Returns: geometric mean computed over dim(s) listed in dimension |
Calculates the harmonic mean of the values in the passed array. That is: n / (1/x1 + 1/x2 + ... + 1/xn). Defaults to ALL values in the passed array. Use dimension=None to flatten array first. REMEMBER: if dimension=0, it collapses over dimension 0 ('rows' in a 2D array) only, and if dimension is a sequence, it collapses over all specified dimensions. If keepdims is set to 1, the resulting array will have as many dimensions as inarray, with only 1 'level' per dim that was collapsed over. Usage: aharmonicmean(inarray,dimension=None,keepdims=0) Returns: harmonic mean computed over dim(s) in dimension |
Calculates the arithmatic mean of the values in the passed array. That is: 1/n * (x1 + x2 + ... + xn). Defaults to ALL values in the passed array. Use dimension=None to flatten array first. REMEMBER: if dimension=0, it collapses over dimension 0 ('rows' in a 2D array) only, and if dimension is a sequence, it collapses over all specified dimensions. If keepdims is set to 1, the resulting array will have as many dimensions as inarray, with only 1 'level' per dim that was collapsed over. Usage: amean(inarray,dimension=None,keepdims=0) Returns: arithematic mean calculated over dim(s) in dimension |
Calculates the COMPUTED median value of an array of numbers, given the number of bins to use for the histogram (more bins approaches finding the precise median value of the array; default number of bins = 1000). From G.W. Heiman's Basic Stats, or CRC Probability & Statistics. NOTE: THIS ROUTINE ALWAYS uses the entire passed array (flattens it first). Usage: amedian(inarray,numbins=1000) Returns: median calculated over ALL values in inarray |
Returns the 'middle' score of the passed array. If there is an even number of scores, the mean of the 2 middle scores is returned. Can function with 1D arrays, or on the FIRST dimension of 2D arrays (i.e., dimension can be None, to pre-flatten the array, or else dimension must equal 0). Usage: amedianscore(inarray,dimension=None) Returns: 'middle' score of the array, or the mean of the 2 middle scores |
Returns an array of the modal (most common) score in the passed array. If there is more than one such score, ONLY THE FIRST is returned. The bin-count for the modal values is also returned. Operates on whole array (dimension=None), or on a given dimension. Usage: amode(a, dimension=None) Returns: array of bin-counts for mode(s), array of corresponding modal values |
Returns the arithmetic mean of all values in an array, ignoring values strictly outside the sequence passed to 'limits'. Note: either limit in the sequence, or the value of limits itself, can be set to None. The inclusive list/tuple determines whether the lower and upper limiting bounds (respectively) are open/exclusive (0) or closed/inclusive (1). Usage: atmean(a,limits=None,inclusive=(1,1)) |
Returns the sample variance of values in an array, (i.e., using N-1), ignoring values strictly outside the sequence passed to 'limits'. Note: either limit in the sequence, or the value of limits itself, can be set to None. The inclusive list/tuple determines whether the lower and upper limiting bounds (respectively) are open/exclusive (0) or closed/inclusive (1). Usage: atvar(a,limits=None,inclusive=(1,1)) |
Returns the minimum value of a, along dimension, including only values less than (or equal to, if inclusive=1) lowerlimit. If the limit is set to None, all values in the array are used. Usage: atmin(a,lowerlimit=None,dimension=None,inclusive=1) |
Returns the maximum value of a, along dimension, including only values greater than (or equal to, if inclusive=1) upperlimit. If the limit is set to None, a limit larger than the max value in the array is used. Usage: atmax(a,upperlimit,dimension=None,inclusive=1) |
Returns the standard deviation of all values in an array, ignoring values strictly outside the sequence passed to 'limits'. Note: either limit in the sequence, or the value of limits itself, can be set to None. The inclusive list/tuple determines whether the lower and upper limiting bounds (respectively) are open/exclusive (0) or closed/inclusive (1). Usage: atstdev(a,limits=None,inclusive=(1,1)) |
Returns the standard error of the mean for the values in an array, (i.e., using N for the denominator), ignoring values strictly outside the sequence passed to 'limits'. Note: either limit in the sequence, or the value of limits itself, can be set to None. The inclusive list/tuple determines whether the lower and upper limiting bounds (respectively) are open/exclusive (0) or closed/inclusive (1). Usage: atsem(a,limits=None,inclusive=(1,1)) |
Calculates the nth moment about the mean for a sample (defaults to the 1st moment). Generally used to calculate coefficients of skewness and kurtosis. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Usage: amoment(a,moment=1,dimension=None) Returns: appropriate moment along given dimension |
Returns the coefficient of variation, as defined in CRC Standard Probability and Statistics, p.6. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Usage: avariation(a,dimension=None) |
Returns the skewness of a distribution (normal ==> 0.0; >0 means extra weight in left tail). Use askewtest() to see if it's close enough. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Usage: askew(a, dimension=None) Returns: skew of vals in a along dimension, returning ZERO where all vals equal |
Returns the kurtosis of a distribution (normal ==> 3.0; >3 means heavier in the tails, and usually more peaked). Use akurtosistest() to see if it's close enough. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Usage: akurtosis(a,dimension=None) Returns: kurtosis of values in a along dimension, and ZERO where all vals equal |
Returns several descriptive statistics of the passed array. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Usage: adescribe(inarray,dimension=None) Returns: n, (min,max), mean, standard deviation, skew, kurtosis |
Tests whether the skew is significantly different from a normal distribution. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Usage: askewtest(a,dimension=None) Returns: z-score and 2-tail z-probability |
Tests whether a dataset has normal kurtosis (i.e., kurtosis=3(n-1)/(n+1)) Valid only for n>20. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Usage: akurtosistest(a,dimension=None) Returns: z-score and 2-tail z-probability, returns 0 for bad pixels |
Tests whether skew and/OR kurtosis of dataset differs from normal curve. Can operate over multiple dimensions. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Usage: anormaltest(a,dimension=None) Returns: z-score and 2-tail probability |
Returns a 2D array of item frequencies. Column 1 contains item values, column 2 contains their respective counts. Assumes a 1D array is passed. Usage: aitemfreq(a) Returns: a 2D frequency table (col [0:n-1]=scores, col n=frequencies) |
Usage: ascoreatpercentile(inarray,percent) 0<percent<100 Returns: score at given percentile, relative to inarray distribution |
Note: result of this function depends on the values used to histogram the data(!). Usage: apercentileofscore(inarray,score,histbins=10,defaultlimits=None) Returns: percentile-position of score (0-100) relative to inarray |
Returns (i) an array of histogram bin counts, (ii) the smallest value of the histogram binning, and (iii) the bin width (the last 2 are not necessarily integers). Default number of bins is 10. Defaultlimits can be None (the routine picks bins spanning all the numbers in the inarray) or a 2-sequence (lowerlimit, upperlimit). Returns all of the following: array of bin values, lowerreallimit, binsize, extrapoints. Usage: ahistogram(inarray,numbins=10,defaultlimits=None,printextras=1) Returns: (array of bin counts, bin-minimum, min-width, #-points-outside-range) |
Returns a cumulative frequency histogram, using the histogram function. Defaultreallimits can be None (use all data), or a 2-sequence containing lower and upper limits on values to include. Usage: acumfreq(a,numbins=10,defaultreallimits=None) Returns: array of cumfreq bin values, lowerreallimit, binsize, extrapoints |
Returns a relative frequency histogram, using the histogram function. Defaultreallimits can be None (use all data), or a 2-sequence containing lower and upper limits on values to include. Usage: arelfreq(a,numbins=10,defaultreallimits=None) Returns: array of cumfreq bin values, lowerreallimit, binsize, extrapoints |
Computes a transform on input data (any number of columns). Used to test for homogeneity of variance prior to running one-way stats. Each array in *args is one level of a factor. If an F_oneway() run on the transformed data and found significant, variances are unequal. From Maxwell and Delaney, p.112. Usage: aobrientransform(*args) *args = 1D arrays, one per level of factor Returns: transformed data for use in an ANOVA |
Returns the sample standard deviation of the values in the passed array (i.e., using N). Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Set keepdims=1 to return an array with the same number of dimensions as inarray. Usage: asamplevar(inarray,dimension=None,keepdims=0) |
Returns the sample standard deviation of the values in the passed array (i.e., using N). Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Set keepdims=1 to return an array with the same number of dimensions as inarray. Usage: asamplestdev(inarray,dimension=None,keepdims=0) |
Calculates signal-to-noise. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Usage: asignaltonoise(instack,dimension=0): Returns: array containing the value of (mean/stdev) along dimension, or 0 when stdev=0 |
Returns the estimated population variance of the values in the passed array (i.e., N-1). Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Set keepdims=1 to return an array with the same number of dimensions as inarray. Usage: avar(inarray,dimension=None,keepdims=0) |
Returns the estimated population standard deviation of the values in the passed array (i.e., N-1). Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Set keepdims=1 to return an array with the same number of dimensions as inarray. Usage: astdev(inarray,dimension=None,keepdims=0) |
Returns the estimated population standard error of the values in the passed array (i.e., N-1). Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Set keepdims=1 to return an array with the same number of dimensions as inarray. Usage: asterr(inarray,dimension=None,keepdims=0) |
Returns the standard error of the mean (i.e., using N) of the values in the passed array. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Set keepdims=1 to return an array with the same number of dimensions as inarray. Usage: asem(inarray,dimension=None, keepdims=0) |
Returns the z-score of a given input score, given thearray from which that score came. Not appropriate for population calculations, nor for arrays > 1D. Usage: az(a, score) |
Returns a 1D array of z-scores, one for each score in the passed array, computed relative to the passed array. Usage: azs(a) |
Returns an array of z-scores the shape of scores (e.g., [x,y]), compared to array passed to compare (e.g., [time,x,y]). Assumes collapsing over dim 0 of the compare array. Usage: azs(scores, compare, dimension=0) |
Rounds all values in array a to 'digits' decimal places. Usage: around(a,digits) Returns: a, where each value is rounded to 'digits' decimals |
Like Numeric.clip() except that values <threshmid or >threshmax are replaced by newval instead of by threshmin/threshmax (respectively). Usage: athreshold(a,threshmin=None,threshmax=None,newval=0) Returns: a, with values <threshmin or >threshmax replaced with newval |
Slices off the passed proportion of items from BOTH ends of the passed array (i.e., with proportiontocut=0.1, slices 'leftmost' 10% AND 'rightmost' 10% of scores. You must pre-sort the array if you want "proper" trimming. Slices off LESS if proportion results in a non-integer slice index (i.e., conservatively slices off proportiontocut). Usage: atrimboth (a,proportiontocut) Returns: trimmed version of array a |
Slices off the passed proportion of items from ONE end of the passed array (i.e., if proportiontocut=0.1, slices off 'leftmost' or 'rightmost' 10% of scores). Slices off LESS if proportion results in a non-integer slice index (i.e., conservatively slices off proportiontocut). Usage: atrim1(a,proportiontocut,tail='right') or set tail='left' Returns: trimmed version of array a |
Computes the covariance matrix of a matrix X. Requires a 2D matrix input. Usage: acovariance(X) Returns: covariance matrix of X |
Computes the correlation matrix of a matrix X. Requires a 2D matrix input. Usage: acorrelation(X) Returns: correlation matrix of X |
Interactively determines the type of data in x and y, and then runs the appropriated statistic for paired group data. Usage: apaired(x,y) x,y = the two arrays of values to be compared Returns: appropriate statistic name, value, and probability |
Calculates a Pearson correlation coefficient and returns p. Taken from Heiman's Basic Statistics for the Behav. Sci (2nd), p.195. Usage: apearsonr(x,y,verbose=1) where x,y are equal length arrays Returns: Pearson's r, two-tailed p-value |
Calculates a Spearman rank-order correlation coefficient. Taken from Heiman's Basic Statistics for the Behav. Sci (1st), p.192. Usage: aspearmanr(x,y) where x,y are equal-length arrays Returns: Spearman's r, two-tailed p-value |
Calculates a point-biserial correlation coefficient and the associated probability value. Taken from Heiman's Basic Statistics for the Behav. Sci (1st), p.194. Usage: apointbiserialr(x,y) where x,y are equal length arrays Returns: Point-biserial r, two-tailed p-value |
Calculates Kendall's tau ... correlation of ordinal data. Adapted from function kendl1 in Numerical Recipies. Needs good test-cases.@@@ Usage: akendalltau(x,y) Returns: Kendall's tau, two-tailed p-value |
Calculates a regression line on two arrays, x and y, corresponding to x,y pairs. If a single 2D array is passed, alinregress finds dim with 2 levels and splits data into x,y pairs along that dim. Usage: alinregress(*args) args=2 equal-length arrays, or one 2D array Returns: slope, intercept, r, two-tailed prob, sterr-of-the-estimate |
Calculates the t-obtained for the independent samples T-test on ONE group of scores a, given a population mean. If printit=1, results are printed to the screen. If printit='filename', the results are output to 'filename' using the given writemode (default=append). Returns t-value, and prob. Usage: attest_1samp(a,popmean,Name='Sample',printit=0,writemode='a') Returns: t-value, two-tailed prob |
Calculates the t-obtained T-test on TWO INDEPENDENT samples of scores a, and b. From Numerical Recipies, p.483. If printit=1, results are printed to the screen. If printit='filename', the results are output to 'filename' using the given writemode (default=append). Dimension can equal None (ravel array first), or an integer (the dimension over which to operate on a and b). Usage:attest_ind (a,b,dimension=None,printit=0, Name1='Samp1',Name2='Samp2',writemode='a')Returns: t-value, two-tailed p-value |
Calculates the t-obtained T-test on TWO RELATED samples of scores, a and b. From Numerical Recipies, p.483. If printit=1, results are printed to the screen. If printit='filename', the results are output to 'filename' using the given writemode (default=append). Dimension can equal None (ravel array first), or an integer (the dimension over which to operate on a and b). Usage:attest_rel(a,b,dimension=None,printit=0, name1='Samp1',name2='Samp2',writemode='a')Returns: t-value, two-tailed p-value |
Calculates a one-way chi square for array of observed frequencies and returns the result. If no expected frequencies are given, the total N is assumed to be equally distributed across all groups. Usage: achisquare(f_obs, f_exp=None) f_obs = array of observed cell freq. Returns: chisquare-statistic, associated p-value |
Computes the Kolmogorov-Smirnof statistic on 2 samples. Modified from Numerical Recipies in C, page 493. Returns KS D-value, prob. Not ufunc- like. Usage: aks_2samp(data1,data2) where data1 and data2 are 1D arrays Returns: KS D-value, p-value |
Calculates a Mann-Whitney U statistic on the provided scores and returns the result. Use only when the n in each condition is < 20 and you have 2 independent samples of ranks. REMEMBER: Mann-Whitney U is significant if the u-obtained is LESS THAN or equal to the critical value of U. Usage: amannwhitneyu(x,y) where x,y are arrays of values for 2 conditions Returns: u-statistic, one-tailed p-value (i.e., p(z(U))) |
Tie-corrector for ties in Mann Whitney U and Kruskal Wallis H tests. See Siegel, S. (1956) Nonparametric Statistics for the Behavioral Sciences. New York: McGraw-Hill. Code adapted from |Stat rankind.c code. Usage: atiecorrect(rankvals) Returns: T correction factor for U or H |
Calculates the rank sums statistic on the provided scores and returns the result. Usage: aranksums(x,y) where x,y are arrays of values for 2 conditions Returns: z-statistic, two-tailed p-value |
Calculates the Wilcoxon T-test for related samples and returns the result. A non-parametric T-test. Usage: awilcoxont(x,y) where x,y are equal-length arrays for 2 conditions Returns: t-statistic, two-tailed p-value |
The Kruskal-Wallis H-test is a non-parametric ANOVA for 3 or more groups, requiring at least 5 subjects in each group. This function calculates the Kruskal-Wallis H and associated p-value for 3 or more independent samples. Usage: akruskalwallish(*args) args are separate arrays for 3+ conditions Returns: H-statistic (corrected for ties), associated p-value |
Friedman Chi-Square is a non-parametric, one-way within-subjects ANOVA. This function calculates the Friedman Chi-square test for repeated measures and returns the result, along with the associated probability value. It assumes 3 or more repeated measures. Only 3 levels requires a minimum of 10 subjects in the study. Four levels requires 5 subjects per level(??). Usage: afriedmanchisquare(*args) args are separate arrays for 2+ conditions Returns: chi-square statistic, associated p-value |
Returns the (1-tail) probability value associated with the provided chi-square value and df. Heavily modified from chisq.c in Gary Perlman's |Stat. Can handle multiple dimensions. Usage: achisqprob(chisq,df) chisq=chisquare stat., df=degrees of freedom |
Returns the complementary error function erfc(x) with fractional error everywhere less than 1.2e-7. Adapted from Numerical Recipies. Can handle multiple dimensions. Usage: aerfcc(x) |
Returns the area under the normal curve 'to the left of' the given z value. Thus, : for z<0, zprob(z) = 1-tail probability for z>0, 1.0-zprob(z) = 1-tail probability for any z, 2.0*(1.0-zprob(abs(z))) = 2-tail probability Adapted from z.c in Gary Perlman's |Stat. Can handle multiple dimensions. Usage: azprob(z) where z is a z-value |
Returns the probability value for a K-S statistic computed via ks_2samp. Adapted from Numerical Recipies. Can handle multiple dimensions. Usage: aksprob(alam) |
Returns the 1-tailed significance level (p-value) of an F statistic given the degrees of freedom for the numerator (dfR-dfF) and the degrees of freedom for the denominator (dfF). Can handle multiple dims for F. Usage: afprob(dfnum, dfden, F) where usually dfnum=dfbn, dfden=dfwn |
Evaluates the continued fraction form of the incomplete Beta function, betai. (Adapted from: Numerical Recipies in C.) Can handle multiple dimensions for x. Usage: abetacf(a,b,x,verbose=1) |
Returns the gamma function of xx.: Gamma(z) = Integral(0,infinity) of t^(z-1)exp(-t) dt. Adapted from: Numerical Recipies in C. Can handle multiple dims ... but probably doesn't normally have to. Usage: agammln(xx) |
Returns the incomplete beta function: I-sub-x(a,b) = 1/B(a,b)*(Integral(0,x) of t^(a-1)(1-t)^(b-1) dt) where a,b>0 and B(a,b) = G(a)*G(b)/(G(a+b)) where G(a) is the gamma function of a. The continued fraction formulation is implemented here, using the betacf function. (Adapted from: Numerical Recipies in C.) Can handle multiple dimensions. Usage: abetai(a,b,x,verbose=1) |
Calculates a linear model fit ... anova/ancova/lin-regress/t-test/etc. Taken from: Peterson et al. Statistical limitations in functional neuroimaging I. Non-inferential methods and statistical models. Phil Trans Royal Soc Lond B 354: 1239-1260.Usage: aglm(data,para) Returns: statistic, p-value ??? |
Performs a 1-way ANOVA, returning an F-value and probability given any number of groups. From Heiman, pp.394-7. Usage: aF_oneway (*args) where *args is 2 or more arrays, one per treatment group Returns: f-value, probability |
Returns an F-statistic given the following: ER = error associated with the null hypothesis (the Restricted model) EF = error associated with the alternate hypothesis (the Full model) dfR = degrees of freedom the Restricted model dfF = degrees of freedom associated with the Restricted model |
|
Returns an F-statistic given the following: ER = error associated with the null hypothesis (the Restricted model) EF = error associated with the alternate hypothesis (the Full model) dfR = degrees of freedom the Restricted model dfF = degrees of freedom associated with the Restricted modelwhere ER and EF are matrices from a multivariate F calculation. |
Usage: asign(a) Returns: array shape of a, with -1 where a<0 and +1 where a>=0 |
An alternative to the Numeric.add.reduce function, which allows one to (1) collapse over multiple dimensions at once, and/or (2) to retain all dimensions in the original array (squashing one down to size. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). If keepdims=1, the resulting array will have as many dimensions as the input array. Usage: asum(a, dimension=None, keepdims=0) Returns: array summed along 'dimension'(s), same _number_ of dims if keepdims=1 |
Returns an array consisting of the cumulative sum of the items in the passed array. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions, but this last one just barely makes sense). Usage: acumsum(a,dimension=None) |
Squares each value in the passed array, adds these squares & returns the result. Unfortunate function name. :-) Defaults to ALL values in the array. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). Set keepdims=1 to maintain the original number of dimensions. Usage: ass(inarray, dimension=None, keepdims=0) Returns: sum-along-'dimension' for (inarray*inarray) |
Multiplies elements in array1 and array2, element by element, and returns the sum (along 'dimension') of all resulting multiplications. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). A trivial function, but included for completeness. Usage: asummult(array1,array2,dimension=None,keepdims=0) |
Adds the values in the passed array, squares that sum, and returns the result. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). If keepdims=1, the returned array will have the same NUMBER of dimensions as the original. Usage: asquare_of_sums(inarray, dimension=None, keepdims=0) Returns: the square of the sum over dim(s) in dimension |
Takes pairwise differences of the values in arrays a and b, squares these differences, and returns the sum of these squares. Dimension can equal None (ravel array first), an integer (the dimension over which to operate), or a sequence (operate over multiple dimensions). keepdims=1 means the return shape = len(a.shape) = len(b.shape) Usage: asumdiffsquared(a,b) Returns: sum[ravel(a-b)**2] |
Shellsort algorithm. Sorts a 1D-array. Usage: ashellsort(inarray) Returns: sorted-inarray, sorting-index-vector (for original array) |
Ranks the data in inarray, dealing with ties appropritely. Assumes a 1D inarray. Adapted from Gary Perlman's |Stat ranksort. Usage: arankdata(inarray) Returns: array of length equal to inarray, containing rank scores |
Returns a binary vector, 1=within-subject factor, 0=between. Input equals the entire data array (i.e., column 1=random factor, last column = measured values. Usage: afindwithin(data) data in |Stat format |
|
__version__
|
N
|
LA
|
geometricmean
|
harmonicmean
|
mean
|
median
|
medianscore
|
mode
|
tmean
|
tvar
|
tstdev
|
tsem
|
moment
|
variation
|
skew
|
kurtosis
|
describe
|
skewtest
|
kurtosistest
|
normaltest
|
itemfreq
|
scoreatpercentile
|
percentileofscore
|
histogram
|
cumfreq
|
relfreq
|
obrientransform
|
samplevar
|
samplestdev
|
signaltonoise
|
var
|
stdev
|
sterr
|
sem
|
z
|
zs
|
threshold
|
trimboth
|
trim1
|
paired
|
pearsonr
|
spearmanr
|
pointbiserialr
|
kendalltau
|
linregress
|
ttest_1samp
|
ttest_ind
|
ttest_rel
|
chisquare
|
ks_2samp
|
mannwhitneyu
|
tiecorrect
|
ranksums
|
wilcoxont
|
kruskalwallish
|
friedmanchisquare
|
chisqprob
|
zprob
|
ksprob
|
fprob
|
betacf
|
betai
|
erfcc
|
gammln
|
F_oneway
|
F_value
|
incr
|
sum
|
cumsum
|
ss
|
summult
|
square_of_sums
|
sumdiffsquared
|
shellsort
|
rankdata
|
findwithin
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0alpha3 on Fri Dec 22 20:11:38 2006 | http://epydoc.sourceforge.net |