Probability and statistics formula sheet
Symbols
Remember that \( \sum_{i=1}^{i=n} \) or \( \sum_{i=1}^{n} \) is the
addition of a sequence of numbers, in this case from \(1\) to \(n\).
Like this: \( \sum_{i=m}^{n} a_{i} = a_{m} + a_{m+1} + a_{m+2} +
\cdots + a_{n-1} + a_{n}\).
name |
symbol |
class amplitude |
$$A$$ |
class mark |
$$CM$$ |
event |
$$E$$ |
sample size |
$$N,\ n$$ |
absolute cumulative frequency
|
$$N_{i}$$ |
absolute frequency |
$$n_{i}$$ |
relative frequency |
$$f_{i}$$ |
relative cumulative frequency
|
$$F_{i}$$ |
mean absolute deviation |
$$MD$$ |
probability of an event
|
$$P\left( E \right)$$ |
probability of the complement of an event
|
$$P\left( E^{\complement} \right)$$ |
union of probabilities
|
$$P\left( A \cup B \right)$$ |
intersection of probabilities
|
$$P\left( A \cap B \right)$$ |
(conditional) probability of \(A\) given \(B\)
|
$$P\left( A \vert B \right)$$ |
range |
$$R$$ |
sample space |
$$S$$ |
standard deviation |
$$s$$ |
variance |
$$s^{2}$$ |
sample elements |
$$X_{i}$$ |
average |
$$\bar{x}$$ |
median |
$$\tilde{x}$$ |
mode |
$$\hat{x}$$ |
value |
$$x_{i}$$ |
Fisher's moment coefficient of skewness
|
$$\gamma_{1}$$ |
average |
$$\mu$$ |
k-th central moment
|
$$\mu_{k}$$ |
standard deviation
|
$$\sigma$$ |
variance |
$$\sigma^{2}$$ |
sample space |
$$\Omega$$ |
imposible event |
$$\empty$$ |
Statistics
name |
equation |
ceiling function
|
$$\left\lceil -1.5 \right\rceil = -1$$ $$\left\lceil 1.5
\right\rceil = 2$$
|
floor function
|
$$\left\lfloor -1.5 \right\rfloor = -2$$ $$\left\lfloor 1.5
\right\rfloor = 1$$
|
absolute frequency
|
$$n_{i}$$ |
average |
$$\bar{x} = \frac{1}{n} \sum_{i = 1}^{n} x_{i}$$ |
class mark |
$$CM = \frac{\mathrm{Upper\ Limit} + \mathrm{Lower\
Limit}}{2}$$
|
relative frequency
|
$$f_{i} = \frac{n_{i}}{n}$$ |
absolute cumulative frequency
|
$$N_{i} = \sum_{j\leq i} n_{j}$$ |
relative cumulative frequency
|
$$F_{i} = \sum_{j \leq i} \frac{n_{j}}{n}$$ |
median |
$$\tilde{x} = \begin{cases} x_{(n+1)/2} & n \text{ is odd} \\
\frac{x_{(n/2)} + x_{(n/2)+1}}{2} & n \text{ is
even}\end{cases}$$
|
median (grouped data)
|
$$\tilde{x} = L_{\tilde{x}} + A \left( \frac{\frac{n}{2} -
F_{\tilde{x} - 1}}{f_{\tilde{x}}} \right)$$
|
mode |
$$\hat{x} = \operatorname{argmax}_{x_{i}} $$ |
arguments of the maximum
|
$$\begin{split}\operatorname{argmax}_{S} f &:= \underset{x \in
S}{\operatorname{argmax}} f\left( x \right) \\ &:=
\left\lbrace x \in S | f(s) \leq f(x) \forall s \in S
\right\rbrace\end{split}$$ Is defined as there is an \( x
\) in set \(S\) such that \(s\) evaluated in \(f\) function is
less or equal to \(x\) evaluated in \(f\) function for all
\(s\) in set \(S\).
|
mode (grouped data)
|
$$\hat{x} = L_{\hat{x}} + A \left( \frac{f_{\hat{x}} -
f_{\hat{x} - 1}}{\left( f_{\hat{x}} - f_{\hat{x} - 1} \right)
+ \left( f_{\hat{x}} - f_{\hat{x} + 1} \right)} \right)$$
|
range |
$$R = x_{max} - x_{min}$$ |
variance |
$$\sigma^{2} = \frac{1}{n-1} \sum_{i = 1}^{n} \left(x_{i} -
\bar{x}\right)^{2}$$
|
variance (less rounding error)
|
$$\sigma^{2} = \frac{1}{n-1} \left\lbrack \sum_{i = 1}^{n}
x_{i} - \frac{1}{n} \left(\sum_{i = 1}^{n}x_{i}\right)^{2}
\right\rbrack$$
|
standard deviation
|
$$\sigma \equiv \sqrt{\sigma^{2}}$$ |
sample average |
$$\bar{x} = \sum_{i = 1}^{m} x_{i} f \left( x_{i} \right)$$
|
sample variance
|
$$\sigma^{2} = \frac{n}{n-1} \sum_{i = 1}^{m} \left(x_{i} -
\bar{x}\right)^{2}f\left( x_{i} \right)$$
|
sample variance (less rounding error)
|
$$\sigma^{2} = \frac{1}{n-1} \left\lbrace \sum_{i = 1}^{m}
\bar{x}^{2} n f\left( x_{i} \right) - \frac{1}{n} \left\lbrack
\sum_{i = 1}^{m} x_{i} n f \left( x_{i}
\right)\right\rbrack^{2} \right\rbrace$$
|
Measures of statistical dispersion
|
Mean Absolute Deviation
|
Standard deviation
|
Variance
|
Individual data
|
$$MD = \frac{\underset{i = 1}{\overset{n}{\sum}}\vert x_{i} -
\bar{x} \vert}{n}$$
|
$$\sigma = \sqrt{\frac{\underset{i =
1}{\overset{n}{\sum}}\left( x_{i} - \bar{x} \right)^{2}}{n}}$$
|
$$\sigma^{2} = \frac{\underset{i = 1}{\overset{n}{\sum}}\left(
x_{i} - \bar{x} \right)^{2}}{n}$$
|
Frequency distribution
|
$$MD = \frac{\underset{i = 1}{\overset{n}{\sum}} f_{i} \cdot
\vert x_{i} - \bar{x} \vert}{n}$$
|
$$\sigma = \sqrt{\frac{\underset{i =
1}{\overset{n}{\sum}}\left( f_{i} \right)\left( x_{i} -
\bar{x} \right)^{2}}{n}}$$
|
$$\sigma^{2} = \frac{\underset{i = 1}{\overset{n}{\sum}}\left(
f_{i} \right)\left( x_{i} - \bar{x} \right)^{2}}{n}$$
|
Grouped data
|
$$MD = \frac{\underset{i = 1}{\overset{n}{\sum}} f_{CM_{i}}
\cdot \vert CM_{i} - \bar{x} \vert}{n}$$
|
$$\sigma = \sqrt{\frac{\underset{i =
1}{\overset{n}{\sum}}\left( f_{CM_{i}} \right)\left( CM_{i} -
\bar{x} \right)^{2}}{n}}$$
|
$$\sigma^{2} = \frac{\underset{i = 1}{\overset{n}{\sum}}\left(
f_{MC_{i}} \right)\left( MC_{i} - \bar{x} \right)^{2}}{n}$$
|
Quantiles
|
Quartiles
|
Deciles |
Percentiles
|
Position
|
$$q_{i} = \left( n + 1 \right) \frac{i}{4},\ i = \left[ 0,4
\right]$$
|
$$d_{i} = \left( n + 1 \right) \frac{i}{10},\ i = \left[ 0,10
\right]$$
|
$$p_{i} = \left( n + 1 \right) \frac{i}{100},\ i = \left[
0,100 \right]$$
|
Value |
$$Q_{i} = x_{\left\lfloor q_{i} \right\rfloor} + \left( q_{i}
- \left\lfloor q_{i} \right\rfloor \right) \left(
x_{\left\lfloor q_{i} \right\rfloor + 1} - x_{\left\lfloor
q_{i} \right\rfloor} \right)$$
|
$$D_{i} = x_{\left\lfloor d_{i} \right\rfloor} + \left( d_{i}
- \left\lfloor d_{i} \right\rfloor \right) \left(
x_{\left\lfloor d_{i} \right\rfloor + 1} - x_{\left\lfloor
d_{i} \right\rfloor} \right)$$
|
$$P_{i} = x_{\left\lfloor p_{i} \right\rfloor} + \left( p_{i}
- \left\lfloor p_{i} \right\rfloor \right) \left(
x_{\left\lfloor p_{i} \right\rfloor + 1} - x_{\left\lfloor
p_{i} \right\rfloor} \right)$$
|
Range |
$$IQR = Q_{3} - Q_{1}$$ |
$$IDR = D_{9} - D_{1},\ \mathrm{(most\ common)}$$ $$IDR =
D_{b} - D_{a}$$
|
$$IPR = P_{90} - P_{10},\ \mathrm{(most\ common)}$$ $$IPR =
P_{b} - P_{a}$$
|
Histogram
Number of bins and width
name |
equation |
notes |
|
$$ k = \left\lceil \frac{\max x - \min x}{h} \right\rceil $$
|
$$k = \mathrm{number},\ h = \mathrm{width}$$
|
Square-root choice
|
$$k = \left\lceil \sqrt{n} \right\rceil$$ |
|
Sturges' formula
|
$$k = \left\lceil \log_{2} n \right\rceil + 1 = \left\lceil
\frac{\log_{10} n}{\log_{10} 2} \right\rceil + 1$$
|
Derived from a binomial distribution and implicitly assumes an
approximately normal distribution.
|
Rice rule
|
$$k = \left\lceil 2\sqrt[3]{n} \right\rceil$$ |
Alternative to Sturges' rule.
|
Doane's formula
|
$$k = 1 + \log_{2}\left( 2 \right) + \log_{2}\left( 1 +
\frac{\vert g_{1} \vert}{\sigma_{g_{1}}} \right)$$
$$\sigma_{g_{1}} = \sqrt{\frac{6\left( n - 2 \right)}{\left( n
+ 1 \right)\left( n + 3 \right)}}$$
|
Modification of Sturges' formula which attempts to improve its
performance with non-normal data. \(g_{1}\) is the estimated
3rd-moment-skewness of the distribution
|
Scott's normal reference rule
|
$$h = \frac{3.5 \hat{\sigma}}{\sqrt[3]{n}}$$ |
Where \({\hat {\sigma }}\) is the sample standard deviation.
|
Freedman-Diaconis' choice
|
$$h = 2\frac{\mathrm{IQR}\left( x \right)}{\sqrt[3]{n}}$$
|
Replaces \(3.5 \sigma \) of Scott's rule with \(2\
\mathrm{IQR}\), which is less sensitive than the standard
deviation to outliers in data.
|
Probability
name |
equation |
probability of an event
|
$$P\left( E \right) = \lim_{n \to \infty} f\left( E \right) =
\lim_{n \to \infty} \frac{n_{E}}{n}$$
|
union of probabilities
|
$$P\left(A \cup B\right) = \begin{cases} P\left(A\right) +
P\left(B\right) - P\left(A \cap B\right) & P\left(A \cap
B\right) \neq \empty \\ P\left(A\right) + P\left(B\right) &
P\left(A \cap B\right) = \empty\end{cases}$$
|
complement probability
|
$$P\left( E^{\complement} \right) = 1 - P\left( E \right)$$
$$P\left( \neg E \right) = 1 - P\left( E \right)$$
|
intersection of disjoint events
|
$$P\left( A \cap B \right) = 0$$ |
intersection of independent events
|
$$P\left( A \cap B \right) = P\left( A \right)P\left( B
\right)$$
|
intersection of dependent events
|
$$\begin{split}P\left( A \cap B \right) &= P\left( B \vert A
\right) P\left( A\right)\\ &= P\left( A \vert B \right)
P\left( B\right)\end{split}$$ $$\begin{split}P\left( A \cap
B^{\complement} \right) &= P\left( B^{\complement} \vert A
\right) P\left( A \right)\\ &= P\left( A \vert B^{\complement}
\right) P\left( B^{\complement} \right)\end{split}$$
$$\begin{split}P\left( A^{\complement} \cap B \right) &=
P\left( B \vert A^{\complement} \right) P\left(
A^{\complement} \right)\\ &= P\left( A^{\complement} \vert B
\right) P\left( B\right)\end{split}$$ $$\begin{split}P\left(
A^{\complement} \cap B^{\complement} \right) &= P\left(
B^{\complement} \vert A^{\complement} \right) P\left(
A^{\complement} \right)\\ &= P\left( A^{\complement} \vert
B^{\complement} \right) P\left( B^{\complement}
\right)\end{split}$$
|
complement of dependent events
|
$$P\left( B \vert A \right) = 1 - P\left( B^{\complement}
\vert A \right)$$ $$P\left( B \vert A^{\complement} \right) =
1 - P\left( B^{\complement} \vert A^{\complement} \right)$$
|
Permutations and Combinations
Remember, while permutations are ordered, combinations are not.
name |
equation |
permutations of \(n\) different things
|
$$n! = 1 \cdot 2 \cdot 3 \cdots n$$ |
permutations of \\(n\\) total elements (there may be identical
elements), \\(r\\) is the number of different elements and
\\(n_i\\) are the case repetition of each \\(r\\).
|
$$\frac{n!}{n_{1}!n_{2}!\cdots n_{r}!}$$ |
permutations of \(n\) different elements arranged in a
circular manner
|
$$\left(n - 1\right) !$$ |
permutations of a set of \( n \) different elements taking one
subset of \( k \) chosen elements without repetition
|
$$\frac{n!}{\left( n-k\right)!}$$ |
permutations of a set of \( n \) different elements taking one
subset of \( k \) chosen elements with repetition
|
$$n^{k}$$ |
combinations of a set of \( n \) different elements taking one
subset of \( k \) chosen elements without repetition
|
$${n \choose k} = \frac{n!}{k!\left( n-k\right)!}$$ |
combinations of a set of \( n \) different elements taking one
subset of \( k \) chosen elements with repetition
|
$${n + k - 1 \choose k} = \frac{\left( n + k -
1\right)!}{k!\left( n - 1\right)!} $$
|
Bayesian probability
name |
equation |
conditional probability of \(A\) given \(B\)
|
$$P\left( A \vert B \right) = \frac{P\left( A \cap B
\right)}{P\left( B \right)}$$
|
Bayes' theorem (special case)
|
$$\begin{split}P\left( A \vert B \right) &= \frac{P\left( B
\vert A\right) P\left(A\right)}{P\left( B \right)} \\ &=
\frac{P\left( B \vert A \right)P\left( A \right)}{P\left( B
\vert A \right)P\left( A \right) + P(B \vert A^{\complement})
P\left( A^{\complement} \right)}\end{split}$$
|
Bayes' theorem (general)
|
$$P\left( A_{k} \vert B \right) = \frac{P\left( B \vert A_{k}
\right)P\left( A_{k} \right)}{\sum_{j}P\left( B \vert A_{j}
\right)P\left(A_{j} \right)}$$
|
Probability distributions
name |
equation |
probability function (for discrete variables)
|
$$f\left( x \right) = P\left( x = x \right)$$ |
probability density (for continuous variables)
|
$$f\left( x \right) = \frac{d F\left( x \right)}{dx}$$ |
k-th raw moment (discrete)
|
$$E\left( x^{k} \right) = \sum_{i}x_{i}^{k}f\left( x_{i}
\right)$$
|
k-th raw moment (continuous)
|
$$E\left( x^{k} \right) = \int_{-\infty}^{\infty} x^{k}
f\left( x \right)dx$$
|
k-th central moment (discrete)
|
$$\mu_{k} = E\left( x - \mu \right)^{k} = \sum_{i} \left(x_{i}
- \mu \right)^{k}f\left( x_{i} \right)$$
|
k-th central moment (continuous)
|
$$\mu_{k} = E\left( x - \mu \right)^{k} =
\int_{-\infty}^{\infty} \left( x - \mu \right)^{k} f\left( x
\right)dx$$
|
Fisher's moment coefficient of skewness
|
$$\gamma_{1} = \frac{\mu_{3}}{\sigma^{3}}$$ |
Moment-generating function (discrete)
|
$$G\left( t \right) = E\left( e^{tx} \right) = \sum_{i} e^{t
x_{i}}f\left( x_{i} \right)$$
|
Moment-generating function (continuous)
|
$$G\left( t \right) = E\left( e^{tx} \right) =
\int_{-\infty}^{\infty} e^{t x}f\left( x \right) dx$$
|
See also
Bayes Theorem
Box Whisker Diagram
Probability distributions