Week2 - Probability Review
2.1 - Univariate random variable
Discrete random variable
Can only take a finite number of values.
\(\forall x \in S_X : 0 \le p(x) \le 1\)
\(\forall x \notin S_X: p(x) = 0\)
\(\sum\limits_{x \in S_X} p(x) = 1\)
Bernoulli distribution
We note X=1 on success and X=0 on failure.
\(Pr(X=1)=\pi\) and \(Pr(X=0)=1-\pi\)
Then we have the pdf: \(p(x)=Pr(X=x)=\pi^x(1-\pi)^x\) for \(x \in {0,1}\)
Continuous random variable
In that case we have a probability curve \(f(x)\)
And we can measure probability on intervals A: \(Pr(X \in A) = \int_A f(x) dx\)
\(\forall x: f(x) \ge 0\) and \(\int_{-\infty}^\infty f(x) dx = 1\)
2.2 - Cumulative Distribution Function
The CDF function F for a rv X is: \(F(x) = Pr(X \le x)\)
\(x_1 \lt x_2 \Rightarrow F(x_1) \le F(x_2)\)
\(F(-\infty) = 0\) and \(F(\infty) = 1\)
\(Pr(X \ge x) = 1 - F(x)\)
\(Pr(x_1 \le x \le x_2) = F(x_2) - F(x_1)\)
\(\frac{d}{dx} F(x) = f(x)\) if X is a continuous rv.
2.3 - Quantiles
Given an X rv with continuous CDF \(F_X(x) = Pr(X \lt x)\): The \(\alpha\)* 100% quantile of \(F_X\) for \(\alpha \in [0,1]\) is the value \(q_\alpha\) such that \(F_X(q_\alpha) = Pr(X \lt q_\alpha) = \alpha\).
The area to the left of \(q_\alpha\) is \(\alpha\) under the probability curve.
If the inverse CDF function exists, then: \(q_\alpha = F_X^{-1}(\alpha)\)
The 50% quantile is also called the median
For a dist U[0,1] for instance we have \(F(x)=x \Rightarrow q_\alpha=\alpha\)
2.4 - Standard normal distribution
\[\Phi(x) = Pr(X \le x) = \int_{-\infty}^x \phi(z)dz\]
\[Pr(-1 \le x \le 1) \approx 0.67\]
\[Pr(-2 \le x \le 2) \approx 0.95\]
\[Pr(-3 \le x \le 3) \approx 0.99\]
In Excel:
In R:
We use pnorm to compute \(\Phi(z)\)
We use qnorm to compute \(\Phi^{-1}(z)\)
We use dnorm to compute \(\phi(z)\)
\[Pr(X\le z) = 1 - Pr(X \ge z)\]
\[Pr(X\ge z) = Pr(X \le -z)\]
\[Pr(X\ge 0) = Pr(X \le 0) = 0.5\]
2.5 - Expected Value and Standard Deviation
Shape characteristics of pdfs
Expected Value or Mean: Center of mass
Variance and standard deviation: spread about mean
Skewness: symmetry about mean
Kurtosis: Tail thickness
Expected value
Variance and Standard Deviation
\(g(X) = (X - E[X])^2 = (X - \mu_X)^2\)
\(Var(x) = \sigma_X^2 = E[g(X)] = E[(X-\mu_X)^2] = E[X^2] - \mu_X^2\)
\(SD(X) = \sigma_X = \sqrt{Var(X)}\)
2.6 - General Normal Distribution
\[f(x) = \frac{1}{\sqrt{2\pi \sigma_X^2}} exp\left( - \frac 12 \left(\frac{x-\mu_X}{\sigma_X} \right)^2\right)\]
Finding areas under General Normal Curve
In Excel:
NORMDIST(x,mu_X,sigma_X,cummulative): if commulative==true, computes \(Pr(X \le x)\), otherwise compute \(f(x) = \frac{1}{\sqrt{2\pi \sigma_X^2}} exp\left( - \frac 12 \left(\frac{x-\mu_X}{\sigma_X} \right)^2\right)\)
NORMINV(alpha, mu, sigma) computes \(q_\alpha = \mu_X + \sigma_X \cdot z_\alpha\)
In R:
simulate data: rnorm(n,mean,sd)
compute CDF: pnorm(q, mean, sd)
compute quantiles: qnorm(p,mean,sd)
compute density: dnorm(x,mean, sd)
2.7 - Standard deviation as measure of risk
Typically for return rate computation, if we consider: \(R_A \sim N(\mu_A,\sigma_A^2)\) and \(R_B \sim N(\mu_B,\sigma_B^2)\), then typically, if \(\mu_A > \mu_B\), then we will also find that \(\sigma_A > \sigma_B\).
2.8 - Normal Distribution: Appropriate fo simple returns ?
The Log-Normal Distribution
\(X \sim N(\mu_X,\sigma_X^2), -\infty \lt X \lt \infty\)
Then we can define \(Y = exp(X) \sim lognormal(\mu_X,\sigma_X^2), 0 \lt Y \lt \infty\)
\(E[Y] = \mu_Y = exp(\mu_X + \frac{\sigma_X^2}{2})\)
\(Var[Y] = \sigma_Y^2 = exp(2\mu_X + \sigma_X^2)(exp(\sigma_X^2)-1)\)
2.9 - Skewness and Kurtosis
Skewness - Measure of symmetry
\(g(X) = ((X - \mu_X)/\sigma_X)^3\)
\(Skew(X) = E\left[ \left(\frac{X - \mu_X}{\sigma_X} \right)^3 \right]\)
Skew(X)>0 is when we have a long “right tail”, eg. the main “blob” is on the left.
Skew(X)<0 is when we have a long “left tail”, eg. the main “blob” is on the right.
For symmetry distributions Skew(X)=0
\[Skew(Y) = (exp(\sigma_X^2) +2) \sqrt{exp(\sigma_X^2) -1} \gt 0\]
Kurtosis - Measure of tail thickness
2.10 - Student's-t Distribution
Similar to normal distribution but with fatter tails (eg. larger kurtosis).
It has an additional parameter called the **degree of freedom“ “v”.
We note \(X \sim t_v\), and the pdf is:
\[f(x) = \frac{\Gamma(\frac{v+1}{2})}{\sqrt{2\pi}\Gamma(\frac v2)} \left( 1 + \frac{x^2}{v}\right)^{- \frac{v+1}{2}}, ~~ -\infty \lt x \lt \infty, ~~ v > 0 \]
With \(\Gamma(z) = \int_0^\infty t^{z-1}e^{-t}dt\) denoting the gamma function.
When \(v \rightarrow \infty\) then the Student-t distribution is exactly the normal distribution.
The smaller the degree of freedom parameter, the fatter are the tails of the distribution.
in R we have the functions: rt, pt, qt and dt related to this distribution.
In practice if v=60 then we can already consider that we have the normal distribution.
2.11 - Linear Functions of Random Variables
Let X be a discrete or continuous rc with \(\mu_X = E[X]\) and \(\sigma_X^2 = Var(X)\)
We define a new rv Y, such as: \(Y = g(X) = a \cdot X + b\)
Then we have: \(\mu_Y = a \cdot \mu_X + b\) and \(\sigma_Y = a \cdot \sigma_X\)
Linear function of Normal rv
\[\mu_Y = a \cdot \mu_X + b\]
\[\sigma_Y^2 = a^2 \cdot \sigma_X^2\]
Standardizing a Normal rv
\[\begin{align} Z & = \frac{X - \mu_X}{\sigma_X} = \frac{1}{\sigma_X} \cdot X - \frac{\mu_X}{\sigma_X} \\ & = a \cdot X + b \\ a & = \frac{1}{\sigma_X}, ~ b = -\frac{\mu_X}{\sigma_X} \end{align}\]
2.12 - (Example) Value at Risk
Eg. compute how much money we could loose with a specified probability \(\alpha\).
Assume R = simple monthly return. \(R \sim N(0.05, (0.10)^2)\)
\(\alpha\) is usually 5% or 1%.
End of month wealth \(W_1 = $10000 \cdot (1+R)\)
What is \(Pr(W_1 \lt $9000\)
What value of R produces \(W_1 = $9000\)
VaR for cc returns
We then:
Compute the alpha quantile of the normal dist for r: \(q_\alpha^r = \mu_r + \sigma_r z_\alpha\)
Convert the alpha quantile for r into an alpha quantile for R: \(q_\alpha^R = e^{q_\alpha^r} - 1\)
We compute the \(VaR_\alpha\) using \(q_\alpha^R\): \(VaR_\alpha = $W_0 \cdot q_\alpha^R\)