Week2 - Probability Review
2.1 - Univariate random variable
- A random variable (rv) X can take value on a sample space \(S_X\).
- It is distributed following a probability distribution function (pdf).
Discrete random variable
- Can only take a finite number of values.
- \(\forall x \in S_X : 0 \le p(x) \le 1\)
- \(\forall x \notin S_X: p(x) = 0\)
- \(\sum\limits_{x \in S_X} p(x) = 1\)
Bernoulli distribution
- We note X=1 on success and X=0 on failure.
- \(Pr(X=1)=\pi\) and \(Pr(X=0)=1-\pi\)
- Then we have the pdf: \(p(x)=Pr(X=x)=\pi^x(1-\pi)^x\) for \(x \in {0,1}\)
Continuous random variable
- In that case we have a probability curve \(f(x)\)
- And we can measure probability on intervals A: \(Pr(X \in A) = \int_A f(x) dx\)
- \(\forall x: f(x) \ge 0\) and \(\int_{-\infty}^\infty f(x) dx = 1\)
Uniform distribution over [a,b]
- assuming b>a here.
- We note: \(X \sim U[a,b]\) and we have the pdf: \(f(x) = \begin{cases} \frac{1}{b-a} & \text{for } a \le x \le b \\ 0 & \text{otherwise}\end{cases}\)
2.2 - Cumulative Distribution Function
- The CDF function F for a rv X is: \(F(x) = Pr(X \le x)\)
- \(x_1 \lt x_2 \Rightarrow F(x_1) \le F(x_2)\)
- \(F(-\infty) = 0\) and \(F(\infty) = 1\)
- \(Pr(X \ge x) = 1 - F(x)\)
- \(Pr(x_1 \le x \le x_2) = F(x_2) - F(x_1)\)
- \(\frac{d}{dx} F(x) = f(x)\) if X is a continuous rv.
- Also note that for a continuous rv: \(Pr(X\le x) = Pr(X\lt x)\) and \(Pr(X=x)=0\)
2.3 - Quantiles
- Given an X rv with continuous CDF \(F_X(x) = Pr(X \lt x)\): The \(\alpha\)* 100% quantile of \(F_X\) for \(\alpha \in [0,1]\) is the value \(q_\alpha\) such that \(F_X(q_\alpha) = Pr(X \lt q_\alpha) = \alpha\).
- The area to the left of \(q_\alpha\) is \(\alpha\) under the probability curve.
- If the inverse CDF function exists, then: \(q_\alpha = F_X^{-1}(\alpha)\)
- The 50% quantile is also called the median
- For a dist U[0,1] for instance we have \(F(x)=x \Rightarrow q_\alpha=\alpha\)
2.4 - Standard normal distribution
- If X is a rv such as \(X \sim N(0,1)\), then: \(f(x) = \phi(x) = \frac{1}{\sqrt{2\pi}} exp\left( - \frac12 x^2 \right)\) for \(-\infty \le x \le \infty\).
\[\Phi(x) = Pr(X \le x) = \int_{-\infty}^x \phi(z)dz\]
- We have the important ranges:
\[Pr(-1 \le x \le 1) \approx 0.67\] \[Pr(-2 \le x \le 2) \approx 0.95\] \[Pr(-3 \le x \le 3) \approx 0.99\]
- In Excel:
- we can use the function NORMSDIST to get the \(\Phi(z)\) or the \(\phi(z)\) values.
- we can use the function NORMSINV to get the \(\Phi^{-1}(\alpha)\) value.
- In R:
- We use pnorm to compute \(\Phi(z)\)
- We use qnorm to compute \(\Phi^{-1}(z)\)
- We use dnorm to compute \(\phi(z)\)
- Other noticeable relations on the std distribution:
\[Pr(X\le z) = 1 - Pr(X \ge z)\] \[Pr(X\ge z) = Pr(X \le -z)\] \[Pr(X\ge 0) = Pr(X \le 0) = 0.5\]
2.5 - Expected Value and Standard Deviation
Shape characteristics of pdfs
- Expected Value or Mean: Center of mass
- Variance and standard deviation: spread about mean
- Skewness: symmetry about mean
- Kurtosis: Tail thickness
Expected value
- For discrete rv: \(E[X] = \mu_X = \sum\limits_{x \in S_X} x \cdot p(x)\)
- For continuous rv: \(E[X] = \mu_X = \int_{-\infty}^\infty x \cdot f(x) dx\)
- If \(X \sim N(0,1)\) then \(\mu_X = \int_{-\infty}^\infty x \cdot \frac{1}{\sqrt{2\pi}} e^{-\frac 12 x^2} dx = 0\)
- Let g(X) be some function of the rv X. Then
- For discrete rv: \(E[g(X)] = \sum\limits_{x \in S_X} g(x) \cdot p(x)\)
- For continuous rv: \(E[g(X)] = \int_{-\infty}^\infty g(x) \cdot f(x) dx\)
Variance and Standard Deviation
- \(g(X) = (X - E[X])^2 = (X - \mu_X)^2\)
- \(Var(x) = \sigma_X^2 = E[g(X)] = E[(X-\mu_X)^2] = E[X^2] - \mu_X^2\)
- \(SD(X) = \sigma_X = \sqrt{Var(X)}\)
- Note that Var(X) is in squared units of X, whereas SD(X) is in the same unit as X.
- Concretely:
- For discrete rv: \(\sigma_X^2 = \sum\limits_{x \in S_X} (x - \mu_X)^2 \cdot p(x)\)
- For continuous rv: \(\sigma_X^2 = \int_{-\infty}^\infty (x - \mu_X)^2 \cdot f(x) dx\)
2.6 - General Normal Distribution
- If \(X \sim N(\mu_X,\sigma_X^2)\), then:
\[f(x) = \frac{1}{\sqrt{2\pi \sigma_X^2}} exp\left( - \frac 12 \left(\frac{x-\mu_X}{\sigma_X} \right)^2\right)\]
- Note that we still have 67% of probability in the range \([\mu_X - \sigma_X, \mu_X + \sigma_X]\).
- For this general normal distribution, we also have the relation with the standard normal distribution quantile function: \(q_\alpha = \mu_X + \sigma_X \cdot \Phi^{-1}(\alpha) = \mu_X + \sigma_X \cdot z_\alpha\)
Finding areas under General Normal Curve
- In Excel:
- NORMDIST(x,mu_X,sigma_X,cummulative): if commulative==true, computes \(Pr(X \le x)\), otherwise compute \(f(x) = \frac{1}{\sqrt{2\pi \sigma_X^2}} exp\left( - \frac 12 \left(\frac{x-\mu_X}{\sigma_X} \right)^2\right)\)
- NORMINV(alpha, mu, sigma) computes \(q_\alpha = \mu_X + \sigma_X \cdot z_\alpha\)
- In R:
- simulate data: rnorm(n,mean,sd)
- compute CDF: pnorm(q, mean, sd)
- compute quantiles: qnorm(p,mean,sd)
- compute density: dnorm(x,mean, sd)
2.7 - Standard deviation as measure of risk
- Typically for return rate computation, if we consider: \(R_A \sim N(\mu_A,\sigma_A^2)\) and \(R_B \sim N(\mu_B,\sigma_B^2)\), then typically, if \(\mu_A > \mu_B\), then we will also find that \(\sigma_A > \sigma_B\).
2.8 - Normal Distribution: Appropriate fo simple returns ?
- If we model a return \(R_t \sim N(0.05,(0.50)^2)\). Then even if we know that \(R_t \ge -1\), we will compute that: \(Pr(R_t < -1) = 0.018\) (which is wrong!).
- normal distribution is more appropriate for cc returns:
- \(r_t = ln(1+R_t)\)
- \(r_t\) can take on values less than -1.
The Log-Normal Distribution
- \(X \sim N(\mu_X,\sigma_X^2), -\infty \lt X \lt \infty\)
- Then we can define \(Y = exp(X) \sim lognormal(\mu_X,\sigma_X^2), 0 \lt Y \lt \infty\)
- \(E[Y] = \mu_Y = exp(\mu_X + \frac{\sigma_X^2}{2})\)
- \(Var[Y] = \sigma_Y^2 = exp(2\mu_X + \sigma_X^2)(exp(\sigma_X^2)-1)\)
- positive skew is when we have a long “right tail”, eg. the main “blob” is on the left.
- in R we have : rlnorm, plnorm, qlnorm and dlnorm.
2.9 - Skewness and Kurtosis
Skewness - Measure of symmetry
- \(g(X) = ((X - \mu_X)/\sigma_X)^3\)
- \(Skew(X) = E\left[ \left(\frac{X - \mu_X}{\sigma_X} \right)^3 \right]\)
- Skew(X)>0 is when we have a long “right tail”, eg. the main “blob” is on the left.
- Skew(X)<0 is when we have a long “left tail”, eg. the main “blob” is on the right.
- For symmetry distributions Skew(X)=0
- For log normal distribution: \(Y \sim lognormal(\mu_X,\sigma_X^2)\) we have:
\[Skew(Y) = (exp(\sigma_X^2) +2) \sqrt{exp(\sigma_X^2) -1} \gt 0\]
Kurtosis - Measure of tail thickness
- \(g(X) = ((X-\mu_X)/\sigma_X)^4\)
- \(Kurt(X) = E\left[ \left( \frac{X-\mu_X}{\sigma_X}\right)^4 \right]\)
- For a general normal distribution \(X \sim N(\mu_X,\sigma_X^2)\) we get \(Kurt(X)=3\)
- We then define the Excess kurtosis = Kurt(X) - 3.
- If Excess kurtosis(X) > 0 ⇒ X has fatter tails than normal distribution
- If Excess kurtosis(X) < 0 ⇒ X has thinner tails than normal distribution
2.10 - Student's-t Distribution
- Similar to normal distribution but with fatter tails (eg. larger kurtosis).
- It has an additional parameter called the **degree of freedom“ “v”.
- We note \(X \sim t_v\), and the pdf is:
\[f(x) = \frac{\Gamma(\frac{v+1}{2})}{\sqrt{2\pi}\Gamma(\frac v2)} \left( 1 + \frac{x^2}{v}\right)^{- \frac{v+1}{2}}, ~~ -\infty \lt x \lt \infty, ~~ v > 0 \]
- With \(\Gamma(z) = \int_0^\infty t^{z-1}e^{-t}dt\) denoting the gamma function.
- When \(v \rightarrow \infty\) then the Student-t distribution is exactly the normal distribution.
- The smaller the degree of freedom parameter, the fatter are the tails of the distribution.
- Properties of this distribution are:
- \(E[X] = 0, ~~ v>1\)
- \(Var(X) = \frac{v}{v-2}, ~~ v > 2\)
- \(Skew(X) = 0, ~~ v > 3\)
- \(excess kurt(X) = \frac{6}{v-4} - 3, ~~ v > 4\)
- in R we have the functions: rt, pt, qt and dt related to this distribution.
- In practice if v=60 then we can already consider that we have the normal distribution.
2.11 - Linear Functions of Random Variables
- Let X be a discrete or continuous rc with \(\mu_X = E[X]\) and \(\sigma_X^2 = Var(X)\)
- We define a new rv Y, such as: \(Y = g(X) = a \cdot X + b\)
- Then we have: \(\mu_Y = a \cdot \mu_X + b\) and \(\sigma_Y = a \cdot \sigma_X\)
Linear function of Normal rv
- Let \(X \sim N(\mu_X,\sigma_X^2)\) and define \(Y = a \cdot X + b\). Then: \(Y \sim N(\mu_Y,\sigma_Y^2)\) with:
\[\mu_Y = a \cdot \mu_X + b\] \[\sigma_Y^2 = a^2 \cdot \sigma_X^2\]
Standardizing a Normal rv
- Let \(X \sim N(\mu_X,\sigma_X^2)\). The standardized rv Z is created using:
\[\begin{align} Z & = \frac{X - \mu_X}{\sigma_X} = \frac{1}{\sigma_X} \cdot X - \frac{\mu_X}{\sigma_X} \\ & = a \cdot X + b \\ a & = \frac{1}{\sigma_X}, ~ b = -\frac{\mu_X}{\sigma_X} \end{align}\]
- Thus we get: \(Z \sim N(0,1)\).
2.12 - (Example) Value at Risk
- Eg. compute how much money we could loose with a specified probability \(\alpha\).
- Assume R = simple monthly return. \(R \sim N(0.05, (0.10)^2)\)
- \(\alpha\) is usually 5% or 1%.
- End of month wealth \(W_1 = $10000 \cdot (1+R)\)
- What is \(Pr(W_1 \lt $9000\)
- What value of R produces \(W_1 = $9000\)
- In general, the \(\alpha \times 100%\) Value-at-Risk \((VaR_\alpha)\) for an initial investment of \($W_0\) is computed as: \(VaR_\alpha = $W_0 \times q_\alpha\) where \(q_\alpha\) is the quantile of the simple return distribution.
- Note that the Var is often reported as a positive number instead of a negative value.
VaR for cc returns
- r =ln(1+R)
- We assume \(r \sim N(\mu_r,\sigma_r^2)\)
- We then:
- Compute the alpha quantile of the normal dist for r: \(q_\alpha^r = \mu_r + \sigma_r z_\alpha\)
- Convert the alpha quantile for r into an alpha quantile for R: \(q_\alpha^R = e^{q_\alpha^r} - 1\)
- We compute the \(VaR_\alpha\) using \(q_\alpha^R\): \(VaR_\alpha = $W_0 \cdot q_\alpha^R\)