====== Week2 - Probability Review ======

===== 2.1 - Univariate random variable =====

  * A random variable (rv) X can take value on a **sample space** \(S_X\).
  * It is distributed following a **probability distribution function** (pdf).

==== Discrete random variable ====

  * Can only take a finite number of values.
  * \(\forall x \in S_X : 0 \le p(x) \le 1\)
  * \(\forall x \notin S_X: p(x) = 0\)
  * \(\sum\limits_{x \in S_X} p(x) = 1\)

=== Bernoulli distribution ===

  * We note X=1 on success and X=0 on failure.
  * \(Pr(X=1)=\pi\) and \(Pr(X=0)=1-\pi\)
  * Then we have the pdf: \(p(x)=Pr(X=x)=\pi^x(1-\pi)^x\) for \(x \in {0,1}\)

==== Continuous random variable ====

  * In that case we have a **probability curve** \(f(x)\)
  * And we can measure probability on intervals A: \(Pr(X \in A) = \int_A f(x) dx\)
  * \(\forall x: f(x) \ge 0\) and \(\int_{-\infty}^\infty f(x) dx = 1\)

=== Uniform distribution over [a,b] ===

  * assuming b>a here.
  * We note: \(X \sim U[a,b]\) and we have the pdf: \(f(x) = \begin{cases} \frac{1}{b-a} & \text{for } a \le x \le b \\ 0 & \text{otherwise}\end{cases}\)

===== 2.2 - Cumulative Distribution Function =====

  * The CDF function F for a rv X is: \(F(x) = Pr(X \le x)\)
  * \(x_1 \lt x_2 \Rightarrow F(x_1) \le F(x_2)\)
  * \(F(-\infty) = 0\) and \(F(\infty) = 1\)
  * \(Pr(X \ge x) = 1 - F(x)\)
  * \(Pr(x_1 \le x \le x_2) = F(x_2) - F(x_1)\)
  * \(\frac{d}{dx} F(x) = f(x)\) if X is a continuous rv.

  * Also note that for a continuous rv: \(Pr(X\le x) = Pr(X\lt x)\) and \(Pr(X=x)=0\)

===== 2.3 - Quantiles =====

  * Given an X rv with continuous CDF \(F_X(x) = Pr(X \lt x)\): The \(\alpha\)* 100% quantile of \(F_X\) for \(\alpha \in [0,1]\) is the value \(q_\alpha\) such that \(F_X(q_\alpha) = Pr(X \lt q_\alpha) = \alpha\).
  * The area to the left of \(q_\alpha\) is \(\alpha\) under the probability curve.
  * If the inverse CDF function exists, then: \(q_\alpha = F_X^{-1}(\alpha)\)
  * The 50% quantile is also called the **median**
  * For a dist U[0,1] for instance we have \(F(x)=x \Rightarrow q_\alpha=\alpha\)


===== 2.4 - Standard normal distribution =====

  * If X is a rv such as \(X \sim N(0,1)\), then: \(f(x) = \phi(x) = \frac{1}{\sqrt{2\pi}} exp\left( - \frac12 x^2 \right)\) for \(-\infty \le x \le \infty\).
\[\Phi(x) = Pr(X \le x) = \int_{-\infty}^x \phi(z)dz\]

  * We have the important ranges:
\[Pr(-1 \le x \le 1) \approx 0.67\]  
\[Pr(-2 \le x \le 2) \approx 0.95\]  
\[Pr(-3 \le x \le 3) \approx 0.99\]  

  * In Excel: 
    * we can use the function NORMSDIST to get the \(\Phi(z)\) or the \(\phi(z)\) values.
    * we can use the function NORMSINV to get the \(\Phi^{-1}(\alpha)\) value.
  * In R:
    * We use **pnorm** to compute \(\Phi(z)\)
    * We use **qnorm** to compute \(\Phi^{-1}(z)\)
    * We use **dnorm** to compute \(\phi(z)\)

  * Other noticeable relations on the std distribution:
\[Pr(X\le z) = 1 - Pr(X \ge z)\]
\[Pr(X\ge z) = Pr(X \le -z)\]
\[Pr(X\ge 0) = Pr(X \le 0) = 0.5\]

===== 2.5 - Expected Value and Standard Deviation =====

==== Shape characteristics of pdfs ====

  * **Expected Value or Mean**: Center of mass
  * **Variance and standard deviation**: spread about mean
  * **Skewness**: symmetry about mean
  * **Kurtosis**: Tail thickness

==== Expected value ====

  * For discrete rv: \(E[X] = \mu_X = \sum\limits_{x \in S_X} x \cdot p(x)\)
  * For continuous rv: \(E[X] = \mu_X = \int_{-\infty}^\infty x \cdot f(x) dx\)

  * If \(X \sim N(0,1)\) then \(\mu_X = \int_{-\infty}^\infty x \cdot \frac{1}{\sqrt{2\pi}} e^{-\frac 12 x^2} dx = 0\)


  * Let g(X) be some function of the rv X. Then
    * For discrete rv: \(E[g(X)] = \sum\limits_{x \in S_X} g(x) \cdot p(x)\)
    * For continuous rv: \(E[g(X)] = \int_{-\infty}^\infty g(x) \cdot f(x) dx\)


==== Variance and Standard Deviation ====

  * \(g(X) = (X - E[X])^2 = (X - \mu_X)^2\)
  * \(Var(x) = \sigma_X^2 = E[g(X)] = E[(X-\mu_X)^2] = E[X^2] - \mu_X^2\)
  * \(SD(X) = \sigma_X = \sqrt{Var(X)}\)

  * Note that Var(X) is in squared units of X, whereas SD(X) is in the same unit as X.


  * Concretely:
    * For discrete rv: \(\sigma_X^2 = \sum\limits_{x \in S_X} (x - \mu_X)^2 \cdot p(x)\)
    * For continuous rv: \(\sigma_X^2 = \int_{-\infty}^\infty (x - \mu_X)^2 \cdot f(x) dx\) 

===== 2.6 - General Normal Distribution =====

  * If \(X \sim N(\mu_X,\sigma_X^2)\), then:
\[f(x) = \frac{1}{\sqrt{2\pi \sigma_X^2}} exp\left( - \frac 12 \left(\frac{x-\mu_X}{\sigma_X} \right)^2\right)\]

  * Note that we still have 67% of probability in the range \([\mu_X - \sigma_X, \mu_X + \sigma_X]\).

  * For this general normal distribution, we also have the relation with the standard normal distribution quantile function: \(q_\alpha = \mu_X + \sigma_X \cdot \Phi^{-1}(\alpha) = \mu_X + \sigma_X \cdot z_\alpha\)

==== Finding areas under General Normal Curve ====

  * In Excel:
    * NORMDIST(x,mu_X,sigma_X,cummulative): if commulative==true, computes \(Pr(X \le x)\), otherwise compute \(f(x) = \frac{1}{\sqrt{2\pi \sigma_X^2}} exp\left( - \frac 12 \left(\frac{x-\mu_X}{\sigma_X} \right)^2\right)\)
    * NORMINV(alpha, mu, sigma) computes \(q_\alpha = \mu_X + \sigma_X \cdot z_\alpha\)
  * In R:
    * simulate data: rnorm(n,mean,sd)
    * compute CDF: pnorm(q, mean, sd)
    * compute quantiles: qnorm(p,mean,sd)
    * compute density: dnorm(x,mean, sd)

===== 2.7 - Standard deviation as measure of risk =====

  * Typically for return rate computation, if we consider: \(R_A \sim N(\mu_A,\sigma_A^2)\) and \(R_B \sim N(\mu_B,\sigma_B^2)\), then typically, if \(\mu_A > \mu_B\), then we will also find that \(\sigma_A > \sigma_B\).

===== 2.8 - Normal Distribution: Appropriate fo simple returns ? =====

  * If we model a return \(R_t \sim N(0.05,(0.50)^2)\). Then even if we know that \(R_t \ge -1\), we will compute that: \(Pr(R_t < -1) = 0.018\) (which is wrong!).

  * normal distribution is more appropriate for cc returns:
    * \(r_t = ln(1+R_t)\)
    * \(r_t\) can take on values less than -1.

==== The Log-Normal Distribution ====

  * \(X \sim N(\mu_X,\sigma_X^2), -\infty \lt X \lt \infty\)
  * Then we can define \(Y = exp(X) \sim lognormal(\mu_X,\sigma_X^2), 0 \lt Y \lt \infty\)
  * \(E[Y] = \mu_Y = exp(\mu_X + \frac{\sigma_X^2}{2})\)
  * \(Var[Y] = \sigma_Y^2 = exp(2\mu_X + \sigma_X^2)(exp(\sigma_X^2)-1)\)

  * **positive skew** is when we have a long "right tail", eg. the main "blob" is on the left.

  * in R we have : rlnorm, plnorm, qlnorm and dlnorm.

===== 2.9 - Skewness and Kurtosis =====

==== Skewness - Measure of symmetry ====

  * \(g(X) = ((X - \mu_X)/\sigma_X)^3\)
  * \(Skew(X) = E\left[ \left(\frac{X - \mu_X}{\sigma_X} \right)^3 \right]\)
  * Skew(X)>0 is when we have a long "right tail", eg. the main "blob" is on the left.
  * Skew(X)<0 is when we have a long "left tail", eg. the main "blob" is on the right.
  * For symmetry distributions Skew(X)=0

  * For log normal distribution: \(Y \sim lognormal(\mu_X,\sigma_X^2)\) we have:
\[Skew(Y) = (exp(\sigma_X^2) +2) \sqrt{exp(\sigma_X^2) -1} \gt 0\]

==== Kurtosis - Measure of tail thickness ====

  * \(g(X) = ((X-\mu_X)/\sigma_X)^4\)
  * \(Kurt(X) = E\left[ \left( \frac{X-\mu_X}{\sigma_X}\right)^4 \right]\)

  * For a general normal distribution \(X \sim N(\mu_X,\sigma_X^2)\) we get \(Kurt(X)=3\)

  * We then define the **Excess kurtosis** = Kurt(X) - 3.
    * If Excess kurtosis(X) > 0 => X has fatter tails than normal distribution
    * If Excess kurtosis(X) < 0 => X has thinner tails than normal distribution

===== 2.10 - Student's-t Distribution =====

  * Similar to normal distribution but with fatter tails (eg. larger kurtosis).
  * It has an additional parameter called the **degree of freedom" "v".
  * We note \(X \sim t_v\), and the pdf is:
\[f(x) = \frac{\Gamma(\frac{v+1}{2})}{\sqrt{2\pi}\Gamma(\frac v2)} \left( 1 + \frac{x^2}{v}\right)^{- \frac{v+1}{2}}, ~~ -\infty \lt x \lt \infty, ~~ v > 0 \]

  * With \(\Gamma(z) = \int_0^\infty t^{z-1}e^{-t}dt\) denoting the gamma function.
  * When \(v \rightarrow \infty\) then the Student-t distribution is exactly the normal distribution.
  * The smaller the degree of freedom parameter, the fatter are the tails of the distribution.

  * Properties of this distribution are:
    * \(E[X] = 0, ~~ v>1\)
    * \(Var(X) = \frac{v}{v-2}, ~~ v > 2\)
    * \(Skew(X) = 0, ~~ v > 3\)
    * \(excess kurt(X) = \frac{6}{v-4} - 3, ~~ v > 4\)

  * in R we have the functions: rt, pt, qt and dt related to this distribution.
  * In practice if v=60 then we can already consider that we have the normal distribution.

===== 2.11 - Linear Functions of Random Variables =====

  * Let X be a discrete or continuous rc with \(\mu_X = E[X]\) and \(\sigma_X^2 = Var(X)\)
  * We define a new rv Y, such as: \(Y = g(X) = a \cdot X + b\)
  * Then we have: \(\mu_Y = a \cdot \mu_X + b\) and \(\sigma_Y = a \cdot \sigma_X\)


==== Linear function of Normal rv ====

  * Let \(X \sim N(\mu_X,\sigma_X^2)\) and define \(Y = a \cdot X + b\). Then: \(Y \sim N(\mu_Y,\sigma_Y^2)\) with:
\[\mu_Y = a \cdot \mu_X + b\]
\[\sigma_Y^2 = a^2 \cdot \sigma_X^2\]


==== Standardizing a Normal rv ====

  * Let \(X \sim N(\mu_X,\sigma_X^2)\). The standardized rv Z is created using:
\[\begin{align} Z & = \frac{X - \mu_X}{\sigma_X} = \frac{1}{\sigma_X} \cdot X - \frac{\mu_X}{\sigma_X} \\ & = a \cdot X + b \\ a & = \frac{1}{\sigma_X}, ~ b = -\frac{\mu_X}{\sigma_X} \end{align}\]
  * Thus we get: \(Z \sim N(0,1)\).

===== 2.12 - (Example) Value at Risk =====

  * Eg. compute how much money we could loose with a specified probability \(\alpha\).
  * Assume R = simple monthly return. \(R \sim N(0.05, (0.10)^2)\)
  * \(\alpha\) is usually 5% or 1%.
  * End of month wealth \(W_1 = $10000 \cdot (1+R)\)
  * What is \(Pr(W_1 \lt $9000\)
  * What value of R produces \(W_1 = $9000\)

  * In general, the \(\alpha \times 100%\) Value-at-Risk \((VaR_\alpha)\) for an initial investment of \($W_0\) is computed as: \(VaR_\alpha = $W_0 \times q_\alpha\) where \(q_\alpha\) is the quantile of the simple return distribution.

  * Note that the Var is often reported as a positive number instead of a negative value.

==== VaR for cc returns ====

  * r =ln(1+R)
  * We assume \(r \sim N(\mu_r,\sigma_r^2)\)

  * We then:
    * Compute the alpha quantile of the normal dist for r: \(q_\alpha^r = \mu_r + \sigma_r z_\alpha\)
    * Convert the alpha quantile for r into an alpha quantile for R: \(q_\alpha^R = e^{q_\alpha^r} - 1\)
    * We compute the \(VaR_\alpha\) using \(q_\alpha^R\): \(VaR_\alpha = $W_0 \cdot q_\alpha^R\)