Week3 - Probability Review Continued

Reverse of standardization. So we build: \(X = \mu_X + \sigma_X \cdot Z, ~~ Z \sim N(0,1)\)
So the location is the means here and the scale is the sigma value.

\(Z \sim N(0,1)\), \(Pr(z \le z_\alpha) = \alpha\)
For the general normal distribution, we get: \(q_\alpha^X = \mu_X + \sigma_X \cdot z_\alpha \)

We have 2 discrete rv X and Y.
p(x,y) = Pr(X = x, Y = y).
Sample space is noted: \(S_{XY}\)

\(p(x) = Pr(X = x) = \sum\limits_{y \in S_Y} p(x,y)\) and similarly:
\(p(y) = Pr(Y = y) = \sum\limits_{x \in S_X} p(x,y)\)

Suppose we know Y = 0, how does this affect the probabilities of X ?

\[\begin{align} Pr(X=0 | Y=0) & = \frac{Pr(X=0, Y=0)}{Pr(Y=0)} \\ & = \frac{\text{joint probability}}{\text{marginal probability}}\end{align}\]

⇒ X depends on Y, so \(Pr(X=0|Y=0) \neq Pr(X=0)\)

So we have \(p(x|y) = Pr(X=x | Y=y) = \frac{Pr(X=x, Y=y)}{Pr(Y=y)} \)
And \(p(y|x) = Pr(Y=y | X=x) = \frac{Pr(X=x, Y=y)}{Pr(X=x)} \)

\(\mu_{X|Y=y} = E[X|Y=y] = \sum\limits_{x \in S_X} x \cdot Pr(X=x |Y=y)\)
\(\mu_{Y|X=x} = E[Y|X=x] = \sum\limits_{y \in S_Y} y \cdot Pr(Y=y |X=x)\)

\(\sigma_{X|Y=y}^2 = Var(X|Y=y) = \sum\limits_{x \in S_X} (x-\mu_{X|Y=y})^2 \cdot Pr(X=x |Y=y)\)
and similarly for \(\sigma_{Y|X=x}^2\)

Most of the time, the conditional variances will be smaller than the unconditioned variances.

Given rv X and Y are independent rv if and only if: \(p(x,y) = p(x) \cdot p(y) ~~ \forall x \in S_X, ~ \forall y \in S_Y\)
If X and Y are independent, then:

\[p(x|y) = p(x) \forall x \in S_X, ~ \forall y \in S_Y\] \[p(y|x) = p(y) \forall x \in S_X, ~ \forall y \in S_Y\]

The joint pdf of X and Y is a non-negative function f(x,y) such that:

\[ \int_{-\infty}^\infty \int_{-\infty}^\infty f(x,y) dx dy = 1\]

Let \([x_1,x_2]\) and \([y_1,y_2]\) be intervals on the real line. Then:

\[Pr(x_1 \le X \le x_2, y_1 \le Y \le y_2) = \int_{x_1}^{x_2} \int_{y_1}^{y_2} f(x,y) dx dy\]

given continuous rv X, Y, we have the marginal pdfs:
- \(f(x) = \int_{-\infty}^\infty f(x,y) dy \)
- \(f(y) = \int_{-\infty}^\infty f(x,y) dx \)

The conditional pdf of X given Y=y is: \(f(x|y) = \frac{f(x,y)}{f(y)}\)
The conditional pdf of Y given X=x is: \(f(y|x) = \frac{f(x,y)}{f(x)}\)

Conditional means are computed as:

\[\mu_{X|Y=y} = E[X|Y=y] = \int x \cdot p(x|y) dx\]

Conditional variances are computed as:

\[\sigma_{X|Y=y}^2 = Var(X|Y=y) = \int (x-\mu_{X|Y=y})^2 p(x|y) dx\]

Let X and Y be continuous rv. X and Y are independent if and only if:

\[f(x,y) = f(x)f(y)\]

Or equivalently:

\[f(x|y) = f(x), for -\infty \lt x,y \lt \infty\] \[f(y|x) = f(y), for -\infty \lt x,y \lt \infty\]

Example if \(X \sim N(0,1)\) and \(Y \sim N(0,1)\) and X,Y are independent. Then

\[f(x,y) = f(x)f(y) = \frac{1}{2\pi} e^{- \frac 12 (x^2+y^2)}\]

in R we use the mvtnorm package to compte those multi variate integrals.

Covariance: measures direction but not strength of linear relationship between 2 rv's:

\[\begin{align} \sigma_{XY} & = E[(X-\mu_X)(Y - \mu_Y)] \\ & = \sum\limits_{x,y \in S_ {XY}} (x-\mu_X)(y-\mu_Y) \cdot p(x,y) ~~ \text{(for discret rvs)} \\ & = \int_{-\infty}^\infty \int_{-\infty}^\infty (x-\mu_X)(y-\mu_Y) f(x,y) dx dy ~~ \text{(for continuous rvs)}\end{align}\]

Correlation: measures direction and strength of linear relationship between 2 rv's:

\[\begin{align} \rho_{XY} & = Cor(X,Y) = \frac{Cov(X,Y)}{SD(X) \cdot SD(Y)} \\ & = \frac{\sigma_{XY}}{\sigma_X \cdot \sigma_Y} = \text{scaled covariance}\end{align}\]

⇒ This is sometimes called the pearson correlation

Cov(X,Y) = Cov(Y,X)
Cov(aX, bY) = a*b*Cov(X,Y)
Cov(X,X) = Var(X)

X and Y are independent ⇒ Cov(X,Y) = 0
But Cov(X) = 0 does not imply that X and Y are independent.
Cov(X,Y) = E[XY] - E[X]E[Y]

Correlation is always bounded between -1 and 1 (it is unit free).

\(-1 \le \rho_{XY} \le 1\)
\(\rho_{XY} = 1\) if Y = aX + b and a > 0
\(\rho_{XY} = -1\) if Y = aX + b and a < 0
\(\rho_{XY} = 0\) if and only if \(\sigma_{XY} = 0\)
\(\rho_{XY} = 0\) does **not* imply that X and Y are independent in general.
\(\rho_{XY} = 0\) does imply that X and Y are independent if they are normal dists.

Let X and Y be distributed bivariate normal. The joint pdf is given by:

\[f(x,y) = \frac{1}{2\pi \sigma_X \sigma_Y \sqrt{1-\rho^2}} \times \\ exp \left[ - \frac{1}{2(1-\rho^2)} \left[ \left( \frac{x - \mu_X}{\sigma_X} \right)^2 + \left( \frac{y - \mu_Y}{\sigma_Y} \right)^2 - \left( \frac{2 \rho(x-\mu_X)(y-\mu_Y)}{\sigma_X \sigma_X} \right) \right] \right] \]

Let X and Y be rv's. We define Z as: Z = aX + bY. Then:
\(\mu_Z = a \cdot \mu_X + a \cdot \mu_Y\)
and \(\sigma_Z^2 = a^2 \sigma_X^2 + b^2 \sigma_Y^2 + 2a\cdot b \cdot \sigma_{XY}\)
if \(X \sim N(\mu_X,\sigma_X^2)\) and \(Y \sim N(\mu_Y,\sigma_Y^2)\), then \(Z \sim N(\mu_Z,\sigma_Z^2)\)

Portfolio return = \(R_P = x_A \cdot R_A + x_B + R_B\)
\(x_A + x_B = 1\)
\(Cov(R_A,R_B) = \sigma_{AB}\) and \(Cor(R_A,R_B) = \frac {\sigma_{AB}}{\sigma_A \sigma_B}\)

\(E[R_P] = x_A \mu_A + x_B \mu_B\)
\(Var(R_P) = x_A^2 \sigma_A^2 + x_B^2 \sigma_B^2 + 2 x_A x_B \sigma_{AB}\)

Let \(Z = \sum\limits_{i=1}^N a_i X_i\)
Then: \(\mu_Z = \sum\limits_{i=1}^N a_i \mu_i\)
If all X rv are normally distributed, the Z is also normally distributed.

Week3 - Probability Review Continued

3.1 - Location-scale Model

Quantiles of normal distribution

3.2 - Bivariate Discrete Distributions

Marginal pdfs

Conditional probability

Conditional Mean and Variance

Independence

3.3 - Bivariate Continuous Distributions

Marginal and conditional distributions

Independence

3.4 - Covariance

Properties of Covariance

3.5 - Correlation and the Bivariate Normal Distribution

Properties of Correlation

Bivariate normal distribution

3.6 - Linear Combination of 2 random Variables

3.7 - Portfolio Example

Linear combination of N rvs