Table of Contents

VI - Logistic Regression

6.1 - Classification

6.2 - Hypothesis Representation

6.3 - Decision Boundary

6.4 - Cost Function

\[Cost(h_\theta(x),y) = \begin{cases} -log(h_\theta(x)) \text{ if } y=1 \\ -log(1 - h_\theta(x)) \text{ if } y=0 \end{cases}\]

6.5 - Simplified Cost Function and Gradient Descent

\[Cost(h_\theta(x),y) = -y~log(h_\theta(x)) - (1-y)~log(1 - h_\theta(x))\]

\[ J(\theta) = -\frac{1}{m} \sum\limits_{i=1}^m y^{(i)}~log(h_\theta(x^{(i)})) + (1-y^{(i)})~log(1 - h_\theta(x^{(i)})) \]

\[ \frac{\partial}{\partial\theta_j}J(\theta) = \frac{1}{m} \sum\limits_{i=1}^m (h_\theta(x^{(i)}) - y^{(i)}) x_j^{(i)} \]

⇒ This is exactly the same partial derivative as for linear regression! (except that h() is different).

6.6 - Advanced Optimization

6.7 - Multiclass Classification: Ons-vs-all