\[J(\theta) = \frac{1}{2m} \left[ \sum\limits_{i=1}^m (h_{\theta}(x^{(i)}) - y^{(i)})^2 + \lambda \sum\limits_{j=1}^n \theta_j^2 \right]\]
\[ \frac{\partial}{\partial\theta_j}J(\theta) = \frac{1}{m} \left( \sum\limits_{i=1}^m (h_\theta(x^{(i)}) - y^{(i)}) x_j^{(i)} \right) + \frac{\lambda}{m} \theta_j \]
⇒ Note that the previous derivative does not apply for \(\theta_0\).
\[ \theta_j := \theta_j (1 - \alpha \frac{\lambda}{m}) - \alpha \sum\limits_{i=1}^m (h_\theta(x^{(i)}) - y^{(i)}) x_j^{(i)} \]
\[\theta = (X^TX + \lambda ~ P)^{-1}X^Ty\]
where P is similar to the identity matrix from \(\mathbb{R}^{(n+1) \times (n+1)}\) except that \(P_{11} = 0\).