====== Lecture 1 ====== ===== 1.1 - Why do we need machine learning ? ===== * Because we don't know how to write the corresponding program! * Good at: * Recognizing patterns * Recognizing anomalies * Doing prediction * Here we use the MNIST database of hand-written digits. ===== 1.2 - What are neural networks ? ===== * Each neuron has: * An axon * A Dentritic tree * Synapses can adapt. They are very slow, and very low power ===== 1.3 - Some simple models of neurons ===== * For a **linear neuron**, the output is : \(y = b + \sum\limits_i x_i w_i\) * where "b" is a bias term, xi, the activity on input term i, and wi the weight of input term i. ==== Binary threshold neurons ==== * we have a threshold used to decide if we output 0 or 1: two formulations: * \(z = \sum\limits_i x_i w_i\) then y=1 if z > \(\theta\) or y=0 otherwise. * \(z = b + \sum\limits_i x_i w_i\) then y=1 if z > 0 or y=0 otherwise. ==== Rectified linear neurons ==== * \(z = b + \sum\limits_i x_i w_i\) then y=z if z > 0 or y=0 otherwise. ==== Sigmoid neurons ==== * \(z = b + \sum\limits_i x_i w_i\) then \(y = \frac{1}{1+e^{-z}}\) ==== Stochastic binary neurons ==== * \(z = b + \sum\limits_i x_i w_i\) then \(p(y=1) = \frac{1}{1+e^{-z}}\) ===== 1.4 - A simple example of learning ===== * Simple neural net with on input layer and one output layer. * To train the network, we **increment** the weights from active pixels to the correct class. * We also **decrement** the weights to **all** the classes the network guesses. ===== 1.5 - Three types of learning ===== * Supervised learning * Regression * Classification * Reinforcement learning * Unsupervised learning ==== Supervised learning ==== * We start by choosing a **model-class** : \(y=f(x; W)\) * Learning means adjusting the parameters to reduce the discrepancy between real output t and the model output y. * For regression the measure the error we usually use a term such as \(\frac 12 (t-y)^2\) * For classification we have other error measure functions. ==== Reinforcement learning ==== * About rewards in the future. Difficult to handle. ==== Unsupervised learning ==== * Usefull to get "an understanding" (eg. an internal representation) of the input without labeling it. * Providing compact low-dimensional representation of input. (there is also PCA, but this one is linear) * Provides economical high dimensional representation * Finds sensible clusters in the input.