Show pageOld revisionsBacklinksBack to top This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong. ====== Lecture 4 ====== ===== 3.1 - Learning to predict the next word ===== * Nothing relevant here. ===== 3.2 - A brief diversion into cognitive science ===== * **feature theory**: a concept is a set of semantic features * **structuralist theory**: the meaning of a concept lies in its relationships to other concepts. ===== 3.3 - Another diversion: The softmax output function ===== * Eahc neuron in the output layer would receive a total input of \(z_i\) and will output a value \(y_i\) that also depends on the inputs from the other neurons in that group: \(y_i = \frac{e^{z_i}}{\sum\limits_{j \in group} e^{z_j}}\) * The derivative of the softmax is simple: \(\frac{\partial y_i}{\partial z_i} = y_i (1 - y_i)\) ==== Cross-entroy : the right cost function to use with softmax ==== * \(C = - \sum\limits_j t_j log(y_i)\) * C has a very big gradient when the target value is 1 and the output is almost zero. (eg. very steep derivative when the answer is very wrong) * \(\frac{\partial C}{\partial z_i} = y_i - t_i\) public/courses/machine_learning/neural_networks/lecture_4.txt Last modified: 2020/07/10 12:11by 127.0.0.1