* feature theory: a concept is a set of semantic features
* structuralist theory: the meaning of a concept lies in its relationships to other concepts.
3.3 - Another diversion: The softmax output function
Eahc neuron in the output layer would receive a total input of \(z_i\) and will output a value \(y_i\) that also depends on the inputs from the other neurons in that group: \(y_i = \frac{e^{z_i}}{\sum\limits_{j \in group} e^{z_j}}\)
The derivative of the softmax is simple: \(\frac{\partial y_i}{\partial z_i} = y_i (1 - y_i)\)
Cross-entroy : the right cost function to use with softmax
\(C = - \sum\limits_j t_j log(y_i)\)
C has a very big gradient when the target value is 1 and the output is almost zero. (eg. very steep derivative when the answer is very wrong)