2. Algorithms and architectures
Three architectures were implemented so far: Kohonen self organizing maps, counter-propagation neural networks, and back-propagation neural networks.Kohonen self organizing maps
Kohonen self organizing maps were implemented as described in T. Kohonen, Self-Organization and Associative Memory, Springer, Berlin, 1988. A description of the method is available in the paper J. Gasteiger, J. Zupan, "Neural Networks in Chemistry", Angew. Chem. Int. Ed. Engl 32 (1993) 503 - 527.
Topology of the network is always toroidal while the network size is chosen by the user (between 5x5 and 29x29). The user is also asked to choose the number of epochs and the initial learning span.
Before the training, the weights at level i are initialized with random values between a-sd and a+sd, where a and sd are respectively the average and the standard deviation of the i th variable of the objects in the training set.
During one epoch, n random objects from the training set are fed to the network in random order (n is the number of objects in the training set). Every time one object is presented to the network during the training, one neuron is chosen as the winning neuron. This is the neuron that has the most similar weights to the object (the neuron with minimum Euclidean distance to the object). The weights of the winning neuron are adapted in order that they become even more similar to the object; and the weights of the neurons in its neighborhood are also adapted, although to a smaller extent. The radius, dL, of the neighborhood decreases during the training according to expression 2.1.
dL = ILS - epoch . ILS / epochmax (2.1.),
where ILS is the initial learning span, epochmax is the epoch at which the training will be stopped, and epoch is the current epoch.
Correction of the weights is done according to 2.2.
winew= wiold + (obji-wiold).(0.4995*(epochmax-epoch)/(epochmax-1)+0.0005).(1-d/(dL+1)) (2.2)
where winew is the i th weight of the neuron after the correction, wiold is the i th weight of the neuron before the correction, obji is the i th variable of the object, and d is the distance from the winning neuron to the neuron being corrected.
Counter-propagation neural networks
Counter-propagation neural networks were implemented as described in R. Hecht-Nielsen, Appl. Opt. 1987, 26, 4979.
A counter-propagation neural network has a Kohonen layer on the top of an output layer. To every neuron in the Kohonen layer corresponds one neuron in the output layer. In the current version of JATOON the output layer has always one dimension. During the training, the winning neurons are chosen by the Kohonen layer (exactly as in a Kohonen NN). But then, not only the weights of the Kohonen layer are adapted. The weights of the output layer are also adapted in order to become closer to the output value of the presented object.
Training a CPG NN and a Kohonen NN are two similar tasks and they were implemented in JATOON in a similar way. In spite of that, it should be clear that the training of a CPG NN is supervised, while a Kohonen NN learns by unsupervised training (no 'right' output is taught for a given input).
When an object is presented to a Kohonen NN, it is mapped on the surface and that's the only result: the position of the neuron that is excited. In a CPG NN, there are two results: the position of the winning neuron and its output value.
Back-propagation neural networks
Back-propagation neural network is the most well known architecture of a neural network. In JATOON it was implemented as described in D. E. Rumelhart, G. E. Hinton, R. J. Williams in Parallel Distributed Processing, Vol. 1 (Eds.: D. E. Rumelhart, J. L. McClelland, and the PDP Research Group), MIT Press, Cambridge, MA, USA, 1986, pp. 318–362. A description of the method is available in the paper J. Gasteiger, J. Zupan, "Neural Networks in Chemistry", Angew. Chem. Int. Ed. Engl 32 (1993) 503 - 527.
Architecture: the network is implemented with three layers, the input layer, the hidden layer, and the output layer. Each neuron is connected to all the neurons of the layer above, and to all the neurons of the layer below. The user can choose the number of input neurons (up to 150), hidden neurons (up to 30) and output neurons (up to 30).
Training: the network learns by correction of the weights. Before the training, the weights are randomly initialized between -1/m and 1/m (m is the number of neurons in the layer below). All the weights are corrected after every object from the training set is submitted to the network, and the correction of the weights is performed by the 'back-propagation of errors' algorithm. Two parameters influence the corrections: the rate parameter and the momentum parameter. Both are defined by the user before the training starts. During one epoch, n random objects from the training set are fed to the network in random order (n is the number of objects in the training set). The user also decides for how many epochs should the network train.
The ability of the network to apply its knowledge to new situations should be monitored by a test set. After every epoch is finished, the objects of the test set are fed to the network and the outputs are compared to the true outputs. Usually, the error for the training set always decreases during the training until stabilizing. On the contrary, the error for the test set usually decreases up to a point, and then the trend inverts; after this point it is considered that the network is becoming 'over trained'.