In this section, I look at an example of a limit on what kinds of thing can be learnt.

Linear associators are one of the simplest kinds of neural net, and
you've probably met them already. For those who haven't, here's a brief
description. See page 161 of *Understanding Cognitive Science* by
McTear (PSY KH:M 025) for a nice intro. Note that although I talk about
neurons and synapses, these nets (and indeed most connectionist models)
are not at all like real neurons.

Consider the following type of net. There are three inputs, , , and . There are three output neurons, , , and . Each input has synaptic connections to each of the three outputs. Thus, has one connection to , another to , and a third to . The same for and . Altogether, this gives nine connections.

Each connection has a given numeric *weight*. In general, I'll use
to denote the connection from to . Each output is
calculated as follows. Multiply each of the inputs by the corresponding
weight, and then add them. In symbols:

This architecture can be extended to any number, not just three. In fact, the number of inputs does not need to be the same as the number of outputs: the principles I describe below will still work. By setting the weights suitably (``training'' the net), such nets can be trained to act as recognisers, such that if you put pattern in, you get a one on and zeros elsewhere; if you put pattern in, in, you get a one on and zeros elsewhere; and so on.

Question 1: assume we have a linear associator with two inputs and one output. Can we train it to make the following association:

i.e. to give an output only when the inputs are the same? You may recognise this as the ``exclusive-or'' problem.

Answer 1: No. All linear associators are what mathematicians call *linear* - hence the name. This means they obey two rules:

- If you multiply each input by the same number , then the
outputs will all be multiplied by . I.e. if the net gives a pattern
of outputs for a pattern of inputs, it will give for .
- Suppose that for pattern of inputs, the net gives pattern of outputs; and that for pattern , it gives pattern . Then for inputs , we get outputs .

The original Perceptron suffered from such a limitation, so you'll find
stuff about linear separability in most accounts of the Perceptron, e.g.
*Introduction to the theory of neural computation* by Herz et al
(PSY KH:H 044). You could not for example train the perceptron to
recognise all patterns containing exactly one dot while rejecting all
other images - see Crevier p 105.

Question 2: You read a paper whose author claims to have made a pattern recogniser that gives the output for the input and gives for every other input. He's done this by connecting 37 linear associators in sequence. Do you believe him?

Answer 2: This is also impossible. If you connect any number of linear associators together, the result is still linear (and can be represented as just one associator). I once heard (maybe apocryphally) of an academic who wrote several papers claiming to have implemented non-linear operations by combining linear nets!

Wed Feb 14 23:47:23 GMT 1996