<?xml version='1.0' encoding='UTF-8' ?>
<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN'
'http://www.w3.org/TR/MathML2/dtd/xhtml-math11-f.dtd' [
	<!ENTITY deg "&#x00b0;">
	<!ENTITY le "&#x2264;">
	<!ENTITY forall "&#x2200;">
	<!ENTITY Delta "&#x2206;">
	<!ENTITY eta "&#x03b7;">
	<!ENTITY sum "&#x2211;">
	<!ENTITY prime "&#x02ca;">
	<!ENTITY sigma "&#x03c3;">
	<!ENTITY sdot "&#x2219;">
	<!ENTITY af "&#x206f;"><!-- FIXME -->
	<!ENTITY pi "&#x03c0;">
]>
<html xmlns='http://www.w3.org/1999/xhtml'>
<head>
  <meta http-equiv='Content-Type' content='text/html; charset=UTF-8' />
  <link rel='stylesheet' type='text/css' href='main.css' />
  <title>Saccadic eye movements</title>
</head>

<body>
<h1>Saccadic eye movements: oculomotor control in the superior colliculus</h1>

<div class='author'>
<a href='mailto:dog&#064;bluezoo.org'>Christopher Burdess</a>
</div>

<div>
 <b>Table of contents</b>
</div>
<ul>
  <li><a href='#introduction'>Introduction</a></li>
  <li><a href='#method'>Method</a></li>
  <li><a href='#implementation'>Implementation</a></li>
  <li><a href='#discussion'>Discussion</a></li>
  <li><a href='#references'>References</a></li>
</ul>

<div>
Note: this document contains equations in the 
<a href='http://www.w3.org/Math/'>MathML</a> format. If your browser cannot
understand this format, the above link may help to locate one that does.
You may also find a PDF copy <a href='saccades.pdf'>here</a>.
</div>
<hr />

<h3>Introduction</h3>

<div>
An important part of human perceptual activity is the numerous eye movements
that must be made during even the shortest task. The organisation of
photoreceptors in the eye is such that the vast majority are located in a
small, shallowly depressed region in the centre of the retina called the
<i>fovea</i>, a window that receives light stimuli subtending approximately
1° of visual field. In trichromatic humans, most of the colour-sensitive cone
photoreceptors are located in this area, and the proportion of
contrast-sensitive rod receptors dominates in this region as well. It is
therefore often necessary to relocate the position of the eye so that the
image of an object that requires detailed perceptual analysis projects onto
this high-resolution area. The method by which this is done is known as a
<i>saccade</i> - an oculomotor reaction that moves the eyeball such that the
image of interest is fovealised.
</div>

<div>
A great deal of neurophysiological research has been carried out within the
field of oculomotor control. <a href='#SN1987'>Sparks &amp; Nelson [1987]</a>
have shown that saccades are triggered in the superior colliculus at the
dorsal side of the mesencephalon. A topographic mapping is in effect between
retinal photoreceptors and groups of neurons in the upper layer of the
superior colliculus, i.e. the neighbourhood of retinal neurons corresponds
proportionally to the neighbourhood of collicular neurons in such a way that a
<i>retinotopic map</i> is created in the superior colliculus. In a similar
way, the topographic organisation of neurons in the lower layer of the
superior colliculus corresponds to saccades triggered by neuronal stimulation
in this layer. Experimentation has shown that the intensity of artificial
stimulation of collicular neurons does not significantly affect the direction
of the resultant saccades: only their topographic location is important in
this respect.
</div>

<div>
This paper is a continuation of the work done by <a href='#RMS1992'>Ritter et
al [1992]</a> on the implementation of a model of the superior colliculus that
can learn saccadic behaviour, based on Robinson’s (1972) <i>fovealisation
hypothesis</i>, the hypothesis that the connections between the upper and
lower layers of the superior colliculus serve to learn and represent saccadic
motor movements to centre the object of interest in the visual field.
</div>

<div>
The method by which the model learns the saccades is in accordance with
current knowledge of the process of saccadic correction occurring in the
oculomotor system. When a saccade is triggered, the image may or may not be
fovealised as a result. If it is not, a corrective saccade is triggered, which
also may or may not cause fovealisation of the desired image. However, if the
position of the eye after the corrective saccade is closer to the fovea than
after the first, the saccade weight for the first is modified to approach the
vector addition of the two saccade weights. Ritter et al describe this process
as a form of supervised learning. However, in the light of the fact that no
learning steps are made if the first saccade correctly fovealises the image or
if the corrective saccade leads the image further from the fovea than it was
after the first, and that learning occurs without resort to any kind of global
error measure, I prefer to consider this algorithm to be a form of
reinforcement learning.
</div>

<h3>Method</h3>

<div>
As in <a href='#RMS1992'>Ritter et al’s [1992]</a> model, I have used a single
layer of formal neurons to represent the two layers of efferent and afferent
neurons in the superior colliculus, where the activation or excitation of
these neurons represents a receptive field of variable width centred on an
internal representation of the retina with respect to the object of interest.
Thus, there are two sets of connection weight values: firstly, those leading
to the formal neuron layer, or <i>lattice</i>, which are termed <i>lattice
weights</i>, and which are responsible for determining the receptive fields of
the lattice; and secondly, those values representing the saccade that is
triggered when the corresponding neuron in the lattice is excited.
</div>

<div class='figure'>
<img src='architecture.png' alt='architecture' />
<br />
<small>Figure 1: Schematic of the architecture of the model. White nodes
represent the stimulus, dark grey nodes represent the direction of the
saccade. In the centre, the light grey nodes represent the lattice units. Not
all lattice units are shown.</small>
</div>

<div>
Ritter et al use a simplification of the neurophysiological reality to
determine the saccade:
</div>

<blockquote>
  <div class='quote'>
  “...we describe the correspondence between the visual stimulus and the
  resultant saccade simply by a pair of values
  <math xmlns='http://www.w3.org/1998/Math/MathML'>
    <mfenced>
      <msub>
        <mi mathvariant='bold'>w</mi>
        <mi>r</mi>
      </msub>
      <msubsup>
        <mi mathvariant='bold'>w</mi>
        <mi>r</mi>
        <mrow>
          <mo>(</mo>
          <mtext>out</mtext>
          <mo>)</mo>
        </mrow>
      </msubsup>
    </mfenced>
  </math>
  of the centrally localized neuron. In reality, the resultant saccade is
  determined by a group of excited neurons localized at
  <math xmlns='http://www.w3.org/1998/Math/MathML'>
    <mi>r</mi>
  </math>
  .”
  </div>
  <div class='caption'>
  <a href='#RMS1992'>Ritter et al [1992]</a>: 146
  </div>
</blockquote>

<div>
The model outlined in this paper uses just such a group of neurons to
determine the actual saccade output. Specifically, a Gaussian distribution
over all the neurons in the lattice, centred on the neuron with the receptive
field centred on the location of the stimulus vector in retinal space, is used
to calculate the resultant saccadic movement. It is hoped that this endows the
model with a greater degree of neurobiological plausibility.
</div>

<div>
The lattice architecture is ring-shaped, as in Ritter et al’s model, with 30
circumferential neurons in a layer and 20 radial layers. The rotational
symmetry of this architecture is suggested by the rotational symmetry of the
input space. Ritter et al additionally mention that
</div>

<blockquote>
  <div class='quote'>
  “Every neural unit has two radial and two circumferential
  neighbours.”
  </div>
  <div class='caption'>
  <a href='#RMS1992'>Ritter et al [1992]</a>: 150
  </div>
</blockquote>

<div>
This is obviously false, as what they are describing is a torus, rather than a
cobweb. Obviously, in a ring-shaped lattice, the innermost (foveal) and
outermost (peripheral) neurons only have three neighbours. We will return to
another problem with their description of the model later. The choice of
stimulus vector, as in Ritter et al’s model, is random yet determined by a
fixed probability density
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mrow>
    <mi mathvariant='normal'>P</mi>
    <mo>(</mo>
    <msup>
      <mi mathvariant='bold'>v</mi>
      <mi>L</mi>
    </msup>
    <mo>)</mo>
  </mrow>
</math>
corresponding to a Gaussian distribution with a width of 40&deg; from the
centre of the retina (maximum 90&deg;), which function corresponds closely
to the distribution of receptors on the retina from the centre, as noted by
<a href='#K1982'>Korn [1982]</a> with stimuli absent within the radius of the
fovea (1&deg;). The initial state of the lattice and saccade weights is also
randomised.
</div>

<div>
The algorithm for the development of this model of the superior colliculus is
as follows:
</div>
<ol>
  <li>Present a vector
    <math xmlns='http://www.w3.org/1998/Math/MathML'>
      <msup>
        <mi mathvariant='bold'>v</mi>
        <mi>L</mi>
      </msup>
    </math>
    in accordance with the probability distribution
	<math xmlns='http://www.w3.org/1998/Math/MathML'>
      <mrow>
        <mn>P</mn>
        <mo>(</mo>
        <msup>
          <mi mathvariant='bold'>v</mi>
          <mi>L</mi>
        </msup>
        <mo>)</mo>
      </mrow>
    </math>
    above.</li>
  <li>Find the centre of excitation
    <math xmlns='http://www.w3.org/1998/Math/MathML'>
      <msup>
        <mi>c</mi>
        <mi>L</mi>
      </msup>
    </math>
    in the lattice according to the “winner-takes-all” condition
	<div class='equation'>
    <math xmlns='http://www.w3.org/1998/Math/MathML'>
      <mrow>
        <mrow>
          <mo fence='true'>&#x2225;</mo>
          <msup>
            <mi mathvariant='bold'>v</mi>
            <mi>L</mi>
          </msup>
          <mo>-</mo>
          <msubsup>
            <mi mathvariant='bold'>w</mi>
            <msup>
              <mi>c</mi>
              <mi>L</mi>
            </msup>
            <mi>L</mi>
          </msubsup>
          <mo fence='true'>&#x2225;</mo>
        </mrow>
        <mo>&le;</mo>
        <mrow>
          <mo fence='true'>&#x2225;</mo>
          <msup>
            <mi mathvariant='bold'>v</mi>
            <mi>L</mi>
          </msup>
          <mo>-</mo>
          <msubsup>
            <mi mathvariant='bold'>w</mi>
            <mi>i</mi>
            <mi>L</mi>
          </msubsup>
          <mo fence='true'>&#x2225;</mo>
        </mrow>
        <mi>&forall;i</mi>
      </mrow>
    </math>
	</div>
  </li>
  <li>Perform the learning step for the lattice weights to form a Kohonen-like
    topology-conserving map onto the lattice
	<div class='equation'>
    <math xmlns='http://www.w3.org/1998/Math/MathML'>
      <mrow>
        <mo>&Delta;</mo>
        <msubsup>
          <mi mathvariant='bold'>w</mi>
          <mi>i</mi>
          <mi>L</mi>
        </msubsup>
        <mo>=</mo>
        <msup>
          <mi>&eta;</mi>
          <mi>L</mi>
        </msup>
        <msubsup>
          <mi>h</mi>
          <msup>
            <mrow>
              <mi>i</mi>
              <mo>,</mo>
              <mi>c</mi>
            </mrow>
            <mi>L</mi>
          </msup>
          <mi>L</mi>
        </msubsup>
        <mo>(</mo>
        <msup>
          <mi mathvariant='bold'>v</mi>
          <mi>L</mi>
        </msup>
        <mo>-</mo>
        <msubsup>
          <mi mathvariant='bold'>w</mi>
          <mi>i</mi>
          <mi>L</mi>
        </msubsup>
        <mo>)</mo>
        <mi>&forall;i</mi>
      </mrow>
    </math>
	</div>
  </li>
  <li>Trigger the saccade centred on
    <math xmlns='http://www.w3.org/1998/Math/MathML'>
      <msup>
        <mi>c</mi>
        <mi>L</mi>
      </msup>
    </math>
    so that the new position
	<math xmlns='http://www.w3.org/1998/Math/MathML'>
      <msup>
        <mi mathvariant='bold'>v</mi>
        <mi>S</mi>
      </msup>
    </math>
    of the image occurs at
	<div class='equation'>
    <math xmlns='http://www.w3.org/1998/Math/MathML'>
      <mrow>
        <msup>
          <mi mathvariant='bold'>v</mi>
          <mi>S</mi>
        </msup>
        <mo>=</mo>
        <msup>
          <mi mathvariant='bold'>v</mi>
          <mi>L</mi>
        </msup>
        <mo>+</mo>
        <mfrac>
          <mrow>
            <munder>
              <mo>&sum;</mo>
              <mn>i</mn>
            </munder>
            <msubsup>
              <mi>h</mi>
              <msup>
                <mrow>
                  <mi>i</mi>
                  <mo>,</mo>
                  <mi>c</mi>
                </mrow>
                <mi>L</mi>
              </msup>
              <mi>S</mi>
            </msubsup>
            <msubsup>
              <mi mathvariant='bold'>w</mi>
              <mi>i</mi>
              <mi>S</mi>
            </msubsup>
          </mrow>
          <mrow>
            <munder>
              <mo>&sum;</mo>
              <mn>i</mn>
            </munder>
            <msubsup>
              <mi>h</mi>
              <msup>
                <mrow>
                  <mi>i</mi>
                  <mo>,</mo>
                  <mi>c</mi>
                </mrow>
                <mi>L</mi>
              </msup>
              <mi>S</mi>
            </msubsup>
          </mrow>
        </mfrac>
      </mrow>
    </math>
	</div>
  </li>
  <li>If the image is now in the fovea, i.e.
    <math xmlns='http://www.w3.org/1998/Math/MathML'>
      <mrow>
        <mo fence='true'>&#x2225;</mo>
        <msup>
          <mi mathvariant='bold'>v</mi>
          <mi>S</mi>
        </msup>
        <mo fence='true'>&#x2225;</mo>
        <mo>&lt;</mo>
        <msub>
          <mi>r</mi>
          <mtext>fovea</mtext>
        </msub>
      </mrow>
    </math>
    , go to step (1)</li>
  <li>Find the new centre of excitation
    <math xmlns='http://www.w3.org/1998/Math/MathML'>
      <msup>
        <mi>c</mi>
        <mi>S</mi>
      </msup>
    </math>
    in the lattice for the stimulus
	<math xmlns='http://www.w3.org/1998/Math/MathML'>
      <msup>
        <mi mathvariant='bold'>v</mi>
        <mi>S</mi>
      </msup>
    </math>
    according to
	<div class='equation'>
    <math xmlns='http://www.w3.org/1998/Math/MathML'>
      <mrow>
        <mo fence='true'>&#x2225;</mo>
        <msup>
          <mi mathvariant='bold'>v</mi>
          <mi>S</mi>
        </msup>
        <mo>-</mo>
        <msubsup>
          <mi mathvariant='bold'>w</mi>
          <msup>
            <mi>c</mi>
            <mi>S</mi>
          </msup>
          <mi>L</mi>
        </msubsup>
        <mo fence='true'>&#x2225;</mo>
        <mo>&le;</mo>
        <mo fence='true'>&#x2225;</mo>
        <msup>
          <mi mathvariant='bold'>v</mi>
          <mi>S</mi>
        </msup>
        <mo>-</mo>
        <msubsup>
          <mi mathvariant='bold'>w</mi>
          <mi>i</mi>
          <mi>L</mi>
        </msubsup>
        <mo fence='true'>&#x2225;</mo>
        <mi>&forall;i</mi>
      </mrow>
    </math>
	</div>
  </li>
  <li>Trigger the saccade centred on
    <math xmlns='http://www.w3.org/1998/Math/MathML'>
      <msup>
        <mi>c</mi>
        <mi>S</mi>
      </msup>
    </math>
    so that the new position
	<math xmlns='http://www.w3.org/1998/Math/MathML'>
      <msup>
        <mi mathvariant='bold'>v</mi>
        <mi>S&prime;</mi>
      </msup>
    </math>
    of the image occurs at
	<div class='equation'>
    <math xmlns='http://www.w3.org/1998/Math/MathML'>
      <mrow>
        <msup>
          <mi mathvariant='bold'>v</mi>
          <mi>S&prime;</mi>
        </msup>
        <mo>=</mo>
        <msup>
          <mi mathvariant='bold'>v</mi>
          <mi>S</mi>
        </msup>
        <mo>+</mo>
        <mfrac>
          <mrow>
            <munder>
              <mo>&sum;</mo>
              <mn>i</mn>
            </munder>
            <msubsup>
              <mi>h</mi>
              <msup>
                <mrow>
                  <mi>i</mi>
                  <mo>,</mo>
                  <mi>c</mi>
                </mrow>
                <mi>L</mi>
              </msup>
              <mi>S</mi>
            </msubsup>
            <msubsup>
              <mi mathvariant='bold'>w</mi>
              <mi>i</mi>
              <mi>S</mi>
            </msubsup>
          </mrow>
          <mrow>
            <munder>
              <mo>&sum;</mo>
              <mn>i</mn>
            </munder>
            <msubsup>
              <mi>h</mi>
              <msup>
                <mrow>
                  <mi>i</mi>
                  <mo>,</mo>
                  <mi>c</mi>
                </mrow>
                <mi>S</mi>
              </msup>
              <mi>S</mi>
            </msubsup>
          </mrow>
        </mfrac>
      </mrow>
    </math>
	</div>
  </li>
  <li>If this corrective saccade is an improvement, i.e.
    <math xmlns='http://www.w3.org/1998/Math/MathML'>
      <mrow>
        <mo fence='true'>&#x2225;</mo>
        <msup>
          <mi mathvariant='bold'>v</mi>
          <mi>S&prime;</mi>
        </msup>
        <mo fence='true'>&#x2225;</mo>
        <mo>&lt;</mo>
        <mo fence='true'>&#x2225;</mo>
        <msup>
          <mi mathvariant='bold'>v</mi>
          <mi>S</mi>
        </msup>
        <mo fence='true'>&#x2225;</mo>
      </mrow>
    </math>
    , perform the learning step for the saccade weights according to
	<div class='equation'>
    <math xmlns='http://www.w3.org/1998/Math/MathML'>
      <mrow>
        <mo>&Delta;</mo>
        <msubsup>
          <mi mathvariant='bold'>w</mi>
          <mi>i</mi>
          <mi>S</mi>
        </msubsup>
        <mo>=</mo>
        <msup>
          <mi>&eta;</mi>
          <mi>S</mi>
        </msup>
        <msubsup>
          <mi>h</mi>
          <msup>
            <mrow>
              <mi>i</mi>
              <mo>,</mo>
              <mi>c</mi>
            </mrow>
            <mi>L</mi>
          </msup>
          <mi>S</mi>
        </msubsup>
        <mo>(</mo>
        <mo>(</mo>
        <msubsup>
          <mi mathvariant='bold'>w</mi>
          <msup>
            <mi>c</mi>
            <mi>L</mi>
          </msup>
          <mi>S</mi>
        </msubsup>
        <mo>+</mo>
        <msubsup>
          <mi mathvariant='bold'>w</mi>
          <msup>
            <mi>c</mi>
            <mi>L</mi>
          </msup>
          <mi>S</mi>
        </msubsup>
        <mo>)</mo>
        <mo>-</mo>
        <msubsup>
          <mi mathvariant='bold'>w</mi>
          <mi>i</mi>
          <mi>S</mi>
        </msubsup>
        <mo>)</mo>
        <mi>&forall;i</mi>
      </mrow>
    </math>
	</div>
  </li>
  <li>Go to step (1)</li>
</ol>

<div>
The terms
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <msubsup>
    <mi>h</mi>
    <mi>i,u</mi>
    <mi>L</mi>
  </msubsup>
</math>
and
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <msubsup>
    <mi>h</mi>
    <mi>i,u</mi>
    <mi>S</mi>
  </msubsup>
</math>
are Gaussian functions of the magnitude of the distance
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mrow>
    <mo fence='true'>&#x2225;</mo>
    <mi>i</mi>
    <mo>-</mo>
    <mi>u</mi>
    <mo fence='true'>&#x2225;</mo>
  </mrow>
</math>
contingent on
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <msup>
    <mi>&sigma;</mi>
    <mi>L</mi>
  </msup>
</math>
and
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <msup>
    <mi>&sigma;</mi>
    <mi>S</mi>
  </msup>
</math>
, respectively, which in turn decrease over time, like the learning rates
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <msup>
    <mi>&eta;</mi>
    <mi>L</mi>
  </msup>
</math>
and
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <msup>
    <mi>&eta;</mi>
    <mi>S</mi>
  </msup>
</math>
, according to a standard exponential decay. The parameters in this model were
chosen to be
</div>
<ul>
  <li><math xmlns='http://www.w3.org/1998/Math/MathML'>
      <mrow>
        <msup>
          <mi>&eta;</mi>
          <mi>L</mi>
        </msup>
        <mo>(</mo>
        <mn>t</mn>
        <mo>)</mo>
      </mrow>
      <mo>=</mo>
      <mrow>
        <mn>0.3</mn>
        <mo>&sdot;</mo>
        <mi>exp</mi>
        <mo>(</mo>
        <mn>-0.0002</mn>
		<mi>t</mi>
        <mo>)</mo>
      </mrow>
    </math>
  </li>
  <li><math xmlns='http://www.w3.org/1998/Math/MathML'>
      <mrow>
        <msup>
          <mi>&sigma;</mi>
          <mi>L</mi>
        </msup>
        <mo>(</mo>
        <mn>t</mn>
        <mo>)</mo>
      </mrow>
      <mo>=</mo>
      <mrow>
        <mn>10</mn>
        <mo>&sdot;</mo>
        <mi>exp</mi>
        <mo>(</mo>
        <mn>-0.0003</mn>
		<mi>t</mi>
        <mo>)</mo>
      </mrow>
    </math>
  </li>
  <li><math xmlns='http://www.w3.org/1998/Math/MathML'>
      <mrow>
        <msup>
          <mi>&sigma;</mi>
          <mi>L</mi>
        </msup>
        <mo>(</mo>
        <mn>t</mn>
        <mo>)</mo>
      </mrow>
      <mo>=</mo>
      <mrow>
        <mn>10</mn>
        <mo>&sdot;</mo>
        <mi>exp</mi>
        <mo>(</mo>
        <mn>-0.0003</mn>
		<mi>t</mi>
        <mo>)</mo>
      </mrow>
    </math>
  </li>
  <li><math xmlns='http://www.w3.org/1998/Math/MathML'>
      <mrow>
        <msup>
          <mi>&sigma;</mi>
          <mi>S</mi>
        </msup>
        <mo>(</mo>
        <mn>t</mn>
        <mo>)</mo>
      </mrow>
      <mo>=</mo>
      <mrow>
        <mn>3</mn>
        <mo>&sdot;</mo>
        <mi>exp</mi>
        <mo>(</mo>
        <mn>-0.0003</mn>
		<mi>t</mi>
        <mo>)</mo>
      </mrow>
    </math>
  </li>
</ul>
<div>
where
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mi>t</mi>
</math>
is an integer value indexing the time, or stimulus presentation. The actual
determination of the distance between two neurons
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mi>i</mi>
</math>
and 
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mi>u</mi>
</math>
is a complex matter since there are a number of ways in which this can be
calculated. Ritter et al describe the situation as follows:
</div>

<blockquote>
  <div class='quote'>
  “The precise form of the distance measure between two neural units in the
  lattice is inconsequential for the organizational process to converge;
  however, sometimes a certain metric may fit a problem better than other
  distance measures. ...we used the ‘Manhattan’ rather than the Euclidean
  metric. ... it is the distances in the lattice which determine the spatial
  interaction between the neurons themselves and, thereby, determine the
  distance-dependent adaptation steps in the model. ... (I)n the vicinity of
  the fovea ... receptors that are directly opposite to each other lie close
  together but have to learn saccades that differ as much as saccades that
  belong to receptive fields directly opposite ... Therefore, it makes sense
  to use the ‘Manhattan’ metric which yields, for the foveal and peripheral
  pair, the same lattice distance between the diametrically opposite neural
  units”.
  </div>
  <div class='caption'>
  <a href='#RMS1992'>Ritter et al [1992]</a>: 155
  </div>
</blockquote>

<div>
I would like to take issue with this explanation. By defining a distance
metric, one is also defining the architecture of the formal neuron lattice. If
the Manhattan metric is used, the lattice will be seen to have the form of a
cylinder rather than a ring, since the distance between two circumferential
units at foveal-receptive positions and peripheral-receptive positions is the
same. In practice, this means that it is unlikely that the lattice weights
will ever come to represent the input space effectively: the geometry of the
input space and the lattice architecture is such that unless the first few
learning steps cause one ring edge (foveal or peripheral) to become completely
encircled by the other, the lattice weights will come to represent the input
space in the form of a windsock, with one edge representing one half of the
visual field and the other the other half. The same problem occurs with the
straightforward Euclidean metric (counting only single units). Therefore, a
distance metric must be found that preserves the actual geometric structure of
the lattice. For the saccades, it is true, the Manhattan metric may be
beneficial for the reasons described above, but unless Ritter et al are to
explain why they have chosen different neighbourhood metrics for models of
groups of neurons that at a neurophysiological level appear to be quite
homogeneous, we should have to reject such an explanation, as the above
justification looks very much like an ad hoc solution to the problem. How can
the system “know” that some neurons will come to represent a particular part
of the input space? If, for instance, some part of the visual field had
significantly less interesting objects for which fovealisation was required,
the network should come to represent that asymmetry naturally. In that case
(the “foveal” inner ring of the lattice being off-centre from the fovea),
neurons on the <i>same</i> side of the ring would have to learn very different
saccade vectors. In a biological implementation, the agent might have to cope
with this very same problem, in that after the lattice has formed, afferent
connections from the retina or from the executive interest-generating system
may be lesioned, providing an asymmetric stimulus probability distribution
which must nevertheless be properly represented in order for sensible
functioning to occur. Ritter et al’s model, in this case, will always be much
more severely impaired than one which uses a straightforward metric
representative of the true form of the lattice.
</div>

<div>
In this model, I have chosen a suitable metric for the distance between units
in the lattice, namely
</div>

<div class='equation'>
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mrow>
    <mo fence='true'>&#x2225;</mo>
    <mi>i</mi>
    <mo>-</mo>
    <mi>u</mi>
    <mo fence='true'>&#x2225;</mo>
    <mo>=</mo>
    <msqrt>
      <msup>
        <mrow>
          <mfenced open='[' close=']'>
            <mrow>
              <mfenced>
                <mrow>
                  <msub>
                    <mi>r</mi>
                    <mi>i</mi>
                  </msub>
                  <mi>sin</mi><mo>&af;</mo>
                  <mfenced>
                    <mfrac>
                      <msub>
                        <mrow>
                          <mn>2</mn>
                          <mi>&pi;c</mi>
                        </mrow>
                        <mi>i</mi>
                      </msub>
                      <mi>C</mi>
                    </mfrac>
                  </mfenced>
                </mrow>
              </mfenced>
              <mo>-</mo>
              <mfenced>
                <mrow>
                  <msub>
                    <mi>r</mi>
                    <mi>i</mi>
                  </msub>
                  <mi>sin</mi><mo>&af;</mo>
                  <mfenced>
                    <mfrac>
                      <msub>
                        <mrow>
                          <mn>2</mn>
                          <mi>&pi;c</mi>
                        </mrow>
                        <mi>i</mi>
                      </msub>
                      <mi>C</mi>
                    </mfrac>
                  </mfenced>
                </mrow>
              </mfenced>
            </mrow>
          </mfenced>
        </mrow>
        <mn>2</mn>
      </msup>
      <mo>+</mo>
      <msup>
        <mrow>
          <mfenced open='[' close=']'>
            <mrow>
              <mfenced>
                <mrow>
                  <msub>
                    <mi>r</mi>
                    <mi>i</mi>
                  </msub>
                  <mi>cos</mi><mo>&af;</mo>
                  <mfenced>
                    <mfrac>
                      <msub>
                        <mrow>
                          <mn>2</mn>
                          <mi>&pi;c</mi>
                        </mrow>
                        <mi>i</mi>
                      </msub>
                      <mi>C</mi>
                    </mfrac>
                  </mfenced>
                </mrow>
              </mfenced>
              <mo>-</mo>
              <mfenced>
                <mrow>
                  <msub>
                    <mi>r</mi>
                    <mi>i</mi>
                  </msub>
                  <mi>cos</mi><mo>&af;</mo>
                  <mfenced>
                    <mfrac>
                      <msub>
                        <mrow>
                          <mn>2</mn>
                          <mi>&pi;c</mi>
                        </mrow>
                        <mi>i</mi>
                      </msub>
                      <mi>C</mi>
                    </mfrac>
                  </mfenced>
                </mrow>
              </mfenced>
            </mrow>
          </mfenced>
        </mrow>
        <mn>2</mn>
      </msup>
    </msqrt>
  </mrow>
</math>
</div>

<div>
where <math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mi>r</mi>
</math>
is the radial component of a unit (or how far it is from the centre of the
lattice), <math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mi>c</mi>
</math>
is the circumferential component of a unit (some proportion of
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mi>C</mi>
</math>
), and <math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mi>C</mi>
</math>
is the total number of circumferential units (30 in this model). This metric
represents the Euclidean distance between the positions of the units in the
lattice, assuming that they are arranged evenly in the ring. In this way, we
preserve the geometric structure of the lattice.
</div>

<h3>Implementation</h3>

<div>
The above model was implemented on an 80486DX2-66 processor over 16,000
training presentations. In the simulation, the two main windows represent the
states of the lattice and saccade weights. The outer ring circumscribes the
limit of the visual field. Each neuron’s position is represented in this space
by the retinal location of the centre of its receptive field. In the lattice
window, each neuron is connected by lines to its immediate neighbours. In the
saccade window, each line shows the saccade vector. Since this model uses a
group of neighbouring neurons to dictate the eventual position of the image
after the saccade, the actual position of the image may not be precisely the
same as the saccade vector would appear to show (the context of the centre of
excitation in the lattice determines the measure of error of that vector). To
make the weight space a little more comprehensible, I have used a greyscale
function to display the lattice connections and saccade vectors proportionally
to the radial component of their neural units in the lattice. In the
parameters window, the values of the parameters are plotted over a 20,000-step
time window. Additionally, an error measure was calculated every 200 timesteps
and plotted in white in this window. This measure is simply defined as
</div>

<div class='equation'>
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <mrow>
    <mi>E</mi>
    <mo>=</mo>
    <mfrac>
      <mrow>
        <munder>
		  <mo>&sum;</mo>
		  <mi>i</mi>
		</munder>
        <mo fence='true'>&#x2225;</mo>
        <msubsup>
          <mi mathvariant='bold'>w</mi>
          <mi>i</mi>
          <mi>L</mi>
        </msubsup>
        <mo>-</mo>
        <msubsup>
          <mi mathvariant='bold'>w</mi>
          <mi>i</mi>
          <mi>S</mi>
        </msubsup>
        <mo fence='true'>&#x2225;</mo>
      </mrow>
      <mi>N</mi>
    </mfrac>
  </mrow>
</math>
</div>

<div>
where
<math xmlns='http://www.w3.org/1998/Math/MathML'><mi>N</mi></math>
is the number of neurons in the lattice, as a proportion of the
radius of the visual field.
</div>

<div>
Figure 2 shows the state of the network before any training has taken place.
The magnitude of the randomised saccade vectors was set to a small proportion
of the radius of the visual field. This helps the network to settle down
quickly, and was also the method used by Ritter et al, but is not strictly
necessary, as convergence of the saccade weights will still occur given enough
time.
</div>

<div class='figure'>
<img src='s0.png' alt='state 0' />
<br />
<small>Figure 2: Lattice and saccade weight space before any
stimuli</small>
</div>

<div>
Figure 3 shows the state of the network after 4,000 timesteps. At this point
we can see that the lattice has formed a more or less isomorphic cobweb shape,
and the saccades have all become directed towards the fovea. The error measure
display shows, however, that the network’s performance on saccade learning
over time is nonmonotonic. This problem occurs because the lattice weights, in
the early stages, move around the input space following the stimulus vectors
to a great degree (
<math xmlns='http://www.w3.org/1998/Math/MathML'>
  <msup>
    <mi>&eta;</mi>
	<mi>L</mi>
  </msup>
</math> 
is relatively high).
This means that even if a saccade led directly into the fovea at one timestep,
the centre of the receptive field of that neuron might change in the next
timestep, causing the saccade to miss the fovea. It is therefore important for
the lattice to become relatively stable with respect to the saccades at an
early stage. This will be discussed further later.
</div>

<div class='figure'>
<img src='s1.png' alt='state 1' />
<br />
<small>Figure 3: Lattice and saccade weight space after 4,000 stimulus
presentations</small>
</div>

<div>
Figure 4 shows the state of the network after 8,000 timesteps. At this point,
the error measure is becoming monotonically smaller as the lattice weights are
now much more stable with respect to the saccades.
</div>

<div class='figure'>
<img src='s2.png' alt='state 2' />
<br />
<small>Figure 4: Lattice and saccade weight space after 8,000 stimulus
presentations</small>
</div>

<div>
Figure 5 shows the state of the network after 16,000 timesteps. The total
error has now reached a very small level of magnitude, so it would seem that
all saccades now lead directly into the fovea. The actual effect of the
saccades, given that a number of neighbouring neurons contribute to the
overall effect, may still cause some saccades to miss the fovea proportional
to the radial component of their location. This effect, given that the
neighbourhood function
<math xmlns='http://www.w3.org/1998/Math/MathML'>
 <msup>
  <mi>h</mi>
  <mi>S</mi>
 </msup>
</math>
of the saccades is monotonically
decreasing to 0 in the limit, will eventually be corrected. I will examine the
implications of this effect, however, below.
</div>

<div class='figure'>
<img src='s3.png' alt='state 3' />
<br />
<small>Figure 5: Lattice and saccade weight space after 16,000 stimulus
presentations</small>
</div>

<h3>Discussion</h3>

<div>
The effect of implementing a saccade strategy by which a number of neurons
cooperate to determine the eventual position of the image leads to some
interesting findings. The purpose of this model, as revealed above, is to
endow the simpler saccade-learning model of 
<a href='#RMS1992'>Ritter et al [1992]</a> with this particular feature.
However, the phenomenology of saccadic learning in humans, as documented by 
<a href='#BF1969'>Becker and Fuchs [1969]</a>, shows another interesting
feature: in humans, saccades almost never lead directly into the fovea but
usually close by, whereupon a corrective saccade takes over and fovealises
the image. Ritter et al note this succinctly:
</div>

<blockquote>
  <div class='quote'>
  “Compared to these observations, our model learns its saccadic eye
  motions much too ‘well’ because at the end of the learning process all of
  our saccades precisely lead in the fovea.”
  </div>
  <div class='caption'>
  <a href='#RMS1992'>Ritter et al [1992]</a>: 160
  </div>
</blockquote>

<div>
They further hypothesise that the consistent error in saccadic learning in
humans may be related to the fact that humans are mobile agents that plan
ahead to be able to track relatively mobile objects. This sounds quite
plausible. On the other hand, let us examine the model above. When the
neighbourhood function
<math xmlns='http://www.w3.org/1998/Math/MathML'>
 <msup>
  <mi>h</mi>
  <mi>S</mi>
 </msup>
</math>
of the saccade weights is still
significant, any saccade still has an error proportional to the difference
between the saccade weight of the centre of excitation in the lattice and the
combined weights of other neurons in its vicinity. Due to the geometry of the
lattice once the lattice weights have been learned, any particular saccade
will tend to undershoot the centre of the visual field to some extent. This is
in accordance with the observations made on human subjects by 
<a href='#BF1969'>Becker and Fuchs [1969]</a>.
</div>

<div>
Although this model has used a monotonically decreasing neighbourhood and
learning rate for the saccades, let us now conceive of a model in which only
the neighbourhood and learning rates for the lattice weights decrease, or in
which the decay rates for
<math xmlns='http://www.w3.org/1998/Math/MathML'>
 <msup>
  <mi>h</mi>
  <mi>S</mi>
 </msup>
</math>
and
<math xmlns='http://www.w3.org/1998/Math/MathML'>
 <msup>
  <mi>&eta;</mi>
  <mi>S</mi>
 </msup>
</math>
over time are much smaller than for the
lattice weights. In such a scenario, the corrective power of the above model
is much reduced, and the saccades will consistently undershoot the fovea by
some small margin.
</div>

<div>
The other issue still left latent is the early freezing of the receptive
fields in the lattice with respect to the saccades. The introduction of
contextual information from neurons in the vicinity of the centre of
excitation for a particular stimulus can be extremely important in the initial
ordering of the saccades. If no contextual information is present, it would
appear to be unjustified to introduce a learning rule for the saccades into
the algorithm that updates neighbouring weights for a particular saccade,
since those weights did not contribute to the saccade in the first place.
Additionally, it might also be noted that in a physical implementation of the
model, there is always the risk that some saccade weight fails to respond to
learning commands, or interference from some external source affects the
plasticity of the connection. In such a situation, if a group of neurons are
responsible for the saccade, the network can learn to overcome the effect of
the recalcitrant connection to some extent. This is not possible in Ritter et
al’s model.
</div>

<div>
In conclusion, we can see that this model of the oculomotor system offers
numerous benefits in terms of neurobiological plausibility and graceful
degradation, and hopefully serves to elucidate yet more clearly the issues
surrounding simple visual reflexes in humans and other mammals. It is also yet
another example of how a very simple learning principle, embodied in this
model by the process of corrective saccades, can come to be a powerful
organisational tool within a neural architecture.
</div>

<div>
<a href='mailto:dog&#064;bluezoo.org'><i>Christopher Burdess</i></a>
</div>
<hr />

<h3>References</h3>

<div>
<small>
<a id='BF1969'>Becker, W, &amp; Fuchs, A F [1969]</a>
‘Further Properties of the Human Saccadic System: Eye Movements and Correction
Saccades with and without Visual Fixation Points’, <i>Vision Research</i> 9:
1247-1258<br />
<a id='K1982'>Korn, A [1982]</a> <i>Bildverarbeitung durch das
visuelle System. Fachberichte Messen, Steuern, Regeln</i> 8,
Springer-Verlag<br />
<a id='RMS1992'>Ritter, H, Martinetz, T, &amp; Shulten, K
[1992]</a> <i>Neural Computation and Self-Organizing Maps</i>,
Addison-Wesley<br />
<a id='SN1987'>Sparks, D L, &amp; Nelson, J S [1987]</a>
‘Sensory and Motor Maps in the Mammalian Superior Colliculus’, <i>Trends in
Neuroscience</i> 10: 312-317<br />
</small>
</div>
<hr />

<div>
 <a href='../index.html'>
  <img class='homeref' src='../dog.home.png' alt='Home' />
 </a>
</div>

</body>
</html>
