Technologies: Mixed Mode » Neuromorphic robot vision

Welcome to The Neuromorphic Engineer

Technologies » Mixed Mode

Neuromorphic robot vision

PDF version | Permalink

Kazuhiro Shimonomura and Tetsuya Yagi

1 December 2005

Mixed analog-digital architecture is used to perform real-time binocular disparity computation.

Robotic vision is the most fascinating and feasible application of neuromorphic engineering, since processing images in real time with low power consumption is the field's most critical requirement. Conventional machine vision systems have been implemented using CMOS (complimentary metal-oxide semiconductor) imagers or CCD (charge-coupled-device) cameras that are interfaced to digital-processing systems operating with serial algorithms. These systems often consume too much power, are too large, and compute with too high a cost.¹

Though neuromorphic technology has advantages in these areas, there are some disadvantages too: current implementations have less-programmable architectures, for example, than digital processing technologies. In addition, digital image processing has a long history and highly-developed hardware and software for pattern recognition are readily available. We therefore think it is practical—at least at the current stage of progress in neuromorphic engineering—to combine neuromorphic sensors with conventional digital technology to implement, for robot vision, the computational essence of what the brain does. On this basis, we designed a neuromorphic vision system consisting of analog VLSI neuromorphic chips and field-programmable gate array (FPGA) circuits.

Figure 1.

Block diagram of the mixed analog-digital system and analog multi-chip circuit for an orientation-selective response.

Figure 2.

Photograph of the binocular vision system.

Figure 3.

Real time disparity computation with the binocular vision system. (A) Sketch of the experimental setup. (B) Neural images of the complex cells tuned to the disparity of near, fixation, and far.

Figure 1 shows a block diagram of the system, which consists of silicon retinas, ‘simple-cell’ chips (named after the simple cell in the V1 area of the brain) and FPGA circuits. The silicon retina is implemented with active pixel sensors (conventionally-sampled photo sensors)² and has a concentric center-surround Laplacian-Gaussian-like receptive field.² Its output image is transferred to the simple-cell chips serially. These chips then aggregate analog pixel outputs from the silicon retina to generate an orientation-selective response similar to the simple-cell response in the primary visual cortex.³ The architecture mimics the feed-forward model proposed by Hubel and Wiesel,⁴ and efficiently computes the two dimensional Gabor-like receptive field using the concentric center-surround receptive field.

The signal transfer from the silicon retina to the simple cell chip is performed using analog voltage, aided by analog memories embedded in each pixel of the simple cell chip. The output image of the simple-cell chip is then converted into a digital signal and fed into the FPGA circuits, where the image is further processed with programmable logic in parallel (not in serial). An example of this real-time computation is shown in Figure 3.

The system emulates the responses of complex cells that are tuned to particular binocular disparities based on the disparity energy model.⁵ Here, the emulation is carried out with a binocular platform as shown in Figure 2. In the experiment, a hand was moving from far to near, crossing the fixation point (see Figure 3A). Figure 3B shows the disparity energy computed by FPGA circuits. Three complex cell layers, each of which is tuned to disparities of near, fixating point, and far, were prepared in parallel. As shown in Figure 3B, the maximum response appears in each corresponding disparity zone as the hand approaches the silicon retina. The neural image of disparity energy model is visualized in real-time.

The simple cells are known to exhibit more-or-less linear receptive fields.^5,6 From this point of view, it is thought to make sense to emulate the simple cell with analog chips without using the spike representation. However, computation in the brain significantly deviates from the linear representation beyond the simple cell. We can therefore use digital technology to compute the nonlinear properties of the complex-cell receptive field, making it compatible with conventional machine-vision techniques. Another possible area to explore is computation using the spike representation. Together, both of these approaches will further robot vision research. Which approach we use for a given application will depend on the type of visual cortex computation we need to implement.

Authors

Kazuhiro Shimonomura
Department of Electronic Engineering, Osaka University
http://brain.ele.eng.osaka-u.ac.jp/in-dexe.html

Tetsuya Yagi
Department of Electronic Engineering, Osaka University

References

G. Indiveri and R. Douglas, Neuromorphic vision sensors, Science 288, pp. 1, 2000.
S. Kameda and T. Yagi, An analog VLSI chip emulating sustained and transient response channels of the vertebrate retina, IEEE Trans. on Neural Networks 14 (5), pp. 1, 2003.
K. Shimonomura and T. Yagi, A multi-chip aVLSI system emulating orientation selectivity of primary visual cortical cell, IEEE Trans. on Neural Networks 16 (4), pp. 972-979, 2005.
D. Hubel and T. Wiesel, Receptive fields, binocular interaction and functional architecture in the cat's visual cortex, J. of Physiol. (London) 160, pp. 106-154, 1962.
I. Ohzawa, G. DeAngelis and R. Freeman, Stereoscopic depth discrimination in the visual cortex: Neurons ideally suited as disparity detectors, Science 249, pp. 1, 1990.
J. A. Movshon, I. D. Thompson and D. J. Tolhurst, Spatial summation in the receptive fields of simple cells in the cat's striate cortex, J. Physiol. (London) 283, pp. 53-77, 1978.

DOI: 10.2417/1200512.0026

Tell us what to cover!

If you'd like to write an article or know of someone else who is doing relevant and interesting stuff, let us know. E-mail the and suggest the subject for the article and, if you're suggesting someone else's work, tell us their name, affiliation, and e-mail.