Computational and Theoretical Neuroscience
Krishnamoorthy V. Iyer
Department of Electrical Engineering, IIT Bombay
August 22, 2011
1 Introduction
Brain as a Computer
(Human) Brain: Neuroscience Basics 101 Outline of Lectures 1 and 2
Computational and Theoretical Neuroscience
2 Computer Vision and (Visual) Computational Neuroscience Computer Vision and Visual Neuroscience Basics
Primary Visual Cortex and Low-Level Psychophysical Visual Phenomena Organization of Part II
Basics of Retina and V1 Physiology and Architecture Visual Field Representation in V1 - the Complex Log Map Computation in Primary Visual Cortex
How V1 may do stereopsis
The (Human) Brain
Remember this is a presentation, NOT a paper Keep the number of references to a minimum Not more than ten references overe 2 talks, average of 5 per talk Sejnowski and Churchland, Schwartz, Schwartz and Yeshurun, David Marr, Laurence Abbott and Peter Dayan, William Bialek Spikes, one or two papers of Bialek and co-workers Maybe one or two websites This also reduces the amount of work
Theme of this lecture: Brain as a “Computer”
Question: Is the brain a “computer“ in the sense that a Pentium or a Macintosh is? Obviously not!
Calling the brain a ”computer“ is a metaphor ....
Like describing an electron as ”both” a “particle“ and a “wave“ - the metaphor refers to the use of the mathematical techniques of wave equations as well as properties associated with particles, such as position, for instance
What the brain as a ”computer“ metaphor refers to
The brain as aninformation processor,
How neural architecture and dynamics represents information (neural code) and processes information (neural computation)
Mathematical description i.e. building mathematical models of neural architecture, dynamics, development, and neural
representations and computation
What is a computational explanation of a physical event?
Refers to theinformation content of the physical signals How the information is used to accomplish a task Not all physical events have information content: Their description in terms of causation and the laws of physics suffice to give an ”understanding“ of the phenomenon
Some do: When we type numbers into a calculator and receive an answer
What is a computational explanation of a physical event?
Refers to theinformation content of the physical signals How the information is used to accomplish a task Not all physical events have information content: Their description in terms of causation and the laws of physics suffice to give an ”understanding“ of the phenomenon
Some do: When we type numbers into a calculator and receive an answer
These require an explanation that describes the computation, and not merely one at the level of dynamics
The Human Brain: Major Regions
Before studying the brain as a ”computer“, we need to know something about the brain
Much of the brain is divided into two hemispheres, a left and a right
Connecting link is a bundle of nerves: corpus callosum
The (Human) Brain: Major Regions Contd
Cortex and Thalamus
(Cerebral)Cortex:
In popular parlance, the cerebrum “is” the brain.
Cortex is Latin for “bark” or “outer rind”
Seat ofall higher mental processes:
Sensory Information Processing Movement
Language (Speech and Comprehension) Reason and Empathy
Hugely developed in mammals.
Thalamus: Cortex’s
Mini-Me: 1-1 correspondence with cortical regions
Gateway: 90 percent of information to cortex (except smell) routed
Major Regions Contd
Hippocampus
Hippocampus: (Episodic) Memory: Where were you and what were you doing when the World Trade Center was destroyed?
Long-Term Memory: Autobiographical details Not crucial for short-term memory
Patient HM
Patient Clive Wearing
“Fifty First Dates” starring Drew Barrymore
“Ghajini” starring Aamir Khan
Major Regions Contd
Hippocampus
Hippocampus: (Episodic) Memory: Where were you and what were you doing when the World Trade Center was destroyed?
Long-Term Memory: Autobiographical details Not crucial for short-term memory
Patient HM
Patient Clive Wearing
“Fifty First Dates” starring Drew Barrymore
“Ghajini” starring Aamir Khan Does not deal with muscle memory.
For ex: Riding a bicyle, dancing, playing an instrument
Other sub-cortical structures
1 Hypothalamus: Basic biological needs/drives:
2 Basal Ganglia: Motivationaka “The Will”
affected in Parkinson’s disease
3 Cerebellum: Fine motor control: Playing the violin (but learning how to play involves the motor cortex)
4 Amygdala: Experience of fear: Famous case of a woman patient who literally did not experience fear as a consequence of damage to the amygdala (Damasio)
Focus of this talk: (Primary Visual) Cortex
Historically, neuroscientists have considered two kinds of divisions of cortical regions
1 Anatomical:
2 Functional:
Cortex
Anatomical Divisions
Note the colors are purely for visual appeal!
The cortical surface is a dull grey (hence, “grey matter”)
Cortex
Anatomical Divisions
Anatomical divisions include:
1 Occipital: (almost exclusively) Vision
2 Temporal: Vision, (also Hearing i.e. Audition)
3 Parietal: Body Sense (somatosensory) and multisensory modality integration (including Vision)
4 Frontal: Reason and Empathy (Psychopaths are suspected to have problems with neural circuitry in this region).
Functional Divisions
Cortex
Functional Divisions
Functional divisions include:
1 Visual Cortex: Vision - more than 50% of the cortex in humans and primates devoted exclusively to vision
2 Auditory Cortex: Hearing
3 Somatosensory: Body sense and touch Phantom Limb phenomenon
4 Motor Cortex: (Learning to) Dance, (Learning to) Play an Instrument
Cortex and Thalamus Relation
Mini-Me
Recurring Theme
Do anatomical divisions correspond to functional divisions?
Brain as a “Computer”
Some Misconceptions
The architecture of the brain has very little in common with the von Neumann architecture
The circuitry consists of multiple (and overlapping) spatial scales of organization
Likewise, the dynamics consists of multiple (and overlapping) temporal scales of activity
Brain as a “Computer”
Some Misconceptions Contd
No CPU, and hence no homunculus
The frontal cortex may be said to come closest to the notion of a CPU No “homunculus”:
Hence the naive theories of visual perception as a TV in the brain must be discarded
Representation of information in the brain becomes a fundamental problem
The left hemifield of vision projects to the right cerebral cortex, and vice-versa the right hemifield of vision
The hippocampus comes closest to notion of a hard drive So what does neural architecture look like?
Levels of Organization
correspond closely to Spatial Scales
1 (Large) Molecules: Ion Channels (Proteins) 10−4 micrometer
2 Synapses: Scale: 1 micrometer
3 SingleNeuron: Scale: 100 micrometers (Neuron is the primary brain cell involved in signaling)
4 Networks (aka Microcircuitry) Scale: 1 millimeter
5 Maps: Scale: 1cmHistologically well-defined areas within a brain structure: Ex: Striate Cortex
6 Systems: Scale: Major anatomical regions: Ex: Cortex, Hippocampus
7 Central Nervous System (CNS) 1 meter
8 Focus on levels 3, 4 and 5:
Map level ((Primary) Visual) Cortex, level 5
Other kinds of spatial organization
1 Laminar i.e. layered structure:
All regions of cortex, except the region devoted to smell, have 6 layers.
There is cross-connectivity b/w the layers as well as within the layers
2 Topographic organization andSelf-similarity in the map from retina to cortex: concatenated complex log map
Other kinds of spatial organization
1 Laminar i.e. layered structure:
All regions of cortex, except the region devoted to smell, have 6 layers.
There is cross-connectivity b/w the layers as well as within the layers
2 Topographic organization andSelf-similarity in the map from retina to cortex: concatenated complex log map
3 Columnar organization
Temporal scales
Events of interest have time scales ranging from:
1 10−3 seconds Ex: generation of an action potential
2 10 years Ex: developmental changes associated with the life of the organism
3 and many time scales in-between
Outline of this lecture
1 Part I: What is computational and theoretical neuroscience?
How computer scientists can contribute to computational and theoretical neuroscience?
2 Part II: (Primary) Visual Cortex: Hero: Eric Schwartz
Visual Cortex is the part of the brain devoted to visual information processing Attention: CS475, CS675, CS663
Outline of the next lecture
1 Part III: Single Neuron: Star: William Bialek
A principle in much of Bialek’s work is thatneural systems(and biological systems more generally) may have reachedoptimum performance limitsunder the constraints of evolutionary history, and noise and energy consumption limitsAttention: CS435 and CS709
2 Part IV: Networks of Neurons: Laurence Abbott, Nancy Kopell
3 Part V: Other neuroscientists whose work I would encourage you to explore
4 Part VI: Conclusion and Discussion: CS students interested in Neuroscience
Computational and Theoretical neuroscience: What it is NOT
1 “Branch” of Biology, but overlaps with it
2 “Branch” of Neurology: Study of diseases of the brain: properly a branch of medicine.
Sridevi Vedula Sarma, PhD, EECS, MIT, LIDS, now at Johns Hopkins Nandini Chatterjee-Singh, PhD, Physics, University of Pune, now at NBRC, Gurgaon
3 “Branch” of Neurosurgery But has an important supporting role to play
Computational and Theoretical neuroscience: What it is NOT Continued
1 Artificial Neural Networks Some examples: McCulloch-Pitts neuron, Rosenblatt’s Perceptron, Hopfield networks, Boltzmann machines, backpropagation
2 Artificial IntelligenceGoals somewhat similar: How to perform computational tasks such as object recognition following a
”rules-based” approach
3 But in CTNS the emphasis in is on figuring out how the brain does it
Potential issue with the goals of computational neuroscience
1 Maybe trying to figure out how the brain does a task is like trying to figure out how birds fly w/o knowing the principles of aerodynamics?
2 Solution: C and T neuroscientists often maintain close connections with these fields
Computational and Theoretical Neuroscience: What it is
The brain as an information processor, taking into account biological imperatives and history: purpose, development, evolutionary history
physical constraints: energy consumption, noise in sensors and network elements, processing speed requirements
Interesting possibility: Evolutionary pressures may have caused neural systems to have achieved some kinds of optimizations. (William Bialek). (At least local, if not global, optima in the landscape of evolutionary possibilities - my take).
As calculus is integral to Physics, and algorithms to Computer Science, so the mathematical techniques of computational and
Scientists contributing to Computational and Theoretical Neuroscience include
1 Physicists: Both theorists and experimentalists
2 Engineers:
Electrical Engineers, especially signal processing, communication, control, computing, and VLSI, fields of EE that deal with information processing
Note that EE types in other areas andChemical Engineers and Mechanical Engineersalso can play a role, with some retraining
3 Computer Scientists:
4 Mathematicians: Mathematicians interested in studying the brain are almost by definition applied mathematicians
5 A small but growing number of scientists from traditionally non-mathematical scientific disciplines such as biology,
How CS types can contribute to computational and theoretical neuroscience
Computer and Machine Vision meets Biological Vision: (Artificial Intelligence meets (Visual) Neuroscience)
Ex: Eric Schwartz’s group
What are the algorithms the brain uses to do visual processing?
Machine Learning: Analysis of large data sets. Curse of dimensionality
Theoretical Computer Science: Computational Learning Theory, Computational Complexity,Vapnik-Chervonenkis (VC) Dimension Attention: CS435, CS475, CS675, CS663, CS709
Brief note about the term “Computational Neuroscience”
Eric Schwartz introduced the term ”Computational Neuroscience“
Eric Schwartz describes his areas of interest as Computational Neuroscience and Computer Vision
Brain as a “Computer”: Two themes
Representation of information and Transformation of Information aka Computation
1 Theme I: Representation of Information:
1 Map level: Mathematical techniques used: Complex analysis and more specifically, numerical conformal mapping andcomputational
geometry
2 Single neuron level: Mathematical techniques used: Information theory, machine learning, and possiblytheory of computation
2 Theme II: Transformation of Information: Computation
1 Single Neuron Level: Theory of computation, Statistical/Computational Learning Theory
2 Network Level: same as above, alsograph theory and the theory of
(Computer/Machine) Vision
Background for Part II
The study of computer/machine vision has traditionally been divided into:
“Low-level“Vision: Estimating thescene from the image Image: Raw pixel values
(Computer/Machine) Vision
Background for Part II
The study of computer/machine vision has traditionally been divided into:
“Low-level“Vision: Estimating thescene from the image Image: Raw pixel values
Scene: Interpreting the image to obtain some (comparatively basic) information Example: edge detection
”High-level“Vision:
Object recognition and classification
(Computer/Machine) Vision
Background for Part II
The study of computer/machine vision has traditionally been divided into:
“Low-level“Vision: Estimating thescene from the image Image: Raw pixel values
Scene: Interpreting the image to obtain some (comparatively basic) information Example: edge detection
”High-level“Vision:
Object recognition and classification
Visual cognition and volition: extraction of shape properties and spatial relations while performing tasks such as manipulating objects and planning movements
Lecture Theme: How the Brain does Vision
Visual Cortex and Visual Field Representation
Thalamus (marked LGN) plays gatekeeper’s role
Visual Field Representation
Each visual hemifield projects onto nasal (’nose’) hemiretina of the same side eye and to the temporal (’temples’) hemiretina on the opposite side
Left visual hemifield projects onto nasal hemiretina of the left eye temporal hemiretina of the right eye
Take home message: Each visual hemifield is represented by neurons on opposite hemisphere visual cortex
Cortical Hemisphere Input
Each cortical hemisphere gets input from:
one hemifield both eyes
Important to the functional (computational) role of ocular dominance columns (see below)
Visual Field Representation
We will come back to the important issue of representation of (visual) information at a later stage
Prerequisites:
Receptive Fields Retinotopy
Cortical magnification
But before we study these matters, we try to obtain a broad overview of vision as performed by the brain
Vision and Visual Cortical Processing
”Low-level“ vision - ”Early“ stages of visual cortex
”High-level“ vision - ”Later“ stages of visual cortex Distinction not sharp (large number of ”top-down“ and
”bottom-up“connections in the brain).
”top-down“ refers to connections from ”higher“ to ”lower“ cortical areas of from cortex to subcortical structures. Ex: V2 to V1, cortex to thalamus
”bottom-up” refers to connections from “lower“ to ”higher“ areas Ex:
thalamus to V1, V1 to V2
Lecture Theme: How the Brain does Vision
Visual Cortex and Visual Information Processing
Low-level Vision: Primary Visual Cortex
Lecture Theme: How the Brain does Vision
Visual Cortex and Visual Information Processing
Low-level Vision: Primary Visual Cortex
Edge detection: What is an edge in an image?
Contour detection Illusory contours
Lecture Theme: How the Brain does Vision
Visual Cortex and Visual Information Processing
Low-level Vision: Primary Visual Cortex
Edge detection: What is an edge in an image?
Contour detection Illusory contours
Motion detection/estimation High-level Vision:
Object recognition and classification: Inferotemporal (IT) Cortex (”what“ aka ”perception“ stream of vision)
Spatial perception, navigation and attention: (Posterior) Parietal Cortex(”where“ or ”how“ aka ”action“ stream of vision)
”What“ and ”Where“or ”How” Streams
aka “Perception” and “Action” streams
Purple: “What” stream, processed in inferotemporal (IT) cortex.
Object recognition
Green: “Where” or “How” stream, processed in posterior parietal
Computer Vision and (Visual) Computational Neuroscience Phenomena
Primary Visual Cortex (V1) and Low-Level Vision
Image Scene
edge detection pixel values edge
contour detection pixel values contour illusory contours pixel values illusory contour shape from shading pixel val-
ues(luminance information)
shape
figure-ground segmentation pixel values assigning contour to one of two abutting regions
stereopsis one image
frame from each of two retinae
est. rel. dist. to objects based on lateral displacements b/w superimposed images motion estimation two succ.
image frames
extract motion info
Computer Vision and (Visual) Computational Neuroscience Phenomena
Task Visual
Cortex Area
Cell Type
edge detection V1 Simple
contour detection V1 Complex
illusory contours V2
shape from shading pixel values (luminance information)
shape
figure-ground segmentation pixel values assigning contour to one of two abutting regions
stereopsis V1 Monocular and Binocular /
Hypercolumn
motion estimation V1 Complex
Computer Vision and (Visual) Computational Neuroscience Phenomena
Kanizsa Triangle
Illusory Contours
Believed to take place due to neurons in V2
Computer Vision and (Visual) Computational Neuroscience Phenomena
Task Visual Cortex
Area
Structures
stereopsis V1 Monocular+Binocular Cells /
Hypercolumn Image representation V1 complex log map
Evidence from psychophysics, imaging studies and neurophysiology
Theoretical and modeling studies indicate that these processes take place in theprimary visual cortex aka V1and V2
Organization of Part II
Architecture of V1
Neuronal Tuning and Receptive Fields Orientation Columns
Ocular Dominance Columns Hypercolumn
Ice-Cube Model
”Pinwheels“
Topographic map (Complex log map) from retina to V1
Organization of Part II Contd
Visual information representation:
Retinotopy and Topographic Organization
Complex Log Mapping between retina and visual cortex Application to Machine Vision: Space-Variant Vision
Psychophysics: How does the brain perceive depth?
Stereopsis Panum’s Area
Hypercolumn’s role in Stereopsis
Psychophysics: How does the brain perceive edges?
What is an edge in an image?
Different kinds of edges in an image Edge-detection algorithms
Neuronal Tuning I
Neuronal tuning curve: Graph of the average firing rate of the neuron as a function of stimulus parametersrelevant to that class of neurons
A typical tuning curve (top) looks like a half-wave rectified cosine function
Neuronal Tuning II
Interpretation: Stimuli that cause the neuron to fire at the highest possible rate are considered to be most important
If neurons encode information in their average firing rate, then this is likely to be true
However, this is open to question (as we will see in the next lecture)
Receptive Fields (RFs) in Visual Processing
Visual receptive fields refer to the region of (visual) space in which an appropriate stimulus will cause the neuron to respond
Appropriate varies b/w regions: from retina to V1 to V2 to V4 and IT We consider RFs in retina and V1
In ”higher“ visual cortical regions, RF sizes increase
appropriate stimuli become more complex
In IT cortex,∃cells that respond when a person recognizes face Called face-recognition cells
Receptive Fields in Retina
RFs in retina come in two types
On−Surround
Off−Center On−Center
Off−Surround
Retinal Cells Receptive Fields
The spatial structure of retinal (ganglion) cell RFs is well captured by
Receptive Fields in Retina Contd
Why two different kinds? Won’t a single kind suffice?
The answer lies in energy efficiency
The brain is an extremely energy-hungry organ brains typically weigh about 2% of body weighted consume about 20% total body oxygen
utlise about 25% body glucose
To signal swings in both directions about zero contrast, the spontaneous firing rate must be high.
This would cause energy consumption to rise
Receptive Fields in V1
Orientation Selectivity
Hubel and Wiesel discovered nature of RFs in V1 1981 Nobel Prize in Physiology or Medicine Retinal cells respond to light or dark ”spots”
Individual V1 Cells are tuned to “edges” of a particular orientation.
Simple Cells RFs
0 deg 90 deg
180 deg
360 deg
Simple cell RF orientations others
not shown
0, 90, 180, 360 deg
Only cells with orientation preferences of 0, 90, 180, 360 degrees have
Simple Cells Contd
Every individual cell has an orientation preference (neuronal tuning)
Population as a whole has cells whose preference ranges from 0 to 360 degrees
distinct excitatory and inhibitory regions linear summation (superposition) property
regions are anatgonistic - if the light is diffuse, then the cell does not respond
Different simple cells respond to different orientations
Receptive field sizes are much smaller than Complex Cells (see below)
Complex Cells
Similar to simple cells: also have orientation preferences Complex cells have larger receptive fields, so some spatial invariance
May act as movement detectors
Also complex cells receive inputs from many simple cells, may act to detect contours in an image,enabling figure-ground
segmentation at early stages of visual processing I encourage you to explore the work of:
Stephen Grossberg and collaborators Zhaoping Li
Hypercomplex Cells
Take Home Message: (Visual) Neuroscience inspiring Computer Vision
∃ cells in V1 that are tuned to distinct orientations
The population as a whole has cells responsive to every orientation from 0 to 360 degree
For each orientation,∃ cells with different RF sizes
These may detect edges of the same orienation but at different scales This inspiredJones and Malik multi-orientation multi-scale
(MOMS) filters used to do stereo (more on this later)
Other computational theories that have been proposed for the role of simple and complex cells
(Visual) Texture analysis: Process by which visual system defines regions that differ in the statistical properties of spatial structure
Matt or gloss finish
Wavy or straight or curly hair
Structure from Shading: how information about the curvature of surfaces can be extracted from changes in luminance due to depth structure in the image
Orientation Columns
Cells tuned to a particular orientation are arranged to form of columns, called orientation columns
Cells tuned to nearby orientations are arranged next to each other
∃ cells responsive to all orientations
Ocular Dominance Columns
Role in Stereopsis
Some cells in V1 respond mainly to i/p from left eye, Others to i/p from the right eye
A few driven by both: so-calledbinocular cells
These cells did not seem to be tuned for binocular disparity
Many of these studies use somewhatad hoc methods of classification, so terms such as monocular and binocular must be treated with caution Monocular cells arranged to form columns called ocular dominance columns
Known to play a role in binocular vision and stereopsis.
Relationship b/w Ocular Dominance and Orientation Columns
Hypercolumn
Ocular dominance and orientation columns run almost orthogonal to each other
Two adjacent ocular dominance columns (each containing cells responsive to all orientations) form ahypercolumn
Hypercolumn def as “a unit containing full set of values for any given set of RF parameters”.
Hypercolumn
Theme: Relationship b/w Anatomical and Functional Modules
In human V1, a hypercolumn is about 2mm, forms a basic anatomical module
Yeshurun and Schwartz show that it corresponds to a functional i.e.
psychophysically measurable module
a biologically plausible algorithm instantiated on a hypercolumn that solves stereopsis
consistent with psychophysical data
Ice-Cube Model of V1
L R
ICE−CUBE MODEL OF V1
1 hypercolumn repeated many times
L and R are the ocular dominance columns due to the left and right
Short digression: Pinwheels in V1
Applying topological reasoning, Schwartz and co-workers showed that the ice-cube model must be modified
∃ singularities in the map of orientations: orientation vortices aka
“pinwheels”
Introduce a figure shoring pinwheels of orientation selectivity
Lecture 2 begins here
Story so far ...
Ice-Cube model
Qualitatively, each cortical hemisphere receives input from the opposite hemifield only
both eyes
Next Step: Quantifying and mathematizing this description of visual field representation in the visual cortex
Visual field representation in V1
Retinotopy and topographic organization
Retinotopy: Spatial organization of neural responses to visual stimuli Retinotopic maps: Special case of topographic organization
Topographic organization: Projection of a sensory surface onto structures in the central nervous system (CNS).
Examples:
Somatosensory Map: Body surface (skin) to the somatosensory cortex Retinotopic Map: Retina to V1
Neighboring points in the sensor (skin or eye) activate (usually) neighboring regions in the (somatosensory or visual) cortex
Digression: Topographic organization of touch:
Somatosensory map
Cortical hemisphere surface
Body regional surfaces mapped as above
Note the large amount of space devoted to hands, face, lips, and tongue
non-Uniform sampling and non-uniform representation
Intuitively obvious: Lips (as all of us are presumably aware) and finger tips (as Braille readers know) are more sensitive surfaces than (say) the back.
Demonstrated by experimental studies.
But why?
Reason: More sensors per unit area of the fingers than the back Consequently, non-Uniform:
Sampling at the sensor surface Representation in the cortex
Eric Schwartz
Eric Schwartz introduced the term ”Computational Neuroscience“
Eric Schwartz and co-workers’ mathematical models of cortical neuroanatomy one of the most successful quantitative models in Neuroscience
His ouevre demonstrates an intense preoccupation with structure-function (computation) relationships in Neuroscience In other words, ’’how the brain does a task“
Topographic organization in vision
Complex log map from retina to V1
Retinotopic map: Mathematical transformation is a complex logarithmic map
Mathematical description of retinotopic map was discovered byEric L. Schwartz
Complex log map from retina to V1 Part II
w =log(z+a), where
w represents position on cortical surface
z is a complex variable representingazimuthal angle,θ and eccentricity,ǫon retina
z=ǫeiθ
F ε
F fixation point θ
ais an experimentally obtained parameter measure of foveal size
Cortical magnification factor
(Magnitude of) derivative ofw w.r.tz called(cortical) magnification factor
|dwdz|=|z+a1 | Mathematically:
Centre (fovea): z<<a, so|dwdz| ≈constant, (map approx linear) Periphery: z>>a, so δδwz ≈ 1ǫ, (map approx logarithmic)
Physical meaning and biological significance of the cortical magnification factor
Physical meaning:
As eccentricityǫincreases, magnification decreases
Foveal magnification (and thus sampling) greater than periphery Consequently, foveal representation (much) greater than periphery representation in cortex
Biological significance:
No. of neurons ’responsible’ for processing a stimulus of a given size, as a function of location in visual field decreases in the peripheral regions
Complex log map
Limitations and Extensions
Excellent fit for foveal region across many species Not a good fit for peripheral retina
More sophisticated models (numerical conformal mapping, dipole map, wedge-dipole map), have been developed by Schwartz and co-workers for
peripheral retinal regions
other visual cortical regions (V2, V3)
Visual field representation in cortex’: Practical implications
Theme: Neuroscience’s impact on Computer Vision and Robotics I
non-Uniform sampling (Somatosensory map also shows non-uniform sampling and representation)
Small portion of visual field sampled at high resolution Coarser sampling at periphery
non-Uniform representation
Consequence: Far more processing resources i.e. neurons in cortex are devoted to objects at the center of gaze
Theme: Neuroscience’s impact on Computer Vision and Robotics II
Eric Schwartz pointed out that a correspondence could be made b/w:
Cortical RF size variation Multiple Resolution Surface of cortex ”Horizontal“
Single Point in visual field ”Vertical“
∃cells in hypercolumns that respond to the same RF location, but have different RF sizes
These have inspired ”multiscale” or ”pyramid“ approaches to computer vision
Theme: Vision Neuroscience’s impact on Computer Vision and Robotics
Summary:
non-Uniform sampling and representation and multi-scale resolution aka ”pyramids“ have
inspired space-variant computer vision architectures, savings in computational requirements, and thence to economy
Information representation and Information Transformation aka Computation
Complex log map: Information representation (at the level of maps) Now: Information transformation aka computation in the visual cortex
Short digression: Nature versus Nurture debate
Nature versus nurture debate:
Respective roles played by the environment versus innate development rules (genetics and the womb/egg)
Many animals, especially reptiles, are born with astonishing survival skills
Newborn mammals are largely helpless and usually spend their childhood learning
Humans have a very long period of childhood even compared with other mammals
Learning period typically lasts until mid-adolescence at least
Nature versus Nurture II
Representational Learning
Retinotopic mapping: Information representation in V1 Question: Are these representations
learned (due to nurture i.e. environmentally determined) or innate (due to nature i.e. developmentally determined)?
Topographic mappings ”largely“ ”innate“.
Learning does play a role here, ∵if one eye is damaged around birth, the OcuDom columns due to that eye will not form
For learning about (non-innate) neural representations, a good starting point is the classic textbook:
Title: ”Theoretical Neuroscience‘‘
CS213 in your Brain!!!
Attention: CS213 Data Structures and Algorithms CS students: Symbiotic relationship b/w data structures and algorithms
Good DSs enable elegant algorithmic solutions to a desired computation
Historically significant example follows
A historically significant Data Structure
Arithmetic w/ Hindu vs. Roman numerals Data Structure used by Hindus involved
place-value notation base-10 representation the zero as placeholder
DS enabled simplified algorithms to do arithmetic operations such as addition (+), subtraction (-), multiplication (*) and division (/)
What function/computation could the ocular dominance columns serve?
Slide title an example of a broader question regarding the visual cortex as a whole
Comp.
Sc
Brain
info. representation DS OcuDom columns in
V1
info. transformation (computation) algorithm windowed cepstral filter (suggested) cepstral filter:
Why is the world perceived as 3-D?
Why do we not perceive the world as 2 2-D images?
Images projected onto the retinas are 2-D (i.e. flat, lacking
”thickness”)
But we do not (visually) perceive the world as two nearly identical (but slightly laterally displaced) and flat images
So where does the perception of ”depth“ come from?
3-D movies and 3-D glasses
3-D movies Chota Chetan Avatar
3-D glasses fuse the two images into one to give depth perception In day-to-day life, the brainautomaticallyfuses the 2 2-D images on the retina, so we perceive the world as one 3-D
How does the brain do it?
Depth Perception
1 Depth perception refers to theperception of “solidity“
2 Brain uses many different cues to perceive depth
3 Cues may be monocular (single eye) or binocular (both eyes)
4 Monocular Cue: Geometric information such as occlusionused to compute depth
Occlusion refers to the hiding of one surface by another due to opacity of the intervening surface.
Provides info re relative depth
5 Here we consider only one depth clue,a binocular clue called stereopsis
Stereopsis
1 “Stereo“ ”solid“ or ”three-dimensional“
2 ”Opsis“ View or Sight
3 Stereopsis means ”three-dimensional vision”
4 Slightly misnamed phenomenon
5 ∵Stereopsis refers to one of many ways humans perceive depth
6 Stereopsis: Perception of depth arising from the slightly different projections of the world to form two slightly different retinal images
7 Only images themselves are used to perceive depth (see Random dot stereograms)
Binocular Disparity
Difference in image location of an object as seen by left and right eyes Arises due to the separation b/w the two eyes
C P F
F
P
C
C P
C closer points , crossed disparity P point of fixation, no disparity F further points, uncrossed disparity
Computing disparity
Retinal pixels do not come with “labels“ indicating corresponding pixels in the images
∴binocular disparity b/w images must be computed from the images themselves - first step in any stereo algorithm
Algorithm will have to figure out corresponding points b/w the images.
Then disparity can be easily computed
Binocular disparity, ”filling-in“, and stereopsis
Input Data (aka “Image“); Pair of “almost” “identical” images, one laterally displaced from the other.
Intermediate step: Compute disparity i.e. lateral displacement b/w the images
Output: Stereopsis or a perception of depth
Final Step: following computation of disparity, ”fill in“ smooth surfaces.
We will consider an algorithm for signaling disparity that the brain may perform.
Why do we not perceive the world as 2 2-D images?
First (incomplete) solution
Because we focus our eyes on a point.
Shortcoming of this answer: circle (sphere) of fixation (horopter)
Horopter
The two eyes, and the point of fixation determine a circle called the
Horopter: Significance
Points on horopter cast images at exactly corresponding ptsin the two retinae
Can be fused into one image w/o difficulty when the optical paths from the eyes cross in V1
Points to the front and back cause images on the retinae that are laterally displaced
Consequence: Except for a sphere of infinitesimal thickness, (almost) all points in visual field should give rise to double images.
Yet we do not perceive the world that way! Why?
Answer: Cortex performs a computation that fuses the two images into one percept
Where does fusion take place?
Fusion can occur only if the optical info from two eyes’ images converge.
In the retina and (visual) thalamus, the paths are separate Convergence happens in area V1
What comes next ...
Human Stereo: Perceptual Aspects
Stereopsis: Formal statement of the computational problem (w/o ref to how the brain does stereo vision)
Random dot stereogramsand their significance
Overview of suggested algorithms(w/o ref to how the brain does stereo vision)
How the brain(V1) does stereopsis
Perceptual Aspects of Human Stereo Vison I
Panum’s Fusion Area
Perceptual Aspects of Human Stereo Vision II
Robustness of Stereo Perception
Human stereo vision is robust tomany kinds of image manipulations:
image degradations(including Gaussian blur, random noise, changes in image intensity),
rotations(upto 6 degrees),
anddifferential expansions(upto 15%)
A complete psychophysical theory of stereoscopic depth perception should explain these perceptual data re Panum’s area and robustness
Stereopsis: The Computational Problem
Stereopsis includes
Matching corresponding image points in the two eyes (so-called Correspondence Problem)
measuring their disparity,
from this info. recovering the 3-D structure of the objects seen by the viewer (stage of “filling-in”)
Human stereopsis is also robust to minor image manipulatons of one or the other image, as seen above
Stereopsis
The Correspondence Problem
Matching corresponding pointscould be doneeither
after recognizing objects- easy and unambiguous computation before recognizing objects- using only disparity information Physiological evidence indicates that binocular cells lie in V1 (much before object recognition takes place in IT cortex)
Random dot stereograms aka Magic-Eye stereograms
Magic Eye stereogram: p. 215 Vision: From Photons to Perception
Random dot stereograms
Significance
Showed brain can compute stereoscopic depth w/o:
Objects Perspective
Cues available to either eye alone i.e. monocular cues such as occlusion
Radha Krishna image: Not a pure random dot stereogram
Same Principle
We can only form the percept after fusing the two images
Stereopsis: The Computational Problem
Suggested Algorithms
Before studying how the brain does stereopsis (we don’t really know) ...
We overview various proposed computational/algorithmic solutions and their limitations from a:
a computational perspective
in terms of biological implementability i.e. whether the brain could actually use the proposed solution
Stereo algorithms
Categories
1 Pixel-basedschemes
2 Edge-based schemes
3 Area-based schemes
4 Filtering algorithms
Stereo algorithms
Categories II
Szeliski and Zabih have proposed another classification:
1 Correlation-style stereo algorithms:
Example of an (unconventional) correlation-based algorithm that is biologically plausible
2 Global methods
Use well-chosen energy minimization to obtain a depth map that minimizes some energy function
Minimization can be done by means of simulated annealing, graph cuts, or mean field methods
One global method that does not use energy minimization is due to Zitnick and Kanade
Stereo Algorithms
Categories III
Stereo algorithms can also be classified based on whether algorithm used is:
1 sequential(uses iterations or relaxation, such as simulated annealing):
Difficulty: Speed w/ which humans do stereo too fast for iteration schemes for computing stereo
2 parallel
We will consider a windowed (noncanonical) correlation-based one-shot parallel algorithm called the windowed cepstrumdue to Yeshurun and Schwartz
Algorithms for Stereo Vision I
Pixel-Based Schemes
Algorithms for computing stereo can be categorised into:
Pixel-based algorithms:
Example: 1st Marr-Poggio algorithm (1979) Match individual pixels in left and right images.
Computational difficulty:
Correspondence problem: which features in one (retinal) image correspond to which features in the other
NOT robustto minor changes other than simple shifting (unlike human stereo vision)
Biologically implausible-
Only in retina are RFs “pixel-like‘‘.
But stereopsis requires convergence of info. from both eyes, cannot take place in the retina,∵images separate
Algorithms for Stereo
Edge-Based Schemes
Edge-Based Algorithms: Marr and Poggio also presented an edge-based algorithm for stereo.
Algo attempts to match edges in the two images rather than pixels Computational Issues:
Correspondence problem not as bad,∵(usually) far fewer edges than pixelsin an image.
OTOH, algorequires pre-processing to detect edges (itself a non-trivial problem)
Limitation: Resulting depth information sparse,∴available at only a few locations.
Further step needed to interpolate depth across surfaces in a scene Biologically Plausibility:
More biologically plausible∵it’s known that binocular processing
Algorithms for Stereo Vision
Area and Filtering-Based Schemes
Area-based algorithms:
Correlation b/w brightness patterns in local neighborhood of a pixel w/
brightness patterns in the local nhd. of the other image.
Simplest is correlation
Filtering Algorithms: Jones and Malik (1992) Algo matches local regionsaround the point.
Uses biologically inspired linear spatial filtersthat differ in their orientation and size tuning, calledmulti-oriented multi-scale (MOMS)filters.
Inspiration from biology: ∃cells in a hypercolumn whose RFs are centered on the same retinal position, each cell tuned to different orientation and scale
Jones and Malik’s motivation:
Some good algorithm for computing stereo matchings
Biological inspiration was the starting point, not the criterion for judging soundness
Lecture theme: Algorithmic strategy that takes biological constraints into account - ”how does the brain do it‘‘
A windowed (unconventional) correlation-based scheme developed by Yeshurun and Schwartz
Cross-Correlation (not the usual) used is called the cepstrum
How V1 does stereopsis
Yeshurun and Schwartz, in a series of papers, indicated:
Possible functional (computational) role for the ocular dominance colummns by showing how:
OcuDom columns enable a data structure allowing a
One-shot (i.e. no iterations required) algorithm called cepstral filter to Provide a robust solution to stereo matching that is
Consistent with psychophysical data on stereopsis
Data Structure
Ocular Dominance Columns
Visual (hemi)-field representation:
Right cortical hemisphere gets input from left visual hemifield from both eyes (and vice-versa)
Interlaced image:
A B
A C B D
C D
interlaced image
Cut the two images into strips
Power Spectrum or Power Spectral Density (PSD)
Power spectrum or Power Spectral Density (PSD) of a signal: The signal power as a function of frequency
Measures power content of component frequencies of the signal Fourier transform of the autocorrelation function
Algorithm supported by the data structure
Cepstrum
Cepstrum is the power spectrum of the log of the power spectrum Windowed cepstrum can be applied to the binocular interlaced image Computational Issues:
Easy to compute windowed spectrum given above DS Need to estimate power spectral density and apply a log Biological plausibility: Cortical neurons can:
Act as medium-bandwidth power spectral filters Perform logarithmic computation and multiplication
Cepstrum calculation I
Suppose the left and right eye images are identical i.e. no disparity Suppose the width of a column is D
Images(x,y) and the identical image repeated and abutted, viz., s(x−D,y)
Interlaced image, composed of a single column pair is given by f(x,y) =s(x,y)⊗ {δ(x,y) +δ(x−D,y)}
where⊗represents convolution
Cepstrum calculation II
Fourier transform of interlaced image is:
F(u,v) =S(u,v)· {1 +e−iπDu}
Log is: logF(u,v) =logS(u,v) +log(1 +e−iπDu)
Fourier transform of the 2nd term on RHS has a prominent peak at disparity D (can be shown)
Peak position can be recovered by a peak detection algorithm Spatial position in the cepstrum is a direct measure of
”disparity“ of left and right images
Requirements for a good theory of human stereopsis
Cepstrum robust to image manipulations that human stereo is robust to
We now turn to how Panum’s area change with eccentricity according to psychophysical data is explained by this algorithm
Perceptual data re Panum’s area explained by the above algorithm
DS used for computation of stereopsis: OcuDom columns
Combined width of a pair of OcuDom columns (i.e. hypercolumn) is 2 mm, almost constant through V1
Recall: ∆w∆z ≈ z+a1 ,
In words, ratio of change in cortical position to visual angle subtended is prop. to magnification factor
Perceptual data re Panum’s area explained by the above algorithm II
∴∆z ≈ ∆w1
z+a
If ∆w hypercolumn width, then ∆z is visual angle subtended
∴angle subtended linearly proportional to inverse magnification factor Magnification factor z+a1 inversely proportional to eccentricityǫ
Perceptual data re Panum’s area explained by the above algorithm III
∴angle subtended by hypercolumn scales linearly w/ eccentricity ǫ Known from psychophysical experiments: Panum’s area (range of fusion) also scales linearly w/ eccentricity
Data consistent w/ the hypercolumn as a functional (computational) module to compute stereo fusion
We conclude that range of fusion extends over a single hypercolumn, regardless of position in the visual field
Edges
What are edges? Answer: Discontinuous changes in image luminance
Lecture Theme: Algorithms the brain employs to do edge detection
RF scatter enhances boundary contour representation in V1
Give a brief summary of Bomberger and Schwartz’s paper
Eric L. Schwartz’s question
What are the computational functions of the visual cortex?
To Computer Science students of Computer Vision interested in how the brain does vision, Eric Schwartz’s web-site:
http://cns.bu.edu/∼eric/
Plus a beautiful display of the complex log map!
Potpourri: Suggested theoretical principles for the computational functions of cortex
Points to ponder, and names to Google
Bayesian Inference: David Mumford
Liquid State Machines: Wolfgang Maass, in collaboration with Henry Markram
No. of ”top-down“ connections approx. ten times number of
”bottom-up“ connections. What is the computational role played by these? After all, facial recognition takes place in time too short for top-down connections to play a role.
Potpourri II: Suggested theoretical principles for the computational functions of cortex
Points to ponder, and names to Google
Schwartz and collaborators have taken one approach to the
relationship between architecture and function, acontinuum approach Laurence Abbott and Peter Dayan have attempted to describe
network connectivity and dynamics, which treats the neurons as
”discrete“ units
“What and ”where“ pathways
Laminar Computation: Stephen Grossberg Terrence Sejnowski
Eric Schwartz
Neuroscience’s Enrico Fermi
Recurring theme in his work: Biological plausibilityof suggested algorithms for a computational task(s)
IOW, how the brain does it
Eric Schwartz: Department of Cognitive and Neural Systems, Boston University
Professor of Cognitive and Neural Systems Professor of Electrical and Computer Engineering Professor of Anatomy and Neurobiology,
PhD Experimental High Energy Physics, Columbia
His work demonstrates how a top-quality experimental science and data analysis can suggest mathematically formulated theoretical work
Eric Schwartz
Experimental (wet-biology) work in Neurophysiology
Mathematical techniques from following subjects used in his work:
Computer Vision (Algorithms for space-variant vision) Image Processing
Signal Processing (Application of cepstrum) Complex Analysis (Numerical conformal mapping) Graph Theory (Graph partitioning)
Non-trivial contributions to machine vision and robotics Built a self-navigated robot
Eric Lee Schwartz’s webpage
Home Page: http://cns.bu.edu/∼eric/
Wikipedia entry: Eric L Schwartz
Part III: Single Neuron Computation
How a single neuron looks to a:
Biologist - Basic terminology
Electrical Engineer or a Computer Scientist - McCulloch-Pitts neuron
Rosenblatt’s Perceptron Spiking Neural Networks
Biophysicist: Hodgkin-Huxley model - describes biophysical mechanisms of action potential generation
Theoretical Bio(logical) Physicist: William Bialek and co-workers One of the world’s top experts in single neuron computation is Bartlett Mel - I encourage you to explore his work
Single Neuron: Biology basics I
Cell body (blue) - contains most of the cell’s volume, mass and nucleus
Single Neuron
Basic Terminology
Soma aka Cell body: ”The” neuron.
Membrane Potential: Difference in voltage between the neuron’s interior and exterior
Dendrites: Input terminals: How a neuron receives information from other neurons
Axon Hillock: Output terminal: Where the output is generated Axon: Neuron messages other neurons via this “wire”
Single Neuron
Basic Terminology
Action Potentialaka Spike:
Role: A neuron’s way of signaling other neurons
Mechanism: Biophysical mechanisms cause a positive feedback process to cause themembrane potential to rise dramatically at the axon hillock, and this voltage change is transmitted down the axon.
Single Neuron: Biology basics II
Pyramidal Cell
Primary excitatory neurons in the cortex, hippocampus, amygdala So-called because soma or cell body is ”triangular“ or ”pyramid“
shape
Single Neuron: Modeled as an Artificial Neuron (circa 1950s and 1960s)
Threhold Logic Unit, developed by McCulloch and Pitts, and called the McCulloch-Pitts neuron
Based on neurophysiology of the 1950s o/p =φ(Σmj=0wjxj)