How can we enable computers to see the world as we see it?




When we watch this video, we see leaves whose individual appearances are constantly changing. This percept is the result of a mysterious process in the brain that somehow is able to identify the borders of each individual leaf and keep track of them as their appearances change. Each of these tasks is extremely difficult to implement using computers. Using algorithms inspired by a new mathematical understanding of the neurobiology of vision, our computers are now able to see the leaves the way we do.





Pathway for object representation in the brain





Neuroscientists have discovered that completely different brain regions are used to create physical models of objects in images (e.g., where are all the leaves, and how are they changing?) and to classify them (e.g., are these leaves or flowers?). The former is accomplished by retinotopic cortex, the latter by inferotemporal cortex. Our algorithms simulate the processing in retinotopic cortex, which is responsible for initially generating the percept of objects as discrete units in space. "Deep networks" simulate the processing in inferotemporal cortex, which is specialized for image classification. By combining the two modules, we can build systems that see like we do.