Our human sensory systems excel at recognizing objects and words, displaying an impressive ability to identify them, even when presented in unusual orientations or spoken in unfamiliar voices. This remarkable capability has been mirrored by computational models, specifically deep neural networks, which have been trained to perform similar feats. These models can accurately identify images, such as a dog, regardless of the fur color, or words, irrespective of variations in the speaker’s voice pitch.
However, a recent study by MIT neuroscientists has shed light on a significant observation: these models sometimes respond in a similar manner to stimuli that bear no resemblance to the intended target. When these neural networks were tasked with generating an image or a word that elicited the same response as a specific natural input, like a picture of a bear, the majority of them produced images or sounds that were entirely unrecognizable to human observers. This intriguing discovery suggests that these models develop their own idiosyncratic “invariances,” essentially responding consistently to stimuli with vastly different characteristics.
Jenelle Feather, PhD ’22, now a research fellow at the Flatiron Institute Center for Computational Neuroscience, emphasizes the significance of these results. Collaborators on the paper include Guillaume Leclerc, an MIT graduate student, and Aleksander Mądry, the Cadence Design Systems Professor of Computing at MIT.
In recent years, researchers have developed deep neural networks capable of analyzing millions of inputs, whether they are sounds or images, and learning common features that enable them to classify words or objects with human-level accuracy. These models are currently regarded as leading representations of biological sensory systems.
The research aimed to determine whether deep neural networks trained for classification tasks also develop similar invariances. To address this question, the researchers employed these models to create stimuli that elicited the same response as the original stimulus presented by the researchers. These stimuli are referred to as “model metamers,” reviving a concept from classical perception research, where indistinguishable stimuli can be used to diagnose the invariances of a system.
The results were surprising. Most of the images and sounds generated by the models in this manner bore no resemblance to the original examples. They appeared as a chaotic jumble of pixels, and the sounds were unintelligible. When human observers were shown these images, they rarely categorized them in the same category as the original target examples.
The same effect was observed across various vision and auditory models, with each model developing its unique invariances.
The researchers also discovered that they could enhance the recognizability of a model’s metamers to humans using a technique called adversarial training. This method was originally developed to address another limitation of object recognition models, which is their susceptibility to misrecognizing images when tiny, almost imperceptible changes are introduced.
The analysis of metamers produced by computational models holds promise as a tool for assessing the degree to whicha computational model replicates the underlying organization of human sensory perception systems.