Object recognition is an important aspect of humans, as it allows us to pick up on various visual cues and react to them accordingly. With the rise of technology in modern times, artificial intelligence models have become better at object recognition, although there are still some differences between the skill and processing levels of humans and A.I. These ideas are highlighted in two recent studies that explore the topics of object and shape recognition further.
The first article, “Beyond the Contour: How 3D Cues Enhance Object Recognition in Humans and Neural Networks”, written by Mikayla Cutler et al., explains the importance of three dimensional cues, and how they are important in being able to recognize an object. In this study, 3D cues are categorized as things like shading and shadows. The researchers created a dataset full of these 3D cues, and examined how humans and artificial neural networks (ANN’s) reacted and used them. It was found that both humans and ANN’s benefit from these 3D cues. However, one important finding was that while humans used the cues to construct the shape in their head to figure it out, ANN’s saw the cues as a pattern, and used pattern recognition from past training to decipher the images. This shows how ANN’s and other artificial intelligence models are unable to infer and think on their own, and are only able to use existing knowledge fed to them in order to come to conclusions.
These findings align with another paper, titled “Deep Learning Models Fail to Capture the Configural Nature of Human Shape Perception” and written by Nicholas Baker and James Elder. In this study, researchers found how humans and deep convolutional neural networks, or DCNN’s, process shapes in different ways. The DCNN’s and human participants were given animal silhouettes, either “whole” (the original silhouette), “fragmented” (where the top half was flipped and moved away from the bottom), or “frankensteined” (where the top half was flipped and then aligned with the bottom to create a single shape). The results showed that humans had a significantly harder time identifying the “fragmented” and “frankenstein” shapes, whereas the DCNN’s only had issues with the “fragmented” images. This data proves that the way humans and DCNN's perceive visual media are drastically different. While humans are able to grasp the overall impression of the image based on it as a whole, AI has trouble with this, instead opting for a more mechanical way of doing things, like analyzing the lines and curves of the image into a pattern which can then be checked in its database to find a match.
In conclusion, these studies are able to show how humans and AI differ when it comes to object recognition. While AI may seem to be on par with the human brain, it lacks the creativity and nuance of a real human.
References
Baker, N., & Elder, J. H. (2022). Deep learning models fail to capture the configural nature of human shape perception. iScience, 25(9), 104913.
Cutler, M., Baumel, L. D., Tocco, J. A., Friebel, W., Thiruvathukal, G. K., & Baker, N. (2025). Beyond the Contour: How 3D Cues Enhance Object Recognition in Humans and Neural Networks. Journal of Vision. https://doi.org/10.1016/j.isci.2022.104913.
No comments:
Post a Comment