Friday, October 11, 2024

Reviewing how Humans and AI Are Shaping the Future of Visual Recognition

  

 

    Analyzing the evolution of society throughout the centuries can make an individual reflect in awe. Decades ago, electricity had just been discovered and in modern society technology is profoundly embedded in the function of human's day-to-day lifestyle. Technology is truly the lifeline of society; we can assume technology will not cease to exist. In fact, it has only continued to evolve through the works of many trail-blazing research findings. One of the most interesting technological advancements that has become quite apparent in recent years is the development and usage of Artificial Intelligence, referred to as AI. If you’ve seen the film, The Matrix directed by Larry and Andy Wachowski, you know AI and our destiny... I’m kidding. However, all jokes aside Artificial Intelligence has become a tool used for a large spectrum of activities. It is used to develop a daily routine or study schedule for students to effectively aiding in diagnosing patients by analyzing medical records to develop personalized treatment recommendations and has been used to enhance robotic capabilities in manufacturing. Despite the evolution of this high-tech form of computer programming and human intelligence, I believe it is important to note that this technology has its limitations, just as any man-made development does. AI is trained in its abilities on materials created by humans with human biases, and unlike humans, it is unable to distinguish between biased and unbiased material when using information to construct its output and responses. Considering this, we become cognizant that the capabilities of AI are largely dependent on the information inputted to AI by humans. Therefore, the materials we provide to AI to train, enhance its abilities and database of knowledge modify its overall skillset. To think of this perspective under a different context we can imagine AI as soil and the material we provide to and input into AI is a seed. When we tend to the repeated action of watering this seed, which in real context would be enabling repeated training within the AI database, the seedling will eventually grow and develop an ability to become mostly self-sufficient once it has passed a certain stage of growth. This very concept has been explored by researchers from UCLA Samueli School of Engineering. Information on this experiment can be found within the article, New AI computer vision system mimics how humans visualize and identify objects. Throughout this article it is concluded that AI cannot create a full picture of an object after seeing only a partial portion of it. A result of the system not being designed to learn on its own, instead the computer system requires training in what to learn. A resolution to this perceived issue was redeemed by a new method developed and discussed by the engineers in the Proceedings of the National Academy of Sciences, which is references in the UCLA research article being commented on and referenced to. This method is based on a breakdown of 3 main components. The first component is that the system separates and breaks down an image into smaller sections, which is referred to as viewlets. The second component is that the computer learns how these said viewlets are able to piece together to formulate the object in question. Lastly, the third component is based on the computer analyzing other objects and determining if the surrounding objects are relevant to identifying the object in question. Through understanding this 3-component system, engineers' placed AI into an internet replica to simulate the very environment humans live in. The AI system was tested with a plethora of images, about 9,000 visuals in different perspectives such as birds’ eye, obscured and scaled which each showed people and other objects. This allowed the computer system to develop detailed models of the human body, without the reliance of external assistance or previously labeled images. Providing the images in an abundant amount through different visuals, images and videos, provided a tutorial for the computer system. As stated before, AI’s limitations are based on the input of human developers. Therefore, by implementing this technique the system was able to learn and expand its understanding of what to do and how to do so. As stated by Vwani Roychowdhury in the UCLA research article, “Starting as infants, we learn what something is because we see many examples of it, in many contexts. The contextual learning is a key feature of our brains, and it helps us build robust models of objects that are part of integrated worldview where everything is functionally connected.” I believe this comment on behalf of Roychowdhury beautifully encapsulates how presenting the images and videos provided a learning experience, and how-to tutorial for the computer system. Allowing it to enable its ability to “learn like humans”. Based on the conclusions of this experiment on AI it indicates how the ability of AI is becoming more affluent with processing and developing a full picture of objects with only being presented with a partial amount of the imaging. Interestingly enough, this is a crossover with the abilities of humans as further indicated in the developed research findings of Nicholas Baker, Patrick Garrigan and Philip J. Kellman. 

    When I read and attended the presentation of this project by Nicholas Baker, Constant Curvature Segments as Building Blocks of 2D Shape Representation, I learned of the conclusions determined throughout the research conducted. It was determined that humans are able to piece together images, even when they are not allotted the entirety of the image. To resolve the issue of not having the full information of the image the brain uses constant curvature segments. This process sanctions the brain to both process and reconstruct an entire shape or image, despite the lack of image or shape. Our visual system adapts to filling in the missing perception of the image by segmenting shapes into regions of constant curvature. This contribution of research indicates the efficiency and adaptability of the human visual system capability to fluctuate in understanding different rotations of images and shapes including scaling, rotation and partial occlusion. This human study conducted by Baker and his colleagues indicates a large parallel to the very experiment executed on AI by UCLA researchers and engineers. Both studies highlight a rise of an adapted form of learning created to evolve to the test at hand in each study. Under each study the system being observed used their adapted form of learning, developed through a training founded on repetition and variety, to understand and piece together the entirety of an image or shape despite the lack of information provided to each systemThis draws an emphasize on how AI can learn when placed in the same environment as a human through the use of an internet replica to do so.  Through recognizing the closeness in the capacity of capabilities of humans uprising in a man-made technology indicates the beginning of how far this technology can truly go and leaves an infinite number of questions to be explored in determined the rate AI can learn at in comparison to human beings. Just in case, The Matrix can be found on Netflix.  


References:  

New AI computer vision system mimics how humans visualize and identify objects (UCLA Research Article)  



Written and Researched By:

Alisha Arreola  

Molecular & Cellular Neuroscience and Bioinformatics Student | Loyola University Chicago  

Co-President, Executive Board Member | Heart for the Unhoused Chicago  



No comments:

Post a Comment