Computer Vision for Image Understanding: A Comprehensive Review
Abstract:
Computer Vision has its own Turing test: Can a machine describe the contents of an image or a video in the way a human being would do? In this paper, the progress of Deep Learning for image recognition is analyzed in order to know the answer to this question. In recent years, Deep Learning has increased considerably the precision rate of many tasks related to computer vision. Many datasets of labeled images are now available online, which leads to pre-trained models for many computer vision applications. In this work, we gather information of the latest techniques to perform image understanding and description. As a conclusion we obtained that the combination of Natural Language Processing (using Recurrent Neural Networks and Long Short-Term Memory) plus Image Understanding (using Convolutional Neural Networks) could bring new types of powerful and useful applications in which the computer will be able to answer questions about the content of images and videos. In order to build datasets of labeled images, we need a lot of work and most of the datasets are built using crowd work. These new applications have the potential to increase the human machine interaction to new levels of usability and user’s satisfaction.
Año de publicación:
2020
Keywords:
- Computer Vision
- Scene recognition
- cnn
- deep learning
- Object classification
- Image understanding
Fuente:

Tipo de documento:
Conference Object
Estado:
Acceso restringido
Áreas de conocimiento:
- Visión por computadora
- Ciencias de la computación
Áreas temáticas:
- Métodos informáticos especiales
- Funcionamiento de bibliotecas y archivos
- Ciencias de la computación