Mostrando 10 resultados de: 10
Publisher
Proceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022(3)
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)(2)
Proceedings of the International Conference on Document Analysis and Recognition, ICDAR(2)
Pattern Recognition Letters(1)
Proceedings - 2021 IEEE Winter Conference on Applications of Computer Vision, WACV 2021(1)
Área temáticas
Métodos informáticos especiales(7)
Funcionamiento de bibliotecas y archivos(3)
Imprenta y actividades conexas(2)
Artes(1)
Biblioteconomía y Documentación informatica(1)
Área de conocimiento
Ciencias de la computación(9)
Visión por computadora(4)
Aprendizaje automático(3)
Inteligencia artificial(2)
Minería de datos(1)
Objetivos de Desarrollo Sostenible
ODS 9: Industria, innovación e infraestructura(10)
ODS 4: Educación de calidad(9)
ODS 8: Trabajo decente y crecimiento económico(1)
Origen
scopus(10)
ICDAR 2019 competition on scene text visual question answering
Conference ObjectAbstract: This paper presents final results of ICDAR 2019 Scene Text Visual Question Answering competition (STPalabras claves:Scene text, Scene understanding, Vision and language, Visual question answeringAutores:Furkan Biten A., Jawahar C.V., Karatzas D., Lluís Álvarez Gómez, Mafla A., Mathew M., Rusiñol M., Tito R., Valveny E.Fuentes:scopusMUST-VQA: MUltilingual Scene-Text VQA
Conference ObjectAbstract: In this paper, we present a framework for Multilingual Scene Text Visual Question Answering that deaPalabras claves:Multilingual models, Power of language models, Scene text, Translation robustness, Visual question answering, Zero-shot transferAutores:Furkan Biten A., Karatzas D., Lluís Álvarez Gómez, Mafla A., Vivoli E.Fuentes:scopusLet there be a clock on the beach: Reducing Object Hallucination in Image Captioning
Conference ObjectAbstract: Explaining an image with missing or non-existent objects is known as object bias (hallucination) inPalabras claves:Vision and LanguagesAutores:Furkan Biten A., Karatzas D., Lluís Álvarez GómezFuentes:scopusIs An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching
Conference ObjectAbstract: The task of image-text matching aims to map representations from different modalities into a commonPalabras claves:Vision and LanguagesAutores:Furkan Biten A., Karatzas D., Lluís Álvarez Gómez, Mafla A.Fuentes:scopusOne-shot Compositional Data Generation for Low Resource Handwritten Text Recognition
Conference ObjectAbstract: Low resource Handwritten Text Recognition (HTR) is a hard problem due to the scarce annotated data aPalabras claves:Document AnalysisAutores:Dey S., Fornes A., Furkan Biten A., Karatzas D., Kessentini Y., Llados J., Lluís Álvarez Gómez, Souibgui M.A.Fuentes:scopusScene text visual question answering
Conference ObjectAbstract: Current visual question answering datasets do not consider the rich semantic information conveyed byPalabras claves:Autores:Furkan Biten A., Jawahar C.V., Karatzas D., Lluís Álvarez Gómez, Mafla A., Rusiñol M., Tito R., Valveny E.Fuentes:scopusSelective style transfer for text
Conference ObjectAbstract: This paper explores the possibilities of image style transfer applied to text maintaining the originPalabras claves:data augmentation, Scene text detection, Style transfer, Text style transferAutores:Furkan Biten A., Gibert J., Gómez R., Karatzas D., Lluís Álvarez Gómez, Rusiñol M.Fuentes:scopusMulti-modal reasoning graph for scene-text based fine-grained image classification and retrieval
Conference ObjectAbstract: Scene text instances found in natural images carry explicit semantic information that can provide imPalabras claves:Autores:Dey S., Furkan Biten A., Karatzas D., Lluís Álvarez Gómez, Mafla A.Fuentes:scopusMultimodal grid features and cell pointers for scene text visual question answering
ArticleAbstract: This paper presents a new model for the task of scene text visual question answering. In this task qPalabras claves:41A05, 41A10, 65D05, 65D17, deep learning, MSC, Multi-modal learning, Scene text, Visual question answeringAutores:Furkan Biten A., Karatzas D., Lluís Álvarez Gómez, Mafla A., Rusiñol M., Tito R., Valveny E.Fuentes:scopusOCR-IDL: OCR Annotations for Industry Document Library Dataset
Conference ObjectAbstract: Pretraining has proven successful in Document Intelligence tasks where deluge of documents are usedPalabras claves:Autores:Furkan Biten A., Karatzas D., Lluís Álvarez Gómez, Tito R., Valveny E.Fuentes:scopus