Real-time Lexicon-free Scene Text Retrieval


Abstract:

In this work, we address the task of scene text retrieval: given a text query, the system returns all images containing the queried text. The proposed model uses a single shot CNN architecture that pbkp_redicts bounding boxes and builds a compact representation of spotted words. In this way, this problem can be modeled as a nearest neighbor search of the textual representation of a query over the outputs of the CNN collected from the totality of an image database. Our experiments demonstrate that the proposed model outperforms previous state-of-the-art, while offering a significant increase in processing speed and unmatched expressiveness with samples never seen at training time. Several experiments to assess the generalization capability of the model are conducted in a multilingual dataset, as well as an application of real-time text spotting in videos.

Año de publicación:

2021

Keywords:

  • Word spotting
  • convolutional neural networks
  • Region proposal networks
  • Image retrieval
  • Scene text recognition
  • Scene text detection
  • PHOC

Fuente:

scopusscopus

Tipo de documento:

Article

Estado:

Acceso restringido

Áreas de conocimiento:

  • Visión por computadora
  • Ciencias de la computación

Áreas temáticas:

  • Programación informática, programas, datos, seguridad
  • Métodos informáticos especiales
  • Biblioteconomía y Documentación informatica