Regresar

Integrating low-resolution depth maps to high-resolution images in the development of a book reader design for persons with visual impairment and blindness

Abstract:

The objective of this study is to provide a new design approach to a fully automated book reader for individuals with visual impairment and blindness that is portable and cost effective. This approach relies on the geometry of the design setup and provides the mathematical foundation for integrating in a unique way a 3-D space surface map from a low-resolution time of flight (ToF) device with a high-resolution image as means to enhance the reading accuracy of warped images due to the page curvature of bound books and other magazines. The merits of this low cost, but effective automated book reader design include: (1) a seamless registration process of the two imaging modalities so that the low-resolution (160 × 120 pixels) depth information, acquired by an Argos3D-P100 camera, accurately covers the entire book spread as captured by the high-resolution image (3072 × 2304 pixels), acquired by a Canon G6 Camera; (2) a mathematical framework for overcoming the curvature of open bound books, a process we refer to as the dewarping of the book spread images, and (3) image correction performance comparison between uniform and full height map to determine which map provides the highest optical character recognition (OCR) reading accuracy possible. The design concept could also be applied to address the challenging process of book digitization. This method is dependent on the geometry of the book reader setup for acquiring a 3-D map that yields high reading accuracy once appropriately fused with the high-resolution image. The experiments were performed on a testing dataset consisting of 200 pages with their corresponding computed and co-registered height maps, which are made available to the research community for their own testing (cate-book3dmaps.fiu.edu). Improvements of the reading accuracy, due to the correction steps, were quantified and measured by introducing the corrected images to an OCR engine and tabulating the number of missrecognized characters. Furthermore, the resilience of the book reader was tested by introducing a rotational misalignment to the book spreads and comparing the OCR accuracy to those obtained with the standard alignment. The standard alignment yielded an average reading accuracy of 95.55% with the uniform height map (i.e., the height values of the central row of the 3-D map are replicated to approximate all other rows), and 96.11% with the full height maps (i.e., each row has its own height values as obtained from the depth information from the 3D camera). When the rotational misalignments were taken into account, the results obtained produced average accuracies of 90.63% and 94.75% for the same respective height maps, proving added resilience of the full height map method.