Development of a robust camera-based text recognition model for the visual impaired.

Nwokoma,Francisca Onyinyechi

Development of a robust camera-based text recognition model for the visual impaired.

dc.contributor.author	Nwokoma,Francisca Onyinyechi
dc.date.accessioned	2025-10-31T11:26:49Z
dc.date.available	2025-10-31T11:26:49Z
dc.date.issued	2022-12
dc.description	Doctoral thesis on "robust camera-based text recognition model". It contains diagrams, pictures and tables.
dc.description.abstract	The quest to bridge the digital divide in this world of fast growing Information and Communication Technology should not only be restricted to some domains but should also be extended to all and sundry. Till date, Screen readers for the visually impaired still perform below expectation; their applications are also domain dependent. Generally, research has shown that the Visually Impaired Persons (VIPs) tend to be greatly deprived of certain job opportunities due to their visual incapacitation and as such the unemployment rates among the visually impaired are increasingly alarming irrespective of their intellectual prowess. Therefore, to improve Text Recognition capabilities of OCR and incorporate the visually impaired community into employment setting, a Robust Camera Based Text Recognition model that will enable a blind person access documents and scene images for effective work collaboration is proposed. The system was designed to come up once the user machine is turned on. To bring this Concept to light, deep learning approach precisely CRAFT (Character-Region Awareness for Text Detection) Architecture which is suitable for detecting Curved images was deployed for text detection and CRNN (Convolutional Recurrent Neural Network) which combines the functionalities of CNN (Convolution Neural Network), RNN (Recurrent Neural Network) and CTC (Connectionist Temporal Classification) loss for an optimal Character Recognition was deployed. The Recognition Model was trained using Synth90k synthetic text dataset provided by the Visual Geometry Group (VGG) architecture which gives recognition accuracy of 98%. The system was implemented using Python Natural Language Processing Libraries. Finally, the recognized text is then communicated to the VIP in audio format.
dc.identifier.citation	Nwokoma, F. O. (2022). Development of a robust camera-based text recognition model for the visual impaired. {Unpublished Doctoral Thesis), Federal University of Technology, Owerri.
dc.identifier.uri	https://repository.futo.edu.ng/handle/20.500.14562/2250
dc.language.iso	en
dc.publisher	Federal University of Technology, Owerri.
dc.rights	Attribution-NonCommercial-ShareAlike 4.0 International	en
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.subject	Artificial intelligence
dc.subject	machine learning
dc.subject	computer vision
dc.subject	speech processing
dc.subject	visual impaired
dc.subject	department of computer science
dc.title	Development of a robust camera-based text recognition model for the visual impaired.
dc.type	Doctoral Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Nwokoma,F.O._Development_2022.pdf
Size:: 2.86 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.64 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

Doctoral