Development of a robust camera-based text recognition model for the visual impaired.

dc.contributor.authorNwokoma,Francisca Onyinyechi
dc.date.accessioned2025-10-31T11:26:49Z
dc.date.available2025-10-31T11:26:49Z
dc.date.issued2022-12
dc.descriptionDoctoral thesis on "robust camera-based text recognition model". It contains diagrams, pictures and tables.
dc.description.abstractThe quest to bridge the digital divide in this world of fast growing Information and Communication Technology should not only be restricted to some domains but should also be extended to all and sundry. Till date, Screen readers for the visually impaired still perform below expectation; their applications are also domain dependent. Generally, research has shown that the Visually Impaired Persons (VIPs) tend to be greatly deprived of certain job opportunities due to their visual incapacitation and as such the unemployment rates among the visually impaired are increasingly alarming irrespective of their intellectual prowess. Therefore, to improve Text Recognition capabilities of OCR and incorporate the visually impaired community into employment setting, a Robust Camera Based Text Recognition model that will enable a blind person access documents and scene images for effective work collaboration is proposed. The system was designed to come up once the user machine is turned on. To bring this Concept to light, deep learning approach precisely CRAFT (Character-Region Awareness for Text Detection) Architecture which is suitable for detecting Curved images was deployed for text detection and CRNN (Convolutional Recurrent Neural Network) which combines the functionalities of CNN (Convolution Neural Network), RNN (Recurrent Neural Network) and CTC (Connectionist Temporal Classification) loss for an optimal Character Recognition was deployed. The Recognition Model was trained using Synth90k synthetic text dataset provided by the Visual Geometry Group (VGG) architecture which gives recognition accuracy of 98%. The system was implemented using Python Natural Language Processing Libraries. Finally, the recognized text is then communicated to the VIP in audio format.
dc.identifier.citationNwokoma, F. O. (2022). Development of a robust camera-based text recognition model for the visual impaired. {Unpublished Doctoral Thesis), Federal University of Technology, Owerri.
dc.identifier.urihttps://repository.futo.edu.ng/handle/20.500.14562/2250
dc.language.isoen
dc.publisherFederal University of Technology, Owerri.
dc.rightsAttribution-NonCommercial-ShareAlike 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/
dc.subjectArtificial intelligence
dc.subjectmachine learning
dc.subjectcomputer vision
dc.subjectspeech processing
dc.subjectvisual impaired
dc.subjectdepartment of computer science
dc.titleDevelopment of a robust camera-based text recognition model for the visual impaired.
dc.typeDoctoral Thesis

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Nwokoma,F.O._Development_2022.pdf
Size:
2.86 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.64 KB
Format:
Item-specific license agreed to upon submission
Description:

Collections