• No results found

Devanagari character recognition in the wild

N/A
N/A
Protected

Academic year: 2022

Share "Devanagari character recognition in the wild"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

DEVANAGARI CHARACTER RECOGNITION IN THE WILD

by

O. V. RAMANA MURTHY

Submitted

in fulfillment of the requirements of the degree of DOCTOR OF PHILOSOPHY

to the

DEPARTMENT OF ELECTRICAL ENGINEERING INDIAN INSTITUTE OF TECHNOLOGY DELHI

December, 2011

(2)

Certificate

This is to certify that the thesis entitled "Devanagari Character Recognition in the Wild" being submitted by 0. V. Ramana Murthy for the award of the degree of the Doctor of Philosophy to the Indian Institute of Technology Delhi is a record of the bonafide research work he has carried out under my supervision. The results contained in this have not been submitted to any other University or Institute for the award of a degree or diploma.

(Dr. M. Hanmandlu) Professor Department of Electrical Engineering Indian Institute of Technology Delhi New Delhi— 110 016

iii

(3)

ACKNOWLEDGEMNETS

First of all, I would like to thank the Supreme Lord who has given me the strength, courage and inspiration to complete this Thesis work. It is because of His blessings and His pure representative's intervention in my life that I am able to submit my research work.

I would express my sincere gratitude to my supervisor Prof. M. Hanmandlu, who guided my research project in a very nice way. There were many ups and downs during the research period. His willingness to help me both professionally and personally will always be appreciated. I also express my heartiest gratitude to Mrs. M. Hanmandlu, who gave me access to meet and discuss with him even in odd hours.

I am also very grateful to Prof. P. V. Krishnan. He has been a great source of inspiration to me to learn many higher principles and values in life. My attitude towards my Thesis and life has improved a lot.

My friends Sujoy Roy, Vipin Narang, Partheepan, Sudheer and many more were a great source of inspiration for me. I can never forget the moral and emotional support they have extended to me during my ups and downs in my thesis completion.

I am very much indebted to my parents and my sisters who always encouraged me and gave me all kinds of support to pursue my research work peacefully.

Ramana Murthy

v

(4)

Abstract

Character recognition is a means of automatic recognition of text in the written form to facilitate faster and better performance. However, text can be seen around us called text in the wild like in street signs, shop names, product advertisements, posters on streets etc. The text in the wild is prone to multiple sources of noise and these make the recognition extremely difficult in the unconstrained setting in which the text in the wild is captured. The problem gets more complicated while developing the text recognition methodologies for different scripts because there is no general solution to this problem and recognizing text in some scripts can be tougher than in others. This thesis proposes methods for recognizing the Devanagari characters from the images in the wild.

A detailed study of the Devanagari script character recognition using the state-of- the-art character recognition and object recognition tools is made to bring out the issues in the straightforward application of such tools to the challenging problem at hand. A methodology to segment the Devanagari words acquired from the natural scenes into characters is presented. The results are encouraging although on a small database.

A part-based model approach is presented for recognizing the Devanagari characters in the wild. This method particularly alleviates the problem of getting sufficient training samples from the real images for learning a model. The proposed approach gives competitive performance on the training database of the machine printed characters than that obtained from the state-of-the-art approaches.

The existing classifiers like Nearest Neighbor and SVM are computationally constrained by a large number of comparisons, and to address these issues a Fuzzy Model based classifier is developed.

vii

(5)

Table of Contents

Certificate iii

Acknowledgements v

Abstract vii

Table of Contents ix

List of Figures xiii

List of Tables xvii

Chapter 1 Introduction 1

1.1 Motivation for recognizing text in the wild 2

1.2 Description of Devanagari script 6

1.3 The problem of recognizing Devanagari characters 8

1.4 Organization of the Thesis 12

Chapter 2 Literature survey 13

2.1 Text extraction from the images in the wild 15 2.2 Word recognition from the images in the wild 16

2.3 Character recognition in the wild 16

2.4 Survey in Indian languages. 17

Chapter 3 Evaluation of Existing Approaches on Devanagari Characters 21

3.1 Databases 22

3.2 Preprocessing 29

3.3 Features 29

3.3.1 Global features 31

3.3.2 Local features 33

3.4 Results of experimentation on different databases 35

ix

(6)

3.4.1. Performance on DSMP-8K and DSHnd-30K 35 3.4.2 Performance on transformed dataset DSMP-48K 37

3.4.3 Performance on text from the wild 39

3.4.4 Performance on DSIW-3K 41

3.5 Conclusions 42

Chapter 4 Segmentation of the words into Devanagari characters 45

4.1 Overall framework 45

4.2 Stages of segmentation 48

4.3 Results of segmentation 52

4.3.1 Partially segmented results 55

4.3.2 Poorly segmented results 55

4.3.3 Execution time 56

4.4 Conclusions 57

Chapter 5 Recognizing Devanagari Characters in the Wild 59

5.1 Overall Framework 60

5.2 Part-based models 61

5.2.1 Extraction of distinct parts 62

5.2.2 Classification of test characters using the part-based models 66

5.3 Results and Analysis 71

5.3.1 Effect of considering the synthetic training data as training data 71 5.3.2Comparison with state-of-the-art results 72

5.3.3 Effect of choice of parameters 73

5.3.4 Effect of using Grassman distance 75

5.3.5 Effect of combined score 77

5.3.6 Execution time 81

x

(7)

5.4 Conclusions 81 Chapter 6 Fuzzy Models for Character Recognition 83

6.1 Overall framework 84

6.2 Formulation of the Fuzzy Model 84

6.2.1 Estimation of parameters 87

6.3 Fuzzy Model 88

6.4 Interactive Fuzzy Model 90

6.4.1 Fuzzy measure theory 91

6.4.2 Estimation of Parameters 92

6.5 Results and Discussion 93

6.6 Conclusions 97

Chapter 7 Conclusion and Future Work 99

7.1 Conclusions 99

7.2 Contribution of the thesis 102

7.3 Suggestions for future work 102

References Appendix

References

Related documents

This is to certify that the thesis entitled "Numerical Simulation of Heat Transfer to Gas Turbine Blades" being submitted by N. ASOK KUMAR for the award of degree of DOCTOR

This is to certify that the thesis titled, " FAULT LOCATION IN GENERAL COMBINATIONAL CIRCUITS ", being submitted by R.Lakshminarasimhan, for the award of Doctor of

This is to certify that the thesis entitled " Radiation and moisture initialization in short range weather prediction over India ", being submitted by B.Nandi for the award

This is to certify that the thesis entitled ‘Architectural Strategies for Hand Posture Recognition in Systems on Chip’ being submitted by Mahesh Chandra for the award of

This is to certify that the thesis entitled "Assimilative Capacity In Major Urban Cities of India" being submitted by Shri Chandra Prakash for the award of degree of

This is to certify that the thesis entitled, "Bile acid-based receptors for recognition of anions and adenine derivatives", being submitted by Mr. Vijay Kumar Khatri, to the

This is to certify that the thesis, entitled, "Characterisation of Short—Range TroposcPtter CommunicPtion Channels" being submitted by Suryp Prakpsh Uttpm for the award

Daystar Downloaded from www.worldscientific.com by INDIAN INSTITUTE OF ASTROPHYSICS BANGALORE on 02/02/21.. Re-use and distribution is strictly not permitted, except for Open