SlideShare a Scribd company logo
1 of 35
OCR System
Presented By:-
Vijay apurva(9910103462),
From 4th
year,CSEGuided By:-
Mr. Ankur
kulhari
The current capacity to translate paper documents quickly and
accurately into machine readable form using optical character
recognition technology augments the opportunities in document
searching and storing, as well as the automated document
processing. A fast response in translating large collections of image-
based electronic documents into structured electronic documents is
still a problem. The availability of a large number of processing units
in Grid environments and of free optical character recognition
tools can be exploited to produce a fast translation.
ABSTRACT:-
CONTENTS :-
 What is OCR?
 When and Why OCR?
 Existing System.
 Proposed System.
 Architecture of OCR.
 Algorithms of OCR.
 Modules of OCR.
 Design of OCR.
 Design of Screen shots for OCR.
 Conclusion.
WHAT IS OCR? :-
OCR stands for Optical Character Recognition. It is
one such system that allows us to scan printed, typewritten or
hand written text (numerals, letters or symbols) and/or convert
scanned image in to a computer process able format, either in the
form of a plain text or a word document.
 Later the converted documents can be edited, used or reused
in other documents. Thus the documents become editable.
WHEN AND WHY OCR? :-
 OCR is used when recreating a similar document in paper as
a document in electronic form takes more time.
 The converted text files take less space than the original
image file and can be indexed. Hence the use of OCR adds an
advantage to the user who had to deal with conversion of great
amount of paper works in to electronic form.
EXISTING SYSTEM:-
In the running world there is a growing demand for
the users to convert the printed documents in to electronic
documents for maintaining the security of their data. Hence the
basic OCR system was invented to convert the data available on
papers in to computer process able documents, So that the
documents can be editable and reusable.
PROPOSED SYSTEM:-
Our proposed system is OCR ON A GRID
INFRASTRUCTURE which is a character recognition system that
supports recognition of the characters of multiple languages. This
feature is what we call grid infrastructure which eliminates the
problem of heterogeneous character recognition. In this context,
Grid infrastructure means the infrastructure that supports group of
specific set of languages. Thus OCR on a grid infrastructure is multi-
lingual.
ARCHITECTURE :-
 The Architecture of the optical character recognition system on a
grid infrastructure consists of the three main components. They are:-
 Scanner
 OCR Hardware or Software
 Output Interface
Document
Illuminator
Detector
Document
Analysis
Character
Recognition Contextual
Processing
Scanner
OCR Hard-Ware Or Soft-Ware
Document image
Output
Interface
Recognition Results
To application user
TYPES OF TRAINING:-
Basically there are two major types of training using which we can
train a neural network system. They are:-
 Supervised Training
 Unsupervised Training
FLOWCHART FOR UNSUPERVISED LEARNING:-
KOHONEN NETWORK:-
The Kohonen network is presented with data, but the correct
output that corresponds to that data is not specified. Using the
Kohonen network this data can be classified into groups.
FLOWCHART FOR KOHONEN TRAINING:-
ALGORITHMS OF OCR:-
TRAINING ALGORITHM:-
One of the most common learning algorithms is called Hebb’s
Rule. This rule was developed to assist with unsupervised training.
 Hebb’s rule is expressed as:
Δ Wi j= µ ai aj (d-a)
MODULES :-
The Modules that were identified in the Optical Character
Recognition system are as follows:-
 Document Processing
 Neural network System Training
 Document Recognition
 Document Editing and
 Document Searching
DESIGN OF OCR :-
The design of our OCR system can be best explained
with the following diagram:-
Scan
Store
Recognize Editing
Searching
Document
and users
Database
OVERALL USECASE DIAGRAM:-
end-user1
end-user2
Document modification Document deletion
Document recognition
scan documents
store documents
Document processing
<<includes>>
<<includes>>
Document processing
Document editing
administrator
Trains the system
end-user
OVERALL CLASS DIAGRAM:-
Document
docid : integer
docname : String
docsize : integer
doctype : String
getDocumentDetails()
scanDocument()
covertToImage()
storeImage()
Editor
cut()
copy()
paste()
new()
open()
find()
HelpFrame
HEntry
hLineClear()
vLineClear()
findBounds()
TrainingSet
inputCount : int
outputcount : int
trainingSetCount : int
setInputCount()
setOutputCount()
setTrainingSetCount()
setClassify()
1..*
1
1..*
1
MainScreen
editor()
helpFrame()
printedFrame()
handWrittenFrame()
Entry
recog : int
downSampleLeft : int
downSampleRight : int
downSampleTop : int
downSampleBottom : int
hLineClear()
hLineClearWithin()
vLineClear()
vLineClearWithin()
PrintedFrame
open_action()
train_action()
topen_action()
recogniseAll_action()
1..*
1
1..*
1
KohenNetwork
LearnMethod = 1:int
LearnRate = 0.3:double
quitError : double
copyWeights()
clearWeights()
winner()
normalizeInput()
1..*1..* 1..*1..* 1..*1..* 1..*1..*
DESIGN OF SCREEN SHOTS FOR OCR:-
 Main Screen
 Hand Written Recognition Screen
 Scanned Document Recognition Screen
 Training Screen
 Recognition Screen
 Editor Screen
The screenshots that describe the operations carried out by our
system are as follows :-
CONCLUSION:-
The Grid infrastructure used in the implementation of
Optical Character Recognition system can be efficiently used to
speed up the translation of image based documents into structured
documents that are currently easy to discover, search and process.
The automated entry of data by OCR is one of the most
attractive, labor reducing technology
The recognition of new font characters by the system is very
easy and quick.
We can edit the information of the documents more
conveniently and we can reuse the edited information as and
when required.
The extension to software other than editing and searching is
topic for future works.
• Training and recognition speeds can
be increased greater and greater by
making it more user-friendly.
• Many applications exist where it
would be desirable to read
handwritten entries. Reading
handwriting is a very difficult task
considering the diversities that exist
in ordinary penmanship. However,
progress is being made.
optical character recognition system

More Related Content

What's hot

Optical Character Recognition( OCR )
Optical Character Recognition( OCR )Optical Character Recognition( OCR )
Optical Character Recognition( OCR )Karan Panjwani
 
Optical character recognition (ocr) ppt
Optical character recognition (ocr) pptOptical character recognition (ocr) ppt
Optical character recognition (ocr) pptDeijee Kalita
 
Optical Character Recognition
Optical Character RecognitionOptical Character Recognition
Optical Character RecognitionRahul Mallik
 
Optical Character Recognition (OCR)
Optical Character Recognition (OCR)Optical Character Recognition (OCR)
Optical Character Recognition (OCR)Vidyut Singhania
 
Presentation on OCR
Presentation on OCRPresentation on OCR
Presentation on OCRxsconfused
 
Optical Character Reader - Project Report BTech
Optical Character Reader - Project Report BTechOptical Character Reader - Project Report BTech
Optical Character Reader - Project Report BTechKushagraChadha1
 
Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR RecognitionBharat Kalia
 
Handwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionHandwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionNaiyan Noor
 
Optical Character Recognition
Optical Character RecognitionOptical Character Recognition
Optical Character RecognitionDurjoy Saha
 
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...iosrjce
 
Optical Character Recognition Using Python
Optical Character Recognition Using PythonOptical Character Recognition Using Python
Optical Character Recognition Using PythonYogeshIJTSRD
 
Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Chiranjeevi Adi
 
Character Recognition using Machine Learning
Character Recognition using Machine LearningCharacter Recognition using Machine Learning
Character Recognition using Machine LearningRitwikSaurabh1
 
Handwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHandwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHarshana Madusanka Jayamaha
 
Handwriting Recognition
Handwriting RecognitionHandwriting Recognition
Handwriting RecognitionBindu Karki
 
CHARACTER RECOGNITION USING NEURAL NETWORK WITHOUT FEATURE EXTRACTION FOR KAN...
CHARACTER RECOGNITION USING NEURAL NETWORK WITHOUT FEATURE EXTRACTION FOR KAN...CHARACTER RECOGNITION USING NEURAL NETWORK WITHOUT FEATURE EXTRACTION FOR KAN...
CHARACTER RECOGNITION USING NEURAL NETWORK WITHOUT FEATURE EXTRACTION FOR KAN...Editor IJMTER
 

What's hot (20)

Optical Character Recognition( OCR )
Optical Character Recognition( OCR )Optical Character Recognition( OCR )
Optical Character Recognition( OCR )
 
Optical character recognition (ocr) ppt
Optical character recognition (ocr) pptOptical character recognition (ocr) ppt
Optical character recognition (ocr) ppt
 
Optical Character Recognition
Optical Character RecognitionOptical Character Recognition
Optical Character Recognition
 
Optical Character Recognition (OCR)
Optical Character Recognition (OCR)Optical Character Recognition (OCR)
Optical Character Recognition (OCR)
 
Presentation on OCR
Presentation on OCRPresentation on OCR
Presentation on OCR
 
Optical Character Reader - Project Report BTech
Optical Character Reader - Project Report BTechOptical Character Reader - Project Report BTech
Optical Character Reader - Project Report BTech
 
Ocr abstract
Ocr abstractOcr abstract
Ocr abstract
 
OCR Text Extraction
OCR Text ExtractionOCR Text Extraction
OCR Text Extraction
 
Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR Recognition
 
Handwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionHandwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer Version
 
Optical Character Recognition
Optical Character RecognitionOptical Character Recognition
Optical Character Recognition
 
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
 
Optical Character Recognition Using Python
Optical Character Recognition Using PythonOptical Character Recognition Using Python
Optical Character Recognition Using Python
 
Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks
 
Basics of-optical-character-recognition
Basics of-optical-character-recognitionBasics of-optical-character-recognition
Basics of-optical-character-recognition
 
Character Recognition using Machine Learning
Character Recognition using Machine LearningCharacter Recognition using Machine Learning
Character Recognition using Machine Learning
 
Handwritten Character Recognition
Handwritten Character RecognitionHandwritten Character Recognition
Handwritten Character Recognition
 
Handwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHandwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural network
 
Handwriting Recognition
Handwriting RecognitionHandwriting Recognition
Handwriting Recognition
 
CHARACTER RECOGNITION USING NEURAL NETWORK WITHOUT FEATURE EXTRACTION FOR KAN...
CHARACTER RECOGNITION USING NEURAL NETWORK WITHOUT FEATURE EXTRACTION FOR KAN...CHARACTER RECOGNITION USING NEURAL NETWORK WITHOUT FEATURE EXTRACTION FOR KAN...
CHARACTER RECOGNITION USING NEURAL NETWORK WITHOUT FEATURE EXTRACTION FOR KAN...
 

Viewers also liked

Glossario Domotica Scuola 3.0
Glossario Domotica Scuola 3.0Glossario Domotica Scuola 3.0
Glossario Domotica Scuola 3.0STUDIO BARONI
 
OCR vs. Urjanet
OCR vs. UrjanetOCR vs. Urjanet
OCR vs. UrjanetUrjanet
 
SPARK16 Presentation: Urjanet Product Vision
SPARK16 Presentation: Urjanet Product VisionSPARK16 Presentation: Urjanet Product Vision
SPARK16 Presentation: Urjanet Product VisionUrjanet
 
Spark 2017 Key Takeaways
Spark 2017 Key TakeawaysSpark 2017 Key Takeaways
Spark 2017 Key TakeawaysUrjanet
 
How to Access Utility Data
How to Access Utility DataHow to Access Utility Data
How to Access Utility DataUrjanet
 
The Credit Score Present and Future
The Credit Score Present and FutureThe Credit Score Present and Future
The Credit Score Present and FutureUrjanet
 
SPARK15: Architecting The Future of Energy & Sustainability
SPARK15: Architecting The Future of Energy & SustainabilitySPARK15: Architecting The Future of Energy & Sustainability
SPARK15: Architecting The Future of Energy & SustainabilityUrjanet
 
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...Urjanet
 
SPARK15: Simplifying Sustainability Through Gamification
SPARK15: Simplifying Sustainability Through GamificationSPARK15: Simplifying Sustainability Through Gamification
SPARK15: Simplifying Sustainability Through GamificationUrjanet
 

Viewers also liked (10)

Text Detection and Recognition
Text Detection and RecognitionText Detection and Recognition
Text Detection and Recognition
 
Glossario Domotica Scuola 3.0
Glossario Domotica Scuola 3.0Glossario Domotica Scuola 3.0
Glossario Domotica Scuola 3.0
 
OCR vs. Urjanet
OCR vs. UrjanetOCR vs. Urjanet
OCR vs. Urjanet
 
SPARK16 Presentation: Urjanet Product Vision
SPARK16 Presentation: Urjanet Product VisionSPARK16 Presentation: Urjanet Product Vision
SPARK16 Presentation: Urjanet Product Vision
 
Spark 2017 Key Takeaways
Spark 2017 Key TakeawaysSpark 2017 Key Takeaways
Spark 2017 Key Takeaways
 
How to Access Utility Data
How to Access Utility DataHow to Access Utility Data
How to Access Utility Data
 
The Credit Score Present and Future
The Credit Score Present and FutureThe Credit Score Present and Future
The Credit Score Present and Future
 
SPARK15: Architecting The Future of Energy & Sustainability
SPARK15: Architecting The Future of Energy & SustainabilitySPARK15: Architecting The Future of Energy & Sustainability
SPARK15: Architecting The Future of Energy & Sustainability
 
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...
 
SPARK15: Simplifying Sustainability Through Gamification
SPARK15: Simplifying Sustainability Through GamificationSPARK15: Simplifying Sustainability Through Gamification
SPARK15: Simplifying Sustainability Through Gamification
 

Similar to optical character recognition system

IRJET- Intelligent Character Recognition of Handwritten Characters using ...
IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...
IRJET- Intelligent Character Recognition of Handwritten Characters using ...IRJET Journal
 
Opticalcharacter recognition
Opticalcharacter recognition Opticalcharacter recognition
Opticalcharacter recognition Shobhit Saxena
 
Optical Recognition of Handwritten Text
Optical Recognition of Handwritten TextOptical Recognition of Handwritten Text
Optical Recognition of Handwritten TextIRJET Journal
 
Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization wordDhana K
 
How to create a corpus of machine-readable texts: challenges and solutions
How to create a corpus of machine-readable texts: challenges and solutionsHow to create a corpus of machine-readable texts: challenges and solutions
How to create a corpus of machine-readable texts: challenges and solutionsMonika Renate Barget
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Editor IJARCET
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Editor IJARCET
 
300GroupProject_handwritingsoftware.pptx
300GroupProject_handwritingsoftware.pptx300GroupProject_handwritingsoftware.pptx
300GroupProject_handwritingsoftware.pptxDanielJDanso
 
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...ijiert bestjournal
 
IRJET- Offline Transcription using AI
IRJET-  	  Offline Transcription using AIIRJET-  	  Offline Transcription using AI
IRJET- Offline Transcription using AIIRJET Journal
 
IRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET Journal
 
Smart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PISmart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PIijtsrd
 
OCR Datasets Unleashed.docx
OCR Datasets Unleashed.docxOCR Datasets Unleashed.docx
OCR Datasets Unleashed.docxShalini104884
 

Similar to optical character recognition system (20)

50120130406005
5012013040600550120130406005
50120130406005
 
D017222226
D017222226D017222226
D017222226
 
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
 
Opticalcharacter recognition
Opticalcharacter recognition Opticalcharacter recognition
Opticalcharacter recognition
 
Z04405149151
Z04405149151Z04405149151
Z04405149151
 
PB.docx
PB.docxPB.docx
PB.docx
 
Optical Recognition of Handwritten Text
Optical Recognition of Handwritten TextOptical Recognition of Handwritten Text
Optical Recognition of Handwritten Text
 
A12REVIEW.pptx
A12REVIEW.pptxA12REVIEW.pptx
A12REVIEW.pptx
 
Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization word
 
How to create a corpus of machine-readable texts: challenges and solutions
How to create a corpus of machine-readable texts: challenges and solutionsHow to create a corpus of machine-readable texts: challenges and solutions
How to create a corpus of machine-readable texts: challenges and solutions
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015
 
300GroupProject_handwritingsoftware.pptx
300GroupProject_handwritingsoftware.pptx300GroupProject_handwritingsoftware.pptx
300GroupProject_handwritingsoftware.pptx
 
Ocr 1
Ocr 1Ocr 1
Ocr 1
 
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
 
IRJET- Offline Transcription using AI
IRJET-  	  Offline Transcription using AIIRJET-  	  Offline Transcription using AI
IRJET- Offline Transcription using AI
 
CRC Final Report
CRC Final ReportCRC Final Report
CRC Final Report
 
IRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten Characters
 
Smart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PISmart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PI
 
OCR Datasets Unleashed.docx
OCR Datasets Unleashed.docxOCR Datasets Unleashed.docx
OCR Datasets Unleashed.docx
 

Recently uploaded

Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 

Recently uploaded (20)

Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 

optical character recognition system

  • 1. OCR System Presented By:- Vijay apurva(9910103462), From 4th year,CSEGuided By:- Mr. Ankur kulhari
  • 2. The current capacity to translate paper documents quickly and accurately into machine readable form using optical character recognition technology augments the opportunities in document searching and storing, as well as the automated document processing. A fast response in translating large collections of image- based electronic documents into structured electronic documents is still a problem. The availability of a large number of processing units in Grid environments and of free optical character recognition tools can be exploited to produce a fast translation. ABSTRACT:-
  • 3. CONTENTS :-  What is OCR?  When and Why OCR?  Existing System.  Proposed System.  Architecture of OCR.  Algorithms of OCR.  Modules of OCR.  Design of OCR.  Design of Screen shots for OCR.  Conclusion.
  • 4. WHAT IS OCR? :- OCR stands for Optical Character Recognition. It is one such system that allows us to scan printed, typewritten or hand written text (numerals, letters or symbols) and/or convert scanned image in to a computer process able format, either in the form of a plain text or a word document.  Later the converted documents can be edited, used or reused in other documents. Thus the documents become editable.
  • 5. WHEN AND WHY OCR? :-  OCR is used when recreating a similar document in paper as a document in electronic form takes more time.  The converted text files take less space than the original image file and can be indexed. Hence the use of OCR adds an advantage to the user who had to deal with conversion of great amount of paper works in to electronic form.
  • 6. EXISTING SYSTEM:- In the running world there is a growing demand for the users to convert the printed documents in to electronic documents for maintaining the security of their data. Hence the basic OCR system was invented to convert the data available on papers in to computer process able documents, So that the documents can be editable and reusable.
  • 7. PROPOSED SYSTEM:- Our proposed system is OCR ON A GRID INFRASTRUCTURE which is a character recognition system that supports recognition of the characters of multiple languages. This feature is what we call grid infrastructure which eliminates the problem of heterogeneous character recognition. In this context, Grid infrastructure means the infrastructure that supports group of specific set of languages. Thus OCR on a grid infrastructure is multi- lingual.
  • 8. ARCHITECTURE :-  The Architecture of the optical character recognition system on a grid infrastructure consists of the three main components. They are:-  Scanner  OCR Hardware or Software  Output Interface
  • 9. Document Illuminator Detector Document Analysis Character Recognition Contextual Processing Scanner OCR Hard-Ware Or Soft-Ware Document image Output Interface Recognition Results To application user
  • 10. TYPES OF TRAINING:- Basically there are two major types of training using which we can train a neural network system. They are:-  Supervised Training  Unsupervised Training
  • 12. KOHONEN NETWORK:- The Kohonen network is presented with data, but the correct output that corresponds to that data is not specified. Using the Kohonen network this data can be classified into groups.
  • 13. FLOWCHART FOR KOHONEN TRAINING:-
  • 14. ALGORITHMS OF OCR:- TRAINING ALGORITHM:- One of the most common learning algorithms is called Hebb’s Rule. This rule was developed to assist with unsupervised training.  Hebb’s rule is expressed as: Δ Wi j= µ ai aj (d-a)
  • 15. MODULES :- The Modules that were identified in the Optical Character Recognition system are as follows:-  Document Processing  Neural network System Training  Document Recognition  Document Editing and  Document Searching
  • 16. DESIGN OF OCR :- The design of our OCR system can be best explained with the following diagram:- Scan Store Recognize Editing Searching Document and users Database
  • 17. OVERALL USECASE DIAGRAM:- end-user1 end-user2 Document modification Document deletion Document recognition scan documents store documents Document processing <<includes>> <<includes>> Document processing Document editing administrator Trains the system end-user
  • 18. OVERALL CLASS DIAGRAM:- Document docid : integer docname : String docsize : integer doctype : String getDocumentDetails() scanDocument() covertToImage() storeImage() Editor cut() copy() paste() new() open() find() HelpFrame HEntry hLineClear() vLineClear() findBounds() TrainingSet inputCount : int outputcount : int trainingSetCount : int setInputCount() setOutputCount() setTrainingSetCount() setClassify() 1..* 1 1..* 1 MainScreen editor() helpFrame() printedFrame() handWrittenFrame() Entry recog : int downSampleLeft : int downSampleRight : int downSampleTop : int downSampleBottom : int hLineClear() hLineClearWithin() vLineClear() vLineClearWithin() PrintedFrame open_action() train_action() topen_action() recogniseAll_action() 1..* 1 1..* 1 KohenNetwork LearnMethod = 1:int LearnRate = 0.3:double quitError : double copyWeights() clearWeights() winner() normalizeInput() 1..*1..* 1..*1..* 1..*1..* 1..*1..*
  • 19. DESIGN OF SCREEN SHOTS FOR OCR:-  Main Screen  Hand Written Recognition Screen  Scanned Document Recognition Screen  Training Screen  Recognition Screen  Editor Screen The screenshots that describe the operations carried out by our system are as follows :-
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33. CONCLUSION:- The Grid infrastructure used in the implementation of Optical Character Recognition system can be efficiently used to speed up the translation of image based documents into structured documents that are currently easy to discover, search and process. The automated entry of data by OCR is one of the most attractive, labor reducing technology The recognition of new font characters by the system is very easy and quick. We can edit the information of the documents more conveniently and we can reuse the edited information as and when required. The extension to software other than editing and searching is topic for future works.
  • 34. • Training and recognition speeds can be increased greater and greater by making it more user-friendly. • Many applications exist where it would be desirable to read handwritten entries. Reading handwriting is a very difficult task considering the diversities that exist in ordinary penmanship. However, progress is being made.