Titre : |
Machine recognition of arabic text |
Type de document : |
texte imprimé |
Auteurs : |
Habib Goraine, Auteur ; M. J. Usher, Directeur de thèse |
Editeur : |
University of Reading |
Année de publication : |
1991 |
Importance : |
164 f. |
Présentation : |
ill. |
Format : |
30 cm. |
Note générale : |
Thèse de Doctorat : Mathématiques : Angleterre, University of Reading : 1991
Bibliogr. [8] f |
Langues : |
Anglais (eng) |
Mots-clés : |
Arabic characters
Recognition
Preprocessing stage
Stroke separation
Feature extraction |
Index. décimale : |
D001291 |
Résumé : |
Arabic characters are always in curcive script.
The language comprises 28 main characters and is written from right to left.
The form of a character is dependent on its position in a character is dependent on its position in a word and dots are important in distinguishing between different caracters.
In this research a syntactical method has been developed for the recognition of typewritten Arabic words at character level.
The work can be divided into two parts.
In the first part isolated typewritten arabic words were entered into an IBM pc via a camera and digitizer.
A preprocessing stage was applied to each isolated word and thinning, stroke separation and featyre extraction performed.
Following this strokes were classified intro eleven primitives using an eight directions code.
The recognition process involves primary and secondary classification.
The primary classifier user a decision tree to give a decision on the character, or to indicate the presence of several characters or part of a character.
The secondary classifier combines strokes into characters and solves ambiguities between pairs and triplets of characters.
Finaly a postprocessing stage in the form of a dictionary is used to check the spelling of the word.
In the second part printed texts were input to the computer's memory through a scanner in the form of binary image.
A preprocessing stage was then performed where lines and words were separated.
Finaly the preprocessing and recognition techniques were performed on each isolated word. |
Machine recognition of arabic text [texte imprimé] / Habib Goraine, Auteur ; M. J. Usher, Directeur de thèse . - University of Reading, 1991 . - 164 f. : ill. ; 30 cm. Thèse de Doctorat : Mathématiques : Angleterre, University of Reading : 1991
Bibliogr. [8] f Langues : Anglais ( eng)
Mots-clés : |
Arabic characters
Recognition
Preprocessing stage
Stroke separation
Feature extraction |
Index. décimale : |
D001291 |
Résumé : |
Arabic characters are always in curcive script.
The language comprises 28 main characters and is written from right to left.
The form of a character is dependent on its position in a character is dependent on its position in a word and dots are important in distinguishing between different caracters.
In this research a syntactical method has been developed for the recognition of typewritten Arabic words at character level.
The work can be divided into two parts.
In the first part isolated typewritten arabic words were entered into an IBM pc via a camera and digitizer.
A preprocessing stage was applied to each isolated word and thinning, stroke separation and featyre extraction performed.
Following this strokes were classified intro eleven primitives using an eight directions code.
The recognition process involves primary and secondary classification.
The primary classifier user a decision tree to give a decision on the character, or to indicate the presence of several characters or part of a character.
The secondary classifier combines strokes into characters and solves ambiguities between pairs and triplets of characters.
Finaly a postprocessing stage in the form of a dictionary is used to check the spelling of the word.
In the second part printed texts were input to the computer's memory through a scanner in the form of binary image.
A preprocessing stage was then performed where lines and words were separated.
Finaly the preprocessing and recognition techniques were performed on each isolated word. |
|