Handwritten Documents Text Line Segmentation based on Information Energy
Keywords:text line segmentation, text recognition, information energy, OCR
The first step in the text recognition process is represented by the text line segmentationÂ procedures. Only after text lines are correctly identified can the process proceed to theÂ recognition of individual characters. This paper proposes a line segmentation algorithmÂ based on the computation of an information content level, called energy, for each pixel ofÂ the image and using it to execute the seam carving procedure. The algorithm proposes theÂ identification of text lines which follow the text more accurately with the expected downsideÂ of the computational overhead.
dos Santos, R.P. et al (2009), Text Line Segmentation Based on Morphology and Histogram Projection, Document Analysis and Recognition (ICDAR), pp. 651- 655.
Saha, S. et al (2010), A Hough Transform based Technique for Text Segmentation, Journal of Computing, vol. 2, no. 2. Arivazhagan, M. et al (2007), A Statistical approach to line segmentation in handwritten documents, Proceedings of SPIE.
Strand, L. et al (2007), Minimal Cost-Path for Path-Based Distances, Image and Signal Processing and Analysis, pp. 379-384.
Avidan, S. et al (2007), Seam Carving for Content-Aware Image Resizing, ACM Siggraph, article 10.
Saabni, S. et al (2001), Language-Independent Text Lines Extraction Using Seam Carving, Document Analysis and Recognition (ICDAR), pp. 563-568.
Papavassiliou, V. et al (2010), Handwritten document image segmentation into text lines and words, Pattern Recognition, vol. 43, no 1, pp. 369-377. http://dx.doi.org/10.1016/j.patcog.2009.05.007
Tripathy, N.; Pal, U. (2004), Handwriting segmentation of unconstrained Oriya text, Frontiers in Handwriting Recognition, pp. 306-311.
Kennard, D.J., Barrett, W.A. (2006), Separating Lines of Text in Free-Form Handwritten Historical Documents, Document Image Analysis for Libraries, pp. 12-23.
Asi, A. et al (2011), Text Line Segmentation for Gray Scale Historical Document Images, Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, 120-126 http://dx.doi.org/10.1145/2037342.2037362
Bar-Yosef, I. (2005), Input sensitive thresholding for ancient Hebrew manuscript, Pattern Recognition Letters, vol. 26, no. 8, pp. 1168-1173. http://dx.doi.org/10.1016/j.patrec.2004.07.014
Bar-Yosef, I. et al (2009), Line segmentation for degraded handwritten historical documents, Document Analysis and Recognition, pp. 1161-1165.
ONLINE OPEN ACCES: Acces to full text of each article and each issue are allowed for free in respect of Attribution-NonCommercial 4.0 International (CC BY-NC 4.0.
You are free to:
-Share: copy and redistribute the material in any medium or format;
-Adapt: remix, transform, and build upon the material.
The licensor cannot revoke these freedoms as long as you follow the license terms.
DISCLAIMER: The author(s) of each article appearing in International Journal of Computers Communications & Control is/are solely responsible for the content thereof; the publication of an article shall not constitute or be deemed to constitute any representation by the Editors or Agora University Press that the data presented therein are original, correct or sufficient to support the conclusions reached or that the experiment design or methodology is adequate.