Enhancing OCR Accuracy with Super Resolution – NLPIR自然语言处理与信息检索共享平台

自然语言处理与信息检索共享平台 自然语言处理与信息检索共享平台

Enhancing OCR Accuracy with Super Resolution



In the new semester, our Lab, Web Search Mining and Security Lab, plans to hold an academic seminar every Monday, and each time a keynote speaker will share understanding of papers on his/her related research with you.


This week’s seminar is organized as follows:

  1. The seminar time is 1.pm, Mon, at Zhongguancun Technology Park ,Building 5, 1306.
  2. The lecturer is Gang Wang, the paper’s title is Enhancing OCR Accuracy with Super Resolution.
  3. Yaofei Yang will give an introduction of deep learning.
  4. The seminar will be hosted by Qinghong Jiang.
  5. Attachment is the paper of this seminar, please download in advance.

Everyone interested in this topic is welcomed to join us. the following is the abstract for this week’s paper.

Enhancing OCR Accuracy with Super Resolution

Ankit Lat C.V.Jawahar


Accuracy of OCR is often marred by the poor quality of the input document images. Generally this performance degradation is attributed to the resolution and quality of scanning. This calls for special efforts to improve the quality of document images before passing it to the OCR engine. One compelling option is to super-resolve these low resolution document images before passing them to the OCR engine.

In this work we address this problem by super-resolving document images using Generative Adversarial Network (GAN). We propose a super resolution based preprocessing step that can enhance the accuracies of the OCRs (including the commercial ones). Our method is specially suited for printed document images. We validate the utility in wide variety of document images (where fonts, styles, and languages vary) without any preprocessing step to adapt across situations. Our experiments show an improvement upto 21% in accuracy OCR on test images scanned at low resolution. One immediate application of this can be in enhancing the recognition of historic documents which have been scanned at low resolutions.

You May Also Like

About the Author: nlpvv