Automatically Rotating Scanned Text Images with Tesseract OCR

Problem

If you've ever had a batch of scanned images with text, you know how tedious it can be to manually rotate each one to the correct orientation. This process can be especially frustrating when dealing with a large number of images. Wouldn't it be great if there were a way to automatically rotate these images so that the text is always upright and readable?

Solution

To solve this problem, I developed a simple script that automatically detects the correct orientation of text in scanned images using Optical Character Recognition (OCR) and dictionary matching. Here's how it works:

  1. OCR Parsing with Tesseract: I used Tesseract, a popular open-source OCR engine, to extract text from the images. Tesseract is powerful and versatile, making it an excellent choice for this task.

  2. Dictionary Matching: I created a list of the most commonly occurring words in the text. This list acts as a reference to determine the correct orientation. While my example includes only 5-6 words, you can easily expand this list to improve accuracy.

  3. Rotation and Validation: The script rotates each image in increments (90°, 180°, 270°) and re-runs the OCR. It then compares the recognized text against the dictionary. Given that OCR is not 100% accurate, I accounted for minor deviations when matching the words. The image is rotated to the orientation where the most dictionary words are recognized, ensuring the text is upright.

How to Use

To get started, simply copy the three files listed below into your ~/bin directory and run the tesseract_rotate_all script. The script will automatically process all images in the directory, rotating them to the correct orientation based on the detected text.

Dependencies

Before running the script, you need to install the perl re::engine::TRE - a regular expression engine that handles approximate matching, which is essential for dealing with OCR inaccuracies.

recognize_good_rotation

       


#use lib "/root/perl5/lib";
# @author Miroslav Bodis 2014

use strict; 
use warnings;

my $file = shift;
 my $rotati
my $debug_mode = shift;

my $find = 0;

if (!defined $debug_mode){
 $debug_mode = 0; 
 # $debug_mode = 1; # TODO use for more details 
}

if ($debug_mode == 1){ 
 print "file:" . $file."\n";
 print $rotation . "\n";
 print "debug mode: " . $debug_mode."\n";
}

my @recognize_words = ('then', 'change', 'when', 'over', 'suddenly', 'another');

open(my $fh, "<", $file) or die "cannot open file";

while(<$fh>)  {
 chomp;
 
 my $line = $_;
 {
  use re::engine::TRE max_cost => 1;

  foreach (@recognize_words) {

   if ($line =~ /$_/i) {
    
    $find += 1;

    if ($debug_mode == 1){ 
     print "match word: " . $_ . "\n";
    }    
   }
  }
 }
}

close $fh;

if ($find > 2){
 exit 0;
}

exit 1;




tesseract_rotate

       

#! /bin/sh

# @author Miroslav Bodis 2014


if [ -z "$1" ]
then
 echo "
 @author Miroslav Bodis 2014
 
 #   script to autorotate image of printed text
 # - inpnut image try all 4 rotations (0, 90, 270, 180)
 # - tesseract current rotation
 # - use your dictionary to find word in tesseracted text (with some tollerance - used TRE max_cost => 1)
 #   - see log for results
 #   - TODO: copy script to your bin folder e.g.: \"~/bin/recognize_good_rotation.pl\"
 # \$1 -> \"input_image\"" 
 exit
fi;


# required 1 arguments
if [ -z "$1" ]
then
 echo "required 1 arguments \"image_name\""
 exit
fi;


help_rotated_img="rotation_help.jpg"
help_ocr_out="output_ocr"
help_ocr_out_txt="$help_ocr_out.txt"
find=1


# 0 - rotation
echo "image $1 try rotation 0"
tesseract -l slk $1 $help_ocr_out 
perl ~/bin/recognize_good_rotation.pl $help_ocr_out_txt 'rotation 0'
find=$?


# 90 - rotation clockwise
if [ $find -eq 1 ] 
then

 echo "image $1 try rotation 90"
 convert $1 -rotate 90 -quality 100 $help_rotated_img
 tesseract -l slk $help_rotated_img $help_ocr_out
 perl ~/bin/recognize_good_rotation.pl $help_ocr_out_txt 'rotation 90'
 
 find=$?
fi;


# 270 - rotation clockwise
if [ $find -eq 1 ] 
then
 
 echo "image $1 try rotation 270"
 convert $1 -rotate 270 -quality 100 $help_rotated_img 
 tesseract -l slk $help_rotated_img $help_ocr_out 
 perl ~/bin/recognize_good_rotation.pl $help_ocr_out_txt 'rotation 270' 

 find=$?
fi;

# 180 - rotation clockwise
if [ $find -eq 1 ] 
then
 
 echo "image $1 try rotation 180"
 convert $1 -rotate 180 -quality 100 $help_rotated_img
 tesseract -l slk $help_rotated_img $help_ocr_out
 perl ~/bin/recognize_good_rotation.pl $help_ocr_out_txt 'rotaiton 180'

 find=$? 
fi;


if [ $find -eq 1 ] 
then
 echo ">>>>>>>>>>>>>>>>> image $1 NOT ROTATED, please update dictionary ! <<<<<<<<<<<<<<<<"
else 
 echo "image $1 ROTATED"
 # if rotated replace new right rotation with old one
 cp $help_rotated_img $1
fi;

rm $help_rotated_img
rm $help_ocr_out_txt


tesseract_rotate_all

       

#! /bin/sh

# @author Miroslav Bodis 2014
# move to current folder with pictures and run "tesseract_rotate_all"

FILES=./*

for f in $FILES
do 
 echo "--- --- --- --- START FILE $f" 
 tesseract_rotate $f
done

echo "finished"


Comments

Popular posts from this blog

Skate Tricks Recognition Using Gyroscope

Play table

Counting dice and train wagons using computer vision