Sunday, March 27, 2022

[FIXED] Not getting correct results from pytesseract

March 27, 2022 image, ocr, python, python-tesseract No comments

Issue

I am trying to do OCR in python but not getting the correct output. Here is the code. I tried with original image, grayscale also but not getting any result

from PIL import Image
import pytesseract

def convert_to_monochrome(image):
    pixels = image.load()
    for i in range(image.size[0]): # for every pixel:
        for j in range(image.size[1]):
            r, g, b = pixels[i, j]
            if r > 200 and g > 200 and b > 200:
                pixels[i, j] = (255, 255, 255)
            else:
                pixels[i, j] = (0, 0, 0)
    return image

def interpret_chips(image):
    #image = image.resize((image.size[0] * 10, image.size[1] * 10), Image.ANTIALIAS)
    #image = image.convert("LA")
    #image.show()
    _image = convert_to_monochrome(image)
    _image.show()
    _image.save("chips.jpg")
    config = "--psm 7 -c tessedit_char_whitelist=0123456789KMT"
    rank_string = pytesseract.image_to_string(_image, config=config)  # expensive
    return _image, rank_string

for i in range(1, 6):
    print(i)
    img = Image.open("temp/sample" + str(i) + ".jpg")
    img, text = interpret_chips(img)
    print(text)
    img.save("temp/monochrome" + str(i) + ".jpg")

Thanks for your help

I am attaching some original images for which it is giving wrong results. Pre processed images are obtained after applying monochrome function defined please have a look. Text can be of type 4, 400, 4000, 459K, 29M etc. I am getting very awkward results.

Solution

The problem is that tesseract expects an image with dark text on a light background. The preprocessed image in your case is just the opposite. So you can just invert the preprocessed image. Below code worked for me :

from PIL import Image
import pytesseract


def convert_to_monochrome(image):
    pixels = image.load()
    for i in range(image.size[0]): # for every pixel:
        for j in range(image.size[1]):
            r, g, b = pixels[i, j]
            if r > 200 and g > 200 and b > 200:
                pixels[i, j] = (0, 0, 0)
            else:
                pixels[i, j] = (255, 255, 255)
    return image


def interpret_chips(image):
    #image = image.resize((image.size[0] * 10, image.size[1] * 10), Image.ANTIALIAS)
    #image = image.convert("LA")
    #image.show()
    _image = convert_to_monochrome(image)
    _image.show()
    _image.save("chips.jpg")
    config = "--psm 6 -c tessedit_char_whitelist=0123456789KMT"
    rank_string = pytesseract.image_to_string(_image, config=config)  # expensive
    return _image, rank_string


img = Image.open("orig.jpg")
img, text = interpret_chips(img)
print(text)

orig.jpg:

text is 23.000,

Answered By - Ravi Maurya

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, March 27, 2022

[FIXED] Not getting correct results from pytesseract

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels