Issue
I am trying to do OCR in python but not getting the correct output. Here is the code. I tried with original image, grayscale also but not getting any result
from PIL import Image
import pytesseract
def convert_to_monochrome(image):
pixels = image.load()
for i in range(image.size[0]): # for every pixel:
for j in range(image.size[1]):
r, g, b = pixels[i, j]
if r > 200 and g > 200 and b > 200:
pixels[i, j] = (255, 255, 255)
else:
pixels[i, j] = (0, 0, 0)
return image
def interpret_chips(image):
#image = image.resize((image.size[0] * 10, image.size[1] * 10), Image.ANTIALIAS)
#image = image.convert("LA")
#image.show()
_image = convert_to_monochrome(image)
_image.show()
_image.save("chips.jpg")
config = "--psm 7 -c tessedit_char_whitelist=0123456789KMT"
rank_string = pytesseract.image_to_string(_image, config=config) # expensive
return _image, rank_string
for i in range(1, 6):
print(i)
img = Image.open("temp/sample" + str(i) + ".jpg")
img, text = interpret_chips(img)
print(text)
img.save("temp/monochrome" + str(i) + ".jpg")
Thanks for your help
I am attaching some original images for which it is giving wrong results. Pre processed images are obtained after applying monochrome function defined please have a look. Text can be of type 4, 400, 4000, 459K, 29M etc. I am getting very awkward results.
Solution
The problem is that tesseract expects an image with dark text on a light background. The preprocessed image in your case is just the opposite. So you can just invert the preprocessed image. Below code worked for me :
from PIL import Image
import pytesseract
def convert_to_monochrome(image):
pixels = image.load()
for i in range(image.size[0]): # for every pixel:
for j in range(image.size[1]):
r, g, b = pixels[i, j]
if r > 200 and g > 200 and b > 200:
pixels[i, j] = (0, 0, 0)
else:
pixels[i, j] = (255, 255, 255)
return image
def interpret_chips(image):
#image = image.resize((image.size[0] * 10, image.size[1] * 10), Image.ANTIALIAS)
#image = image.convert("LA")
#image.show()
_image = convert_to_monochrome(image)
_image.show()
_image.save("chips.jpg")
config = "--psm 6 -c tessedit_char_whitelist=0123456789KMT"
rank_string = pytesseract.image_to_string(_image, config=config) # expensive
return _image, rank_string
img = Image.open("orig.jpg")
img, text = interpret_chips(img)
print(text)
text
is 23.000,
Answered By - Ravi Maurya
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.