Issue
I'm using Pytesseract to get digits from this image (from Clash of Clans) but I seem to be doing something wrong :(
import cv2
import pytesseract
image = cv2.imread('x.png',0)
thresh = cv2.threshold(image, 150, 255, cv2.THRESH_BINARY_INV)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
result = 255 - close
data = pytesseract.image_to_string(result, lang='eng',config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.imshow('result', result)
cv2.imshow('close', close)
cv2.waitKey()
Uke awl BOB C
Image:
Close:
Solution
I see no point of dealing with the whole screenshot if you care the score only. It is located in known-in-advance portion of the screen so we can easily crop the image to just that area. Let's assume that x.png
your code reads is the first screenshot you posted, then rectangle at 107x18 of size 230x25 px is what I'd be interested dealing with, and I will crop the screenshot to that area:
The next step is thresholding and I think your 150
seems too low as it still leaves some unwanted pixels:
so I bumped it to 165
and it gave me clear picture:
When I OCR it, the results would be as expected:
893
I tried adjusting the image cropping to make Tesseract recognize it as a single line of text but that led to inaccurate results, so I gave it up.
I am also not sure it is the case it recognized two lines or always adds \n
to each returned text line. If that is so, then when combined with default print()
's one, would would result in output I am seeing, but I did not really care doing further investigation and corrected that in result post-processing:
print(data.split()[0].strip())
Note: Some posts suggest that using Page Segmentation Mode (--psm
) 11 could be more effective for the kind of content we're dealing with, which is sparse and lacks a standard reading order. This mode doesn't make assumptions about text layout, offering Tesseract greater flexibility in detecting characters or numbers that are spaced out. As our content isn't structured text, it's easier to parse which makes this mode potentially more effective than others designed for structured text. However, I haven't noticed any significant differences in the results when using mode 11
, but it's worth mentioning that these observations are based on a single test image, which is not sufficient for drawing any definitive conclusions. Keep this in mind in case you encounter incorrect results with different score values.
Full code:
import cv2
import pytesseract
image = cv2.imread('x.png', 0)
x1, y1 = 107, 18
w, h = 230, 25
cropped = image[y1:(y1+h), x1:(x1+w)]
thresh = cv2.threshold(cropped, 165, 255, cv2.THRESH_BINARY_INV)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
data = pytesseract.image_to_string(close, lang='eng', config='--psm 6')
print(data)
Answered By - Marcin Orlowski
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.