Issue
I am trying to download a captcha image with Selenium, however, I'm getting a different image downloaded than the one showed in the browser. If I try to download the image again, without changing the browser, I get a different one.
Any thoughts?
from selenium import webdriver
import urllib
driver = webdriver.Firefox()
driver.get("http://sistemas.cvm.gov.br/?fundosreg")
# Change frame.
driver.switch_to.frame("Main")
# Download image/captcha.
img = driver.find_element_by_xpath(".//*[@id='trRandom3']/td[2]/img")
src = img.get_attribute('src')
urllib.request.urlretrieve(src, "captcha.jpeg")
Solution
Because the link of image's src
gives you a random new captcha image once you open that link!
Instead of download the file from the image's src
, you can take a screenshot to get the one in browser. However, you need to download Pillow
(pip install Pillow
) and use it like the way mentioned in this answer:
from PIL import Image
from selenium import webdriver
def get_captcha(driver, element, path):
# now that we have the preliminary stuff out of the way time to get that image :D
location = element.location
size = element.size
# saves screenshot of entire page
driver.save_screenshot(path)
# uses PIL library to open image in memory
image = Image.open(path)
left = location['x']
top = location['y'] + 140
right = location['x'] + size['width']
bottom = location['y'] + size['height'] + 140
image = image.crop((left, top, right, bottom)) # defines crop points
image.save(path, 'jpeg') # saves new cropped image
driver = webdriver.Firefox()
driver.get("http://sistemas.cvm.gov.br/?fundosreg")
# change frame
driver.switch_to.frame("Main")
# download image/captcha
img = driver.find_element_by_xpath(".//*[@id='trRandom3']/td[2]/img")
get_captcha(driver, img, "captcha.jpeg")
driver = webdriver.Firefox()
driver.get("http://sistemas.cvm.gov.br/?fundosreg")
# change frame
driver.switch_to.frame("Main")
# download image/captcha
img = driver.find_element_by_xpath(".//*[@id='trRandom3']/td[2]/img")
get_captcha(driver, img, "captcha.jpeg")
(Note that I've changed the code little bit so it could works in your case.)
Answered By - Remi Guan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.