Issue
I am trying to log into a website. When I look at print(g.text) I am not getting back the web page I expect but instead a cloudflare page that says 'Checking your browser before accessing'
import requests
import time
s = requests.Session()
s.get('https://www.off---white.com/en/GB/')
headers = {'Referer': 'https://www.off---white.com/en/GB/login'}
payload = {
'utf8':'✓',
'authenticity_token':'',
'spree_user[email]': '[email protected]',
'spree_user[password]': 'PASSWORD',
'spree_user[remember_me]': '0',
'commit': 'Login'
}
r = s.post('https://www.off---white.com/en/GB/login', data=payload, headers=headers)
print(r.status_code)
g = s.get('https://www.off---white.com/en/GB/account')
print(g.status_code)
print(g.text)
Why is this occurring when I have set the session?
Solution
This is due to the fact that the page uses Cloudflare's anti-bot page (or IUAM).
Bypassing this check is quite difficult to solve on your own, since Cloudflare changes their techniques periodically. Currently, they check if the client supports JavaScript, which can be spoofed.
I would recommend using the cfscrape
module for bypassing this.
To install it, use pip install cfscrape
. You'll also need to install Node.js.
You can pass a requests session into create_scraper()
like so:
session = requests.Session()
session.headers = ...
scraper = cfscrape.create_scraper(sess=session)
Answered By - Jeremiah
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.