Issue
Here is the code:
from requests import get
from bs4 import BeautifulSoup
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36'}
url = 'https://business.inquirer.net/category/latest-stories/page/10'
response = get(url)
print(response.text[:500])
html_soup = BeautifulSoup(response.text, 'html.parser')
type(html_soup)
And this is the result i got:
<html>
<head><title>403 Forbidden</title></head>
<body bgcolor="white">
<center><h1>403 Forbidden</h1></center>
<hr><center>nginx</center>
</body>
</html>
I have read that putting a header will solve the error but I tried putting the header which i copied from the devtool when i inspected the site but it doesn't solve my problem please help me
Solution
Try including a header, many sites block requests without headers:
r = requests.get(url, headers=...)
Check the requests docs for more info: http://docs.python-requests.org/en/master/user/quickstart/
Answered By - Uralan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.