Issue
I have a URL list of 5 URLs within a .txt file named as URLlist.txt.
https://www.w3schools.com/php/php_syntax.asp
https://www.w3schools.com/php/php_comments.asp
https://www.w3schools.com/php/php_variables.asp
https://www.w3schools.com/php/php_echo_print.asp
https://www.w3schools.com/php/php_datatypes.asp
I need to parse all the HTML content within the 5 URLs one by one for further processing.
My current code to parse an individual URL -
import requests from bs4
import BeautifulSoup as bs #HTML parsing using beatuifulsoup
r = requests.get("https://www.w3schools.com/whatis/whatis_jquery.asp")
soup = bs(r.content)
print(soup.prettify())
Solution
Create a list of your links
with open('test.txt', 'r') as f:
urls = [line.strip() for line in f]
Then u can loop your parse
for url in urls:
r = requests.get(url)
...
Answered By - Sharkerz
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.