Issue
I am trying to extract some data from a website from its script. I am able to get the data from the whole page however i am stuck on how to get data from inside the script tags. Here is how i am getting the soup.
import requests
from bs4 import BeautifulSoup
searchf = input("Enter the product you want: ")
url =f"https://www.daraz.com.np/catalog/?q={searchf}&_keyori=ss&from=input&spm=a2a0e.searchlist.search.go.12ec45adZnn1MP"
r = requests.get(url)
contents = r.content
soup = BeautifulSoup(contents,'html.parser')
print(soup.prettify())
This is the website that i want to scrape and i want to get only the names of the products listed there. https://www.daraz.com.np/catalog/?q=cars&_keyori=ss&from=input&spm=a2a0e.searchlist.search.go.3b834ce4JPBUHm
Solution
Data is also loaded dynamically by javacript. Go to the network tab > xhr/fetch> headers,then you will see the api url and if you click on preview tab then you will see the data, total items count 4080 and each page contains 40 items and I've made pagination using query string parameters as params that's sent along with the api url.
Code:
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36'}
api_url = 'https://www.daraz.com.np/catalog/'
params = {
'keyori': 'ss',
'ajax': 'true',
'from': 'input',
#'page': '2',
'q': 'cars',
'spm': 'a2a0e.searchlist.search.go.3b834ce4JPBUHm'
}
for num in range(0, 4080, 40):
params['totalResults'] = num
print(num)
resp = requests.get(api_url, params=params, headers=headers).json()
items = resp['mods']['listItems']
for item in items:
name = item['name']
print(name)
Output:
Hot wheels 3 Pack Cars
XimiVogue Pull-Back-And-Go Alloy Suv Car Toy With Sound (Bmw Mini)
Kids Spider Stunt Flip Car
Funny Diy Electric Thomas Train Truck Set Toy Set Orange
Monster Trucks Inertia Car Toys For Kids
Remote Control Transformer Car
Remote Control Emulation Car
Spiderman Friction Stunt Car
Remote Control Car
360 Degree Spinning Rolling stunt Farm Truck Super Tipping Car for Kids
Monster Trucks Friction Powered Inertia Cars Toys For Kids (Color May Vary)
Gallant Toy Bus-Set Of 2
Remote Control Transformer Car
19026B Express Train Set - Battery Operated Black Train Toys for Kids Aged 3+ Years
Friction Powered Monster Truck Toy (Multicolor) Pack Of 1
Toy Train
Model Concept R/C Car
Friction Powered 3-Pack Mini Push and Go Car Truck Jam Playset
Makes Your Kids Happy - Remote Control Vehicle Car For Kids
2.4Ghz 1/18 Rc Rock Crawler Vehicle Buggy Car 4 Wd Shaft Drive High Speed Remote Control 4X4 Monster Off Road Truck
XimiVogue Pull-Back-And-Go Alloy Suv Car Toy With Sound
Kids Spider Stunt Flip Car Small Size
Black ForMula Remote Control Car For Kids
... so on (total items 4080)
Answered By - F.Hoque
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.