Issue
I'm trying to scrape the different prices for an item and i would like to scrape all the available items to get the average price ,i've tried the below code but it only output the first value in the list and open the csv but with no data just the header
#Open URL
link3= "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=p2334524.m570.l2632&_nkw=naruto+shippuden+ultimate+ninja+storm+4+ps4&_sacat=139973&LH_TitleDesc=0&rt=nc&_odkw=Naruto+Shippuden%3A+Ultimate+Ninja+Storm+4&_osacat=0&LH_BIN=1&LH_PrefLoc=1"
req = Request(link3, headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req).read()
# # #Get to the sections
#Create excel file with headers
with open('yellowPage.csv', 'w', encoding='utf8', newline='') as f:
thewriter = writer(f)
header = ['link','prices','avg']
thewriter.writerow(header)
#Loop between section to scrap data
with requests.Session() as c:
soup = BeautifulSoup(webpage, 'html5lib')
lists = soup.find_all('li',class_='s-item s-item__pl-on-bottom')
prices = []
for list in lists:
prices.append(float(list.find('span', class_="s-item__price").text.replace('£','').replace(',','').replace('$','')))
avg =sum(prices)/len(prices)
print(avg)
print(prices)
print(len(prices))
info=[link3,prices,avg]
thewriter.writerow(info)
I need help in identifying the best way to get all the items' price from all the available pages as well as send scrapped data to csv file
Solution
This should do what you want. I found the last page number, i.e. 9, and then scraped each page until the last page was scraped.
There is, however, an issue with gathering all of the products; there are 9 pages and each page displays 60 products (by default), but I was only able to get 265 prices. The discrepancy is likely caused by the product li tags having different class attributes. For example some, of the class attributes had only had the s-item s-item__pl-on-bottom
and not s-item--watch-at-corner
.
import requests
from bs4 import BeautifulSoup
# getting html of first page to find total number of succeeding pages
page = requests.get(f'https://www.ebay.co.uk/sch/i.html?_from=R40&_nkw=mario&_sacat=0&LH_TitleDesc=0&_pgn=1').text
soup = BeautifulSoup(page, 'html.parser')
# find last page number
end_page = soup.find('a', href='https://www.ebay.co.uk/sch/i.html?_from=R40&_nkw=mario&_sacat=0&LH_TitleDesc=0&_pgn=9&rt=nc').text
prices = []
page_num = 0
# gets html of each page until last page is reach
while page_num < int(end_page):
page_num += 1
page = requests.get(f'https://www.ebay.co.uk/sch/i.html?_from=R40&_nkw=mario&_sacat=0&LH_TitleDesc=0&_pgn={page_num}').text
soup = BeautifulSoup(page, 'html.parser')
# list of all ;i tags in a page
lists = soup.find_all('li', class_="s-item s-item__pl-on-bottom s-item--watch-at-corner")
# iterate over each page's li tags and append product price to a list
for list in lists:
prices.append(float(list.find('span', class_="s-item__price").text.replace('£','').replace(',','')))
# Average price of the scraped product prices
print(sum(prices)/len(prices))
Answered By - Übermensch
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.