Issue
I want to scrape the original and the discount price from this link:
https://www2.hm.com/hu_hu/productpage.0903062001.html
Both the span and the del class has weird class names, but I were able to find the discount price in Scrapy shell with the following:
response.css('span.price-value::text').get()
However I have no luck with the original which is inside a del tag:
<del class="BodyText-module--general__32l6J ProductPrice-module--priceValueOriginal__3U3Cz">6 995 Ft</del>
I tried both xpath and css but Scrapy could not find this tag.
Solution
Both Original and discount price is embedded as JSON data in the page source itself
page_source_data = response.xpath('//div[@class= "tealiumProductviewtag productview parbase"]//text()')[0]
re.findall('product_original_price : \[(.*?)\],', page_source_data)
re.findall('product_list_price : \["(.*?)\],', page_source_data)
This can be used to find the price
Answered By - Self
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.