Issue
I'm trying to access the information on prices and products of this webpage: https://www.easy.com.ar/banos-y-cocinas
All the data is inside the tag.
However, there's one big issue: using either BeautifulSoup or Selenium, only the first such tag (that is outside the body of HTML) is retrieved. I've tried in different ways, but haven't made progress.
What is more, no tag inside body is retrieved. I'm sending attached a sample code using Selenium:
prices = driver.find_element_by_xpath("//div[@class='flex flex-column min-vh-100 w-100']")
And this is what that statement gets:
If you can help me with this, I'd greatly appreciate it.
I tried to retrieve info on a json tag, but haven't been able to using either Selenium or BeautifulSoup.
Solution
You're getting first element by provided code, so it returns you first element by defined selector (that is not in the body section).
To get scripts from body, you should define more concrete selector like .flex [type*=application]
.
Then you should get array of elements, not the first element.
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import json
from selenium import webdriver
from selenium.webdriver.common.by import By
link = "https://www.easy.com.ar/banos-y-cocinas"
driver.get(link)
json_els = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, '.flex [type*=application]')))
result = [json.loads(element.get_property('innerText')) for element in json_els]
Answered By - Yaroslavm
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.