Issue
Can it be possible to add a click on link event or a new tab for all possible links so that I can scrape my forum I had to filter the forum by using the URL as a filter by grabbing all links that contain viewthread but when I try to get it to click on it just ends with no errors can someone explain it to me as I am very new to web scraping
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument("start-maximized")
webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(service=webdriver_service, options=options)
url = "https://navalcommand.enjin.com/forum/viewforum/2989694/m/11178354/page/1"
driver.get(url)
wait = WebDriverWait(driver, 100)
elems = driver.find_elements(By.XPATH, "//table[@class='structure small-cells']//a[@href]")
for elem in elems:
if "viewthread" in elem.get_attribute('href'):
print(elem.get_attribute("href"))
links = driver.find_elements(By.XPATH, "//table[@class='structure small-cells']//a[@href]")
for link in links:
if "veiwthread" in link.get_attribute("href"):
wait = WebDriverWait(driver, 10)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.XPATH, "//table[@class='structure small-cells']//a[@href]']")))
print(driver.page_source)
link = driver.find_element(By.XPATH, ".//a[@href]")
link.click()
Solution
This would be my approach:
from selenium import webdriver
from selenium.webdriver.common import window
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
options = webdriver.ChromeOptions()
options.add_experimental_option("detach", True)
options.add_argument("start-maximized")
wait = WebDriverWait(driver, 100)
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
driver.get("https://navalcommand.enjin.com/forum/viewforum/2989694/m/11178354/page/1")
elems = driver.find_elements(By.XPATH, "//table[@class='structure small-cells']//a[@href]")
links = []
for ele in elems:
if "viewthread" in ele.get_attribute("href"):
links.append(ele.get_attribute("href"))
for link in links:
driver.switch_to.new_window(window.WindowTypes.TAB)
driver.get(link)
Notice that elems is a list that contains selenium's WebElements and what we need is the href of them.
Answered By - Carapace
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.