Issue
I am scraping all values from here While I get the desired output, a caveat is the when I inspect the elements in the table tags..on one day I get 145 rows, on another day I get 143 rows, and on another day I get it around 140 rows...Simply put, I want to optimize the logic so that whatever is the last element record in the tag for eg: 145,150,132... it runs fine.
Here's a piece of code for the same:
table_row = []
for i in range(1, 142):
temp = browser.find_element_by_xpath('//*[@id="companies-table-deal-announced"]/tbody/tr[' + str(i) + ']').find_elements_by_tag_name('td')
table_row.append(list(map(lambda x: x.text, temp)))
print(table_row)
df = pd.DataFrame(table_row,
columns=['SPAC', 'Target', 'Ticker', 'Announced', 'Deadline', 'TEV ($M)', 'TEV/IPO', 'Sector',
'Geography', 'Premium', 'Common', 'Warrant'])
A way in which I can think is to use a len() in the for loop and make it work, Is there another way to do it optimally? Please let me know, Thanks!
Solution
Try this once:
driver.implicitly_wait(10)
driver.get("https://www.spacresearch.com/symbol?s=live-deal§or=&geography=")
table = driver.find_elements_by_xpath("//table[@id='companies-table-deal-announced']//tbody//tr")
for i,tab in zip(range(1,len(table)+1),table):
datalist=[]
data = tab.find_elements_by_tag_name("td")
for d in data:
datalist.append(d.get_attribute("innerText"))
print("{}: {}".format(i,datalist))
Output:
1: ['Ace Global', 'DDC Enterprise', 'ACBA', '8/25/2021', '4/8/2022', '300']
2: ['NextGen Acquisition II', 'Virgin Orbit', 'NGCA', '8/23/2021', '3/25/2023', '3,218']
3: ['Aldel Financial', 'Hagerty', 'ADF', '8/18/2021', '4/13/2023', '3,134']
...
Answered By - pmadhu
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.