Issue
I'm scraping this webpage with some tables. I want to 'build' two lists and the site have the class 'txt' for two datatypes. I need to extract those datatypes separately, so I'm tryng to "filter" the first type, extract, and then doing the other type.
I made this code:
from bs4 import BeautifulSoup
r = requests.get(url, headers=header)
soup = BeautifulSoup(r.content, 'html.parser')
page = soup.find('div', class_='content')
labels = page.findAll('td', class_='label')
Output:
[<td class="label w15"><span class="help tips" title="Code">?</span><span class="txt">Paper</span></td>,
<td class="label destaque w2"><span class="help tips" title="Last value">?</span><span class="txt">Value</span></td>]
I need what is inside those <span class="txt">Paper</span>
When I try this:
myfilter = labels.findAll('span', class_='txt')
I get this error:
AttributeError: ResultSet object has no attribute 'findAll'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?
Why? How I can do this?
Solution
As the error message says, you can't use a list of results as a result by itself. You need to loop over them.
myfilter = []
for label in labels:
myfilter.extend(label.find_all('span', class_='txt'))
Answered By - Barmar
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.