Issue
I am trying to scrape from a basketball reference page to pull out referees assigned to certain games and export that later on. To test out one game I tried the below code (and some other variations) but received an error.
data = requests. Get(f"https://www.basketball-reference.com/wnba/boxscores/201506050CON.html")
soup = BeautifulSoup(data.text)
refs = soup.find(string = "Officials: ").next_sibling
print(refs)
AttributeError Traceback (most recent call last)
Cell In[30], line 3
1 #data = requests.get(f"https://www.basketball-reference.com/wnba/boxscores/201506050CON.html")
2 soup = BeautifulSoup(data.text)
----> 3 refs = soup.find(string = "Officials: ").next_sibling
AttributeError: 'NoneType' object has no attribute 'next_sibling'
Solution
I had a look at the HTML on the page you referenced, and your piece looks like this:
<div><strong>Officials: </strong>Daryl Humphrey, Don Hudson, Michael Price</div>
The
in there is a non-breaking space, so your find expression needs to be:
soup.find(string = "Officials:\xa0")
However, that's going to find the text, whereas what you want is the parent of the text, i.e. the <strong>
tag, and to get the next_sibling of that parent tag, for example:
soup.find(string = "Officials:\xa0").parent.next_sibling
Answered By - Rusticus
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.