Issue
I have a list of <li>
each containing an <a>
tag href url value and a <span>
with its url title . I am trying to get the url by the span tag's title value. This is my example:
<li><a href="http://someurl"><span>Title of URL</span></a></li>
This is my last attempt:
soup.select_one('span:-soup-contains("Title of URL:")').find_previous_sibling(text=True)
But that won't work since the span is IN the <a>
tag.
I've tried countless other variations that I have since deleted.
If anyone can help I'd be grateful.
Solution
Just select correct <a>
:
from bs4 import BeautifulSoup
html_text = """\
<li><a href="http://someurl"><span>Title of URL</span></a></li>"""
soup = BeautifulSoup(html_text, "html.parser")
url = soup.select_one('a:-soup-contains("Title of URL")')["href"]
print(url)
Prints:
http://someurl
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.