Issue
could anyone help me with the extraction of the 'name' as well as the 'description':
</div>, <div class="bubble-description">
<p><b>name</b><br/>
description
</p>
I know how to extract the name. But the description part is a bit cumbersome. Here is my code:
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
content = soup.select('div.bubble-description')
name = [x.select('p')[0].contents[0].text for x in content]
Any help is really appreciated.
Solution
Try:
from bs4 import BeautifulSoup
html_doc = """
<div class="bubble-description">
<p>
<b>name</b>
<br/>
description
</p>
</div>
"""
soup = BeautifulSoup(html_doc, "html.parser")
desc = soup.select_one(".bubble-description")
name = desc.b.text.strip()
desc = desc.br.find_next(text=True).strip()
print(name)
print(desc)
Prints:
name
description
Or using unique separator, e.g. |
:
desc = soup.select_one(".bubble-description")
n, d = desc.get_text(strip=True, separator="|").split("|")
print(n)
print(d)
Prints:
name
description
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.