Issue
Hello everyone I have the information I want pulled using BeautiuflSoup
but I can't seem to get it printed out correctly to send to pandas
and excel
.
html_f ='''
<li class="list-group-item">
<div>
<div class="tyler-toggle-controller open">
<p class="text-primary">
07/01/2022 Date
<span class="caret"> </span>
</p>
</div>
<div class="tyler-toggle-container row-buff" style="display: block; overflow: hidden;">
<p class="col-sm-12 col-md-12">
<span class="text-muted">Comment</span><br>
[1] Comments
</p>
</div>
</div>
</li>'''
My code used to pull the data I want:
soup = BeautifulSoup(html_f,'html.parser')
for child in soup.findAll('li',class_='list-group-item')[0]:
print (child.text)
Here is the info it pulls But it prints it out weird with tons of spacing
07/01/2022 Date
Comment
[1] Comments
Ideally, I only need the top portion of (date and File Date) printed out but at the very least I need help getting it into a list format like:
07/01/2022 Date
Comment
[1] Comments
Solution
So far so good, it's my trying
doc='''
<li class="list-group-item">
<div>
<div class="tyler-toggle-controller open">
<p class="text-primary">
07/01/2022 Date
<span class="caret">
</span>
</p>
</div>
<div class="tyler-toggle-container row-buff" style="display: block; overflow: hidden;">
<p class="col-sm-12 col-md-12">
<span class="text-muted">
Comment
</span>
<br/>
[1] Comments
</p>
</div>
</div>
</li>
'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(doc, 'html.parser')
text=[' '.join(child.get_text(strip=True).split(' ')).replace(' DateComment[1]',',') for child in soup.find_all('li',class_='list-group-item')]
print(text)
Output:
['07/01/2022, Comments']
Try this ways,must work
text=' '.join([' '.join(child.get_text(strip=True).split(' ')).replace(' DateComment[1]',',') for child in soup.find_all('li',class_='list-group-item')]).strip()
#Or
text= [' '.join(child.get_text(strip=True).split(' ')).replace(' DateComment[1]',',') for child in soup.find_all('li',class_='list-group-item')]
final_text= text[1]+ ',' +text[2]
final_text= text[1]+text[2].split()#if you want to make list
Answered By - F.Hoque
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.