Issue
import requests
from bs4 import BeautifulSoup
URL = "https://www.hockey-reference.com/leagues/NHL_2021_games.html"
page = requests.get(URL)
soup = BeautifulSoup(page.content, "html.parser")
results = soup.find(id="all_games")
table = soup.find('div', attrs = {'id':'div_games'})
print(table.prettify())
Solution
Select the table not the div to print the table:
table = soup.find('table', attrs = {'id':'games'})
print(table.prettify())
Or use pandas.read_html()
to get the table and transform into a dataframe:
import pandas as pd
pd.read_html('https://www.hockey-reference.com/leagues/NHL_2021_games.html', attrs={'id':'games'})[0].iloc[:,:5]
Output:
Date | Visitor | G | Home | G.1 |
---|---|---|---|---|
2021-01-13 | St. Louis Blues | 4 | Colorado Avalanche | 1 |
2021-01-13 | Vancouver Canucks | 5 | Edmonton Oilers | 3 |
2021-01-13 | Pittsburgh Penguins | 3 | Philadelphia Flyers | 6 |
2021-01-13 | Chicago Blackhawks | 1 | Tampa Bay Lightning | 5 |
2021-01-13 | Montreal Canadiens | 4 | Toronto Maple Leafs | 5 |
... | ... | ... | ... | ... |
Answered By - HedgeHog
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.