Issue
Let's say I have the following dataframe.
import pandas as pd
data = {
'home': ['team1', 'team2', 'team3', 'team2'],
'away': ['team2', 'team3', 'team1', 'team1']
}
df = pd.DataFrame(data)
How can I count the number of time each element (team) appears in both columns ? The expected result is
team1 3
team2 3
team3 2
Solution
You can concatenate the columns and use .value_counts
method:
out = pd.concat([df['home'], df['away']]).value_counts()
Output:
team1 3
team2 3
team3 2
dtype: int64
You can also get the underlying numpy array, flatten
it, find unique values and their counts, wrap it in a dictionary (this is by far the fastest method):
out = dict(np.array(np.unique(df.values.flatten(), return_counts=True)).T)
Output:
{'team1': 3, 'team2': 3, 'team3': 2}
Answered By - enke
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.