Issue
I am trying to make a contingency table using pd.crosstab
from my local dataframe. Imagine we asked 3 people in 2 separate groups the question of whether they like ice cream or not, and here is the result in a dataframe:
group1 | group2
------------------
yes | no
no | maybe
yes | no
And i would like the contingency table to look like this:
| group1 | group2
----------------------------
yes | 2 | 0
no | 1 | 2
maybe | 0 | 1
I have played around with pandas and evidently referenced many different resources, including the docs and other posts, but couldn't figure this out. Does anyone have any ideas? Thanks!
Solution
Pandas has a crosstab function that solve this; first you have to melt the dataframe:
box = df.melt()
pd.crosstab(box.value, box.variable)
variable group1 group2
value
maybe 0 1
no 1 2
yes 2 0
For performance, it is possible that groupby will be faster, even if it involves a few more steps:
box.groupby(["variable", "value"]).size().unstack("variable", fill_value=0)
Answered By - sammywemmy
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.