Monday, April 18, 2022

[FIXED] how to get quartiles and classify a value according to this quartile range

April 18, 2022 jupyter-notebook, pandas, python-3.x No comments

Issue

I have this df:

d = pd.DataFrame({'Name':['Andres','Lars','Paul','Mike'],
                  'target':['A','A','B','C'],
                  'number':[10,12.3,11,6]})

And I want classify each number in a quartile. I am doing this:

(d.groupby(['Name','target','number'])['number']
 .quantile([0.25,0.5,0.75,1]).unstack()
 .reset_index()
 .rename(columns={0.25:"1Q",0.5:"2Q",0.75:"3Q",1:"4Q"})
)

But as you can see, the 4 quartiles are all equal because the code above is calculating per row so if there's one 1 number per row all quartiles are equal.

If a run instead:

d['number'].quantile([0.25,0.5,0.75,1])

Then I have the 4 quartiles I am looking for:

0.25     9.000
0.50    10.500
0.75    11.325
1.00    12.300

What I need as output(showing only first 2 rows)

   Name    target   number     1Q      2Q      3Q        4Q     Rank
0   Andres  A       10.0       9.0    10.5    11.325    12.30     1
1   Lars    A       12.3       9.0    10.5    11.325    12.30     4

you can see all quartiles has the the values considering tall values in the number column. Besides that, now we have a column names Rank that classify the number according to it's quartile. ex. In the first row 10 is within the 1st quartile.

Solution

Here's one way that build on the quantiles you've created by making it a DataFrame and joining it to d. Also assigns "Rank" column using rank method:

out = (d.join(d['number'].quantile([0.25,0.5,0.75,1])
              .set_axis([f'{i}Q' for i in range(1,5)], axis=0)
              .to_frame().T
              .pipe(lambda x: x.loc[x.index.repeat(len(d))])
              .reset_index(drop=True))
       .assign(Rank=d['number'].rank(method='dense')))

Output:

     Name target  number   1Q    2Q      3Q    4Q  Rank
0  Andres      A    10.0  9.0  10.5  11.325  12.3   2.0
1    Lars      A    12.3  9.0  10.5  11.325  12.3   4.0
2    Paul      B    11.0  9.0  10.5  11.325  12.3   3.0
3    Mike      C     6.0  9.0  10.5  11.325  12.3   1.0

Answered By - enke

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, April 18, 2022

[FIXED] how to get quartiles and classify a value according to this quartile range

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels