Issue
I am using python3 (spyder), and I have a table which is the type of object "pandas.core.frame.DataFrame". I want to z-score normalize the values in that table (to each value substract the mean of its row and divide by the sd of its row), so each row has mean=0 and sd=1. I have tried 2 approaches.
First approach
from scipy.stats import zscore
zetascore_table=zscore(table,axis=1)
Second approach
rows=table.index.values
columns=table.columns
import numpy as np
for i in range(len(rows)):
for j in range(len(columns)):
table.loc[rows[i],columns[j]]=(table.loc[rows[i],columns[j]] - np.mean(table.loc[rows[i],]))/np.std(table.loc[rows[i],])
table
Both approaches seem to work, but when I check the mean and sd of each row it is not 0 and 1 as it is suppose to be, but other float values. I don´t know which can be the problem.
Thanks in advance for your help!
Solution
The code below calculates a z-score for each value in a column of a pandas df. It then saves the z-score in a new column (here, called 'num_1_zscore'). Very easy to do.
from scipy.stats import zscore
import pandas as pd
# Create a sample df
df = pd.DataFrame({'num_1': [1,2,3,4,5,6,7,8,9,3,4,6,5,7,3,2,9]})
# Calculate the zscores and drop zscores into new column
df['num_1_zscore'] = zscore(df['num_1'])
display(df)
Answered By - BGG16
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.