Issue
linking this solved question (again thx alot @mozway and jezrael!) Remove outlier with Python
I would like to plot the outlier removal. What I want: A scatter plot consisting of 7 subplots made with all rows (x-axes should be the time from the first row, the other rows should be the y-axes, respectively). The removed values should be highlighted. How can I do this?
I thought about plotting before and after the removal and insert both into a single plot.
I have two approaches to plot:
Ni60 = data[['60Ni']]
Ni61 = data[['61Ni']]
Ni62 = data[['62Ni']]
Cu63 = data[['63Cu']]
Ni64 = data[['64Ni']]
Cu65 = data[['65Cu']]
Zn66 = data[['66Zn']]
Time = data[['Time']]
fig, ax = plt.subplots(2, 2, sharex = True, figsize = (13, 8))
plt.rcParams['figure.dpi'] = 100
fig.suptitle(basename)
ax[0, 0].scatter(Time, Ni60)
ax[0, 1].scatter(Time, Ni61)
ax[1, 0].scatter(Time, Cu63)
ax[1, 1].scatter(Time, Cu65)
for axis in ax.flat: axis.set_xlim(0, 32)
ax[0, 0].set_ylim(0, 0.02)
ax[1, 0].set_ylim(0, 0.02)
ax[0, 1].set_ylim(0, 0.002)
ax[1, 1].set_ylim(0, 0.02)
ax[0, 0].set_xlabel('Time (s)')
ax[0, 1].set_xlabel('Time (s)')
ax[1, 0].set_xlabel('Time (s)')
ax[1, 1].set_xlabel('Time (s)')
ax[0, 0].set_ylabel('Spannung (V)')
ax[0, 1].set_ylabel('Spannung (V)')
ax[1, 0].set_ylabel('Spannung (V)')
ax[1, 1].set_ylabel('Spannung (V)')
ax[0, 0].set_title('$^{60}$Ni', color = 'b')
ax[0, 1].set_title('$^{61}$Ni', color = 'b')
ax[1, 0].set_title('$^{63}$Cu', color = 'b')
ax[1, 1].set_title('$^{65}$Cu', color = 'b')
fig.savefig(outfile + "blank.png", dpi=150)
and
f = sns.relplot(data=data.melt(id_vars='Time', value_name='Spannung (V)'), x='Time', y='Spannung (V)', col='variable', col_wrap=2, kind='line', marker='o')
f.fig.savefig("out.png")
But these will generate either before and/or after the outlier removal. I would like to plot the data in blue and the outlier in red.
The outlier is removed by:
from scipy import stats
cols = list(df.drop(columns='Time').columns)
# or
# cols = ['60Ni', '61Ni', '62Ni', '63Cu', '64Ni', '65Cu', '66Zn']
df[cols] = df[cols].where(np.abs(stats.zscore(df[cols])) < 2)
Solution
Although not the best method logically, if you draw in red with the data before the outlier exclusion and then draw the outlier in blue, the outlier will not be overwritten and will remain red.
from scipy import stats
cols = list(df.drop(columns='Time').columns)
dfo = pd.DataFrame({'Time':df['Time']})
dfo[cols] = df[cols].mask(np.abs(stats.zscore(df[cols])) >= 2)
import matplotlib.pyplot as plt
cols = ['60Ni', '61Ni', '62Ni', '63Cu', '64Ni', '65Cu', '66Zn']
fig, axes = plt.subplots(2, 2, sharex=True, figsize=(12, 8), dpi=100)
fig.suptitle('Outlier graph')
for idx, (c,ax) in enumerate(zip(cols, axes.flatten())):
ax.scatter(df['Time'], df[c], color='r')
ax.scatter(dfo['Time'], dfo[c], color='b')
ax.set(xlim=(0,32), ylim=(0,0.01))
ax.set_xlabel('Time (s)')
ax.set_ylabel('Spannung (V)')
ax.set_title('${}${}'.format(c[:2], c[2:]), color='b')
#fig.savefig(outfile + "blank.png", dpi=150)
plt.show()
To read in a data file and create a graph
for c in cols:
data = pd.read_csv(c+'.csv', ...)
for ax in axes.flatten():
...
fig.savefig(c+'_blank.png',dpi=150)
Answered By - r-beginners
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.