Monday, September 12, 2022

[FIXED] Matplotlib: Why does interpolated points fall outside the plotted line?

September 12, 2022 matplotlib, numpy, pandas, python No comments

Issue

I have recreated a common geoscientific plot using Matplotlib. It shows the grain size distribution of a soil sample and is used for soil classification.

Basically, a soil sample is placed in a stack of sieves, which is then shaked for a certain amount of time, and the remaining weight of each grain fraction is then plotted onto the diagram (see attached image below).

An important use for this type of diagram, is to determine two parameters known as D60 and D10, which is the grain size at 60 and 10 percent passing, respectively (see orange dots in diagram). I have interpolated these values with a function using np.interp, but oddly enough these points fall outside of the line plotted by Matplotlib. Can anyone give me a hint where I'm going wrong with this? They should intersect the line where y = 10 and y = 60 exactly.

The data looks like this:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.ticker import FormatStrFormatter

d = {
    'x': [0.063, 0.125, 0.250, 0.500, 1.000, 2.000, 4.000, 8.000],
    'y': [5.9, 26.0, 59.0, 87.0, 95.0, 97.0, 97.0, 100.0]
}

df = pd.DataFrame(d)

df
    x   y
0   0.063   5.9
1   0.125   26.0
2   0.250   59.0
3   0.500   87.0
4   1.000   95.0
5   2.000   97.0
6   4.000   97.0
7   8.000   100.0

The function for interpolating values looks like this (I have tried a similar approach using Scipy with the same results):

def interpolate(xval, df, xcol, ycol):
    return np.interp([xval], df[ycol], df[xcol])

The code for creating the plot itself looks like this:

fig, ax = plt.subplots(figsize=(10,5))

ax.scatter(df['x'], df['y']) #Show datapoints

# Beginning of table
cell_text = [
    ['Clay','FSi','MSi','CSi', 'FSa', 'MSa', 'CSa', 'FGr', 'MGr', 'CGr', 'Co']
]

table = ax.table(
    cellText=cell_text,
    colWidths=[0.06, 0.1, 0.1, 0.1,0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.04],
    cellLoc = 'center',
    rowLoc = 'center',
    loc='top')

h = table.get_celld()[(0,0)].get_height()
w = table.get_celld()[(0,0)].get_width()

header = [table.add_cell(-1,pos, 0.1, h, loc="center", facecolor="none") for pos in [i for i in range(1,10)]]

table.auto_set_font_size(False)
table.set_fontsize(12)
table.scale(1, 1.5)

for i in [0,3,6]:
    header[i].visible_edges = 'TBL'
for i in [1,4,7]:
    header[i].visible_edges = 'TB'
for i in [2,5,8]:
    header[i].visible_edges = 'TBR'

header[1].get_text().set_text('Silt')
header[4].get_text().set_text('Sand')
header[7].get_text().set_text('Gravel')
# End of table


plt.grid(b=True, which='major', color='k', linestyle='--', alpha=0.5)
plt.grid(b=True, which='minor', color='k', linestyle='--', alpha=0.5)

ax.set_yticks(np.arange(0, 110, 10))
ax.set_xscale('log')
ax.xaxis.set_major_formatter(FormatStrFormatter('%g'))
ax.axis(xmin=0.001,xmax=100, ymin=0, ymax=100)

#Interpolate D10 and D60
x2 = np.concatenate((interpolate(10, df, 'x', 'y'), interpolate(60, df, 'x', 'y')))
y2 = np.array([10,60])

#Plot D10 and D60
ax.scatter(x2, y2)

#Plot the line
ax.plot(df['x'], df['y'])

ax.set_xlabel('Grain size (mm)'), ax.set_ylabel('Percent passing (%)')

Can anyone help me figure out why the orange dots fall slightly outside of the lines, what am I doing wrong? Thanks!

Solution

The problem is that you are using linear interpolation to find the points, while the plot has straight lines on a log scale. This can be accomplished via interpolation in log space:

def interpolate(yval, df, xcol, ycol):
    return np.exp(np.interp([yval], df[ycol], np.log(df[xcol])))

If you furthermore write np.array(yval) instead of [yval], vector x2 can be simplified. Providing a z-order of 3 draws the new dots on top of the line. Optionally some text could be added:

def interpolate(yval, df, xcol, ycol):
    return np.exp(np.interp(np.array(yval), df[ycol], np.log(df[xcol])))

y2 = [10, 60]
x2 = interpolate(y2, df, 'x', 'y')
ax.scatter(x2, y2, zorder=3, color='crimson')
for x, y in zip(x2, y2):
    ax.text(x, y, f' D{y}={x:.4f}', color='crimson', ha='left', va='top', size=12)

Answered By - JohanC

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, September 12, 2022

[FIXED] Matplotlib: Why does interpolated points fall outside the plotted line?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels