Issue
In scikit learn you can compute the area under the curve for a binary classifier with
roc_auc_score( Y, clf.predict_proba(X)[:,1] )
I am only interested in the part of the curve where the false positive rate is less than 0.1.
Given such a threshold false positive rate, how can I compute the AUC only for the part of the curve up the threshold?
Here is an example with several ROC-curves, for illustration:
The scikit learn docs show how to use roc_curve
>>> import numpy as np
>>> from sklearn import metrics
>>> y = np.array([1, 1, 2, 2])
>>> scores = np.array([0.1, 0.4, 0.35, 0.8])
>>> fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2)
>>> fpr
array([ 0. , 0.5, 0.5, 1. ])
>>> tpr
array([ 0.5, 0.5, 1. , 1. ])
>>> thresholds
array([ 0.8 , 0.4 , 0.35, 0.1 ]
Is there a simple way to go from this to the partial AUC?
It seems the only problem is how to compute the tpr value at fpr = 0.1 as roc_curve doesn't necessarily give you that.
Solution
Python sklearn roc_auc_score()
now allows you to set max_fpr
. In your case you can set max_fpr=0.1
, the function will calculate the AUC for you. https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html
Answered By - Cherry Wu
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.