What is metrics.roc_curve and metrics.auc measuring when I'm comparing binary data with probability estimates?Fastest way to compare ROC curvesLogistic regression: maximizing true positives - false positivesHow to compute the AUROC for a single categorical variableWhat is the effect of training a model on an imbalanced dataset & using it on a balanced dataset?Is sensitivity, specificity and g-mean considered as “point-wise” metricsHow to improve F1 score with skewed classes?Using “accuracy” as a measure of performance for logistic regressionHow to determine if the predicted probabilities from sklearn logistic regresssion are accurate?Bootstrapping for imbalanced and small sample sized dataDoes a low Area Under Curve (AUC) for ROC imply worthless classifier?
Can't think of a good word or term to describe not feeling or thinking
Vehemently against code formatting
What is metrics.roc_curve and metrics.auc measuring when I'm comparing binary data with probability estimates?
What does "bella ciao" mean literally?
If you attack a Tarrasque while swallowed, what AC do you need to beat to hit it?
What quantum phenomena violate the superposition principle in electromagnetism?
1950s or earlier book with electrical currents living on Pluto
How can I prevent Bash expansion from passing files starting with "-" as argument?
Way of refund if scammed?
Hotel booking: Why is Agoda much cheaper than booking.com?
Why does an injection from a set to a countable set imply that set is countable?
Working hours and productivity expectations for game artists and programmers
Why was Harry at the Weasley's at the beginning of Goblet of Fire but at the Dursleys' after?
Separate the element after every 2nd ',' and push into next row in bash
Presenting 2 results for one variable using a left brace
How to safely discharge oneself
Best practice for printing and evaluating formulas with the minimal coding
How to tease a romance without a cat and mouse chase?
Salesforce bug enabled "Modify All"
Why is こと used in 「私に何かできること」?
Was Tyrion always a poor strategist?
Warped chessboard
Bash - Execute two commands and get exit status 1 if first fails
Is being an extrovert a necessary condition to be a manager?
What is metrics.roc_curve and metrics.auc measuring when I'm comparing binary data with probability estimates?
Fastest way to compare ROC curvesLogistic regression: maximizing true positives - false positivesHow to compute the AUROC for a single categorical variableWhat is the effect of training a model on an imbalanced dataset & using it on a balanced dataset?Is sensitivity, specificity and g-mean considered as “point-wise” metricsHow to improve F1 score with skewed classes?Using “accuracy” as a measure of performance for logistic regressionHow to determine if the predicted probabilities from sklearn logistic regresssion are accurate?Bootstrapping for imbalanced and small sample sized dataDoes a low Area Under Curve (AUC) for ROC imply worthless classifier?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
I was working on a challenge, and I was excited because the metric.auc for my predicted values compared to my test values was very high. This was for a binary selection process.
However, when I looked at it, my predicted values outputted by logistic regression were actually probabilities, not binary values.
So I rounded them, as the challenge requires binary predictions. When I rounded them, the auc score dropped drastically.
My understanding of the auc score and roc curve is that it compares false positives/negatives etc., and I don't even know how it came up with an actual value for these probabilistic predictions.
What was it computing before, and why was it so high?
logistic python auc
New contributor
Brian Rushton is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
I was working on a challenge, and I was excited because the metric.auc for my predicted values compared to my test values was very high. This was for a binary selection process.
However, when I looked at it, my predicted values outputted by logistic regression were actually probabilities, not binary values.
So I rounded them, as the challenge requires binary predictions. When I rounded them, the auc score dropped drastically.
My understanding of the auc score and roc curve is that it compares false positives/negatives etc., and I don't even know how it came up with an actual value for these probabilistic predictions.
What was it computing before, and why was it so high?
logistic python auc
New contributor
Brian Rushton is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
I was working on a challenge, and I was excited because the metric.auc for my predicted values compared to my test values was very high. This was for a binary selection process.
However, when I looked at it, my predicted values outputted by logistic regression were actually probabilities, not binary values.
So I rounded them, as the challenge requires binary predictions. When I rounded them, the auc score dropped drastically.
My understanding of the auc score and roc curve is that it compares false positives/negatives etc., and I don't even know how it came up with an actual value for these probabilistic predictions.
What was it computing before, and why was it so high?
logistic python auc
New contributor
Brian Rushton is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
I was working on a challenge, and I was excited because the metric.auc for my predicted values compared to my test values was very high. This was for a binary selection process.
However, when I looked at it, my predicted values outputted by logistic regression were actually probabilities, not binary values.
So I rounded them, as the challenge requires binary predictions. When I rounded them, the auc score dropped drastically.
My understanding of the auc score and roc curve is that it compares false positives/negatives etc., and I don't even know how it came up with an actual value for these probabilistic predictions.
What was it computing before, and why was it so high?
logistic python auc
logistic python auc
New contributor
Brian Rushton is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Brian Rushton is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Brian Rushton is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
asked 2 hours ago
Brian RushtonBrian Rushton
82
82
New contributor
Brian Rushton is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Brian Rushton is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
When you round up/down the predicted probabilities, you are essentially using 0.5 as a threshold for your classification. ROC curves do this not for one but for every possible threshold. The false positive rates and true positive rates are then plotted as roc curve (with the integral being the auc).
If the challenge requires you to provide binary predictions, they are unlikely to use AUC as performance measure.
$endgroup$
1
$begingroup$
Congrats on hitting 1000!
$endgroup$
– Matthew Drury
1 hour ago
1
$begingroup$
The $c$-index (concordance probability; area under ROC curve) is a decent pure measure of predictive discrimination when computed on the continuous probabilities and the binary outcomes. But proper accuracy scoring rules in this case are the Brier score and the pseudo $R^2$, which are more sensitive because they give more credit to extreme probabilities that are "right".
$endgroup$
– Frank Harrell
1 hour ago
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Brian Rushton is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f408976%2fwhat-is-metrics-roc-curve-and-metrics-auc-measuring-when-im-comparing-binary-da%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
When you round up/down the predicted probabilities, you are essentially using 0.5 as a threshold for your classification. ROC curves do this not for one but for every possible threshold. The false positive rates and true positive rates are then plotted as roc curve (with the integral being the auc).
If the challenge requires you to provide binary predictions, they are unlikely to use AUC as performance measure.
$endgroup$
1
$begingroup$
Congrats on hitting 1000!
$endgroup$
– Matthew Drury
1 hour ago
1
$begingroup$
The $c$-index (concordance probability; area under ROC curve) is a decent pure measure of predictive discrimination when computed on the continuous probabilities and the binary outcomes. But proper accuracy scoring rules in this case are the Brier score and the pseudo $R^2$, which are more sensitive because they give more credit to extreme probabilities that are "right".
$endgroup$
– Frank Harrell
1 hour ago
add a comment |
$begingroup$
When you round up/down the predicted probabilities, you are essentially using 0.5 as a threshold for your classification. ROC curves do this not for one but for every possible threshold. The false positive rates and true positive rates are then plotted as roc curve (with the integral being the auc).
If the challenge requires you to provide binary predictions, they are unlikely to use AUC as performance measure.
$endgroup$
1
$begingroup$
Congrats on hitting 1000!
$endgroup$
– Matthew Drury
1 hour ago
1
$begingroup$
The $c$-index (concordance probability; area under ROC curve) is a decent pure measure of predictive discrimination when computed on the continuous probabilities and the binary outcomes. But proper accuracy scoring rules in this case are the Brier score and the pseudo $R^2$, which are more sensitive because they give more credit to extreme probabilities that are "right".
$endgroup$
– Frank Harrell
1 hour ago
add a comment |
$begingroup$
When you round up/down the predicted probabilities, you are essentially using 0.5 as a threshold for your classification. ROC curves do this not for one but for every possible threshold. The false positive rates and true positive rates are then plotted as roc curve (with the integral being the auc).
If the challenge requires you to provide binary predictions, they are unlikely to use AUC as performance measure.
$endgroup$
When you round up/down the predicted probabilities, you are essentially using 0.5 as a threshold for your classification. ROC curves do this not for one but for every possible threshold. The false positive rates and true positive rates are then plotted as roc curve (with the integral being the auc).
If the challenge requires you to provide binary predictions, they are unlikely to use AUC as performance measure.
edited 1 hour ago
answered 1 hour ago
lnathanlnathan
1,0281523
1,0281523
1
$begingroup$
Congrats on hitting 1000!
$endgroup$
– Matthew Drury
1 hour ago
1
$begingroup$
The $c$-index (concordance probability; area under ROC curve) is a decent pure measure of predictive discrimination when computed on the continuous probabilities and the binary outcomes. But proper accuracy scoring rules in this case are the Brier score and the pseudo $R^2$, which are more sensitive because they give more credit to extreme probabilities that are "right".
$endgroup$
– Frank Harrell
1 hour ago
add a comment |
1
$begingroup$
Congrats on hitting 1000!
$endgroup$
– Matthew Drury
1 hour ago
1
$begingroup$
The $c$-index (concordance probability; area under ROC curve) is a decent pure measure of predictive discrimination when computed on the continuous probabilities and the binary outcomes. But proper accuracy scoring rules in this case are the Brier score and the pseudo $R^2$, which are more sensitive because they give more credit to extreme probabilities that are "right".
$endgroup$
– Frank Harrell
1 hour ago
1
1
$begingroup$
Congrats on hitting 1000!
$endgroup$
– Matthew Drury
1 hour ago
$begingroup$
Congrats on hitting 1000!
$endgroup$
– Matthew Drury
1 hour ago
1
1
$begingroup$
The $c$-index (concordance probability; area under ROC curve) is a decent pure measure of predictive discrimination when computed on the continuous probabilities and the binary outcomes. But proper accuracy scoring rules in this case are the Brier score and the pseudo $R^2$, which are more sensitive because they give more credit to extreme probabilities that are "right".
$endgroup$
– Frank Harrell
1 hour ago
$begingroup$
The $c$-index (concordance probability; area under ROC curve) is a decent pure measure of predictive discrimination when computed on the continuous probabilities and the binary outcomes. But proper accuracy scoring rules in this case are the Brier score and the pseudo $R^2$, which are more sensitive because they give more credit to extreme probabilities that are "right".
$endgroup$
– Frank Harrell
1 hour ago
add a comment |
Brian Rushton is a new contributor. Be nice, and check out our Code of Conduct.
Brian Rushton is a new contributor. Be nice, and check out our Code of Conduct.
Brian Rushton is a new contributor. Be nice, and check out our Code of Conduct.
Brian Rushton is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f408976%2fwhat-is-metrics-roc-curve-and-metrics-auc-measuring-when-im-comparing-binary-da%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown