Why do most published works in medical imaging try to reduce false positives?Binary classification of similar images with small region of interestUnsupervised learning if existing image captions match the imagesImage classification: Strategies for minimal input countHow to maximize recall?Multi Class + Negative Class Image Classification StrategiesWhy the performance of VGG-16 is better than Inception V3?Detecting if an image can be made BW/Greyscale/ColourNeed help with confusing dataset formats for Images and annotationsAudio files and their corresponding spectrograms for image classification processHow can one quickly look up people from a large database?
Is "cool" appropriate or offensive to use in IMs?
Are black holes spherical during merger?
In general, would I need to season a meat when making a sauce?
Where's this lookout in Nova Scotia?
Equivalence relation by the symmetric difference of sets
Question in discrete mathematics about group permutations
Melodic minor Major 9 chords
Count Even Digits In Number
Is it legal to meet with potential future employers in the UK, whilst visiting from the USA
How to draw Sankey diagram with Tikz?
Why were helmets and other body armour not commonplace in the 1800s?
Is there an online tool which supports shared writing?
Does this strict reading of the rules allow both Extra Attack and the Thirsting Blade warlock invocation to be used together?
My players want to grind XP but we're using milestone advancement
Do I need full recovery mode when I have multiple daily backup?
What does $!# mean in Shell scripting?
Can I tell a prospective employee that everyone in the team is leaving?
How should I introduce map drawing to my players?
Why did the person in charge of a principality not just declare themself king?
Why are GND pads often only connected by four traces?
What was the idiom for something that we take without a doubt?
Access to the path 'c:somepath' is denied for MSSQL CLR
Can the product of any two aperiodic functions which are defined on the entire number line be periodic?
The art of clickbait captions
Why do most published works in medical imaging try to reduce false positives?
Binary classification of similar images with small region of interestUnsupervised learning if existing image captions match the imagesImage classification: Strategies for minimal input countHow to maximize recall?Multi Class + Negative Class Image Classification StrategiesWhy the performance of VGG-16 is better than Inception V3?Detecting if an image can be made BW/Greyscale/ColourNeed help with confusing dataset formats for Images and annotationsAudio files and their corresponding spectrograms for image classification processHow can one quickly look up people from a large database?
$begingroup$
In medical image processing most of the published works try to reduce false positive rate (FPR) while in reality false negatives are more dangerous than false positives. What is the rationale behind it?
image-classification image-recognition
$endgroup$
add a comment |
$begingroup$
In medical image processing most of the published works try to reduce false positive rate (FPR) while in reality false negatives are more dangerous than false positives. What is the rationale behind it?
image-classification image-recognition
$endgroup$
add a comment |
$begingroup$
In medical image processing most of the published works try to reduce false positive rate (FPR) while in reality false negatives are more dangerous than false positives. What is the rationale behind it?
image-classification image-recognition
$endgroup$
In medical image processing most of the published works try to reduce false positive rate (FPR) while in reality false negatives are more dangerous than false positives. What is the rationale behind it?
image-classification image-recognition
image-classification image-recognition
edited 46 mins ago
Community♦
1
1
asked 22 hours ago
SoKSoK
37814
37814
add a comment |
add a comment |
5 Answers
5
active
oldest
votes
$begingroup$
You know the story of the boy who cried wolf, right?
It's the same idea. After some classifier gives false alarms (cries wolf) so many times, the medical staff will turn it off or ignore it.
"Oh, this again! NOPE!"
At least with the bioengineering group I've worked with, the emphasis is on reducing FPR specifically because the goal is to make a tool that will alert physicians to potential pathology, and they've told us that they will ignore a product that cries wolf too much.
For a product that aids physicians, we have to appeal to their psychology, despite the legitimate argument that missing the wolf on the farm is worse than crying wolf.
Edit: Decreasing false positives also has a legitimate argument. If your computer keeps crying wolf while getting the occasional true positive (and catching most of the true positives), it's effectively saying that someone might be sick. They're in the hospital. The physician knows that the patient might be sick.
$endgroup$
add a comment |
$begingroup$
TL;DR: diseases are rare, so the absolute number of false positives is a lot more than that of false negatives.
Let's assume that our system has the same false positive and false negative rate of 1% (pretty good!), and that we're detecting the presence of new cancers this year: 439.2 / 100,000 people, or 0.5% of the population. [source]
- No cancer, no detection: 99.5% x 99% = 98.5% (98.505%)
- No cancer, detection: 99.5% x 1% = 1.0% (0.995%)
- Cancer, detection: 0.5% x 99% = 0.5% (0.495%)
- Cancer, no detection: 0.5% x 1% = 0.005%
So we can see that we have a problem: for everyone who has cancer, two people who didn't have cancer wind up with invasive surgery, chemotherapy or radiotherapy.
For every person who fails to have a present cancer detected, two hundred people receive actively harmful treatment they didn't need and can't really afford.
New contributor
$endgroup$
$begingroup$
For many screening applications the incidence (no of newly diagnosed disease per 100000 population) is acually even lower: the 0.5 % is total cancer incidence whereas screening programs target specific types of cancer.
$endgroup$
– cbeleites
11 hours ago
2
$begingroup$
@cbeleites, to take a concrete example, pancreatic adenocarcinoma is nearly always fatal because it's asymptomatic until it reaches an advanced stage. If you were to apply a screening test with a 1% false positive/1% false negative rate to the entire population of the United States, you'd identify about three million cases, of which only 46,000 actually have the cancer, giving a positive predictive value of only 1.5%.
$endgroup$
– Mark
9 hours ago
add a comment |
$begingroup$
Summary: the question probably* isn't whether one false negative is worse than one false positive, it's probably* more like whether 500 false positives are acceptable to get down to one false negative.
* depends on the application
Let me expand a bit on @Dragon's answer:
Screening means that we're looking for disease among a seemingly healthy population. As @Dragon explained, for these we need extremely low FPR (or high Sensitivity), otherwise we'll end up with many more false positives than true positives. I.e., the Positive Predictive Value (# truly diseased among all diagnosed positive) would be inacceptably low.
Sensitivity (TPR) and Specificity (TNR) are easy to measure for a diagnostic system: take a number of truly (non)diseased cases and measure the fraction of correctly detected ones.
OTOH, both from doctors' and patients' point of view, the predicitive values are more to the point. They are the "inverse" to Sensitivity and specificity and tell you among all positive (negative) predictions, what fraction is correct. In other words, after the test said "disease" what is the probability that the patient actually does have the disease.
As @Dragon showed you, the incidence (or prevalence, depending on what test we're talking about) plays a crucial role here. Incidence is low in all kinds of screening/early cancer diagnosis applications.
To illustrate this, ovarian cancer screening for post-menopausal women has a prevalence of 0.04 % in the general population and 0.5 % in high-risk women with family history and/or known mutations of tumor suppressor genes BRCA1 and 2 [Buchen, L. Cancer: Missing the mark. Nature, 2011, 471, 428-432]So the question is typically not whether one false negative is worse than one false positive, but even 99 % specificity (1 % FPR) and 95 % sensitivity (numbers taken from the paper linked above) then means roughly 500 false positives for each false negative.
As a side note, also keep in mind that early cancer diagnosis in itself is no magic cure for cancer. E.g. for breast cancer screening mammography, only 3 - 13 % of the true positive patients actually benefit from the screening.
So we also need to keep an eye on the number of false positives for each benefitting patient. E.g. for mammography, together with these numbers, a rough guesstimate it that we have somewhere in the range of 400 - 1800 false positives per benefitting true positive (39 - 49 year old group).With hundreds of false positives per false negative (and also maybe hundreds or even thousands of false positives per patient benefitting from the screening) the situation isn't as clear as "is one missed cancer worse than one false positive cancer diagnosis": false positives do have an impact, ranging from psychological and psycho-somatic (worrying that you have cancer in itself isn't healthy) to physical risks of follow-up diagnoses such as biopsy (which is a small surgery, and as such comes with its own risks).
Even if the impact of one false positive is small, the corresponding risks may add up substantially if hundreds of false positives have to be considered.Suggested reading: Gerd Gigerenzer: Risk Savvy: How to Make Good Decisions (2014).
Still, what PPV and NPV are needed to make a diagnostic test useful is highly dependend on the application.
As explained, in screening for early cancer detection the focus is usually on PPV, i.e. making sure you do not cause too much harm by false negatives: finding a sizeable fraction (even if not all) of the early cancer patients is already an improvement over the status quo without screening.
OTOH, HIV test in blood donations focuses first on NPV (i.e. making sure the blood is HIV-free). Still, in a 2nd (and 3rd) step, false positives are then reduced by applying further tests before worrying people with (false) positive HIV test results.Last but not least, there are also medical testing applications where the incidences or prevalences aren't as extreme as they usually are in screening of not-particularly-high-risk populations, e.g. some differential diagnoses.
$endgroup$
add a comment |
$begingroup$
False Positive Rate (FPR) also known as false alarm rate (FAR); A large False Positive Rate can produce a poor performance of the Medical Image Detection System. A false positive is where you receive a positive result for a test, when you should have received a negative results. For example A pregnancy test is positive, when in fact the person isn't pregnant.
New contributor
$endgroup$
1
$begingroup$
This is not answering the question. OP is not asking what false positive means, but why it's deemed more important than false negative.
$endgroup$
– Llewellyn
12 hours ago
add a comment |
$begingroup$
Clinician's time is precious
From within the field of medicine, clinicians often have a wide variety of illnesses to try to detect and diagnose, and this is a time consuming process. A tool that presents a false positive (even if at a low rate) is less useful because it's not possible to trust that diagnosis, meaning every time it makes that diagnosis, it needs to be checked. Think of it like the WebMD of software - everything is a sign of cancer!
A tool that presents false negatives, but always presents true positives, is far more useful, as a clinician doesn't need to waste time double-checking or second guessing the diagnosis. If it marks someone as being ill with a specific diagnosis, job done. If it doesn't, the people which aren't highlighted as being ill will receive additional tests anyway.
It's better to have a tool that can accurately identify even a single trait of an illness, than a tool that maybe fudges multiple traits.
New contributor
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f52445%2fwhy-do-most-published-works-in-medical-imaging-try-to-reduce-false-positives%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
You know the story of the boy who cried wolf, right?
It's the same idea. After some classifier gives false alarms (cries wolf) so many times, the medical staff will turn it off or ignore it.
"Oh, this again! NOPE!"
At least with the bioengineering group I've worked with, the emphasis is on reducing FPR specifically because the goal is to make a tool that will alert physicians to potential pathology, and they've told us that they will ignore a product that cries wolf too much.
For a product that aids physicians, we have to appeal to their psychology, despite the legitimate argument that missing the wolf on the farm is worse than crying wolf.
Edit: Decreasing false positives also has a legitimate argument. If your computer keeps crying wolf while getting the occasional true positive (and catching most of the true positives), it's effectively saying that someone might be sick. They're in the hospital. The physician knows that the patient might be sick.
$endgroup$
add a comment |
$begingroup$
You know the story of the boy who cried wolf, right?
It's the same idea. After some classifier gives false alarms (cries wolf) so many times, the medical staff will turn it off or ignore it.
"Oh, this again! NOPE!"
At least with the bioengineering group I've worked with, the emphasis is on reducing FPR specifically because the goal is to make a tool that will alert physicians to potential pathology, and they've told us that they will ignore a product that cries wolf too much.
For a product that aids physicians, we have to appeal to their psychology, despite the legitimate argument that missing the wolf on the farm is worse than crying wolf.
Edit: Decreasing false positives also has a legitimate argument. If your computer keeps crying wolf while getting the occasional true positive (and catching most of the true positives), it's effectively saying that someone might be sick. They're in the hospital. The physician knows that the patient might be sick.
$endgroup$
add a comment |
$begingroup$
You know the story of the boy who cried wolf, right?
It's the same idea. After some classifier gives false alarms (cries wolf) so many times, the medical staff will turn it off or ignore it.
"Oh, this again! NOPE!"
At least with the bioengineering group I've worked with, the emphasis is on reducing FPR specifically because the goal is to make a tool that will alert physicians to potential pathology, and they've told us that they will ignore a product that cries wolf too much.
For a product that aids physicians, we have to appeal to their psychology, despite the legitimate argument that missing the wolf on the farm is worse than crying wolf.
Edit: Decreasing false positives also has a legitimate argument. If your computer keeps crying wolf while getting the occasional true positive (and catching most of the true positives), it's effectively saying that someone might be sick. They're in the hospital. The physician knows that the patient might be sick.
$endgroup$
You know the story of the boy who cried wolf, right?
It's the same idea. After some classifier gives false alarms (cries wolf) so many times, the medical staff will turn it off or ignore it.
"Oh, this again! NOPE!"
At least with the bioengineering group I've worked with, the emphasis is on reducing FPR specifically because the goal is to make a tool that will alert physicians to potential pathology, and they've told us that they will ignore a product that cries wolf too much.
For a product that aids physicians, we have to appeal to their psychology, despite the legitimate argument that missing the wolf on the farm is worse than crying wolf.
Edit: Decreasing false positives also has a legitimate argument. If your computer keeps crying wolf while getting the occasional true positive (and catching most of the true positives), it's effectively saying that someone might be sick. They're in the hospital. The physician knows that the patient might be sick.
edited 16 hours ago
answered 17 hours ago
DaveDave
1115
1115
add a comment |
add a comment |
$begingroup$
TL;DR: diseases are rare, so the absolute number of false positives is a lot more than that of false negatives.
Let's assume that our system has the same false positive and false negative rate of 1% (pretty good!), and that we're detecting the presence of new cancers this year: 439.2 / 100,000 people, or 0.5% of the population. [source]
- No cancer, no detection: 99.5% x 99% = 98.5% (98.505%)
- No cancer, detection: 99.5% x 1% = 1.0% (0.995%)
- Cancer, detection: 0.5% x 99% = 0.5% (0.495%)
- Cancer, no detection: 0.5% x 1% = 0.005%
So we can see that we have a problem: for everyone who has cancer, two people who didn't have cancer wind up with invasive surgery, chemotherapy or radiotherapy.
For every person who fails to have a present cancer detected, two hundred people receive actively harmful treatment they didn't need and can't really afford.
New contributor
$endgroup$
$begingroup$
For many screening applications the incidence (no of newly diagnosed disease per 100000 population) is acually even lower: the 0.5 % is total cancer incidence whereas screening programs target specific types of cancer.
$endgroup$
– cbeleites
11 hours ago
2
$begingroup$
@cbeleites, to take a concrete example, pancreatic adenocarcinoma is nearly always fatal because it's asymptomatic until it reaches an advanced stage. If you were to apply a screening test with a 1% false positive/1% false negative rate to the entire population of the United States, you'd identify about three million cases, of which only 46,000 actually have the cancer, giving a positive predictive value of only 1.5%.
$endgroup$
– Mark
9 hours ago
add a comment |
$begingroup$
TL;DR: diseases are rare, so the absolute number of false positives is a lot more than that of false negatives.
Let's assume that our system has the same false positive and false negative rate of 1% (pretty good!), and that we're detecting the presence of new cancers this year: 439.2 / 100,000 people, or 0.5% of the population. [source]
- No cancer, no detection: 99.5% x 99% = 98.5% (98.505%)
- No cancer, detection: 99.5% x 1% = 1.0% (0.995%)
- Cancer, detection: 0.5% x 99% = 0.5% (0.495%)
- Cancer, no detection: 0.5% x 1% = 0.005%
So we can see that we have a problem: for everyone who has cancer, two people who didn't have cancer wind up with invasive surgery, chemotherapy or radiotherapy.
For every person who fails to have a present cancer detected, two hundred people receive actively harmful treatment they didn't need and can't really afford.
New contributor
$endgroup$
$begingroup$
For many screening applications the incidence (no of newly diagnosed disease per 100000 population) is acually even lower: the 0.5 % is total cancer incidence whereas screening programs target specific types of cancer.
$endgroup$
– cbeleites
11 hours ago
2
$begingroup$
@cbeleites, to take a concrete example, pancreatic adenocarcinoma is nearly always fatal because it's asymptomatic until it reaches an advanced stage. If you were to apply a screening test with a 1% false positive/1% false negative rate to the entire population of the United States, you'd identify about three million cases, of which only 46,000 actually have the cancer, giving a positive predictive value of only 1.5%.
$endgroup$
– Mark
9 hours ago
add a comment |
$begingroup$
TL;DR: diseases are rare, so the absolute number of false positives is a lot more than that of false negatives.
Let's assume that our system has the same false positive and false negative rate of 1% (pretty good!), and that we're detecting the presence of new cancers this year: 439.2 / 100,000 people, or 0.5% of the population. [source]
- No cancer, no detection: 99.5% x 99% = 98.5% (98.505%)
- No cancer, detection: 99.5% x 1% = 1.0% (0.995%)
- Cancer, detection: 0.5% x 99% = 0.5% (0.495%)
- Cancer, no detection: 0.5% x 1% = 0.005%
So we can see that we have a problem: for everyone who has cancer, two people who didn't have cancer wind up with invasive surgery, chemotherapy or radiotherapy.
For every person who fails to have a present cancer detected, two hundred people receive actively harmful treatment they didn't need and can't really afford.
New contributor
$endgroup$
TL;DR: diseases are rare, so the absolute number of false positives is a lot more than that of false negatives.
Let's assume that our system has the same false positive and false negative rate of 1% (pretty good!), and that we're detecting the presence of new cancers this year: 439.2 / 100,000 people, or 0.5% of the population. [source]
- No cancer, no detection: 99.5% x 99% = 98.5% (98.505%)
- No cancer, detection: 99.5% x 1% = 1.0% (0.995%)
- Cancer, detection: 0.5% x 99% = 0.5% (0.495%)
- Cancer, no detection: 0.5% x 1% = 0.005%
So we can see that we have a problem: for everyone who has cancer, two people who didn't have cancer wind up with invasive surgery, chemotherapy or radiotherapy.
For every person who fails to have a present cancer detected, two hundred people receive actively harmful treatment they didn't need and can't really afford.
New contributor
New contributor
answered 12 hours ago
DragonDragon
2012
2012
New contributor
New contributor
$begingroup$
For many screening applications the incidence (no of newly diagnosed disease per 100000 population) is acually even lower: the 0.5 % is total cancer incidence whereas screening programs target specific types of cancer.
$endgroup$
– cbeleites
11 hours ago
2
$begingroup$
@cbeleites, to take a concrete example, pancreatic adenocarcinoma is nearly always fatal because it's asymptomatic until it reaches an advanced stage. If you were to apply a screening test with a 1% false positive/1% false negative rate to the entire population of the United States, you'd identify about three million cases, of which only 46,000 actually have the cancer, giving a positive predictive value of only 1.5%.
$endgroup$
– Mark
9 hours ago
add a comment |
$begingroup$
For many screening applications the incidence (no of newly diagnosed disease per 100000 population) is acually even lower: the 0.5 % is total cancer incidence whereas screening programs target specific types of cancer.
$endgroup$
– cbeleites
11 hours ago
2
$begingroup$
@cbeleites, to take a concrete example, pancreatic adenocarcinoma is nearly always fatal because it's asymptomatic until it reaches an advanced stage. If you were to apply a screening test with a 1% false positive/1% false negative rate to the entire population of the United States, you'd identify about three million cases, of which only 46,000 actually have the cancer, giving a positive predictive value of only 1.5%.
$endgroup$
– Mark
9 hours ago
$begingroup$
For many screening applications the incidence (no of newly diagnosed disease per 100000 population) is acually even lower: the 0.5 % is total cancer incidence whereas screening programs target specific types of cancer.
$endgroup$
– cbeleites
11 hours ago
$begingroup$
For many screening applications the incidence (no of newly diagnosed disease per 100000 population) is acually even lower: the 0.5 % is total cancer incidence whereas screening programs target specific types of cancer.
$endgroup$
– cbeleites
11 hours ago
2
2
$begingroup$
@cbeleites, to take a concrete example, pancreatic adenocarcinoma is nearly always fatal because it's asymptomatic until it reaches an advanced stage. If you were to apply a screening test with a 1% false positive/1% false negative rate to the entire population of the United States, you'd identify about three million cases, of which only 46,000 actually have the cancer, giving a positive predictive value of only 1.5%.
$endgroup$
– Mark
9 hours ago
$begingroup$
@cbeleites, to take a concrete example, pancreatic adenocarcinoma is nearly always fatal because it's asymptomatic until it reaches an advanced stage. If you were to apply a screening test with a 1% false positive/1% false negative rate to the entire population of the United States, you'd identify about three million cases, of which only 46,000 actually have the cancer, giving a positive predictive value of only 1.5%.
$endgroup$
– Mark
9 hours ago
add a comment |
$begingroup$
Summary: the question probably* isn't whether one false negative is worse than one false positive, it's probably* more like whether 500 false positives are acceptable to get down to one false negative.
* depends on the application
Let me expand a bit on @Dragon's answer:
Screening means that we're looking for disease among a seemingly healthy population. As @Dragon explained, for these we need extremely low FPR (or high Sensitivity), otherwise we'll end up with many more false positives than true positives. I.e., the Positive Predictive Value (# truly diseased among all diagnosed positive) would be inacceptably low.
Sensitivity (TPR) and Specificity (TNR) are easy to measure for a diagnostic system: take a number of truly (non)diseased cases and measure the fraction of correctly detected ones.
OTOH, both from doctors' and patients' point of view, the predicitive values are more to the point. They are the "inverse" to Sensitivity and specificity and tell you among all positive (negative) predictions, what fraction is correct. In other words, after the test said "disease" what is the probability that the patient actually does have the disease.
As @Dragon showed you, the incidence (or prevalence, depending on what test we're talking about) plays a crucial role here. Incidence is low in all kinds of screening/early cancer diagnosis applications.
To illustrate this, ovarian cancer screening for post-menopausal women has a prevalence of 0.04 % in the general population and 0.5 % in high-risk women with family history and/or known mutations of tumor suppressor genes BRCA1 and 2 [Buchen, L. Cancer: Missing the mark. Nature, 2011, 471, 428-432]So the question is typically not whether one false negative is worse than one false positive, but even 99 % specificity (1 % FPR) and 95 % sensitivity (numbers taken from the paper linked above) then means roughly 500 false positives for each false negative.
As a side note, also keep in mind that early cancer diagnosis in itself is no magic cure for cancer. E.g. for breast cancer screening mammography, only 3 - 13 % of the true positive patients actually benefit from the screening.
So we also need to keep an eye on the number of false positives for each benefitting patient. E.g. for mammography, together with these numbers, a rough guesstimate it that we have somewhere in the range of 400 - 1800 false positives per benefitting true positive (39 - 49 year old group).With hundreds of false positives per false negative (and also maybe hundreds or even thousands of false positives per patient benefitting from the screening) the situation isn't as clear as "is one missed cancer worse than one false positive cancer diagnosis": false positives do have an impact, ranging from psychological and psycho-somatic (worrying that you have cancer in itself isn't healthy) to physical risks of follow-up diagnoses such as biopsy (which is a small surgery, and as such comes with its own risks).
Even if the impact of one false positive is small, the corresponding risks may add up substantially if hundreds of false positives have to be considered.Suggested reading: Gerd Gigerenzer: Risk Savvy: How to Make Good Decisions (2014).
Still, what PPV and NPV are needed to make a diagnostic test useful is highly dependend on the application.
As explained, in screening for early cancer detection the focus is usually on PPV, i.e. making sure you do not cause too much harm by false negatives: finding a sizeable fraction (even if not all) of the early cancer patients is already an improvement over the status quo without screening.
OTOH, HIV test in blood donations focuses first on NPV (i.e. making sure the blood is HIV-free). Still, in a 2nd (and 3rd) step, false positives are then reduced by applying further tests before worrying people with (false) positive HIV test results.Last but not least, there are also medical testing applications where the incidences or prevalences aren't as extreme as they usually are in screening of not-particularly-high-risk populations, e.g. some differential diagnoses.
$endgroup$
add a comment |
$begingroup$
Summary: the question probably* isn't whether one false negative is worse than one false positive, it's probably* more like whether 500 false positives are acceptable to get down to one false negative.
* depends on the application
Let me expand a bit on @Dragon's answer:
Screening means that we're looking for disease among a seemingly healthy population. As @Dragon explained, for these we need extremely low FPR (or high Sensitivity), otherwise we'll end up with many more false positives than true positives. I.e., the Positive Predictive Value (# truly diseased among all diagnosed positive) would be inacceptably low.
Sensitivity (TPR) and Specificity (TNR) are easy to measure for a diagnostic system: take a number of truly (non)diseased cases and measure the fraction of correctly detected ones.
OTOH, both from doctors' and patients' point of view, the predicitive values are more to the point. They are the "inverse" to Sensitivity and specificity and tell you among all positive (negative) predictions, what fraction is correct. In other words, after the test said "disease" what is the probability that the patient actually does have the disease.
As @Dragon showed you, the incidence (or prevalence, depending on what test we're talking about) plays a crucial role here. Incidence is low in all kinds of screening/early cancer diagnosis applications.
To illustrate this, ovarian cancer screening for post-menopausal women has a prevalence of 0.04 % in the general population and 0.5 % in high-risk women with family history and/or known mutations of tumor suppressor genes BRCA1 and 2 [Buchen, L. Cancer: Missing the mark. Nature, 2011, 471, 428-432]So the question is typically not whether one false negative is worse than one false positive, but even 99 % specificity (1 % FPR) and 95 % sensitivity (numbers taken from the paper linked above) then means roughly 500 false positives for each false negative.
As a side note, also keep in mind that early cancer diagnosis in itself is no magic cure for cancer. E.g. for breast cancer screening mammography, only 3 - 13 % of the true positive patients actually benefit from the screening.
So we also need to keep an eye on the number of false positives for each benefitting patient. E.g. for mammography, together with these numbers, a rough guesstimate it that we have somewhere in the range of 400 - 1800 false positives per benefitting true positive (39 - 49 year old group).With hundreds of false positives per false negative (and also maybe hundreds or even thousands of false positives per patient benefitting from the screening) the situation isn't as clear as "is one missed cancer worse than one false positive cancer diagnosis": false positives do have an impact, ranging from psychological and psycho-somatic (worrying that you have cancer in itself isn't healthy) to physical risks of follow-up diagnoses such as biopsy (which is a small surgery, and as such comes with its own risks).
Even if the impact of one false positive is small, the corresponding risks may add up substantially if hundreds of false positives have to be considered.Suggested reading: Gerd Gigerenzer: Risk Savvy: How to Make Good Decisions (2014).
Still, what PPV and NPV are needed to make a diagnostic test useful is highly dependend on the application.
As explained, in screening for early cancer detection the focus is usually on PPV, i.e. making sure you do not cause too much harm by false negatives: finding a sizeable fraction (even if not all) of the early cancer patients is already an improvement over the status quo without screening.
OTOH, HIV test in blood donations focuses first on NPV (i.e. making sure the blood is HIV-free). Still, in a 2nd (and 3rd) step, false positives are then reduced by applying further tests before worrying people with (false) positive HIV test results.Last but not least, there are also medical testing applications where the incidences or prevalences aren't as extreme as they usually are in screening of not-particularly-high-risk populations, e.g. some differential diagnoses.
$endgroup$
add a comment |
$begingroup$
Summary: the question probably* isn't whether one false negative is worse than one false positive, it's probably* more like whether 500 false positives are acceptable to get down to one false negative.
* depends on the application
Let me expand a bit on @Dragon's answer:
Screening means that we're looking for disease among a seemingly healthy population. As @Dragon explained, for these we need extremely low FPR (or high Sensitivity), otherwise we'll end up with many more false positives than true positives. I.e., the Positive Predictive Value (# truly diseased among all diagnosed positive) would be inacceptably low.
Sensitivity (TPR) and Specificity (TNR) are easy to measure for a diagnostic system: take a number of truly (non)diseased cases and measure the fraction of correctly detected ones.
OTOH, both from doctors' and patients' point of view, the predicitive values are more to the point. They are the "inverse" to Sensitivity and specificity and tell you among all positive (negative) predictions, what fraction is correct. In other words, after the test said "disease" what is the probability that the patient actually does have the disease.
As @Dragon showed you, the incidence (or prevalence, depending on what test we're talking about) plays a crucial role here. Incidence is low in all kinds of screening/early cancer diagnosis applications.
To illustrate this, ovarian cancer screening for post-menopausal women has a prevalence of 0.04 % in the general population and 0.5 % in high-risk women with family history and/or known mutations of tumor suppressor genes BRCA1 and 2 [Buchen, L. Cancer: Missing the mark. Nature, 2011, 471, 428-432]So the question is typically not whether one false negative is worse than one false positive, but even 99 % specificity (1 % FPR) and 95 % sensitivity (numbers taken from the paper linked above) then means roughly 500 false positives for each false negative.
As a side note, also keep in mind that early cancer diagnosis in itself is no magic cure for cancer. E.g. for breast cancer screening mammography, only 3 - 13 % of the true positive patients actually benefit from the screening.
So we also need to keep an eye on the number of false positives for each benefitting patient. E.g. for mammography, together with these numbers, a rough guesstimate it that we have somewhere in the range of 400 - 1800 false positives per benefitting true positive (39 - 49 year old group).With hundreds of false positives per false negative (and also maybe hundreds or even thousands of false positives per patient benefitting from the screening) the situation isn't as clear as "is one missed cancer worse than one false positive cancer diagnosis": false positives do have an impact, ranging from psychological and psycho-somatic (worrying that you have cancer in itself isn't healthy) to physical risks of follow-up diagnoses such as biopsy (which is a small surgery, and as such comes with its own risks).
Even if the impact of one false positive is small, the corresponding risks may add up substantially if hundreds of false positives have to be considered.Suggested reading: Gerd Gigerenzer: Risk Savvy: How to Make Good Decisions (2014).
Still, what PPV and NPV are needed to make a diagnostic test useful is highly dependend on the application.
As explained, in screening for early cancer detection the focus is usually on PPV, i.e. making sure you do not cause too much harm by false negatives: finding a sizeable fraction (even if not all) of the early cancer patients is already an improvement over the status quo without screening.
OTOH, HIV test in blood donations focuses first on NPV (i.e. making sure the blood is HIV-free). Still, in a 2nd (and 3rd) step, false positives are then reduced by applying further tests before worrying people with (false) positive HIV test results.Last but not least, there are also medical testing applications where the incidences or prevalences aren't as extreme as they usually are in screening of not-particularly-high-risk populations, e.g. some differential diagnoses.
$endgroup$
Summary: the question probably* isn't whether one false negative is worse than one false positive, it's probably* more like whether 500 false positives are acceptable to get down to one false negative.
* depends on the application
Let me expand a bit on @Dragon's answer:
Screening means that we're looking for disease among a seemingly healthy population. As @Dragon explained, for these we need extremely low FPR (or high Sensitivity), otherwise we'll end up with many more false positives than true positives. I.e., the Positive Predictive Value (# truly diseased among all diagnosed positive) would be inacceptably low.
Sensitivity (TPR) and Specificity (TNR) are easy to measure for a diagnostic system: take a number of truly (non)diseased cases and measure the fraction of correctly detected ones.
OTOH, both from doctors' and patients' point of view, the predicitive values are more to the point. They are the "inverse" to Sensitivity and specificity and tell you among all positive (negative) predictions, what fraction is correct. In other words, after the test said "disease" what is the probability that the patient actually does have the disease.
As @Dragon showed you, the incidence (or prevalence, depending on what test we're talking about) plays a crucial role here. Incidence is low in all kinds of screening/early cancer diagnosis applications.
To illustrate this, ovarian cancer screening for post-menopausal women has a prevalence of 0.04 % in the general population and 0.5 % in high-risk women with family history and/or known mutations of tumor suppressor genes BRCA1 and 2 [Buchen, L. Cancer: Missing the mark. Nature, 2011, 471, 428-432]So the question is typically not whether one false negative is worse than one false positive, but even 99 % specificity (1 % FPR) and 95 % sensitivity (numbers taken from the paper linked above) then means roughly 500 false positives for each false negative.
As a side note, also keep in mind that early cancer diagnosis in itself is no magic cure for cancer. E.g. for breast cancer screening mammography, only 3 - 13 % of the true positive patients actually benefit from the screening.
So we also need to keep an eye on the number of false positives for each benefitting patient. E.g. for mammography, together with these numbers, a rough guesstimate it that we have somewhere in the range of 400 - 1800 false positives per benefitting true positive (39 - 49 year old group).With hundreds of false positives per false negative (and also maybe hundreds or even thousands of false positives per patient benefitting from the screening) the situation isn't as clear as "is one missed cancer worse than one false positive cancer diagnosis": false positives do have an impact, ranging from psychological and psycho-somatic (worrying that you have cancer in itself isn't healthy) to physical risks of follow-up diagnoses such as biopsy (which is a small surgery, and as such comes with its own risks).
Even if the impact of one false positive is small, the corresponding risks may add up substantially if hundreds of false positives have to be considered.Suggested reading: Gerd Gigerenzer: Risk Savvy: How to Make Good Decisions (2014).
Still, what PPV and NPV are needed to make a diagnostic test useful is highly dependend on the application.
As explained, in screening for early cancer detection the focus is usually on PPV, i.e. making sure you do not cause too much harm by false negatives: finding a sizeable fraction (even if not all) of the early cancer patients is already an improvement over the status quo without screening.
OTOH, HIV test in blood donations focuses first on NPV (i.e. making sure the blood is HIV-free). Still, in a 2nd (and 3rd) step, false positives are then reduced by applying further tests before worrying people with (false) positive HIV test results.Last but not least, there are also medical testing applications where the incidences or prevalences aren't as extreme as they usually are in screening of not-particularly-high-risk populations, e.g. some differential diagnoses.
edited 10 hours ago
answered 10 hours ago
cbeleitescbeleites
30016
30016
add a comment |
add a comment |
$begingroup$
False Positive Rate (FPR) also known as false alarm rate (FAR); A large False Positive Rate can produce a poor performance of the Medical Image Detection System. A false positive is where you receive a positive result for a test, when you should have received a negative results. For example A pregnancy test is positive, when in fact the person isn't pregnant.
New contributor
$endgroup$
1
$begingroup$
This is not answering the question. OP is not asking what false positive means, but why it's deemed more important than false negative.
$endgroup$
– Llewellyn
12 hours ago
add a comment |
$begingroup$
False Positive Rate (FPR) also known as false alarm rate (FAR); A large False Positive Rate can produce a poor performance of the Medical Image Detection System. A false positive is where you receive a positive result for a test, when you should have received a negative results. For example A pregnancy test is positive, when in fact the person isn't pregnant.
New contributor
$endgroup$
1
$begingroup$
This is not answering the question. OP is not asking what false positive means, but why it's deemed more important than false negative.
$endgroup$
– Llewellyn
12 hours ago
add a comment |
$begingroup$
False Positive Rate (FPR) also known as false alarm rate (FAR); A large False Positive Rate can produce a poor performance of the Medical Image Detection System. A false positive is where you receive a positive result for a test, when you should have received a negative results. For example A pregnancy test is positive, when in fact the person isn't pregnant.
New contributor
$endgroup$
False Positive Rate (FPR) also known as false alarm rate (FAR); A large False Positive Rate can produce a poor performance of the Medical Image Detection System. A false positive is where you receive a positive result for a test, when you should have received a negative results. For example A pregnancy test is positive, when in fact the person isn't pregnant.
New contributor
New contributor
answered 19 hours ago
EricAtHaufeEricAtHaufe
112
112
New contributor
New contributor
1
$begingroup$
This is not answering the question. OP is not asking what false positive means, but why it's deemed more important than false negative.
$endgroup$
– Llewellyn
12 hours ago
add a comment |
1
$begingroup$
This is not answering the question. OP is not asking what false positive means, but why it's deemed more important than false negative.
$endgroup$
– Llewellyn
12 hours ago
1
1
$begingroup$
This is not answering the question. OP is not asking what false positive means, but why it's deemed more important than false negative.
$endgroup$
– Llewellyn
12 hours ago
$begingroup$
This is not answering the question. OP is not asking what false positive means, but why it's deemed more important than false negative.
$endgroup$
– Llewellyn
12 hours ago
add a comment |
$begingroup$
Clinician's time is precious
From within the field of medicine, clinicians often have a wide variety of illnesses to try to detect and diagnose, and this is a time consuming process. A tool that presents a false positive (even if at a low rate) is less useful because it's not possible to trust that diagnosis, meaning every time it makes that diagnosis, it needs to be checked. Think of it like the WebMD of software - everything is a sign of cancer!
A tool that presents false negatives, but always presents true positives, is far more useful, as a clinician doesn't need to waste time double-checking or second guessing the diagnosis. If it marks someone as being ill with a specific diagnosis, job done. If it doesn't, the people which aren't highlighted as being ill will receive additional tests anyway.
It's better to have a tool that can accurately identify even a single trait of an illness, than a tool that maybe fudges multiple traits.
New contributor
$endgroup$
add a comment |
$begingroup$
Clinician's time is precious
From within the field of medicine, clinicians often have a wide variety of illnesses to try to detect and diagnose, and this is a time consuming process. A tool that presents a false positive (even if at a low rate) is less useful because it's not possible to trust that diagnosis, meaning every time it makes that diagnosis, it needs to be checked. Think of it like the WebMD of software - everything is a sign of cancer!
A tool that presents false negatives, but always presents true positives, is far more useful, as a clinician doesn't need to waste time double-checking or second guessing the diagnosis. If it marks someone as being ill with a specific diagnosis, job done. If it doesn't, the people which aren't highlighted as being ill will receive additional tests anyway.
It's better to have a tool that can accurately identify even a single trait of an illness, than a tool that maybe fudges multiple traits.
New contributor
$endgroup$
add a comment |
$begingroup$
Clinician's time is precious
From within the field of medicine, clinicians often have a wide variety of illnesses to try to detect and diagnose, and this is a time consuming process. A tool that presents a false positive (even if at a low rate) is less useful because it's not possible to trust that diagnosis, meaning every time it makes that diagnosis, it needs to be checked. Think of it like the WebMD of software - everything is a sign of cancer!
A tool that presents false negatives, but always presents true positives, is far more useful, as a clinician doesn't need to waste time double-checking or second guessing the diagnosis. If it marks someone as being ill with a specific diagnosis, job done. If it doesn't, the people which aren't highlighted as being ill will receive additional tests anyway.
It's better to have a tool that can accurately identify even a single trait of an illness, than a tool that maybe fudges multiple traits.
New contributor
$endgroup$
Clinician's time is precious
From within the field of medicine, clinicians often have a wide variety of illnesses to try to detect and diagnose, and this is a time consuming process. A tool that presents a false positive (even if at a low rate) is less useful because it's not possible to trust that diagnosis, meaning every time it makes that diagnosis, it needs to be checked. Think of it like the WebMD of software - everything is a sign of cancer!
A tool that presents false negatives, but always presents true positives, is far more useful, as a clinician doesn't need to waste time double-checking or second guessing the diagnosis. If it marks someone as being ill with a specific diagnosis, job done. If it doesn't, the people which aren't highlighted as being ill will receive additional tests anyway.
It's better to have a tool that can accurately identify even a single trait of an illness, than a tool that maybe fudges multiple traits.
New contributor
New contributor
answered 5 hours ago
SSight3SSight3
101
101
New contributor
New contributor
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f52445%2fwhy-do-most-published-works-in-medical-imaging-try-to-reduce-false-positives%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown