Why do most published works in medical imaging try to reduce false positives?Binary classification of similar images with small region of interestUnsupervised learning if existing image captions match the imagesImage classification: Strategies for minimal input countHow to maximize recall?Multi Class + Negative Class Image Classification StrategiesWhy the performance of VGG-16 is better than Inception V3?Detecting if an image can be made BW/Greyscale/ColourNeed help with confusing dataset formats for Images and annotationsAudio files and their corresponding spectrograms for image classification processHow can one quickly look up people from a large database?

Is "cool" appropriate or offensive to use in IMs?

Are black holes spherical during merger?

In general, would I need to season a meat when making a sauce?

Where's this lookout in Nova Scotia?

Equivalence relation by the symmetric difference of sets

Question in discrete mathematics about group permutations

Melodic minor Major 9 chords

Count Even Digits In Number

Is it legal to meet with potential future employers in the UK, whilst visiting from the USA

How to draw Sankey diagram with Tikz?

Why were helmets and other body armour not commonplace in the 1800s?

Is there an online tool which supports shared writing?

Does this strict reading of the rules allow both Extra Attack and the Thirsting Blade warlock invocation to be used together?

My players want to grind XP but we're using milestone advancement

Do I need full recovery mode when I have multiple daily backup?

What does $!# mean in Shell scripting?

Can I tell a prospective employee that everyone in the team is leaving?

How should I introduce map drawing to my players?

Why did the person in charge of a principality not just declare themself king?

Why are GND pads often only connected by four traces?

What was the idiom for something that we take without a doubt?

Access to the path 'c:somepath' is denied for MSSQL CLR

Can the product of any two aperiodic functions which are defined on the entire number line be periodic?

The art of clickbait captions

Why do most published works in medical imaging try to reduce false positives?

Binary classification of similar images with small region of interestUnsupervised learning if existing image captions match the imagesImage classification: Strategies for minimal input countHow to maximize recall?Multi Class + Negative Class Image Classification StrategiesWhy the performance of VGG-16 is better than Inception V3?Detecting if an image can be made BW/Greyscale/ColourNeed help with confusing dataset formats for Images and annotationsAudio files and their corresponding spectrograms for image classification processHow can one quickly look up people from a large database?

In medical image processing most of the published works try to reduce false positive rate (FPR) while in reality false negatives are more dangerous than false positives. What is the rationale behind it?

edited 46 mins ago

Community♦

asked 22 hours ago

SoK

37814

add a comment |

edited 46 mins ago

Community♦

asked 22 hours ago

SoK

37814

add a comment |

edited 46 mins ago

Community♦

asked 22 hours ago

SoK

37814

image-classification image-recognition

edited 46 mins ago

Community♦

asked 22 hours ago

SoK

37814

edited 46 mins ago

Community♦

asked 22 hours ago

SoK

37814

edited 46 mins ago

Community♦

edited 46 mins ago

Community♦

edited 46 mins ago

Community♦

asked 22 hours ago

SoK

37814

asked 22 hours ago

SoK

37814

asked 22 hours ago

SoK

37814

add a comment |

5 Answers
5

active

oldest

votes

You know the story of the boy who cried wolf, right?

It's the same idea. After some classifier gives false alarms (cries wolf) so many times, the medical staff will turn it off or ignore it.

"Oh, this again! NOPE!"

At least with the bioengineering group I've worked with, the emphasis is on reducing FPR specifically because the goal is to make a tool that will alert physicians to potential pathology, and they've told us that they will ignore a product that cries wolf too much.

For a product that aids physicians, we have to appeal to their psychology, despite the legitimate argument that missing the wolf on the farm is worse than crying wolf.

Edit: Decreasing false positives also has a legitimate argument. If your computer keeps crying wolf while getting the occasional true positive (and catching most of the true positives), it's effectively saying that someone might be sick. They're in the hospital. The physician knows that the patient might be sick.

edited 16 hours ago

answered 17 hours ago

Dave

1115

add a comment |

TL;DR: diseases are rare, so the absolute number of false positives is a lot more than that of false negatives.

Let's assume that our system has the same false positive and false negative rate of 1% (pretty good!), and that we're detecting the presence of new cancers this year: 439.2 / 100,000 people, or 0.5% of the population. [source]

No cancer, no detection: 99.5% x 99% = 98.5% (98.505%)

No cancer, detection: 99.5% x 1% = 1.0% (0.995%)

Cancer, detection: 0.5% x 99% = 0.5% (0.495%)

Cancer, no detection: 0.5% x 1% = 0.005%

So we can see that we have a problem: for everyone who has cancer, two people who didn't have cancer wind up with invasive surgery, chemotherapy or radiotherapy.

For every person who fails to have a present cancer detected, two hundred people receive actively harmful treatment they didn't need and can't really afford.

answered 12 hours ago

Dragon

2012

New contributor

$begingroup$
For many screening applications the incidence (no of newly diagnosed disease per 100000 population) is acually even lower: the 0.5 % is total cancer incidence whereas screening programs target specific types of cancer.
$endgroup$
– cbeleites
11 hours ago

2

$begingroup$
@cbeleites, to take a concrete example, pancreatic adenocarcinoma is nearly always fatal because it's asymptomatic until it reaches an advanced stage. If you were to apply a screening test with a 1% false positive/1% false negative rate to the entire population of the United States, you'd identify about three million cases, of which only 46,000 actually have the cancer, giving a positive predictive value of only 1.5%.
$endgroup$
– Mark
9 hours ago

add a comment |

Summary: the question probably* isn't whether one false negative is worse than one false positive, it's probably* more like whether 500 false positives are acceptable to get down to one false negative.

* depends on the application

Let me expand a bit on @Dragon's answer:

Screening means that we're looking for disease among a seemingly healthy population. As @Dragon explained, for these we need extremely low FPR (or high Sensitivity), otherwise we'll end up with many more false positives than true positives. I.e., the Positive Predictive Value (# truly diseased among all diagnosed positive) would be inacceptably low.

Sensitivity (TPR) and Specificity (TNR) are easy to measure for a diagnostic system: take a number of truly (non)diseased cases and measure the fraction of correctly detected ones.

OTOH, both from doctors' and patients' point of view, the predicitive values are more to the point. They are the "inverse" to Sensitivity and specificity and tell you among all positive (negative) predictions, what fraction is correct. In other words, after the test said "disease" what is the probability that the patient actually does have the disease.

As @Dragon showed you, the incidence (or prevalence, depending on what test we're talking about) plays a crucial role here. Incidence is low in all kinds of screening/early cancer diagnosis applications.

To illustrate this, ovarian cancer screening for post-menopausal women has a prevalence of 0.04 % in the general population and 0.5 % in high-risk women with family history and/or known mutations of tumor suppressor genes BRCA1 and 2 [Buchen, L. Cancer: Missing the mark. Nature, 2011, 471, 428-432]

So the question is typically not whether one false negative is worse than one false positive, but even 99 % specificity (1 % FPR) and 95 % sensitivity (numbers taken from the paper linked above) then means roughly 500 false positives for each false negative.

As a side note, also keep in mind that early cancer diagnosis in itself is no magic cure for cancer. E.g. for breast cancer screening mammography, only 3 - 13 % of the true positive patients actually benefit from the screening.

So we also need to keep an eye on the number of false positives for each benefitting patient. E.g. for mammography, together with these numbers, a rough guesstimate it that we have somewhere in the range of 400 - 1800 false positives per benefitting true positive (39 - 49 year old group).

With hundreds of false positives per false negative (and also maybe hundreds or even thousands of false positives per patient benefitting from the screening) the situation isn't as clear as "is one missed cancer worse than one false positive cancer diagnosis": false positives do have an impact, ranging from psychological and psycho-somatic (worrying that you have cancer in itself isn't healthy) to physical risks of follow-up diagnoses such as biopsy (which is a small surgery, and as such comes with its own risks).

Even if the impact of one false positive is small, the corresponding risks may add up substantially if hundreds of false positives have to be considered.

Suggested reading: Gerd Gigerenzer: Risk Savvy: How to Make Good Decisions (2014).

Still, what PPV and NPV are needed to make a diagnostic test useful is highly dependend on the application.

As explained, in screening for early cancer detection the focus is usually on PPV, i.e. making sure you do not cause too much harm by false negatives: finding a sizeable fraction (even if not all) of the early cancer patients is already an improvement over the status quo without screening.

OTOH, HIV test in blood donations focuses first on NPV (i.e. making sure the blood is HIV-free). Still, in a 2nd (and 3rd) step, false positives are then reduced by applying further tests before worrying people with (false) positive HIV test results.

Last but not least, there are also medical testing applications where the incidences or prevalences aren't as extreme as they usually are in screening of not-particularly-high-risk populations, e.g. some differential diagnoses.

edited 10 hours ago

answered 10 hours ago

cbeleites

30016

add a comment |

False Positive Rate (FPR) also known as false alarm rate (FAR); A large False Positive Rate can produce a poor performance of the Medical Image Detection System. A false positive is where you receive a positive result for a test, when you should have received a negative results. For example A pregnancy test is positive, when in fact the person isn't pregnant.

answered 19 hours ago

EricAtHaufe

112

New contributor

1

$begingroup$
This is not answering the question. OP is not asking what false positive means, but why it's deemed more important than false negative.
$endgroup$
– Llewellyn
12 hours ago

add a comment |

Clinician's time is precious

From within the field of medicine, clinicians often have a wide variety of illnesses to try to detect and diagnose, and this is a time consuming process. A tool that presents a false positive (even if at a low rate) is less useful because it's not possible to trust that diagnosis, meaning every time it makes that diagnosis, it needs to be checked. Think of it like the WebMD of software - everything is a sign of cancer!

A tool that presents false negatives, but always presents true positives, is far more useful, as a clinician doesn't need to waste time double-checking or second guessing the diagnosis. If it marks someone as being ill with a specific diagnosis, job done. If it doesn't, the people which aren't highlighted as being ill will receive additional tests anyway.

It's better to have a tool that can accurately identify even a single trait of an illness, than a tool that maybe fudges multiple traits.

answered 5 hours ago

SSight3

101

New contributor

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f52445%2fwhy-do-most-published-works-in-medical-imaging-try-to-reduce-false-positives%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

You know the story of the boy who cried wolf, right?

It's the same idea. After some classifier gives false alarms (cries wolf) so many times, the medical staff will turn it off or ignore it.

"Oh, this again! NOPE!"

For a product that aids physicians, we have to appeal to their psychology, despite the legitimate argument that missing the wolf on the farm is worse than crying wolf.

edited 16 hours ago

answered 17 hours ago

Dave

1115

add a comment |

You know the story of the boy who cried wolf, right?

It's the same idea. After some classifier gives false alarms (cries wolf) so many times, the medical staff will turn it off or ignore it.

"Oh, this again! NOPE!"

For a product that aids physicians, we have to appeal to their psychology, despite the legitimate argument that missing the wolf on the farm is worse than crying wolf.

edited 16 hours ago

answered 17 hours ago

Dave

1115

add a comment |

You know the story of the boy who cried wolf, right?

It's the same idea. After some classifier gives false alarms (cries wolf) so many times, the medical staff will turn it off or ignore it.

"Oh, this again! NOPE!"

For a product that aids physicians, we have to appeal to their psychology, despite the legitimate argument that missing the wolf on the farm is worse than crying wolf.

edited 16 hours ago

answered 17 hours ago

Dave

1115

You know the story of the boy who cried wolf, right?

It's the same idea. After some classifier gives false alarms (cries wolf) so many times, the medical staff will turn it off or ignore it.

"Oh, this again! NOPE!"

For a product that aids physicians, we have to appeal to their psychology, despite the legitimate argument that missing the wolf on the farm is worse than crying wolf.

edited 16 hours ago

answered 17 hours ago

Dave

1115

edited 16 hours ago

answered 17 hours ago

Dave

1115

answered 17 hours ago

Dave

1115

answered 17 hours ago

Dave

1115

add a comment |

TL;DR: diseases are rare, so the absolute number of false positives is a lot more than that of false negatives.

No cancer, no detection: 99.5% x 99% = 98.5% (98.505%)

No cancer, detection: 99.5% x 1% = 1.0% (0.995%)

Cancer, detection: 0.5% x 99% = 0.5% (0.495%)

Cancer, no detection: 0.5% x 1% = 0.005%

So we can see that we have a problem: for everyone who has cancer, two people who didn't have cancer wind up with invasive surgery, chemotherapy or radiotherapy.

For every person who fails to have a present cancer detected, two hundred people receive actively harmful treatment they didn't need and can't really afford.

answered 12 hours ago

Dragon

2012

New contributor

$begingroup$
For many screening applications the incidence (no of newly diagnosed disease per 100000 population) is acually even lower: the 0.5 % is total cancer incidence whereas screening programs target specific types of cancer.
$endgroup$
– cbeleites
11 hours ago

2

$begingroup$
@cbeleites, to take a concrete example, pancreatic adenocarcinoma is nearly always fatal because it's asymptomatic until it reaches an advanced stage. If you were to apply a screening test with a 1% false positive/1% false negative rate to the entire population of the United States, you'd identify about three million cases, of which only 46,000 actually have the cancer, giving a positive predictive value of only 1.5%.
$endgroup$
– Mark
9 hours ago

add a comment |

TL;DR: diseases are rare, so the absolute number of false positives is a lot more than that of false negatives.

No cancer, no detection: 99.5% x 99% = 98.5% (98.505%)

No cancer, detection: 99.5% x 1% = 1.0% (0.995%)

Cancer, detection: 0.5% x 99% = 0.5% (0.495%)

Cancer, no detection: 0.5% x 1% = 0.005%

So we can see that we have a problem: for everyone who has cancer, two people who didn't have cancer wind up with invasive surgery, chemotherapy or radiotherapy.

For every person who fails to have a present cancer detected, two hundred people receive actively harmful treatment they didn't need and can't really afford.

answered 12 hours ago

Dragon

2012

New contributor

$begingroup$
For many screening applications the incidence (no of newly diagnosed disease per 100000 population) is acually even lower: the 0.5 % is total cancer incidence whereas screening programs target specific types of cancer.
$endgroup$
– cbeleites
11 hours ago

2

$begingroup$
@cbeleites, to take a concrete example, pancreatic adenocarcinoma is nearly always fatal because it's asymptomatic until it reaches an advanced stage. If you were to apply a screening test with a 1% false positive/1% false negative rate to the entire population of the United States, you'd identify about three million cases, of which only 46,000 actually have the cancer, giving a positive predictive value of only 1.5%.
$endgroup$
– Mark
9 hours ago

add a comment |

TL;DR: diseases are rare, so the absolute number of false positives is a lot more than that of false negatives.

No cancer, no detection: 99.5% x 99% = 98.5% (98.505%)

No cancer, detection: 99.5% x 1% = 1.0% (0.995%)

Cancer, detection: 0.5% x 99% = 0.5% (0.495%)

Cancer, no detection: 0.5% x 1% = 0.005%

So we can see that we have a problem: for everyone who has cancer, two people who didn't have cancer wind up with invasive surgery, chemotherapy or radiotherapy.

For every person who fails to have a present cancer detected, two hundred people receive actively harmful treatment they didn't need and can't really afford.

answered 12 hours ago

Dragon

2012

New contributor

TL;DR: diseases are rare, so the absolute number of false positives is a lot more than that of false negatives.

No cancer, no detection: 99.5% x 99% = 98.5% (98.505%)

No cancer, detection: 99.5% x 1% = 1.0% (0.995%)

Cancer, detection: 0.5% x 99% = 0.5% (0.495%)

Cancer, no detection: 0.5% x 1% = 0.005%

So we can see that we have a problem: for everyone who has cancer, two people who didn't have cancer wind up with invasive surgery, chemotherapy or radiotherapy.

For every person who fails to have a present cancer detected, two hundred people receive actively harmful treatment they didn't need and can't really afford.

answered 12 hours ago

Dragon

2012

New contributor

answered 12 hours ago

Dragon

2012

New contributor

answered 12 hours ago

Dragon

2012

answered 12 hours ago

Dragon

2012

New contributor

$begingroup$
For many screening applications the incidence (no of newly diagnosed disease per 100000 population) is acually even lower: the 0.5 % is total cancer incidence whereas screening programs target specific types of cancer.
$endgroup$
– cbeleites
11 hours ago

2

$begingroup$
@cbeleites, to take a concrete example, pancreatic adenocarcinoma is nearly always fatal because it's asymptomatic until it reaches an advanced stage. If you were to apply a screening test with a 1% false positive/1% false negative rate to the entire population of the United States, you'd identify about three million cases, of which only 46,000 actually have the cancer, giving a positive predictive value of only 1.5%.
$endgroup$
– Mark
9 hours ago

add a comment |

$begingroup$
For many screening applications the incidence (no of newly diagnosed disease per 100000 population) is acually even lower: the 0.5 % is total cancer incidence whereas screening programs target specific types of cancer.
$endgroup$
– cbeleites
11 hours ago

2

$begingroup$
@cbeleites, to take a concrete example, pancreatic adenocarcinoma is nearly always fatal because it's asymptomatic until it reaches an advanced stage. If you were to apply a screening test with a 1% false positive/1% false negative rate to the entire population of the United States, you'd identify about three million cases, of which only 46,000 actually have the cancer, giving a positive predictive value of only 1.5%.
$endgroup$
– Mark
9 hours ago

For many screening applications the incidence (no of newly diagnosed disease per 100000 population) is acually even lower: the 0.5 % is total cancer incidence whereas screening programs target specific types of cancer.

– cbeleites
11 hours ago

@cbeleites, to take a concrete example, pancreatic adenocarcinoma is nearly always fatal because it's asymptomatic until it reaches an advanced stage. If you were to apply a screening test with a 1% false positive/1% false negative rate to the entire population of the United States, you'd identify about three million cases, of which only 46,000 actually have the cancer, giving a positive predictive value of only 1.5%.

– Mark
9 hours ago

add a comment |

* depends on the application

Let me expand a bit on @Dragon's answer:

Screening means that we're looking for disease among a seemingly healthy population. As @Dragon explained, for these we need extremely low FPR (or high Sensitivity), otherwise we'll end up with many more false positives than true positives. I.e., the Positive Predictive Value (# truly diseased among all diagnosed positive) would be inacceptably low.

Sensitivity (TPR) and Specificity (TNR) are easy to measure for a diagnostic system: take a number of truly (non)diseased cases and measure the fraction of correctly detected ones.

OTOH, both from doctors' and patients' point of view, the predicitive values are more to the point. They are the "inverse" to Sensitivity and specificity and tell you among all positive (negative) predictions, what fraction is correct. In other words, after the test said "disease" what is the probability that the patient actually does have the disease.

As @Dragon showed you, the incidence (or prevalence, depending on what test we're talking about) plays a crucial role here. Incidence is low in all kinds of screening/early cancer diagnosis applications.

To illustrate this, ovarian cancer screening for post-menopausal women has a prevalence of 0.04 % in the general population and 0.5 % in high-risk women with family history and/or known mutations of tumor suppressor genes BRCA1 and 2 [Buchen, L. Cancer: Missing the mark. Nature, 2011, 471, 428-432]

So the question is typically not whether one false negative is worse than one false positive, but even 99 % specificity (1 % FPR) and 95 % sensitivity (numbers taken from the paper linked above) then means roughly 500 false positives for each false negative.

As a side note, also keep in mind that early cancer diagnosis in itself is no magic cure for cancer. E.g. for breast cancer screening mammography, only 3 - 13 % of the true positive patients actually benefit from the screening.

So we also need to keep an eye on the number of false positives for each benefitting patient. E.g. for mammography, together with these numbers, a rough guesstimate it that we have somewhere in the range of 400 - 1800 false positives per benefitting true positive (39 - 49 year old group).

With hundreds of false positives per false negative (and also maybe hundreds or even thousands of false positives per patient benefitting from the screening) the situation isn't as clear as "is one missed cancer worse than one false positive cancer diagnosis": false positives do have an impact, ranging from psychological and psycho-somatic (worrying that you have cancer in itself isn't healthy) to physical risks of follow-up diagnoses such as biopsy (which is a small surgery, and as such comes with its own risks).

Even if the impact of one false positive is small, the corresponding risks may add up substantially if hundreds of false positives have to be considered.

Suggested reading: Gerd Gigerenzer: Risk Savvy: How to Make Good Decisions (2014).

Still, what PPV and NPV are needed to make a diagnostic test useful is highly dependend on the application.

As explained, in screening for early cancer detection the focus is usually on PPV, i.e. making sure you do not cause too much harm by false negatives: finding a sizeable fraction (even if not all) of the early cancer patients is already an improvement over the status quo without screening.

OTOH, HIV test in blood donations focuses first on NPV (i.e. making sure the blood is HIV-free). Still, in a 2nd (and 3rd) step, false positives are then reduced by applying further tests before worrying people with (false) positive HIV test results.

Last but not least, there are also medical testing applications where the incidences or prevalences aren't as extreme as they usually are in screening of not-particularly-high-risk populations, e.g. some differential diagnoses.

edited 10 hours ago

answered 10 hours ago

cbeleites

30016

add a comment |

* depends on the application

Let me expand a bit on @Dragon's answer:

Screening means that we're looking for disease among a seemingly healthy population. As @Dragon explained, for these we need extremely low FPR (or high Sensitivity), otherwise we'll end up with many more false positives than true positives. I.e., the Positive Predictive Value (# truly diseased among all diagnosed positive) would be inacceptably low.

Sensitivity (TPR) and Specificity (TNR) are easy to measure for a diagnostic system: take a number of truly (non)diseased cases and measure the fraction of correctly detected ones.

OTOH, both from doctors' and patients' point of view, the predicitive values are more to the point. They are the "inverse" to Sensitivity and specificity and tell you among all positive (negative) predictions, what fraction is correct. In other words, after the test said "disease" what is the probability that the patient actually does have the disease.

As @Dragon showed you, the incidence (or prevalence, depending on what test we're talking about) plays a crucial role here. Incidence is low in all kinds of screening/early cancer diagnosis applications.

To illustrate this, ovarian cancer screening for post-menopausal women has a prevalence of 0.04 % in the general population and 0.5 % in high-risk women with family history and/or known mutations of tumor suppressor genes BRCA1 and 2 [Buchen, L. Cancer: Missing the mark. Nature, 2011, 471, 428-432]

So the question is typically not whether one false negative is worse than one false positive, but even 99 % specificity (1 % FPR) and 95 % sensitivity (numbers taken from the paper linked above) then means roughly 500 false positives for each false negative.

As a side note, also keep in mind that early cancer diagnosis in itself is no magic cure for cancer. E.g. for breast cancer screening mammography, only 3 - 13 % of the true positive patients actually benefit from the screening.

So we also need to keep an eye on the number of false positives for each benefitting patient. E.g. for mammography, together with these numbers, a rough guesstimate it that we have somewhere in the range of 400 - 1800 false positives per benefitting true positive (39 - 49 year old group).

With hundreds of false positives per false negative (and also maybe hundreds or even thousands of false positives per patient benefitting from the screening) the situation isn't as clear as "is one missed cancer worse than one false positive cancer diagnosis": false positives do have an impact, ranging from psychological and psycho-somatic (worrying that you have cancer in itself isn't healthy) to physical risks of follow-up diagnoses such as biopsy (which is a small surgery, and as such comes with its own risks).

Even if the impact of one false positive is small, the corresponding risks may add up substantially if hundreds of false positives have to be considered.

Suggested reading: Gerd Gigerenzer: Risk Savvy: How to Make Good Decisions (2014).

Still, what PPV and NPV are needed to make a diagnostic test useful is highly dependend on the application.

As explained, in screening for early cancer detection the focus is usually on PPV, i.e. making sure you do not cause too much harm by false negatives: finding a sizeable fraction (even if not all) of the early cancer patients is already an improvement over the status quo without screening.

OTOH, HIV test in blood donations focuses first on NPV (i.e. making sure the blood is HIV-free). Still, in a 2nd (and 3rd) step, false positives are then reduced by applying further tests before worrying people with (false) positive HIV test results.

Last but not least, there are also medical testing applications where the incidences or prevalences aren't as extreme as they usually are in screening of not-particularly-high-risk populations, e.g. some differential diagnoses.

edited 10 hours ago

answered 10 hours ago

cbeleites

30016

add a comment |

* depends on the application

Let me expand a bit on @Dragon's answer:

Screening means that we're looking for disease among a seemingly healthy population. As @Dragon explained, for these we need extremely low FPR (or high Sensitivity), otherwise we'll end up with many more false positives than true positives. I.e., the Positive Predictive Value (# truly diseased among all diagnosed positive) would be inacceptably low.

Sensitivity (TPR) and Specificity (TNR) are easy to measure for a diagnostic system: take a number of truly (non)diseased cases and measure the fraction of correctly detected ones.

OTOH, both from doctors' and patients' point of view, the predicitive values are more to the point. They are the "inverse" to Sensitivity and specificity and tell you among all positive (negative) predictions, what fraction is correct. In other words, after the test said "disease" what is the probability that the patient actually does have the disease.

As @Dragon showed you, the incidence (or prevalence, depending on what test we're talking about) plays a crucial role here. Incidence is low in all kinds of screening/early cancer diagnosis applications.

To illustrate this, ovarian cancer screening for post-menopausal women has a prevalence of 0.04 % in the general population and 0.5 % in high-risk women with family history and/or known mutations of tumor suppressor genes BRCA1 and 2 [Buchen, L. Cancer: Missing the mark. Nature, 2011, 471, 428-432]

So the question is typically not whether one false negative is worse than one false positive, but even 99 % specificity (1 % FPR) and 95 % sensitivity (numbers taken from the paper linked above) then means roughly 500 false positives for each false negative.

As a side note, also keep in mind that early cancer diagnosis in itself is no magic cure for cancer. E.g. for breast cancer screening mammography, only 3 - 13 % of the true positive patients actually benefit from the screening.

So we also need to keep an eye on the number of false positives for each benefitting patient. E.g. for mammography, together with these numbers, a rough guesstimate it that we have somewhere in the range of 400 - 1800 false positives per benefitting true positive (39 - 49 year old group).

With hundreds of false positives per false negative (and also maybe hundreds or even thousands of false positives per patient benefitting from the screening) the situation isn't as clear as "is one missed cancer worse than one false positive cancer diagnosis": false positives do have an impact, ranging from psychological and psycho-somatic (worrying that you have cancer in itself isn't healthy) to physical risks of follow-up diagnoses such as biopsy (which is a small surgery, and as such comes with its own risks).

Even if the impact of one false positive is small, the corresponding risks may add up substantially if hundreds of false positives have to be considered.

Suggested reading: Gerd Gigerenzer: Risk Savvy: How to Make Good Decisions (2014).

Still, what PPV and NPV are needed to make a diagnostic test useful is highly dependend on the application.

As explained, in screening for early cancer detection the focus is usually on PPV, i.e. making sure you do not cause too much harm by false negatives: finding a sizeable fraction (even if not all) of the early cancer patients is already an improvement over the status quo without screening.

OTOH, HIV test in blood donations focuses first on NPV (i.e. making sure the blood is HIV-free). Still, in a 2nd (and 3rd) step, false positives are then reduced by applying further tests before worrying people with (false) positive HIV test results.

Last but not least, there are also medical testing applications where the incidences or prevalences aren't as extreme as they usually are in screening of not-particularly-high-risk populations, e.g. some differential diagnoses.

edited 10 hours ago

answered 10 hours ago

cbeleites

30016

* depends on the application

Let me expand a bit on @Dragon's answer:

Screening means that we're looking for disease among a seemingly healthy population. As @Dragon explained, for these we need extremely low FPR (or high Sensitivity), otherwise we'll end up with many more false positives than true positives. I.e., the Positive Predictive Value (# truly diseased among all diagnosed positive) would be inacceptably low.

Sensitivity (TPR) and Specificity (TNR) are easy to measure for a diagnostic system: take a number of truly (non)diseased cases and measure the fraction of correctly detected ones.

OTOH, both from doctors' and patients' point of view, the predicitive values are more to the point. They are the "inverse" to Sensitivity and specificity and tell you among all positive (negative) predictions, what fraction is correct. In other words, after the test said "disease" what is the probability that the patient actually does have the disease.

As @Dragon showed you, the incidence (or prevalence, depending on what test we're talking about) plays a crucial role here. Incidence is low in all kinds of screening/early cancer diagnosis applications.

To illustrate this, ovarian cancer screening for post-menopausal women has a prevalence of 0.04 % in the general population and 0.5 % in high-risk women with family history and/or known mutations of tumor suppressor genes BRCA1 and 2 [Buchen, L. Cancer: Missing the mark. Nature, 2011, 471, 428-432]

So the question is typically not whether one false negative is worse than one false positive, but even 99 % specificity (1 % FPR) and 95 % sensitivity (numbers taken from the paper linked above) then means roughly 500 false positives for each false negative.

As a side note, also keep in mind that early cancer diagnosis in itself is no magic cure for cancer. E.g. for breast cancer screening mammography, only 3 - 13 % of the true positive patients actually benefit from the screening.

So we also need to keep an eye on the number of false positives for each benefitting patient. E.g. for mammography, together with these numbers, a rough guesstimate it that we have somewhere in the range of 400 - 1800 false positives per benefitting true positive (39 - 49 year old group).

With hundreds of false positives per false negative (and also maybe hundreds or even thousands of false positives per patient benefitting from the screening) the situation isn't as clear as "is one missed cancer worse than one false positive cancer diagnosis": false positives do have an impact, ranging from psychological and psycho-somatic (worrying that you have cancer in itself isn't healthy) to physical risks of follow-up diagnoses such as biopsy (which is a small surgery, and as such comes with its own risks).

Even if the impact of one false positive is small, the corresponding risks may add up substantially if hundreds of false positives have to be considered.

Suggested reading: Gerd Gigerenzer: Risk Savvy: How to Make Good Decisions (2014).

Still, what PPV and NPV are needed to make a diagnostic test useful is highly dependend on the application.

As explained, in screening for early cancer detection the focus is usually on PPV, i.e. making sure you do not cause too much harm by false negatives: finding a sizeable fraction (even if not all) of the early cancer patients is already an improvement over the status quo without screening.

OTOH, HIV test in blood donations focuses first on NPV (i.e. making sure the blood is HIV-free). Still, in a 2nd (and 3rd) step, false positives are then reduced by applying further tests before worrying people with (false) positive HIV test results.

Last but not least, there are also medical testing applications where the incidences or prevalences aren't as extreme as they usually are in screening of not-particularly-high-risk populations, e.g. some differential diagnoses.

edited 10 hours ago

answered 10 hours ago

cbeleites

30016

edited 10 hours ago

answered 10 hours ago

cbeleites

30016

answered 10 hours ago

cbeleites

30016

answered 10 hours ago

cbeleites

30016

add a comment |

answered 19 hours ago

EricAtHaufe

112

New contributor

1

$begingroup$
This is not answering the question. OP is not asking what false positive means, but why it's deemed more important than false negative.
$endgroup$
– Llewellyn
12 hours ago

add a comment |

answered 19 hours ago

EricAtHaufe

112

New contributor

1

$begingroup$
This is not answering the question. OP is not asking what false positive means, but why it's deemed more important than false negative.
$endgroup$
– Llewellyn
12 hours ago

add a comment |

answered 19 hours ago

EricAtHaufe

112

New contributor

answered 19 hours ago

EricAtHaufe

112

New contributor

answered 19 hours ago

EricAtHaufe

112

New contributor

answered 19 hours ago

EricAtHaufe

112

answered 19 hours ago

EricAtHaufe

112

New contributor

1

$begingroup$
This is not answering the question. OP is not asking what false positive means, but why it's deemed more important than false negative.
$endgroup$
– Llewellyn
12 hours ago

add a comment |

1

$begingroup$
This is not answering the question. OP is not asking what false positive means, but why it's deemed more important than false negative.
$endgroup$
– Llewellyn
12 hours ago

This is not answering the question. OP is not asking what false positive means, but why it's deemed more important than false negative.

– Llewellyn
12 hours ago

add a comment |

Clinician's time is precious

It's better to have a tool that can accurately identify even a single trait of an illness, than a tool that maybe fudges multiple traits.

answered 5 hours ago

SSight3

101

New contributor

add a comment |

Clinician's time is precious

It's better to have a tool that can accurately identify even a single trait of an illness, than a tool that maybe fudges multiple traits.

answered 5 hours ago

SSight3

101

New contributor

add a comment |

Clinician's time is precious

It's better to have a tool that can accurately identify even a single trait of an illness, than a tool that maybe fudges multiple traits.

answered 5 hours ago

SSight3

101

New contributor

Clinician's time is precious

It's better to have a tool that can accurately identify even a single trait of an illness, than a tool that maybe fudges multiple traits.

answered 5 hours ago

SSight3

101

New contributor

answered 5 hours ago

SSight3

101

New contributor

answered 5 hours ago

SSight3

101

answered 5 hours ago

SSight3

101

New contributor

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Mfcttrf

5 Answers
5

Your Answer

Post as a guest

5 Answers
5

5 Answers
5

Post as a guest

Popular posts from this blog

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

5 Answers 5

5 Answers 5

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

5 Answers
5

5 Answers
5

5 Answers
5