Article Text

Download PDFPDF

Evidence categories in systematic assessment of cancer overdiagnosis
  1. Anton Barchuk1,2,
  2. Niko K Nordlund1,
  3. Alex L E Halme1,
  4. Kari A O Tikkinen3,4
  1. 1Faculty of Medicine, University of Helsinki, Helsinki, Finland
  2. 2Department of Medical Informatics, Erasmus MC, Rotterdam, The Netherlands
  3. 3Department of Urology, Helsinki University Hospital, Helsinki, Finland
  4. 4Department of Surgery, Päijät-Häme Central Hospital, Lahti, Finland
  1. Correspondence to Dr Anton Barchuk; anton.barchuk{at}helsinki.fi

Abstract

The phenomenon of cancer overdiagnosis, the diagnosis of a malignant tumour that, without detection, would never lead to adverse health effects, has been reported for several cancer types in different populations. There has been an increase in studies focused on overdiagnosis, creating an opportunity to synthesise evidence on specific cancer types. However, studies that systematically assess evidence across different research domains remain scarce, with most of them relying on data from studies that already mentioned overdiagnosis as a potential concern. In this review, we consider several evidence categories that are used to systematically assess the presence and magnitude of overdiagnosis, including (1) data from cancer surveillance, (2) studies exploring the ‘true’ prevalence of cancer in the population, (3) studies that explore the use of diagnostics and its effect on incidence and mortality and (4) studies that explore changes and progress in cancer management and its effect on cancer mortality. This article highlights the strengths and weaknesses of different evidence categories, provides examples of studies on different cancer types and discusses how these categories can help synthesise evidence on cancer overdiagnosis.

  • Overdiagnosis
  • Neoplasms
  • Early Diagnosis

Data availability statement

Data sharing not applicable as no datasets generated and/or analysed for this study.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

WHAT IS ALREADY KNOWN ON THIS TOPIC

  • Cancer overdiagnosis, the detection of tumours that would not cause harm if left untreated, is a recognised issue in oncology, particularly with the increased use of advanced diagnostic technologies. However, systematic reviews on this topic are uncommon, and the categorisation of studies and standard approaches to data extraction is still limited.

WHAT THIS STUDY ADDS

  • We propose several evidence categories that can be used to synthesise evidence on cancer overdiagnosis and aid in detecting it across different populations. The article provides examples of studies, highlighting their strengths and limitations, and offers a structured approach to analysing the data, allowing for a more comprehensive understanding of overdiagnosis patterns.

HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

  • Systematic evidence synthesis using the proposed evidence categories can improve the identification of cancer overdiagnosis and clarify its public health impact. Integrating data from various research domains can lead to a more balanced approach to overdiagnosis assessment, prioritising patient outcomes and optimising resource allocation within healthcare systems.

Introduction

Cancer overdiagnosis occurs when a malignant tumour—that would never lead to any harm if left undetected—is found. This has been reported for several cancer types, such as thyroid, prostate and breast cancers, and is more common in populations with better access to diagnostics.1 In recent years, there has been an increase in studies focused on overdiagnosis, creating opportunities to synthesise evidence on specific cancer types. For instance, a recent scoping review on overdiagnosis in malignant melanoma assessed available evidence from studies in different populations and settings to conclude the presence and magnitude of this phenomenon.2 The growing awareness of the potential harms associated with cancer overdiagnosis is also reshaping how future healthcare providers are trained.3 This shift in teaching is particularly significant for those who once believed that earlier detection was unequivocally beneficial. Table 1 provides explanations of key concepts relevant to overdiagnosis studies.

Table 1

Terms and concepts used in cancer overdiagnosis research (in alphabetical order)

Studies have typically assessed cancer overdiagnosis by analysing cancer surveillance data to identify specific epidemiological patterns (‘signatures’ or ‘fingerprints’) that may explain trends.4–6 Researchers have sometimes interpreted the same trends as supporting or opposing cancer screening.7 8 However, studies that systematically assess evidence from different research domains are still rare, with most relying on data from studies that already flag overdiagnosis as a concern. Overdiagnosis research, therefore, needs a systematic approach to evaluate the role of emerging cancer diagnostics, which increasingly affects the epidemiology of many cancer types.

The individuals harmed by unnecessary tests are often not the same as those who benefit from earlier detection. The balance of benefits and harms of cancer diagnostics should be assessed at the population level through randomised trials. However, cancer screening trials are often limited to high-resource settings and may lack the required follow-up to establish trustworthy evidence on overdiagnosis. In addition, due to the extended follow-up required, some trials may become outdated, making their overdiagnosis estimates applicable only to selected settings, time periods and populations.9–11

In this paper, we examine a range of evidence categories that can be relevant in the context of cancer overdiagnosis studies: (1) cancer surveillance data, (2) studies assessing the true prevalence of cancer in the population, (3) research on diagnostic utilisation and its impact on cancer incidence and mortality and (4) studies investigating changes and progress in cancer management and its effect on cancer mortality. We explore why and how these specific categories should be considered in overdiagnosis research, discuss practical applications for systematically synthesising evidence on overdiagnosis and highlight their potential limitations.

Cancer surveillance

Data from population-based cancer surveillance are among the most commonly used domains in studies that address cancer overdiagnosis.4 5 12 13 Epidemiological measures used to assess changes in cancer trends focus on cancer incidence and mortality in the general population. The risk of death in patients with cancer is also assessed through survival analysis, referred to as cancer survival. Studies often involve linear modelling, where rates are analysed as a function of time, and information on the relative change of the rates is captured over a specified period. For example, in a recent study, long-term annual trends of mortality and incidence rates of thyroid cancer from 43 countries were compared, and the number of overdiagnosed cases was estimated.14 Trend analyses can be supplemented by breakpoint and age–period–cohort analyses. In the age–period–cohort analysis of kidney cancer incidence rates from 1978 to 2007 in 16 populations, overdiagnosis was hypothesised as one of the reasons for changes.15 Stratification by stage can help identify stage-specific changes and determine whether shifts in incidence are attributed to early-stage disease.16 Increased early-stage cancer incidence accompanied by unchanged late-stage disease incidence might be an indicator of overdiagnosis. This approach may have limitations, as the changes in the incidence of late-stage disease over time may be affected by changes in risk factor prevalence and diagnostic accuracy. Comparison across and outside the age groups targeted by screening intervention and the addition of proper control groups to comparison might be useful to overcome these difficulties;17 still, the availability of information about screening participation may be a limitation. Individual-level studies using national databases that capture individual characteristics, screening participation and relevant outcomes may help refine this approach.

Cancer surveillance helps detect meaningful changes in cancer trends. It also helps to suspect underlying causes of cancer. Surveillance can be considered as an instrument for overdiagnosis research based on the reasonable assumption that increased diagnostic activity leads to higher cancer incidence. Generally, a substantial increase in incidence without similar changes in mortality, accompanied by an increase in survival, is considered a sign of diagnostic effects and overdiagnosis.4 5 A well-known example is the increase in thyroid cancer incidence in South Korea, which is consensually explained by the use of ultrasound in regular check-ups.12 Cancer surveillance data have also been used to assess the magnitude of overdiagnosis in prostate and thyroid cancer when baseline historical trends were compared with those observed in the period of the increased diagnostic activity.6 14

The methods and criteria to quantify overdiagnosis using surveillance data are not, however, standardised. There is no consensus on how large changes in cancer rates are considered substantial. As periods used to identify meaningful changes are often arbitrary, studies may not be comparable. In addition, stage-specific information is often unavailable or incomplete in population-based cancer registries that consolidate and produce cancer incidence data.

Cancer surveillance studies typically have an ecological study design without data on exposures. This limits the opportunities for causal inferences. Cancer incidence rates reflect individuals diagnosed each year. In contrast, mortality and survival rates typically reflect individuals diagnosed in previous years or decades. The gap between diagnosis and death depends on many factors: the lead time of a given cancer, the characteristics of the tumour and the effectiveness of management, including diagnostics and treatments.18

Although cancer surveillance studies are useful for detecting trends, they have highlighted the need for reproducible algorithms and clear quantitative criteria to standardise efforts to quantify overdiagnosis. Pooling the results of studies that use different methods to analyse the same population-based datasets is not the most optimal approach to generating reliable evidence. Systematic synthesis in this evidence category is likely most useful when data from different regions acquired through similar methodologies are considered.

Changes in diagnostic practices and effect of diagnostic activities

Diagnostic utilisation studies

The surge in thyroid cancer incidence, often linked to opportunistic ultrasonography screening, exemplifies how diagnostic volume changes can lead to overdiagnosis, as seen in South Korea and globally.12 13 Years before overdiagnosis became a recognised issue, a French study emphasised the need to account for shifts in diagnostic, medical, surgical and pathological practices when assessing thyroid cancer incidence,19 validating earlier concerns.20 Indeed, the thyroid cancer ‘diagnostic epidemic’ had been anticipated, with the potential reservoir of tumours noted already in 1985.21 Despite this, many individuals and healthcare systems have faced unnecessary diagnoses.14

Concerning diagnostic utilisation, some cancers have been included in or considered for population-based screening programmes, which are organised to detect cancer in asymptomatic populations. Examples of such cancers include breast, cervical and colorectal cancer.1 Several countries were considering screening programmes for different cancer types, for example, thyroid cancer, melanoma and stomach cancer. Regardless of whether a country has an official national programme, the rapid advancement of technologies and lack of evidence-based decision-making have led to the widespread use of diagnostic tests that have never been tested in randomised trials as screening methods.22

Cancer screening registries can provide individual-level data on diagnostic tests for cancers included in or considered for population-based screening programmes. Such data are, however, available in only a few regions,23 and these registers typically do not capture information on opportunistic testing. These registries are crucial for identifying the participation rates and evaluating test characteristics, such as positivity and positive predictive value.24 Even in countries with established screening programmes and similar disease incidence rates, these characteristics vary and, sometimes, point to potential overdiagnosis.25

For opportunistic screening programmes and modalities such as ultrasound, MRI, prostate-specific antigen (PSA) and other tests, it is much more challenging to differentiate between screening and diagnostic use in symptomatic patients or follow-up contexts. Some studies report changes in biopsy utilisation that could be used as a proxy for primary diagnostic activities.26 27

Unfortunately, individual-level studies that capture diagnostic procedures and outcomes (lesions detected) are rare outside screening trials and programmes. Studies based on individual data from multiple sources could bridge this gap, allowing large-scale characterisation that includes diagnostic tests, outcomes and reasons for referral.28

Cancer screening individual-level studies

Cancer screening research serves as the foundation for many publications that address cancer overdiagnosis, with results often drawn from individual-level studies. Comparison of exposed groups (individuals who were offered or had certain screening tests) and unexposed groups can help identify the difference in incidence and mortality, forming the basis for evaluating the balance between the benefits and harms of certain diagnostic interventions or combinations of interventions.29

Many screening trials are conducted in actual practice settings and are affected by group contamination and non-compliance. In the screening arm, some participants may not adhere to the protocol and forgo screening, while in the control group, some may undergo screening outside the trial. For example, a high frequency of screening in the control arm was one of the factors that may have contributed to the lack of observed prostate cancer mortality reduction in the the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial.30 While contamination can underestimate the mortality reduction, it may have an even more pronounced absolute effect on excess incidence. In a Finnish randomised prostate cancer screening trial, adjustment for contamination nearly doubled both mortality reduction and excess incidence. Without correction, there were 8 fewer deaths per 10 000 participants in the screening arm, increasing to 14 after correction. Incidence increased from 165 to 324 excess cases per 10 000 participants when contamination was accounted for.31

In addition, screening trials have limited ability to identify the risk of uncontrolled use of the same diagnostics outside of research settings,32 and those who defend the beneficial effects of screening on cancer mortality based on trial results are often faced simultaneously with the need to combat opportunistic screening activities.

In addition, individual-level screening studies are often conducted in higher-income countries, which have access to quality control and a more balanced approach to healthcare utilisation. At the same time, possible overdiagnosis can escalate when the same diagnostic technologies are introduced without proper quality control in lower-resource settings.22

The prevalence of subclinical cancer

While overdiagnosis is related to the properties and dissemination of diagnostic tests, it is also influenced by tumour characteristics. The discussion about the natural history of cancer, or the course of a disease uninterrupted by treatment, when applied to cancer screening and earlier detection studies, led to several important practical conclusions. First, it was suggested that not all asymptomatic cancers and precancers are destined to become invasive and symptomatic malignancies,33 and there may be a substantial reservoir of potentially detectable subclinical disease.20 Second, it was suggested that including these cancer cases in statistics could bias cancer survival as a measure of cancer control progress.34 35 While the second conclusion was more related to the interpretation of surveillance studies, the first requires empirical evidence and data to suggest the extent of such a reservoir.

Two important study types can be used to quantify the detectable subclinical cancer in the population: autopsy studies and cross-sectional diagnostic surveys. Autopsy studies are a unique source of information on the prevalence of asymptomatic tumours in people whose cause of death was not related to that specific cancer. However, the complexity of these studies and the decreasing proportion of autopsies performed worldwide make this type of study rarer today. Systematic reviews and individual-level autopsy studies are available for prostate,36 lung,37 colorectal,38 thyroid,39 breast40 and kidney41 cancers. The higher prevalence of cancer in autopsy suggests a higher likelihood of the presence of overdiagnosis.

Cross-sectional diagnostic surveys and the first rounds of screening trials can also provide information on the prevalence of asymptomatic tumours in the population, especially with recent advances in diagnostics that offer a resolution to capture smaller lesions.42 43 Cross-sectional studies have several important limitations: they are not always representative, and the confirmation of malignancy in these studies requires a biopsy, an intervention potentially leading to adverse effects. Also, longer-term follow-up is needed to prove that small asymptomatic tumours represent true overdiagnosis. Recent lung and prostate screening trials have shown a substantial number of lesions detected by modern imaging but have also introduced solutions to identify detected nodule features and volume doubling time to avoid unnecessary biopsy, highlighting how understanding the natural history of diseases can help shape screening programmes.44

Lastly, the prevalence of any disease depends on healing rates, which can be spontaneous. This phenomenon was described for several malignant tumours (eg, melanoma, kidney cancer), but the evidence is scarce, and rates of spontaneous regression are unclear.45 46 Theoretically, increased diagnostic activities are likely to cause overdiagnosis in tumours where spontaneous regression is more common, and this phenomenon should not be dismissed.

Changes in treatment

When discussing the benefits of earlier cancer diagnosis, the ultimate goal is to reduce mortality. Overdiagnosis is an unavoidable consequence of earlier cancer detection and must be weighed against possible benefits. As mortality reduction and overdiagnosis affect individuals differently, the balance of benefits and harms is difficult to measure and subjective. This highlights the need for informed and shared decision-making. Mortality rates for many cancers have declined in high-income countries47 coinciding with both improved diagnostics and new treatments. Distinguishing these effects is essential for evidence-based, sustainable screening policies.

Although treatment progress reduces mortality for certain types of cancer,5 in reality, this process is often not fast. The introduction of new drugs can affect mortality years later. Assessing the results of cancer treatment of advanced disease can help identify overall progress in cancer management. For example, in advanced breast cancer, survival after metastatic recurrence increased from 1.9 years in 2000 to 3.2 years in 2019.48 These changes should be interpreted cautiously as stage migration and better follow-up may inflate survival due to significant diagnostic advancements.

Another way that treatment information can contribute to overdiagnosis research is through the implementation of watchful waiting and active surveillance strategies, indicating that certain cancer cases may not require treatment and should not have been detected in the first place. Predicting the malignant potential of detected tumours is difficult, so introducing additional markers and diagnostic strategies can help identify non-progressive tumours before any invasive intervention begins. Watchful waiting strategies have been introduced for thyroid49 and prostate cancer.50 The ongoing discussion on the treatment of precancers in the cervix uteri51 and breast52 not only highlights possible adverse events linked to overdiagnosis but also paves the way for reevaluating disease categories in the future.

A somewhat underused source of data is studies that describe patients who refused conventional treatment. These studies often focus on the effect of complementary or alternative treatments, which are seen as a threat in light of the progress of conventional treatments. However, they also show potential for tumour progression when detected relatively early. For example, in a study from the National Cancer Database in the USA, a substantial proportion of patients with prostate (86%), breast (58%) and even lung (20%) cancer diagnosed between 2004 and 2013 were alive after 5 years without any specific treatment.53

Sometimes, the effects of both diagnostic and treatment interventions are assessed in simulation exercises. For example, the combined effect of screening and treatment was shown to be associated with the reduction of breast cancer mortality in US women.54 However, while simulation studies are useful, it is still difficult to prove that all assumptions hold and results have external validity without data from trials.

Conclusion

This article outlines evidence categories that can support a systematic, proactive approach to identifying and synthesising evidence on overdiagnosis. We discuss several interrelated evidence categories that can help evaluate the balance of benefits and harms of diagnostics and study cancer overdiagnosis. Additionally, we outline the strengths and weaknesses of each category, with examples of studies provided in table 2. Figure 1 schematically illustrates possible relationships between the evidence categories described.

Table 2

Evidence categories, studies and measures that inform overdiagnosis research

Figure 1

The role of diagnostic practices, screening and treatment in the pathway from risk factors to cancer incidence and mortality (diagram illustrates the flow from risk factors to cancer mortality: risk factors influence the true incidence and prevalence of cancer, which, along with diagnostic and screening practices, determines the registered cancer incidence; treatment impacts cancer mortality rates and diagnostic practices (screening) might also affect mortality).

While conducting systematic reviews, we suggest researchers formulate search strategies to cover a broad range of relevant evidence categories. Research papers relevant to synthesising evidence on overdiagnosis may not explicitly use terms like ‘overdiagnosis’ or ‘overdetection’ and thus may be inappropriately dismissed. To address this, we urge researchers to consider broadening their search criteria and scope, expanding it beyond surveillance studies and screening trials to include studies on the natural history of cancer, research exploring reservoirs of subclinical undetected cancers, and studies that cover diagnostics utilisation and treatment effects. Including diverse study types may provide a more comprehensive understanding of overdiagnosis and its relationship to mortality reduction.

There are also several ways researchers can use evidence categories to design future screening trials. First, trials should assess overdiagnosis and mortality reduction as critical outcomes of potential screening interventions, aiming to minimise overdiagnosis while maximising mortality reduction. Second, when strong evidence for cancer overdiagnosis exists, deimplementation studies—traditionally focused on treatment and less frequently on diagnostics or screening55—should be designed to reduce the harms of unnecessary diagnostics across various healthcare settings.

This review aims to encourage a broader, inclusive discussion that can lead to a more structured appraisal of the available evidence in overdiagnosis research. Overdiagnosis is a controversial topic where heated arguments are common. It often becomes prone to speculations without a consensual approach to the available evidence. As diagnostic tools become more sensitive, overdiagnosis is likely to increase across more cancer types. A systematic approach to evidence on overdiagnosis can help detect those changes early enough to prevent adverse effects and maximise the benefits of cancer diagnostics.

Data availability statement

Data sharing not applicable as no datasets generated and/or analysed for this study.

Ethics statements

Patient consent for publication

Ethics approval

Not applicable.

References

Footnotes

  • X @AntonBarchuk, @halmealex, @KariTikkinen

  • Contributors AB, NKN, ALEH and KAOT planned and designed the outline of the submitted manuscript. AB drafted the first version of the manuscript. NKN, ALEH and KAOT critically reviewed and edited the manuscript. All authors approved the manuscript. AB acted as guarantor.

  • Funding KAOT reports funding from the Research Council of Finland (353026), Helsinki University Hospital State Research Funding (TYH2023236), Sigrid Jusélius Foundation and Vyborg Tuberculosis Foundation.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.