U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

The Levels of Evidence and their role in Evidence-Based Medicine

Patricia b. burns.

1 Research Associate, Section of Plastic Surgery, Department of Surgery, The University of Michigan Health System

Rod J. Rohrich

2 Professor of Surgery, Department of Plastic Surgery, University of Texas Southwestern Medical Center

Kevin C. Chung

3 Professor of Surgery, Section of Plastic Surgery, Department of Surgery, The University of Michigan Health System

As the name suggests, evidence-based medicine (EBM), is about finding evidence and using that evidence to make clinical decisions. A cornerstone of EBM is the hierarchical system of classifying evidence. This hierarchy is known as the levels of evidence. Physicians are encouraged to find the highest level of evidence to answer clinical questions. Several papers published in Plastic Surgery journals concerning EBM topics have touched on this subject. 1 – 6 Specifically, previous papers have discussed the lack of higher level evidence in PRS and need to improve the evidence published in the journal. Before that can be accomplished, it is important to understand the history behind the levels and how they should be interpreted. This paper will focus on the origin of levels of evidence, their relevance to the EBM movement and the implications for the field of plastic surgery as well as the everyday practice of plastic surgery.

History of Levels of Evidence

The levels of evidence were originally described in a report by the Canadian Task Force on the Periodic Health Examination in 1979. 7 The report’s purpose was to develop recommendations on the periodic health exam and base those recommendations on evidence in the medical literature. The authors developed a system of rating evidence ( Table 1 ) when determining the effectiveness of a particular intervention. The evidence was taken into account when grading recommendations. For example, a Grade A recommendation was given if there was good evidence to support a recommendation that a condition be included in the periodic health exam. The levels of evidence were further described and expanded by Sackett 8 in an article on levels of evidence for antithrombotic agents in 1989 ( Table 2 ). Both systems place randomized controlled trials (RCT) at the highest level and case series or expert opinions at the lowest level. The hierarchies rank studies according to the probability of bias. RCTs are given the highest level because they are designed to be unbiased and have less risk of systematic errors. For example, by randomly allocating subjects to two or more treatment groups, these types of studies also randomize confounding factors that may bias results. A case series or expert opinion is often biased by the author’s experience or opinions and there is no control of confounding factors.

Canadian Task Force on the Periodic Health Examination’s Levels of Evidence *

Levels of Evidence from Sackett *

Modification of levels

Since the introduction of levels of evidence, several other organizations and journals have adopted variation of the classification system. Diverse specialties are often asking different questions and it was recognized that the type and level of evidence needed to be modified accordingly. Research questions are divided into the categories: treatment, prognosis, diagnosis, and economic/decision analysis. For example, Table 3 shows the levels of evidence developed by the American Society of Plastic Surgeons (ASPS) for prognosis 9 and Table 4 shows the levels developed by the Centre for Evidence Based Medicine (CEBM) for treatment. 10 The two tables highlight the types of studies that are appropriate for the question (prognosis versus treatment) and how quality of data is taken into account when assigning a level. For example, RCTs are not appropriate when looking at the prognosis of a disease. The question in this instance is: “What will happen if we do nothing at all”? Because a prognosis question does not involve comparing treatments, the highest evidence would come from a cohort study or a systematic review of cohort studies. The levels of evidence also take into account the quality of the data. For example, in the chart from CEBM, poorly designed RCTs have the same level of evidence as a cohort study.

Levels of Evidence for Prognostic Studies *

Levels of Evidence for Therapeutic Studies *

A grading system that provides strength of recommendations based on evidence has also changed over time. Table 5 shows the Grade Practice Recommendations developed by ASPS. The grading system provides an important component in evidence-based medicine and assists in clinical decision making. For example, a strong recommendation is given when there is level I evidence and consistent evidence from Level II, III and IV studies available. The grading system does not degrade lower level evidence when deciding recommendations if the results are consistent.

Grade Practice Recommendations *

Interpretation of levels

Many journals assign a level to the papers they publish and authors often assign a level when submitting an abstract to conference proceedings. This allows the reader to know the level of evidence of the research but the designated level of evidence does always guarantee the quality of the research. It is important that readers not assume that level 1 evidence is always the best choice or appropriate for the research question. This concept will be very important for all of us to understand as we evolve into the field of EBM in Plastic Surgery. By design, our designated surgical specialty will always have important articles that may have a lower level of evidence due to the level of innovation and technique articles which are needed to move our surgical specialty forward.

Although RCTs are the often assigned the highest level of evidence, not all RCTs are conducted properly and the results should be carefully scrutinized. Sackett 8 stressed the importance of estimating types of errors and the power of studies when interpreting results from RCTs. For example, a poorly conducted RCT may report a negative result due to low power when in fact a real difference exists between treatment groups. Scales such as the Jadad scale have been developed to judge the quality of RCTs. 11 Although physicians may not have the time or inclination to use a scale to assess quality, there are some basic items that should be taken into account. Items used for assessing RCTs include: randomization, blinding, a description of the randomization and blinding process, description of the number of subjects who withdrew or drop out of the study; the confidence intervals around study estimates; and a description of the power analysis. For example, Bhandari et al. 12 published a paper assessing the quality of surgical RCTs. The authors evaluated the quality of RCTs reported in the Journal of Bone and Joint Surgery (JBJS) from 1988–2000. Papers with a score of > 75% were deemed high quality and 60% of the papers had a score < 75%. The authors identified 72 RCTs during this time period and the mean score was 68%. The main reason for the low quality score was lack of appropriate randomization, blinding, and a description of patient exclusion criteria. Another paper found the same quality score of papers in JBJS with a level 1 rating compared to level 2. 13 Therefore, one should not assume that level 1 studies have higher quality than level 2.

A resource for surgeons when appraising levels of evidence are the users’ guides published in the Canadian Journal of Surgery 14 , 15 and the Journal of Bone and Joint Surgery. 16 Similar papers that are not specific to surgery have been published in the Journal of the American Medical Association (JAMA). 17 , 18

Plastic surgery and EBM

The field of plastic surgery has been slow to adopt evidence-based medicine. This was demonstrated in a paper examining the level of evidence of papers published in PRS. 19 The authors assigned levels of evidence to papers published in PRS over a 20 year period. The majority of studies (93% in 1983) were level 4 or 5, which denotes case series and case reports. Although the results are disappointing, there was some improvement over time. By 2003 there were more level 1studies (1.5%) and fewer level 4 and 5 studies (87%). A recent analysis looked at the number of level 1 studies in 5 different plastic surgery journals from 1978–2009. The authors defined level 1 studies as RCTs and meta-analysis and restricted their search to these studies. The number of level 1 studies increased from 1 in 1978 to 32 by 2009. 20 From these results, we see that the field of plastic surgery is improving the level of evidence but still has a way to go, especially in improving the quality of studies published. For example, approximately a third of the studies involved double blinding, but the majority did not randomize subjects, describe the randomization process, or perform a power analysis. Power analysis is another area of concern in plastic surgery. A review of the plastic surgery literature found that the majority of published studies have inadequate power to detect moderate to large differences between treatment groups. 21 No matter what the level of evidence for a study, if it is under powered, the interpretation of results is questionable.

Although the goal is to improve the overall level of evidence in plastic surgery, this does not mean that all lower level evidence should be discarded. Case series and case reports are important for hypothesis generation and can lead to more controlled studies. Additionally, in the face of overwhelming evidence to support a treatment, such as the use of antibiotics for wound infections, there is no need for an RCT.

Clinical examples using levels of evidence

In order to understand how the levels of evidence work and aid the reader in interpreting levels, we provide some examples from the plastic surgery literature. The examples also show the peril of medical decisions based on results from case reports.

An association was hypothesized between lymphoma and silicone breast implants based on case reports. 22 – 27 The level of evidence for case reports, depending on the scale used, is 4 or 5. These case reports were used to generate the hypothesis that a possible association existed. Because of these results, several large retrospective cohort studies from the United States, Canada, Denmark, Sweden and Finland were conducted. 28 – 32 The level of evidence for a retrospective cohort is 2. All of these studies had many years of follow-up for a large number of patients. Some of the studies found an elevated risk and others no risk for lymphoma. None of the studies reached statistical significance. Therefore, higher level evidence from cohort studies does not provide evidence of any risk of lymphoma. Finally, a systematic review was performed that combined the evidence from the retrospective cohorts. 27 The results found an overall standardized incidence ratio of 0.89 (95% CI 0.67–1.18). Because the confidence intervals include 1, the results indicate there is no increased incidence. The level of evidence for the systematic review is 1. Based on the best available evidence, there is no association between lymphoma and silicone implants. This example shows how low level evidence studies were used to generate a hypothesis, which then led to higher level evidence that disproved the hypothesis. This example also demonstrates that RCTs are not feasible for rare events such as cancer and the importance of observational studies for a specific study question. A case-control study is a better option and provides higher evidence for testing the prognosis of the long-term effect of silicone breast implants.

Another example is the injection of epinephrine in fingers. Based on case reports prior to 1950, physicians were advised that epinephrine injection can result in finger ischemia. 33 We see in this example in which level 4 or 5 evidence was accepted as fact and incorporated into medical textbooks and teaching. However, not all physicians accepted this evidence and are performing injections of epinephrine into the fingers with no adverse effects on the hand. Obviously, it was time for higher level evidence to resolve this issue. An in-depth review of the literature from 1880 to 2000 by Denkler, 33 identified 48 cases of digital infarction of which 21 were injected with epinephrine. Further analysis found that the addition of procaine to the epinephrine injection was the cause of the ischemia. 34 The procaine used in these injections included toxic acidic batches that were recalled in 1948. In addition, several cohort studies found no complications from the use of epinephrine in the fingers and hand. 35 , 36 , 37 The results from these cohort studies increased the level of evidence. Based on the best available evidence from these studies, the hypothesis that epinephrine injection will harm fingers was rejected. This example highlights the biases inherent in case reports. It also shows the risk when spurious evidence is handed down and integrated into medical teaching.

Obtaining the best evidence

We have established the need for RCTs to improve evidence in plastic surgery but have also acknowledged the difficulties, particularly with randomization and blinding. Although RCTs may not be appropriate for many surgical questions, well designed and conducted cohort or case-control studies could boost the level of evidence. Many of the current studies tend to be descriptive and lack a control group. The way forward seems clear. Plastic surgery researchers need to consider utilizing a cohort or case-control design whenever an RCT is not possible. If designed properly, the level of evidence for observational studies can approach or surpass those from an RCT. In some instances, observation studies and RCTs have found similar results. 38 If enough cohort or case-control studies become available, this increases the prospect of systematic reviews of these studies that will increase overall evidence levels in plastic surgery.

The levels of evidence are an important component of EBM. Understanding the levels and why they are assigned to publications and abstracts helps the reader to prioritize information. This is not to say that all level 4 evidence should be ignored and all level 1 evidence accepted as fact. The levels of evidence provide a guide and the reader needs to be cautious when interpreting these results.

Acknowledgments

Supported in part by a Midcareer Investigator Award in Patient-Oriented Research (K24 AR053120) from the National Institute of Arthritis and Musculoskeletal and Skin Diseases (to Dr. Kevin C. Chung).

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

IMAGES

  1. Levels of Evidence

    case study evidence in

  2. Case Report: A Beginner’s Guide with Examples

    case study evidence in

  3. PPT

    case study evidence in

  4. Types of Studies

    case study evidence in

  5. 15 Empirical Evidence Examples (2024)

    case study evidence in

  6. 49 Free Case Study Templates ( + Case Study Format Examples + )

    case study evidence in

VIDEO

  1. This Supplement Reverses Your Age Instantly!

  2. Understanding Evidence in Academic Writing

  3. REborn For This

  4. Case study 117

  5. 3 evidence based study techniques for productive learning _ Aakash Mahajan _ #study #productivity

  6. SSAC24: Research Collectives: The Call for Large-Scale, Open Science Initiatives in Sports Science

COMMENTS

  1. The Levels of Evidence and their role in Evidence-Based

    A case-control study is a better option and provides higher evidence for testing the prognosis of the long-term effect of silicone breast implants. Another example is the injection of epinephrine in fingers.

  2. Case Study Method: A Step-by-Step Guide for Business

    Case study reporting is as important as empirical material collection and interpretation. The quality of a case study does not only depend on the empirical material collection and analysis but also on its reporting (Denzin & Lincoln, 1998). A sound report structure, along with “story-like” writing is crucial to case study reporting.

  3. Analyzing Case Study Evidence

    29A. A Study of Multiple Communities. In a multiple-case study, one goal is to build a general explanation that fits each indi-vidual case, even though the cases will vary in their details. The objective is analo-gous to creating an overall explanation, in science, for the findings from multiple experiments.