case study case evaluation

Case Study Evaluation Approach
Learning Center

A case study evaluation approach can be an incredibly powerful tool for monitoring and evaluating complex programs and policies. By identifying common themes and patterns, this approach allows us to better understand the successes and challenges faced by the program. In this article, we’ll explore the benefits of using a case study evaluation approach in the monitoring and evaluation of projects, programs, and public policies.

Table of Contents

Introduction to Case Study Evaluation Approach

The advantages of a case study evaluation approach, types of case studies, potential challenges with a case study evaluation approach, guiding principles for successful implementation of a case study evaluation approach.

Benefits of Incorporating the Case Study Evaluation Approach in the Monitoring and Evaluation of Projects and Programs

A case study evaluation approach is a great way to gain an in-depth understanding of a particular issue or situation. This type of approach allows the researcher to observe, analyze, and assess the effects of a particular situation on individuals or groups.

An individual, a location, or a project may serve as the focal point of a case study’s attention. Quantitative and qualitative data are frequently used in conjunction with one another.

It also allows the researcher to gain insights into how people react to external influences. By using a case study evaluation approach, researchers can gain insights into how certain factors such as policy change or a new technology have impacted individuals and communities. The data gathered through this approach can be used to formulate effective strategies for responding to changes and challenges. Ultimately, this monitoring and evaluation approach helps organizations make better decision about the implementation of their plans.

This approach can be used to assess the effectiveness of a policy, program, or initiative by considering specific elements such as implementation processes, outcomes, and impact. A case study evaluation approach can provide an in-depth understanding of the effectiveness of a program by closely examining the processes involved in its implementation. This includes understanding the context, stakeholders, and resources to gain insight into how well a program is functioning or has been executed. By evaluating these elements, it can help to identify areas for improvement and suggest potential solutions. The findings from this approach can then be used to inform decisions about policies, programs, and initiatives for improved outcomes.

It is also useful for determining if other policies, programs, or initiatives could be applied to similar situations in order to achieve similar results or improved outcomes. All in all, the case study monitoring evaluation approach is an effective method for determining the effectiveness of specific policies, programs, or initiatives. By researching and analyzing the successes of previous cases, this approach can be used to identify similar approaches that could be applied to similar situations in order to achieve similar results or improved outcomes.

A case study evaluation approach offers the advantage of providing in-depth insight into a particular program or policy. This can be accomplished by analyzing data and observations collected from a range of stakeholders such as program participants, service providers, and community members. The monitoring and evaluation approach is used to assess the impact of programs and inform the decision-making process to ensure successful implementation. The case study monitoring and evaluation approach can help identify any underlying issues that need to be addressed in order to improve program effectiveness. It also provides a reality check on how successful programs are actually working, allowing organizations to make adjustments as needed. Overall, a case study monitoring and evaluation approach helps to ensure that policies and programs are achieving their objectives while providing valuable insight into how they are performing overall.

By taking a qualitative approach to data collection and analysis, case study evaluations are able to capture nuances in the context of a particular program or policy that can be overlooked when relying solely on quantitative methods. Using this approach, insights can be gleaned from looking at the individual experiences and perspectives of actors involved, providing a more detailed understanding of the impact of the program or policy than is possible with other evaluation methodologies. As such, case study monitoring evaluation is an invaluable tool in assessing the effectiveness of a particular initiative, enabling more informed decision-making as well as more effective implementation of programs and policies.

Furthermore, this approach is an effective way to uncover experiential information that can help to inform the ongoing improvement of policy and programming over time All in all, the case study monitoring evaluation approach offers an effective way to uncover experiential information necessary to inform the ongoing improvement of policy and programming. By analyzing the data gathered from this systematic approach, stakeholders can gain deeper insight into how best to make meaningful and long-term changes in their respective organizations.

Case studies come in a variety of forms, each of which can be put to a unique set of evaluation tasks. Evaluators have come to a consensus on describing six distinct sorts of case studies, which are as follows: illustrative, exploratory, critical instance, program implementation, program effects, and cumulative.

Illustrative Case Study

An illustrative case study is a type of case study that is used to provide a detailed and descriptive account of a particular event, situation, or phenomenon. It is often used in research to provide a clear understanding of a complex issue, and to illustrate the practical application of theories or concepts.

An illustrative case study typically uses qualitative data, such as interviews, surveys, or observations, to provide a detailed account of the unit being studied. The case study may also include quantitative data, such as statistics or numerical measurements, to provide additional context or to support the qualitative data.

The goal of an illustrative case study is to provide a rich and detailed description of the unit being studied, and to use this information to illustrate broader themes or concepts. For example, an illustrative case study of a successful community development project may be used to illustrate the importance of community engagement and collaboration in achieving development goals.

One of the strengths of an illustrative case study is its ability to provide a detailed and nuanced understanding of a particular issue or phenomenon. By focusing on a single case, the researcher is able to provide a detailed and in-depth analysis that may not be possible through other research methods.

However, one limitation of an illustrative case study is that the findings may not be generalizable to other contexts or populations. Because the case study focuses on a single unit, it may not be representative of other similar units or situations.

A well-executed case study can shed light on wider research topics or concepts through its thorough and descriptive analysis of a specific event or phenomenon.

Exploratory Case Study

An exploratory case study is a type of case study that is used to investigate a new or previously unexplored phenomenon or issue. It is often used in research when the topic is relatively unknown or when there is little existing literature on the topic.

Exploratory case studies are typically qualitative in nature and use a variety of methods to collect data, such as interviews, observations, and document analysis. The focus of the study is to gather as much information as possible about the phenomenon being studied and to identify new and emerging themes or patterns.

The goal of an exploratory case study is to provide a foundation for further research and to generate hypotheses about the phenomenon being studied. By exploring the topic in-depth, the researcher can identify new areas of research and generate new questions to guide future research.

One of the strengths of an exploratory case study is its ability to provide a rich and detailed understanding of a new or emerging phenomenon. By using a variety of data collection methods, the researcher can gather a broad range of data and perspectives to gain a more comprehensive understanding of the phenomenon being studied.

However, one limitation of an exploratory case study is that the findings may not be generalizable to other contexts or populations. Because the study is focused on a new or previously unexplored phenomenon, the findings may not be applicable to other situations or populations.

Exploratory case studies are an effective research strategy for learning about novel occurrences, developing research hypotheses, and gaining a deep familiarity with a topic of study.

Critical Instance Case Study

A critical instance case study is a type of case study that focuses on a specific event or situation that is critical to understanding a broader issue or phenomenon. The goal of a critical instance case study is to analyze the event in depth and to draw conclusions about the broader issue or phenomenon based on the analysis.

A critical instance case study typically uses qualitative data, such as interviews, observations, or document analysis, to provide a detailed and nuanced understanding of the event being studied. The data are analyzed using various methods, such as content analysis or thematic analysis, to identify patterns and themes that emerge from the data.

The critical instance case study is often used in research when a particular event or situation is critical to understanding a broader issue or phenomenon. For example, a critical instance case study of a successful disaster response effort may be used to identify key factors that contributed to the success of the response, and to draw conclusions about effective disaster response strategies more broadly.

One of the strengths of a critical instance case study is its ability to provide a detailed and in-depth analysis of a particular event or situation. By focusing on a critical instance, the researcher is able to provide a rich and nuanced understanding of the event, and to draw conclusions about broader issues or phenomena based on the analysis.

However, one limitation of a critical instance case study is that the findings may not be generalizable to other contexts or populations. Because the case study focuses on a specific event or situation, the findings may not be applicable to other similar events or situations.

A critical instance case study is a valuable research method that can provide a detailed and nuanced understanding of a particular event or situation and can be used to draw conclusions about broader issues or phenomena based on the analysis.

Program Implementation Program Implementation

A program implementation case study is a type of case study that focuses on the implementation of a particular program or intervention. The goal of the case study is to provide a detailed and comprehensive account of the program implementation process, and to identify factors that contributed to the success or failure of the program.

Program implementation case studies typically use qualitative data, such as interviews, observations, and document analysis, to provide a detailed and nuanced understanding of the program implementation process. The data are analyzed using various methods, such as content analysis or thematic analysis, to identify patterns and themes that emerge from the data.

The program implementation case study is often used in research to evaluate the effectiveness of a particular program or intervention, and to identify strategies for improving program implementation in the future. For example, a program implementation case study of a school-based health program may be used to identify key factors that contributed to the success or failure of the program, and to make recommendations for improving program implementation in similar settings.

One of the strengths of a program implementation case study is its ability to provide a detailed and comprehensive account of the program implementation process. By using qualitative data, the researcher is able to capture the complexity and nuance of the implementation process, and to identify factors that may not be captured by quantitative data alone.

However, one limitation of a program implementation case study is that the findings may not be generalizable to other contexts or populations. Because the case study focuses on a specific program or intervention, the findings may not be applicable to other programs or interventions in different settings.

An effective research tool, a case study of program implementation may illuminate the intricacies of the implementation process and point the way towards future enhancements.

Program Effects Case Study

A program effects case study is a research method that evaluates the effectiveness of a particular program or intervention by examining its outcomes or effects. The purpose of this type of case study is to provide a detailed and comprehensive account of the program’s impact on its intended participants or target population.

A program effects case study typically employs both quantitative and qualitative data collection methods, such as surveys, interviews, and observations, to evaluate the program’s impact on the target population. The data is then analyzed using statistical and thematic analysis to identify patterns and themes that emerge from the data.

The program effects case study is often used to evaluate the success of a program and identify areas for improvement. For example, a program effects case study of a community-based HIV prevention program may evaluate the program’s effectiveness in reducing HIV transmission rates among high-risk populations and identify factors that contributed to the program’s success.

One of the strengths of a program effects case study is its ability to provide a detailed and nuanced understanding of a program’s impact on its intended participants or target population. By using both quantitative and qualitative data, the researcher can capture both the objective and subjective outcomes of the program and identify factors that may have contributed to the outcomes.

However, a limitation of the program effects case study is that it may not be generalizable to other populations or contexts. Since the case study focuses on a particular program and population, the findings may not be applicable to other programs or populations in different settings.

A program effects case study is a good way to do research because it can give a detailed look at how a program affects the people it is meant for. This kind of case study can be used to figure out what needs to be changed and how to make programs that work better.

Cumulative Case Study

A cumulative case study is a type of case study that involves the collection and analysis of multiple cases to draw broader conclusions. Unlike a single-case study, which focuses on one specific case, a cumulative case study combines multiple cases to provide a more comprehensive understanding of a phenomenon.

The purpose of a cumulative case study is to build up a body of evidence through the examination of multiple cases. The cases are typically selected to represent a range of variations or perspectives on the phenomenon of interest. Data is collected from each case using a range of methods, such as interviews, surveys, and observations.

The data is then analyzed across cases to identify common themes, patterns, and trends. The analysis may involve both qualitative and quantitative methods, such as thematic analysis and statistical analysis.

The cumulative case study is often used in research to develop and test theories about a phenomenon. For example, a cumulative case study of successful community-based health programs may be used to identify common factors that contribute to program success, and to develop a theory about effective community-based health program design.

One of the strengths of the cumulative case study is its ability to draw on a range of cases to build a more comprehensive understanding of a phenomenon. By examining multiple cases, the researcher can identify patterns and trends that may not be evident in a single case study. This allows for a more nuanced understanding of the phenomenon and helps to develop more robust theories.

However, one limitation of the cumulative case study is that it can be time-consuming and resource-intensive to collect and analyze data from multiple cases. Additionally, the selection of cases may introduce bias if the cases are not representative of the population of interest.

In summary, a cumulative case study is a valuable research method that can provide a more comprehensive understanding of a phenomenon by examining multiple cases. This type of case study is particularly useful for developing and testing theories and identifying common themes and patterns across cases.

When conducting a case study evaluation approach, one of the main challenges is the need to establish a contextually relevant research design that accounts for the unique factors of the case being studied. This requires close monitoring of the case, its environment, and relevant stakeholders. In addition, the researcher must build a framework for the collection and analysis of data that is able to draw meaningful conclusions and provide valid insights into the dynamics of the case. Ultimately, an effective case study monitoring evaluation approach will allow researchers to form an accurate understanding of their research subject.

Additionally, depending on the size and scope of the case, there may be concerns regarding the availability of resources and personnel that could be allocated to data collection and analysis. To address these issues, a case study monitoring evaluation approach can be adopted, which would involve a mix of different methods such as interviews, surveys, focus groups and document reviews. Such an approach could provide valuable insights into the effectiveness and implementation of the case in question. Additionally, this type of evaluation can be tailored to the specific needs of the case study to ensure that all relevant data is collected and respected.

When dealing with a highly sensitive or confidential subject matter within a case study, researchers must take extra measures to prevent bias during data collection as well as protect participant anonymity while also collecting valid data in order to ensure reliable results

Moreover, when conducting a case study evaluation it is important to consider the potential implications of the data gathered. By taking extra measures to prevent bias and protect participant anonymity, researchers can ensure reliable results while also collecting valid data. Maintaining confidentiality and deploying ethical research practices are essential when conducting a case study to ensure an unbiased and accurate monitoring evaluation.

When planning and implementing a case study evaluation approach, it is important to ensure the guiding principles of research quality, data collection, and analysis are met. To ensure these principles are upheld, it is essential to develop a comprehensive monitoring and evaluation plan. This plan should clearly outline the steps to be taken during the data collection and analysis process. Furthermore, the plan should provide detailed descriptions of the project objectives, target population, key indicators, and timeline. It is also important to include metrics or benchmarks to monitor progress and identify any potential areas for improvement. By implementing such an approach, it will be possible to ensure that the case study evaluation approach yields valid and reliable results.

To ensure successful implementation, it is essential to establish a reliable data collection process that includes detailed information such as the scope of the study, the participants involved, and the methods used to collect data. Additionally, it is important to have a clear understanding of what will be examined through the evaluation process and how the results will be used. All in all, it is essential to establish a sound monitoring evaluation approach for a successful case study implementation. This includes creating a reliable data collection process that encompasses the scope of the study, the participants involved, and the methods used to collect data. It is also imperative to have an understanding of what will be examined and how the results will be utilized. Ultimately, effective planning is key to ensure that the evaluation process yields meaningful insights.

Benefits of Incorporating the Case Study Evaluation Approach in the Monitoring and Evaluation of Projects and Programmes

Using a case study approach in monitoring and evaluation allows for a more detailed and in-depth exploration of the project’s success, helping to identify key areas of improvement and successes that may have been overlooked through traditional evaluation. Through this case study method, specific data can be collected and analyzed to identify trends and different perspectives that can support the evaluation process. This data can allow stakeholders to gain a better understanding of the project’s successes and failures, helping them make informed decisions on how to strengthen current activities or shape future initiatives. From a monitoring and evaluation standpoint, this approach can provide an increased level of accuracy in terms of accurately assessing the effectiveness of the project.

This can provide valuable insights into what works—and what doesn’t—when it comes to implementing projects and programs, aiding decision-makers in making future plans that better meet their objectives However, monitoring and evaluation is just one approach to assessing the success of a case study. It does provide a useful insight into what initiatives may be successful, but it is important to note that there are other effective research methods, such as surveys and interviews, that can also help to further evaluate the success of a project or program.

In conclusion, a case study evaluation approach can be incredibly useful in monitoring and evaluating complex programs and policies. By exploring key themes, patterns and relationships, organizations can gain a detailed understanding of the successes, challenges and limitations of their program or policy. This understanding can then be used to inform decision-making and improve outcomes for those involved. With its ability to provide an in-depth understanding of a program or policy, the case study evaluation approach has become an invaluable tool for monitoring and evaluation professionals.

How strong is my Resume?

Only 2% of resumes land interviews.

Land a better, higher-paying career

Jobs for You

Call for consultancy: evaluation of dfpa projects in kenya, uganda and ethiopia.

The Danish Family Planning Association

Project Assistant – Close Out

United States (Remote)

Global Technical Advisor – Information Management

Belfast, UK
Concern Worldwide

Intern- International Project and Proposal Support – ISPI

United States

Budget and Billing Consultant

Manager ii, budget and billing, usaid/lac office of regional sustainable development – program analyst, team leader, senior finance and administrative manager, data scientist.

New York, NY, USA
Everytown For Gun Safety

Energy Evaluation Specialist

Senior evaluation specialist, associate project manager, project manager i, services you might be interested in, useful guides ....

How to Create a Strong Resume

Monitoring And Evaluation Specialist Resume

Resume Length for the International Development Sector

Types of Evaluation

Monitoring, Evaluation, Accountability, and Learning (MEAL)

LAND A JOB REFERRAL IN 2 WEEKS (NO ONLINE APPS!)

15.7 Evaluation: Presentation and Analysis of Case Study

Learning outcomes.

By the end of this section, you will be able to:

Revise writing to follow the genre conventions of case studies.
Evaluate the effectiveness and quality of a case study report.

Case studies follow a structure of background and context , methods , findings , and analysis . Body paragraphs should have main points and concrete details. In addition, case studies are written in formal language with precise wording and with a specific purpose and audience (generally other professionals in the field) in mind. Case studies also adhere to the conventions of the discipline’s formatting guide ( APA Documentation and Format in this study). Compare your case study with the following rubric as a final check.

Critical Language Awareness	Clarity and Coherence	Rhetorical Choices
The text always adheres to the “Editing Focus” of this chapter: words often confused, as discussed in Section 15.6. The text also shows ample evidence of the writer’s intent to consciously meet or challenge conventional expectations in rhetorically effective ways.	Paragraphs are unified under a single, clear topic. Abundant background and supporting details provide a sense of completeness. Evidence of qualitative and quantitative data collection is clear. Transitions and subheads connect ideas and sections, thus establishing coherence throughout. Applicable visuals clarify abstract ideas.	The writer clearly and consistently recognizes and works within the limits and purpose of the case study. The writer engages the audience by inviting them to contribute to the research and suggests ways for doing so. The implications, relevance, and consequences of the research are explained. The study shows mature command of language and consistent objectivity. Quotations from participant(s) are accurate and relevant.
The text usually adheres to the “Editing Focus” of this chapter: words often confused, as discussed in Section 15.6. The text also shows some evidence of the writer’s intent to consciously meet or challenge conventional expectations in rhetorically effective ways.	Paragraphs usually are unified under a single, clear topic. Background and supporting details provide a sense of completeness. Evidence of qualitative and quantitative data collection is clear. Transitions and subheads connect ideas and sections, thus establishing coherence. Applicable visuals clarify abstract ideas.	The writer usually recognizes and works within the limits and purpose of the case study. The writer engages the audience by inviting them to contribute to the research and usually suggests ways for doing so. The implications, relevance, and consequences of the research are explained. The study shows command of language and objectivity. Quotations from participant(s) are usually accurate and relevant.
The text generally adheres to the “Editing Focus” of this chapter: words often confused, as discussed in Section 15.6. The text also shows limited evidence of the writer’s intent to consciously meet or challenge conventional expectations in rhetorically effective ways.	Paragraphs generally are unified under a single, clear topic. Background and supporting details provide a sense of completeness. Some evidence of qualitative and quantitative data collection is clear. Some transitions and subheads connect ideas and sections, generally establishing coherence. Visuals may clarify abstract ideas or may seem irrelevant.	The writer generally recognizes and works within the limits and purpose of the case study. The writer sometimes engages the audience by inviting them to contribute to the research but may not suggest ways for doing so. The implications, relevance, and consequences of the research are explained, if not fully. The study shows some command of language and objectivity. Quotations from participant(s) are generally accurate, if not always relevant.
The text occasionally adheres to the “Editing Focus” of this chapter: words often confused, as discussed in Section 15.6. The text also shows emerging evidence of the writer’s intent to consciously meet or challenge conventional expectations in rhetorically effective ways.	Paragraphs sometimes are unified under a single, clear topic. Background and supporting details are insufficient to provide a sense of completeness. There is little evidence of qualitative or quantitative data collection. Some transitions and subheads connect ideas and sections, but coherence may be lacking. Visuals are either missing or irrelevant.	The writer occasionally recognizes and works within the limits and purpose of the case study. The writer rarely engages the audience by inviting them to contribute to the research or suggests ways for doing so. The implications, relevance, and consequences of the research are haphazardly explained, if at all. The study shows little command of language or objectivity. Quotations from participant(s) are questionable and often irrelevant.
The text does not adhere to the “Editing Focus” of this chapter: words often confused, as discussed in Section 15.6. The text also shows little to no evidence of the writer’s intent to consciously meet or challenge conventional expectations in rhetorically effective ways.	Paragraphs are not unified under a single, clear topic. Background and supporting details are insufficient to provide a sense of completeness. There is little evidence of qualitative or quantitative data collection. Transitions and subheads are missing or inappropriate to provide coherence. Visuals are either missing or irrelevant.	The writer does not recognize or work within the limits and purpose of the case study. The writer does not engage the audience by inviting them to contribute to the research. The implications, relevance, and consequences of the research are haphazardly explained, if at all. The study shows little command of language or objectivity. Quotations, if any, from participant(s) are questionable and often irrelevant.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Access for free at https://openstax.org/books/writing-guide/pages/1-unit-introduction

Authors: Michelle Bachelor Robinson, Maria Jerskey, featuring Toby Fulwiler
Publisher/website: OpenStax
Book title: Writing Guide with Handbook
Publication date: Dec 21, 2021
Location: Houston, Texas
Book URL: https://openstax.org/books/writing-guide/pages/1-unit-introduction
Section URL: https://openstax.org/books/writing-guide/pages/15-7-evaluation-presentation-and-analysis-of-case-study

© Dec 19, 2023 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

Knowledge Base

Methodology

What Is a Case Study? | Definition, Examples & Methods

What Is a Case Study? | Definition, Examples & Methods

Published on May 8, 2019 by Shona McCombes . Revised on November 20, 2023.

A case study is a detailed study of a specific subject, such as a person, group, place, event, organization, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research.

A case study research design usually involves qualitative methods , but quantitative methods are sometimes also used. Case studies are good for describing , comparing, evaluating and understanding different aspects of a research problem .

When to do a case study, step 1: select a case, step 2: build a theoretical framework, step 3: collect your data, step 4: describe and analyze the case, other interesting articles.

A case study is an appropriate research design when you want to gain concrete, contextual, in-depth knowledge about a specific real-world subject. It allows you to explore the key characteristics, meanings, and implications of the case.

Case studies are often a good choice in a thesis or dissertation . They keep your project focused and manageable when you don’t have the time or resources to do large-scale research.

You might use just one complex case study where you explore a single subject in depth, or conduct multiple case studies to compare and illuminate different aspects of your research problem.

Case study examples
Research question	Case study
What are the ecological effects of wolf reintroduction?	Case study of wolf reintroduction in Yellowstone National Park
How do populist politicians use narratives about history to gain support?	Case studies of Hungarian prime minister Viktor Orbán and US president Donald Trump
How can teachers implement active learning strategies in mixed-level classrooms?	Case study of a local school that promotes active learning
What are the main advantages and disadvantages of wind farms for rural communities?	Case studies of three rural wind farm development projects in different parts of the country
How are viral marketing strategies changing the relationship between companies and consumers?	Case study of the iPhone X marketing campaign
How do experiences of work in the gig economy differ by gender, race and age?	Case studies of Deliveroo and Uber drivers in London

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Once you have developed your problem statement and research questions , you should be ready to choose the specific case that you want to focus on. A good case study should have the potential to:

Provide new or unexpected insights into the subject
Challenge or complicate existing assumptions and theories
Propose practical courses of action to resolve a problem
Open up new directions for future research

TipIf your research is more practical in nature and aims to simultaneously investigate an issue as you solve it, consider conducting action research instead.

Unlike quantitative or experimental research , a strong case study does not require a random or representative sample. In fact, case studies often deliberately focus on unusual, neglected, or outlying cases which may shed new light on the research problem.

Example of an outlying case studyIn the 1960s the town of Roseto, Pennsylvania was discovered to have extremely low rates of heart disease compared to the US average. It became an important case study for understanding previously neglected causes of heart disease.

However, you can also choose a more common or representative case to exemplify a particular category, experience or phenomenon.

Example of a representative case studyIn the 1920s, two sociologists used Muncie, Indiana as a case study of a typical American city that supposedly exemplified the changing culture of the US at the time.

While case studies focus more on concrete details than general theories, they should usually have some connection with theory in the field. This way the case study is not just an isolated description, but is integrated into existing knowledge about the topic. It might aim to:

Exemplify a theory by showing how it explains the case under investigation
Expand on a theory by uncovering new concepts and ideas that need to be incorporated
Challenge a theory by exploring an outlier case that doesn’t fit with established assumptions

To ensure that your analysis of the case has a solid academic grounding, you should conduct a literature review of sources related to the topic and develop a theoretical framework . This means identifying key concepts and theories to guide your analysis and interpretation.

There are many different research methods you can use to collect data on your subject. Case studies tend to focus on qualitative data using methods such as interviews , observations , and analysis of primary and secondary sources (e.g., newspaper articles, photographs, official records). Sometimes a case study will also collect quantitative data.

Example of a mixed methods case studyFor a case study of a wind farm development in a rural area, you could collect quantitative data on employment rates and business revenue, collect qualitative data on local people’s perceptions and experiences, and analyze local and national media coverage of the development.

The aim is to gain as thorough an understanding as possible of the case and its context.

Prevent plagiarism. Run a free check.

In writing up the case study, you need to bring together all the relevant aspects to give as complete a picture as possible of the subject.

How you report your findings depends on the type of research you are doing. Some case studies are structured like a standard scientific paper or thesis , with separate sections or chapters for the methods , results and discussion .

Others are written in a more narrative style, aiming to explore the case from various angles and analyze its meanings and implications (for example, by using textual analysis or discourse analysis ).

In all cases, though, make sure to give contextual details about the case, connect it back to the literature and theory, and discuss how it fits into wider patterns or debates.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

Normal distribution
Degrees of freedom
Null hypothesis
Discourse analysis
Control groups
Mixed methods research
Non-probability sampling
Quantitative research
Ecological validity

Research bias

Rosenthal effect
Implicit bias
Cognitive bias
Selection bias
Negativity bias
Status quo bias

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. (2023, November 20). What Is a Case Study? | Definition, Examples & Methods. Scribbr. Retrieved August 14, 2024, from https://www.scribbr.com/methodology/case-study/

Is this article helpful?

Shona McCombes

Other students also liked, primary vs. secondary sources | difference & examples, what is a theoretical framework | guide to organizing, what is action research | definition & examples, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

Open access
Published: 10 November 2020

Case study research for better evaluations of complex interventions: rationale and challenges

Sara Paparini ORCID: orcid.org/0000-0002-1909-2481 1 ,
Judith Green 2 ,
Chrysanthi Papoutsi 1 ,
Jamie Murdoch 3 ,
Mark Petticrew 4 ,
Trish Greenhalgh 1 ,
Benjamin Hanckel 5 &
Sara Shaw 1

BMC Medicine volume 18 , Article number: 301 ( 2020 ) Cite this article

19k Accesses

46 Citations

35 Altmetric

Metrics details

The need for better methods for evaluation in health research has been widely recognised. The ‘complexity turn’ has drawn attention to the limitations of relying on causal inference from randomised controlled trials alone for understanding whether, and under which conditions, interventions in complex systems improve health services or the public health, and what mechanisms might link interventions and outcomes. We argue that case study research—currently denigrated as poor evidence—is an under-utilised resource for not only providing evidence about context and transferability, but also for helping strengthen causal inferences when pathways between intervention and effects are likely to be non-linear.

Case study research, as an overall approach, is based on in-depth explorations of complex phenomena in their natural, or real-life, settings. Empirical case studies typically enable dynamic understanding of complex challenges and provide evidence about causal mechanisms and the necessary and sufficient conditions (contexts) for intervention implementation and effects. This is essential evidence not just for researchers concerned about internal and external validity, but also research users in policy and practice who need to know what the likely effects of complex programmes or interventions will be in their settings. The health sciences have much to learn from scholarship on case study methodology in the social sciences. However, there are multiple challenges in fully exploiting the potential learning from case study research. First are misconceptions that case study research can only provide exploratory or descriptive evidence. Second, there is little consensus about what a case study is, and considerable diversity in how empirical case studies are conducted and reported. Finally, as case study researchers typically (and appropriately) focus on thick description (that captures contextual detail), it can be challenging to identify the key messages related to intervention evaluation from case study reports.

Whilst the diversity of published case studies in health services and public health research is rich and productive, we recommend further clarity and specific methodological guidance for those reporting case study research for evaluation audiences.

Peer Review reports

The need for methodological development to address the most urgent challenges in health research has been well-documented. Many of the most pressing questions for public health research, where the focus is on system-level determinants [ 1 , 2 ], and for health services research, where provisions typically vary across sites and are provided through interlocking networks of services [ 3 ], require methodological approaches that can attend to complexity. The need for methodological advance has arisen, in part, as a result of the diminishing returns from randomised controlled trials (RCTs) where they have been used to answer questions about the effects of interventions in complex systems [ 4 , 5 , 6 ]. In conditions of complexity, there is limited value in maintaining the current orientation to experimental trial designs in the health sciences as providing ‘gold standard’ evidence of effect.

There are increasing calls for methodological pluralism [ 7 , 8 ], with the recognition that complex intervention and context are not easily or usefully separated (as is often the situation when using trial design), and that system interruptions may have effects that are not reducible to linear causal pathways between intervention and outcome. These calls are reflected in a shifting and contested discourse of trial design, seen with the emergence of realist [ 9 ], adaptive and hybrid (types 1, 2 and 3) [ 10 , 11 ] trials that blend studies of effectiveness with a close consideration of the contexts of implementation. Similarly, process evaluation has now become a core component of complex healthcare intervention trials, reflected in MRC guidance on how to explore implementation, causal mechanisms and context [ 12 ].

Evidence about the context of an intervention is crucial for questions of external validity. As Woolcock [ 4 ] notes, even if RCT designs are accepted as robust for maximising internal validity, questions of transferability (how well the intervention works in different contexts) and generalisability (how well the intervention can be scaled up) remain unanswered [ 5 , 13 ]. For research evidence to have impact on policy and systems organisation, and thus to improve population and patient health, there is an urgent need for better methods for strengthening external validity, including a better understanding of the relationship between intervention and context [ 14 ].

Policymakers, healthcare commissioners and other research users require credible evidence of relevance to their settings and populations [ 15 ], to perform what Rosengarten and Savransky [ 16 ] call ‘careful abstraction’ to the locales that matter for them. They also require robust evidence for understanding complex causal pathways. Case study research, currently under-utilised in public health and health services evaluation, can offer considerable potential for strengthening faith in both external and internal validity. For example, in an empirical case study of how the policy of free bus travel had specific health effects in London, UK, a quasi-experimental evaluation (led by JG) identified how important aspects of context (a good public transport system) and intervention (that it was universal) were necessary conditions for the observed effects, thus providing useful, actionable evidence for decision-makers in other contexts [ 17 ].

The overall approach of case study research is based on the in-depth exploration of complex phenomena in their natural, or ‘real-life’, settings. Empirical case studies typically enable dynamic understanding of complex challenges rather than restricting the focus on narrow problem delineations and simple fixes. Case study research is a diverse and somewhat contested field, with multiple definitions and perspectives grounded in different ways of viewing the world, and involving different combinations of methods. In this paper, we raise awareness of such plurality and highlight the contribution that case study research can make to the evaluation of complex system-level interventions. We review some of the challenges in exploiting the current evidence base from empirical case studies and conclude by recommending that further guidance and minimum reporting criteria for evaluation using case studies, appropriate for audiences in the health sciences, can enhance the take-up of evidence from case study research.

Case study research offers evidence about context, causal inference in complex systems and implementation

Well-conducted and described empirical case studies provide evidence on context, complexity and mechanisms for understanding how, where and why interventions have their observed effects. Recognition of the importance of context for understanding the relationships between interventions and outcomes is hardly new. In 1943, Canguilhem berated an over-reliance on experimental designs for determining universal physiological laws: ‘As if one could determine a phenomenon’s essence apart from its conditions! As if conditions were a mask or frame which changed neither the face nor the picture!’ ([ 18 ] p126). More recently, a concern with context has been expressed in health systems and public health research as part of what has been called the ‘complexity turn’ [ 1 ]: a recognition that many of the most enduring challenges for developing an evidence base require a consideration of system-level effects [ 1 ] and the conceptualisation of interventions as interruptions in systems [ 19 ].

The case study approach is widely recognised as offering an invaluable resource for understanding the dynamic and evolving influence of context on complex, system-level interventions [ 20 , 21 , 22 , 23 ]. Empirically, case studies can directly inform assessments of where, when, how and for whom interventions might be successfully implemented, by helping to specify the necessary and sufficient conditions under which interventions might have effects and to consolidate learning on how interdependencies, emergence and unpredictability can be managed to achieve and sustain desired effects. Case study research has the potential to address four objectives for improving research and reporting of context recently set out by guidance on taking account of context in population health research [ 24 ], that is to (1) improve the appropriateness of intervention development for specific contexts, (2) improve understanding of ‘how’ interventions work, (3) better understand how and why impacts vary across contexts and (4) ensure reports of intervention studies are most useful for decision-makers and researchers.

However, evaluations of complex healthcare interventions have arguably not exploited the full potential of case study research and can learn much from other disciplines. For evaluative research, exploratory case studies have had a traditional role of providing data on ‘process’, or initial ‘hypothesis-generating’ scoping, but might also have an increasing salience for explanatory aims. Across the social and political sciences, different kinds of case studies are undertaken to meet diverse aims (description, exploration or explanation) and across different scales (from small N qualitative studies that aim to elucidate processes, or provide thick description, to more systematic techniques designed for medium-to-large N cases).

Case studies with explanatory aims vary in terms of their positioning within mixed-methods projects, with designs including (but not restricted to) (1) single N of 1 studies of interventions in specific contexts, where the overall design is a case study that may incorporate one or more (randomised or not) comparisons over time and between variables within the case; (2) a series of cases conducted or synthesised to provide explanation from variations between cases; and (3) case studies of particular settings within RCT or quasi-experimental designs to explore variation in effects or implementation.

Detailed qualitative research (typically done as ‘case studies’ within process evaluations) provides evidence for the plausibility of mechanisms [ 25 ], offering theoretical generalisations for how interventions may function under different conditions. Although RCT designs reduce many threats to internal validity, the mechanisms of effect remain opaque, particularly when the causal pathways between ‘intervention’ and ‘effect’ are long and potentially non-linear: case study research has a more fundamental role here, in providing detailed observational evidence for causal claims [ 26 ] as well as producing a rich, nuanced picture of tensions and multiple perspectives [ 8 ].

Longitudinal or cross-case analysis may be best suited for evidence generation in system-level evaluative research. Turner [ 27 ], for instance, reflecting on the complex processes in major system change, has argued for the need for methods that integrate learning across cases, to develop theoretical knowledge that would enable inferences beyond the single case, and to develop generalisable theory about organisational and structural change in health systems. Qualitative Comparative Analysis (QCA) [ 28 ] is one such formal method for deriving causal claims, using set theory mathematics to integrate data from empirical case studies to answer questions about the configurations of causal pathways linking conditions to outcomes [ 29 , 30 ].

Nonetheless, the single N case study, too, provides opportunities for theoretical development [ 31 ], and theoretical generalisation or analytical refinement [ 32 ]. How ‘the case’ and ‘context’ are conceptualised is crucial here. Findings from the single case may seem to be confined to its intrinsic particularities in a specific and distinct context [ 33 ]. However, if such context is viewed as exemplifying wider social and political forces, the single case can be ‘telling’, rather than ‘typical’, and offer insight into a wider issue [ 34 ]. Internal comparisons within the case can offer rich possibilities for logical inferences about causation [ 17 ]. Further, case studies of any size can be used for theory testing through refutation [ 22 ]. The potential lies, then, in utilising the strengths and plurality of case study to support theory-driven research within different methodological paradigms.

Evaluation research in health has much to learn from a range of social sciences where case study methodology has been used to develop various kinds of causal inference. For instance, Gerring [ 35 ] expands on the within-case variations utilised to make causal claims. For Gerring [ 35 ], case studies come into their own with regard to invariant or strong causal claims (such as X is a necessary and/or sufficient condition for Y) rather than for probabilistic causal claims. For the latter (where experimental methods might have an advantage in estimating effect sizes), case studies offer evidence on mechanisms: from observations of X affecting Y, from process tracing or from pattern matching. Case studies also support the study of emergent causation, that is, the multiple interacting properties that account for particular and unexpected outcomes in complex systems, such as in healthcare [ 8 ].

Finally, efficacy (or beliefs about efficacy) is not the only contributor to intervention uptake, with a range of organisational and policy contingencies affecting whether an intervention is likely to be rolled out in practice. Case study research is, therefore, invaluable for learning about contextual contingencies and identifying the conditions necessary for interventions to become normalised (i.e. implemented routinely) in practice [ 36 ].

The challenges in exploiting evidence from case study research

At present, there are significant challenges in exploiting the benefits of case study research in evaluative health research, which relate to status, definition and reporting. Case study research has been marginalised at the bottom of an evidence hierarchy, seen to offer little by way of explanatory power, if nonetheless useful for adding descriptive data on process or providing useful illustrations for policymakers [ 37 ]. This is an opportune moment to revisit this low status. As health researchers are increasingly charged with evaluating ‘natural experiments’—the use of face masks in the response to the COVID-19 pandemic being a recent example [ 38 ]—rather than interventions that take place in settings that can be controlled, research approaches using methods to strengthen causal inference that does not require randomisation become more relevant.

A second challenge for improving the use of case study evidence in evaluative health research is that, as we have seen, what is meant by ‘case study’ varies widely, not only across but also within disciplines. There is indeed little consensus amongst methodologists as to how to define ‘a case study’. Definitions focus, variously, on small sample size or lack of control over the intervention (e.g. [ 39 ] p194), on in-depth study and context [ 40 , 41 ], on the logic of inference used [ 35 ] or on distinct research strategies which incorporate a number of methods to address questions of ‘how’ and ‘why’ [ 42 ]. Moreover, definitions developed for specific disciplines do not capture the range of ways in which case study research is carried out across disciplines. Multiple definitions of case study reflect the richness and diversity of the approach. However, evidence suggests that a lack of consensus across methodologists results in some of the limitations of published reports of empirical case studies [ 43 , 44 ]. Hyett and colleagues [ 43 ], for instance, reviewing reports in qualitative journals, found little match between methodological definitions of case study research and how authors used the term.

This raises the third challenge we identify that case study reports are typically not written in ways that are accessible or useful for the evaluation research community and policymakers. Case studies may not appear in journals widely read by those in the health sciences, either because space constraints preclude the reporting of rich, thick descriptions, or because of the reported lack of willingness of some biomedical journals to publish research that uses qualitative methods [ 45 ], signalling the persistence of the aforementioned evidence hierarchy. Where they do, however, the term ‘case study’ is used to indicate, interchangeably, a qualitative study, an N of 1 sample, or a multi-method, in-depth analysis of one example from a population of phenomena. Definitions of what constitutes the ‘case’ are frequently lacking and appear to be used as a synonym for the settings in which the research is conducted. Despite offering insights for evaluation, the primary aims may not have been evaluative, so the implications may not be explicitly drawn out. Indeed, some case study reports might properly be aiming for thick description without necessarily seeking to inform about context or causality.

Acknowledging plurality and developing guidance

We recognise that definitional and methodological plurality is not only inevitable, but also a necessary and creative reflection of the very different epistemological and disciplinary origins of health researchers, and the aims they have in doing and reporting case study research. Indeed, to provide some clarity, Thomas [ 46 ] has suggested a typology of subject/purpose/approach/process for classifying aims (e.g. evaluative or exploratory), sample rationale and selection and methods for data generation of case studies. We also recognise that the diversity of methods used in case study research, and the necessary focus on narrative reporting, does not lend itself to straightforward development of formal quality or reporting criteria.

Existing checklists for reporting case study research from the social sciences—for example Lincoln and Guba’s [ 47 ] and Stake’s [ 33 ]—are primarily orientated to the quality of narrative produced, and the extent to which they encapsulate thick description, rather than the more pragmatic issues of implications for intervention effects. Those designed for clinical settings, such as the CARE (CAse REports) guidelines, provide specific reporting guidelines for medical case reports about single, or small groups of patients [ 48 ], not for case study research.

The Design of Case Study Research in Health Care (DESCARTE) model [ 44 ] suggests a series of questions to be asked of a case study researcher (including clarity about the philosophy underpinning their research), study design (with a focus on case definition) and analysis (to improve process). The model resembles toolkits for enhancing the quality and robustness of qualitative and mixed-methods research reporting, and it is usefully open-ended and non-prescriptive. However, even if it does include some reflections on context, the model does not fully address aspects of context, logic and causal inference that are perhaps most relevant for evaluative research in health.

Hence, for evaluative research where the aim is to report empirical findings in ways that are intended to be pragmatically useful for health policy and practice, this may be an opportune time to consider how to best navigate plurality around what is (minimally) important to report when publishing empirical case studies, especially with regards to the complex relationships between context and interventions, information that case study research is well placed to provide.

The conventional scientific quest for certainty, predictability and linear causality (maximised in RCT designs) has to be augmented by the study of uncertainty, unpredictability and emergent causality [ 8 ] in complex systems. This will require methodological pluralism, and openness to broadening the evidence base to better understand both causality in and the transferability of system change intervention [ 14 , 20 , 23 , 25 ]. Case study research evidence is essential, yet is currently under exploited in the health sciences. If evaluative health research is to move beyond the current impasse on methods for understanding interventions as interruptions in complex systems, we need to consider in more detail how researchers can conduct and report empirical case studies which do aim to elucidate the contextual factors which interact with interventions to produce particular effects. To this end, supported by the UK’s Medical Research Council, we are embracing the challenge to develop guidance for case study researchers studying complex interventions. Following a meta-narrative review of the literature, we are planning a Delphi study to inform guidance that will, at minimum, cover the value of case study research for evaluating the interrelationship between context and complex system-level interventions; for situating and defining ‘the case’, and generalising from case studies; as well as provide specific guidance on conducting, analysing and reporting case study research. Our hope is that such guidance can support researchers evaluating interventions in complex systems to better exploit the diversity and richness of case study research.

Availability of data and materials

Not applicable (article based on existing available academic publications)

Abbreviations

Qualitative comparative analysis

Quasi-experimental design

Randomised controlled trial

Diez Roux AV. Complex systems thinking and current impasses in health disparities research. Am J Public Health. 2011;101(9):1627–34.

Article Google Scholar

Ogilvie D, Mitchell R, Mutrie N, M P, Platt S. Evaluating health effects of transport interventions: methodologic case study. Am J Prev Med 2006;31:118–126.

Walshe C. The evaluation of complex interventions in palliative care: an exploration of the potential of case study research strategies. Palliat Med. 2011;25(8):774–81.

Woolcock M. Using case studies to explore the external validity of ‘complex’ development interventions. Evaluation. 2013;19:229–48.

Cartwright N. Are RCTs the gold standard? BioSocieties. 2007;2(1):11–20.

Deaton A, Cartwright N. Understanding and misunderstanding randomized controlled trials. Soc Sci Med. 2018;210:2–21.

Salway S, Green J. Towards a critical complex systems approach to public health. Crit Public Health. 2017;27(5):523–4.

Greenhalgh T, Papoutsi C. Studying complexity in health services research: desperately seeking an overdue paradigm shift. BMC Med. 2018;16(1):95.

Bonell C, Warren E, Fletcher A. Realist trials and the testing of context-mechanism-outcome configurations: a response to Van Belle et al. Trials. 2016;17:478.

Pallmann P, Bedding AW, Choodari-Oskooei B. Adaptive designs in clinical trials: why use them, and how to run and report them. BMC Med. 2018;16:29.

Curran G, Bauer M, Mittman B, Pyne J, Stetler C. Effectiveness-implementation hybrid designs: combining elements of clinical effectiveness and implementation research to enhance public health impact. Med Care. 2012;50(3):217–26. https://doi.org/10.1097/MLR.0b013e3182408812 .

Moore GF, Audrey S, Barker M, Bond L, Bonell C, Hardeman W, et al. Process evaluation of complex interventions: Medical Research Council guidance. BMJ. 2015 [cited 2020 Jun 27];350. Available from: https://www.bmj.com/content/350/bmj.h1258 .

Evans RE, Craig P, Hoddinott P, Littlecott H, Moore L, Murphy S, et al. When and how do ‘effective’ interventions need to be adapted and/or re-evaluated in new contexts? The need for guidance. J Epidemiol Community Health. 2019;73(6):481–2.

Shoveller J. A critical examination of representations of context within research on population health interventions. Crit Public Health. 2016;26(5):487–500.

Treweek S, Zwarenstein M. Making trials matter: pragmatic and explanatory trials and the problem of applicability. Trials. 2009;10(1):37.

Rosengarten M, Savransky M. A careful biomedicine? Generalization and abstraction in RCTs. Crit Public Health. 2019;29(2):181–91.

Green J, Roberts H, Petticrew M, Steinbach R, Goodman A, Jones A, et al. Integrating quasi-experimental and inductive designs in evaluation: a case study of the impact of free bus travel on public health. Evaluation. 2015;21(4):391–406.

Canguilhem G. The normal and the pathological. New York: Zone Books; 1991. (1949).

Google Scholar

Hawe P, Shiell A, Riley T. Theorising interventions as events in systems. Am J Community Psychol. 2009;43:267–76.

King G, Keohane RO, Verba S. Designing social inquiry: scientific inference in qualitative research: Princeton University Press; 1994.

Greenhalgh T, Robert G, Macfarlane F, Bate P, Kyriakidou O. Diffusion of innovations in service organizations: systematic review and recommendations. Milbank Q. 2004;82(4):581–629.

Yin R. Enhancing the quality of case studies in health services research. Health Serv Res. 1999;34(5 Pt 2):1209.

CAS PubMed PubMed Central Google Scholar

Raine R, Fitzpatrick R, Barratt H, Bevan G, Black N, Boaden R, et al. Challenges, solutions and future directions in the evaluation of service innovations in health care and public health. Health Serv Deliv Res. 2016 [cited 2020 Jun 30];4(16). Available from: https://www.journalslibrary.nihr.ac.uk/hsdr/hsdr04160#/abstract .

Craig P, Di Ruggiero E, Frohlich KL, E M, White M, Group CCGA. Taking account of context in population health intervention research: guidance for producers, users and funders of research. NIHR Evaluation, Trials and Studies Coordinating Centre; 2018.

Grant RL, Hood R. Complex systems, explanation and policy: implications of the crisis of replication for public health research. Crit Public Health. 2017;27(5):525–32.

Mahoney J. Strategies of causal inference in small-N analysis. Sociol Methods Res. 2000;4:387–424.

Turner S. Major system change: a management and organisational research perspective. In: Rosalind Raine, Ray Fitzpatrick, Helen Barratt, Gywn Bevan, Nick Black, Ruth Boaden, et al. Challenges, solutions and future directions in the evaluation of service innovations in health care and public health. Health Serv Deliv Res. 2016;4(16) 2016. https://doi.org/10.3310/hsdr04160.

Ragin CC. Using qualitative comparative analysis to study causal complexity. Health Serv Res. 1999;34(5 Pt 2):1225.

Hanckel B, Petticrew M, Thomas J, Green J. Protocol for a systematic review of the use of qualitative comparative analysis for evaluative questions in public health research. Syst Rev. 2019;8(1):252.

Schneider CQ, Wagemann C. Set-theoretic methods for the social sciences: a guide to qualitative comparative analysis: Cambridge University Press; 2012. 369 p.

Flyvbjerg B. Five misunderstandings about case-study research. Qual Inq. 2006;12:219–45.

Tsoukas H. Craving for generality and small-N studies: a Wittgensteinian approach towards the epistemology of the particular in organization and management studies. Sage Handb Organ Res Methods. 2009:285–301.

Stake RE. The art of case study research. London: Sage Publications Ltd; 1995.

Mitchell JC. Typicality and the case study. Ethnographic research: A guide to general conduct. Vol. 238241. 1984.

Gerring J. What is a case study and what is it good for? Am Polit Sci Rev. 2004;98(2):341–54.

May C, Mort M, Williams T, F M, Gask L. Health technology assessment in its local contexts: studies of telehealthcare. Soc Sci Med 2003;57:697–710.

McGill E. Trading quality for relevance: non-health decision-makers’ use of evidence on the social determinants of health. BMJ Open. 2015;5(4):007053.

Greenhalgh T. We can’t be 100% sure face masks work – but that shouldn’t stop us wearing them | Trish Greenhalgh. The Guardian. 2020 [cited 2020 Jun 27]; Available from: https://www.theguardian.com/commentisfree/2020/jun/05/face-masks-coronavirus .

Hammersley M. So, what are case studies? In: What’s wrong with ethnography? New York: Routledge; 1992.

Crowe S, Cresswell K, Robertson A, Huby G, Avery A, Sheikh A. The case study approach. BMC Med Res Methodol. 2011;11(1):100.

Luck L, Jackson D, Usher K. Case study: a bridge across the paradigms. Nurs Inq. 2006;13(2):103–9.

Yin RK. Case study research and applications: design and methods: Sage; 2017.

Hyett N, A K, Dickson-Swift V. Methodology or method? A critical review of qualitative case study reports. Int J Qual Stud Health Well-Being. 2014;9:23606.

Carolan CM, Forbat L, Smith A. Developing the DESCARTE model: the design of case study research in health care. Qual Health Res. 2016;26(5):626–39.

Greenhalgh T, Annandale E, Ashcroft R, Barlow J, Black N, Bleakley A, et al. An open letter to the BMJ editors on qualitative research. Bmj. 2016;352.

Thomas G. A typology for the case study in social science following a review of definition, discourse, and structure. Qual Inq. 2011;17(6):511–21.

Lincoln YS, Guba EG. Judging the quality of case study reports. Int J Qual Stud Educ. 1990;3(1):53–9.

Riley DS, Barber MS, Kienle GS, Aronson JK, Schoen-Angerer T, Tugwell P, et al. CARE guidelines for case reports: explanation and elaboration document. J Clin Epidemiol. 2017;89:218–35.

Download references

Acknowledgements

Not applicable

This work was funded by the Medical Research Council - MRC Award MR/S014632/1 HCS: Case study, Context and Complex interventions (TRIPLE C). SP was additionally funded by the University of Oxford's Higher Education Innovation Fund (HEIF).

Author information

Authors and affiliations.

Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK

Sara Paparini, Chrysanthi Papoutsi, Trish Greenhalgh & Sara Shaw

Wellcome Centre for Cultures & Environments of Health, University of Exeter, Exeter, UK

Judith Green

School of Health Sciences, University of East Anglia, Norwich, UK

Jamie Murdoch

Public Health, Environments and Society, London School of Hygiene & Tropical Medicin, London, UK

Mark Petticrew

Institute for Culture and Society, Western Sydney University, Penrith, Australia

Benjamin Hanckel

You can also search for this author in PubMed Google Scholar

Contributions

JG, MP, SP, JM, TG, CP and SS drafted the initial paper; all authors contributed to the drafting of the final version, and read and approved the final manuscript.

Corresponding author

Correspondence to Sara Paparini .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Paparini, S., Green, J., Papoutsi, C. et al. Case study research for better evaluations of complex interventions: rationale and challenges. BMC Med 18 , 301 (2020). https://doi.org/10.1186/s12916-020-01777-6

Download citation

Received : 03 July 2020

Accepted : 07 September 2020

Published : 10 November 2020

DOI : https://doi.org/10.1186/s12916-020-01777-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Qualitative
Case studies
Mixed-method
Public health
Health services research
Interventions

BMC Medicine

ISSN: 1741-7015

Submission enquiries: [email protected]
General enquiries: [email protected]

- Google Chrome

Intended for healthcare professionals

My email alerts
BMA member login
Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Search form

Advanced search
Search responses
Search blogs
Qualitative Research:...

Qualitative Research: Case study evaluation

Related content
Peer review
Justin Keen , research fellow, health economics research group a ,
Tim Packwood a
Brunel University, Uxbridge, Middlesex UB8 3PH
a Correspondence to: Dr Keen.

Case study evaluations, using one or more qualitative methods, have been used to investigate important practical and policy questions in health care. This paper describes the features of a well designed case study and gives examples showing how qualitative methods are used in evaluations of health services and health policy.

This is the last in a series of seven articles describing non-quantitative techniques and showing their value in health research

Introduction

The medical approach to understanding disease has traditionally drawn heavily on qualitative data, and in particular on case studies to illustrate important or interesting phenomena. The tradition continues today, not least in regular case reports in this and other medical journals. Moreover, much of the everyday work of doctors and other health professionals still involves decisions that are qualitative rather than quantitative in nature.

This paper discusses the use of qualitative research methods, not in clinical care but in case study evaluations of health service interventions. It is useful for doctors to understand the principles guiding the design and conduct of these evaluations, because they are frequently used by both researchers and inspectorial agencies (such as the Audit Commission in the United Kingdom and the Office of Technology Assessment in the United States) to investigate the work of doctors and other health professionals.

We briefly discuss the circumstances in which case study research can usefully be undertaken in health service settings and the ways in which qualitative methods are used within case studies. Examples show how qualitative methods are applied, both in purely qualitative studies and alongside quantitative methods.

Case study evaluations

Doctors often find themselves asking important practical questions, such as should we be involved in the management of hospitals and, if so, how? how will new government policies affect the lives of our patients? and how can we cope with changes …

Log in using your username and password

BMA Member Log In

If you have a subscription to The BMJ, log in:

Need to activate
Log in via institution
Log in via OpenAthens

Log in through your institution

Subscribe from £184 *.

Subscribe and get access to all BMJ articles, and much more.

* For online subscription

Access this article for 1 day for: £50 / $60/ €56 ( excludes VAT )

You can download a PDF version for your personal record.

Buy this article

Home » Case Study – Methods, Examples and Guide

Case Study – Methods, Examples and Guide

Table of Contents

A case study is a research method that involves an in-depth examination and analysis of a particular phenomenon or case, such as an individual, organization, community, event, or situation.

It is a qualitative research approach that aims to provide a detailed and comprehensive understanding of the case being studied. Case studies typically involve multiple sources of data, including interviews, observations, documents, and artifacts, which are analyzed using various techniques, such as content analysis, thematic analysis, and grounded theory. The findings of a case study are often used to develop theories, inform policy or practice, or generate new research questions.

Types of Case Study

Types and Methods of Case Study are as follows:

Single-Case Study

A single-case study is an in-depth analysis of a single case. This type of case study is useful when the researcher wants to understand a specific phenomenon in detail.

For Example , A researcher might conduct a single-case study on a particular individual to understand their experiences with a particular health condition or a specific organization to explore their management practices. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of a single-case study are often used to generate new research questions, develop theories, or inform policy or practice.

Multiple-Case Study

A multiple-case study involves the analysis of several cases that are similar in nature. This type of case study is useful when the researcher wants to identify similarities and differences between the cases.

For Example, a researcher might conduct a multiple-case study on several companies to explore the factors that contribute to their success or failure. The researcher collects data from each case, compares and contrasts the findings, and uses various techniques to analyze the data, such as comparative analysis or pattern-matching. The findings of a multiple-case study can be used to develop theories, inform policy or practice, or generate new research questions.

Exploratory Case Study

An exploratory case study is used to explore a new or understudied phenomenon. This type of case study is useful when the researcher wants to generate hypotheses or theories about the phenomenon.

For Example, a researcher might conduct an exploratory case study on a new technology to understand its potential impact on society. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as grounded theory or content analysis. The findings of an exploratory case study can be used to generate new research questions, develop theories, or inform policy or practice.

Descriptive Case Study

A descriptive case study is used to describe a particular phenomenon in detail. This type of case study is useful when the researcher wants to provide a comprehensive account of the phenomenon.

For Example, a researcher might conduct a descriptive case study on a particular community to understand its social and economic characteristics. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of a descriptive case study can be used to inform policy or practice or generate new research questions.

Instrumental Case Study

An instrumental case study is used to understand a particular phenomenon that is instrumental in achieving a particular goal. This type of case study is useful when the researcher wants to understand the role of the phenomenon in achieving the goal.

For Example, a researcher might conduct an instrumental case study on a particular policy to understand its impact on achieving a particular goal, such as reducing poverty. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of an instrumental case study can be used to inform policy or practice or generate new research questions.

Case Study Data Collection Methods

Here are some common data collection methods for case studies:

Interviews involve asking questions to individuals who have knowledge or experience relevant to the case study. Interviews can be structured (where the same questions are asked to all participants) or unstructured (where the interviewer follows up on the responses with further questions). Interviews can be conducted in person, over the phone, or through video conferencing.

Observations

Observations involve watching and recording the behavior and activities of individuals or groups relevant to the case study. Observations can be participant (where the researcher actively participates in the activities) or non-participant (where the researcher observes from a distance). Observations can be recorded using notes, audio or video recordings, or photographs.

Documents can be used as a source of information for case studies. Documents can include reports, memos, emails, letters, and other written materials related to the case study. Documents can be collected from the case study participants or from public sources.

Surveys involve asking a set of questions to a sample of individuals relevant to the case study. Surveys can be administered in person, over the phone, through mail or email, or online. Surveys can be used to gather information on attitudes, opinions, or behaviors related to the case study.

Artifacts are physical objects relevant to the case study. Artifacts can include tools, equipment, products, or other objects that provide insights into the case study phenomenon.

How to conduct Case Study Research

Conducting a case study research involves several steps that need to be followed to ensure the quality and rigor of the study. Here are the steps to conduct case study research:

Define the research questions: The first step in conducting a case study research is to define the research questions. The research questions should be specific, measurable, and relevant to the case study phenomenon under investigation.
Select the case: The next step is to select the case or cases to be studied. The case should be relevant to the research questions and should provide rich and diverse data that can be used to answer the research questions.
Collect data: Data can be collected using various methods, such as interviews, observations, documents, surveys, and artifacts. The data collection method should be selected based on the research questions and the nature of the case study phenomenon.
Analyze the data: The data collected from the case study should be analyzed using various techniques, such as content analysis, thematic analysis, or grounded theory. The analysis should be guided by the research questions and should aim to provide insights and conclusions relevant to the research questions.
Draw conclusions: The conclusions drawn from the case study should be based on the data analysis and should be relevant to the research questions. The conclusions should be supported by evidence and should be clearly stated.
Validate the findings: The findings of the case study should be validated by reviewing the data and the analysis with participants or other experts in the field. This helps to ensure the validity and reliability of the findings.
Write the report: The final step is to write the report of the case study research. The report should provide a clear description of the case study phenomenon, the research questions, the data collection methods, the data analysis, the findings, and the conclusions. The report should be written in a clear and concise manner and should follow the guidelines for academic writing.

Examples of Case Study

Here are some examples of case study research:

The Hawthorne Studies : Conducted between 1924 and 1932, the Hawthorne Studies were a series of case studies conducted by Elton Mayo and his colleagues to examine the impact of work environment on employee productivity. The studies were conducted at the Hawthorne Works plant of the Western Electric Company in Chicago and included interviews, observations, and experiments.
The Stanford Prison Experiment: Conducted in 1971, the Stanford Prison Experiment was a case study conducted by Philip Zimbardo to examine the psychological effects of power and authority. The study involved simulating a prison environment and assigning participants to the role of guards or prisoners. The study was controversial due to the ethical issues it raised.
The Challenger Disaster: The Challenger Disaster was a case study conducted to examine the causes of the Space Shuttle Challenger explosion in 1986. The study included interviews, observations, and analysis of data to identify the technical, organizational, and cultural factors that contributed to the disaster.
The Enron Scandal: The Enron Scandal was a case study conducted to examine the causes of the Enron Corporation’s bankruptcy in 2001. The study included interviews, analysis of financial data, and review of documents to identify the accounting practices, corporate culture, and ethical issues that led to the company’s downfall.
The Fukushima Nuclear Disaster : The Fukushima Nuclear Disaster was a case study conducted to examine the causes of the nuclear accident that occurred at the Fukushima Daiichi Nuclear Power Plant in Japan in 2011. The study included interviews, analysis of data, and review of documents to identify the technical, organizational, and cultural factors that contributed to the disaster.

Application of Case Study

Case studies have a wide range of applications across various fields and industries. Here are some examples:

Business and Management

Case studies are widely used in business and management to examine real-life situations and develop problem-solving skills. Case studies can help students and professionals to develop a deep understanding of business concepts, theories, and best practices.

Case studies are used in healthcare to examine patient care, treatment options, and outcomes. Case studies can help healthcare professionals to develop critical thinking skills, diagnose complex medical conditions, and develop effective treatment plans.

Case studies are used in education to examine teaching and learning practices. Case studies can help educators to develop effective teaching strategies, evaluate student progress, and identify areas for improvement.

Social Sciences

Case studies are widely used in social sciences to examine human behavior, social phenomena, and cultural practices. Case studies can help researchers to develop theories, test hypotheses, and gain insights into complex social issues.

Law and Ethics

Case studies are used in law and ethics to examine legal and ethical dilemmas. Case studies can help lawyers, policymakers, and ethical professionals to develop critical thinking skills, analyze complex cases, and make informed decisions.

Purpose of Case Study

The purpose of a case study is to provide a detailed analysis of a specific phenomenon, issue, or problem in its real-life context. A case study is a qualitative research method that involves the in-depth exploration and analysis of a particular case, which can be an individual, group, organization, event, or community.

The primary purpose of a case study is to generate a comprehensive and nuanced understanding of the case, including its history, context, and dynamics. Case studies can help researchers to identify and examine the underlying factors, processes, and mechanisms that contribute to the case and its outcomes. This can help to develop a more accurate and detailed understanding of the case, which can inform future research, practice, or policy.

Case studies can also serve other purposes, including:

Illustrating a theory or concept: Case studies can be used to illustrate and explain theoretical concepts and frameworks, providing concrete examples of how they can be applied in real-life situations.
Developing hypotheses: Case studies can help to generate hypotheses about the causal relationships between different factors and outcomes, which can be tested through further research.
Providing insight into complex issues: Case studies can provide insights into complex and multifaceted issues, which may be difficult to understand through other research methods.
Informing practice or policy: Case studies can be used to inform practice or policy by identifying best practices, lessons learned, or areas for improvement.

Advantages of Case Study Research

There are several advantages of case study research, including:

In-depth exploration: Case study research allows for a detailed exploration and analysis of a specific phenomenon, issue, or problem in its real-life context. This can provide a comprehensive understanding of the case and its dynamics, which may not be possible through other research methods.
Rich data: Case study research can generate rich and detailed data, including qualitative data such as interviews, observations, and documents. This can provide a nuanced understanding of the case and its complexity.
Holistic perspective: Case study research allows for a holistic perspective of the case, taking into account the various factors, processes, and mechanisms that contribute to the case and its outcomes. This can help to develop a more accurate and comprehensive understanding of the case.
Theory development: Case study research can help to develop and refine theories and concepts by providing empirical evidence and concrete examples of how they can be applied in real-life situations.
Practical application: Case study research can inform practice or policy by identifying best practices, lessons learned, or areas for improvement.
Contextualization: Case study research takes into account the specific context in which the case is situated, which can help to understand how the case is influenced by the social, cultural, and historical factors of its environment.

Limitations of Case Study Research

There are several limitations of case study research, including:

Limited generalizability : Case studies are typically focused on a single case or a small number of cases, which limits the generalizability of the findings. The unique characteristics of the case may not be applicable to other contexts or populations, which may limit the external validity of the research.
Biased sampling: Case studies may rely on purposive or convenience sampling, which can introduce bias into the sample selection process. This may limit the representativeness of the sample and the generalizability of the findings.
Subjectivity: Case studies rely on the interpretation of the researcher, which can introduce subjectivity into the analysis. The researcher’s own biases, assumptions, and perspectives may influence the findings, which may limit the objectivity of the research.
Limited control: Case studies are typically conducted in naturalistic settings, which limits the control that the researcher has over the environment and the variables being studied. This may limit the ability to establish causal relationships between variables.
Time-consuming: Case studies can be time-consuming to conduct, as they typically involve a detailed exploration and analysis of a specific case. This may limit the feasibility of conducting multiple case studies or conducting case studies in a timely manner.
Resource-intensive: Case studies may require significant resources, including time, funding, and expertise. This may limit the ability of researchers to conduct case studies in resource-constrained settings.

About the author

Muhammad Hassan

Researcher, Academic Writer, Web developer

Applied Research – Types, Methods and Examples

Mixed Methods Research – Types & Analysis

Ethnographic Research -Types, Methods and Guide

Triangulation in Research – Types, Methods and...

Qualitative Research Methods

Quasi-Experimental Research Design – Types...

Log in using your username and password

Search More Search for this keyword Advanced search
Latest content
Current issue
Write for Us
BMJ Journals

Roberta Heale 1 ,
Alison Twycross 2
1 School of Nursing , Laurentian University , Sudbury , Ontario , Canada
2 School of Health and Social Care , London South Bank University , London , UK
Correspondence to Dr Roberta Heale, School of Nursing, Laurentian University, Sudbury, ON P3E2C6, Canada; rheale{at}laurentian.ca

https://doi.org/10.1136/eb-2017-102845

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

What is it?

Case study is a research methodology, typically seen in social and life sciences. There is no one definition of case study research. 1 However, very simply… ‘a case study can be defined as an intensive study about a person, a group of people or a unit, which is aimed to generalize over several units’. 1 A case study has also been described as an intensive, systematic investigation of a single individual, group, community or some other unit in which the researcher examines in-depth data relating to several variables. 2

Often there are several similar cases to consider such as educational or social service programmes that are delivered from a number of locations. Although similar, they are complex and have unique features. In these circumstances, the evaluation of several, similar cases will provide a better answer to a research question than if only one case is examined, hence the multiple-case study. Stake asserts that the cases are grouped and viewed as one entity, called the quintain . 6 ‘We study what is similar and different about the cases to understand the quintain better’. 6

The steps when using case study methodology are the same as for other types of research. 6 The first step is defining the single case or identifying a group of similar cases that can then be incorporated into a multiple-case study. A search to determine what is known about the case(s) is typically conducted. This may include a review of the literature, grey literature, media, reports and more, which serves to establish a basic understanding of the cases and informs the development of research questions. Data in case studies are often, but not exclusively, qualitative in nature. In multiple-case studies, analysis within cases and across cases is conducted. Themes arise from the analyses and assertions about the cases as a whole, or the quintain, emerge. 6

Benefits and limitations of case studies

If a researcher wants to study a specific phenomenon arising from a particular entity, then a single-case study is warranted and will allow for a in-depth understanding of the single phenomenon and, as discussed above, would involve collecting several different types of data. This is illustrated in example 1 below.

Using a multiple-case research study allows for a more in-depth understanding of the cases as a unit, through comparison of similarities and differences of the individual cases embedded within the quintain. Evidence arising from multiple-case studies is often stronger and more reliable than from single-case research. Multiple-case studies allow for more comprehensive exploration of research questions and theory development. 6

Despite the advantages of case studies, there are limitations. The sheer volume of data is difficult to organise and data analysis and integration strategies need to be carefully thought through. There is also sometimes a temptation to veer away from the research focus. 2 Reporting of findings from multiple-case research studies is also challenging at times, 1 particularly in relation to the word limits for some journal papers.

Examples of case studies

Example 1: nurses’ paediatric pain management practices.

One of the authors of this paper (AT) has used a case study approach to explore nurses’ paediatric pain management practices. This involved collecting several datasets:

Observational data to gain a picture about actual pain management practices.

Questionnaire data about nurses’ knowledge about paediatric pain management practices and how well they felt they managed pain in children.

Questionnaire data about how critical nurses perceived pain management tasks to be.

These datasets were analysed separately and then compared 7–9 and demonstrated that nurses’ level of theoretical did not impact on the quality of their pain management practices. 7 Nor did individual nurse’s perceptions of how critical a task was effect the likelihood of them carrying out this task in practice. 8 There was also a difference in self-reported and observed practices 9 ; actual (observed) practices did not confirm to best practice guidelines, whereas self-reported practices tended to.

Example 2: quality of care for complex patients at Nurse Practitioner-Led Clinics (NPLCs)

The other author of this paper (RH) has conducted a multiple-case study to determine the quality of care for patients with complex clinical presentations in NPLCs in Ontario, Canada. 10 Five NPLCs served as individual cases that, together, represented the quatrain. Three types of data were collected including:

Review of documentation related to the NPLC model (media, annual reports, research articles, grey literature and regulatory legislation).

Interviews with nurse practitioners (NPs) practising at the five NPLCs to determine their perceptions of the impact of the NPLC model on the quality of care provided to patients with multimorbidity.

Chart audits conducted at the five NPLCs to determine the extent to which evidence-based guidelines were followed for patients with diabetes and at least one other chronic condition.

The three sources of data collected from the five NPLCs were analysed and themes arose related to the quality of care for complex patients at NPLCs. The multiple-case study confirmed that nurse practitioners are the primary care providers at the NPLCs, and this positively impacts the quality of care for patients with multimorbidity. Healthcare policy, such as lack of an increase in salary for NPs for 10 years, has resulted in issues in recruitment and retention of NPs at NPLCs. This, along with insufficient resources in the communities where NPLCs are located and high patient vulnerability at NPLCs, have a negative impact on the quality of care. 10

These examples illustrate how collecting data about a single case or multiple cases helps us to better understand the phenomenon in question. Case study methodology serves to provide a framework for evaluation and analysis of complex issues. It shines a light on the holistic nature of nursing practice and offers a perspective that informs improved patient care.

Gustafsson J
Calanzaro M
Sandelowski M

Competing interests None declared.

Provenance and peer review Commissioned; internally peer reviewed.

Read the full text or download the PDF:

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings
My Bibliography
Collections
Citation manager

Save citation to file

Email citation, add to collections.

Create a new collection
Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

Search in PubMed
Search in NLM Catalog
Add to Search

Case study evaluation

Affiliation.

1 Brunel University, Uxbridge, Middlesex.
PMID: 7640596
PMCID: PMC2550500
DOI: 10.1136/bmj.311.7002.444

PubMed Disclaimer

Publication types

Search in MeSH

Related information

Cited in Books

LinkOut - more resources

Full text sources.

Europe PubMed Central
Ovid Technologies, Inc.
PubMed Central
Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

This paper is in the following e-collection/theme issue:

Published on 14.8.2024 in Vol 8 (2024)

A Chatbot (Juno) Prototype to Deploy a Behavioral Activation Intervention to Pregnant Women: Qualitative Evaluation Using a Multiple Case Study

Authors of this article:

Original Paper

Elisa Mancinelli 1, 2 , BSc, MSc ;
Simone Magnolini 3 , PhD ;
Silvia Gabrielli 2 , PhD ;
Silvia Salcuni 1 , PhD

1 Department of Developmental and Socialization Psychology, University of Padova, Padova, Italy

2 Digital Health Lab, Centre for Digital Health and Wellbeing, Fondazione Bruno Kessler, Povo, Trento, Italy

3 Intelligent Digital Agents, Centre for Digital Health and Wellbeing, Fondazione Bruno Kessler, Povo, Trento, Italy

Corresponding Author:

Elisa Mancinelli, BSc, MSc

Department of Developmental and Socialization Psychology, University of Padova

Via Venezia 8

Padova, 35131

Phone: 39 3342799698

Email: [email protected]

Background: Despite the increasing focus on perinatal care, preventive digital interventions are still scarce. Furthermore, the literature suggests that the design and development of these interventions are mainly conducted through a top-down approach that limitedly accounts for direct end user perspectives.

Objective: Building from a previous co-design study, this study aimed to qualitatively evaluate pregnant women’s experiences with a chatbot (Juno) prototype designed to deploy a preventive behavioral activation intervention.

Methods: Using a multiple–case study design, the research aims to uncover similarities and differences in participants’ perceptions of the chatbot while also exploring women’s desires for improvement and technological advancements in chatbot-based interventions in perinatal mental health. Five pregnant women interacted weekly with the chatbot, operationalized in Telegram, following a 6-week intervention. Self-report questionnaires were administered at baseline and postintervention time points. About 10-14 days after concluding interactions with Juno, women participated in a semistructured interview focused on (1) their personal experience with Juno, (2) user experience and user engagement, and (3) their opinions on future technological advancements. Interview transcripts, comprising 15 questions, were qualitatively evaluated and compared. Finally, a text-mining analysis of transcripts was performed.

Results: Similarities and differences have emerged regarding women’s experiences with Juno, appreciating its esthetic but highlighting technical issues and desiring clearer guidance. They found the content useful and pertinent to pregnancy but differed on when they deemed it most helpful. Women expressed interest in receiving increasingly personalized responses and in future integration with existing health care systems for better support. Accordingly, they generally viewed Juno as an effective momentary support but emphasized the need for human interaction in mental health care, particularly if increasingly personalized. Further concerns included overreliance on chatbots when seeking psychological support and the importance of clearly educating users on the chatbot’s limitations.

Conclusions: Overall, the results highlighted both the positive aspects and the shortcomings of the chatbot-based intervention, providing insight into its refinement and future developments. However, women stressed the need to balance technological support with human interactions, particularly when the intervention involves beyond preventive mental health context, to favor a greater and more reliable monitoring.

Introduction

User-centered design of digital mental health interventions.

eHealth is a burgeoning field that integrates medical informatics, public health, and business. It encompasses delivering health services and information through the internet and digital technologies. In this domain, e-mental health specifically focuses on leveraging technologies, such as smartphone apps, websites, chatbots, and virtual reality, to enhance and support mental health care [ 1 - 3 ]. e-Mental health holds many advantages, including the increased scalability of mental services, in terms of screening, prevention, and treatment, leading to reduced costs for the broader health care system [ 4 - 6 ]. However, while the potential benefits of digital technology can be considerable, their actual implementation and use, especially within the field of e-mental health, often fall short. The journey from preuse considerations to initial adoption and, crucially, sustained use poses challenges that need careful navigation and understanding. In this regard, a recent review [ 7 ] exploring design methods and approaches for digital tools in mental health emphasized that human-centered design methods, thus those focusing on user experience (UX) rather than just engineering design, are not fully integrated into the field. The reported design approaches are predominantly external, lacking the perspective of the end users for whom the tool is intended. Indeed, when developing digital solutions, it is essential to consider 4 key components: the design issue and solution, the context in which the design occurs, the dynamics and organization of the design activity, and the actors contributing to the design [ 8 - 10 ]. Within the context of e-mental health intervention, the above altogether emphasizes the significance of co-design, a collaborative process strongly involving targeted end users to contribute to all stages of e-mental health intervention development. This inclusive approach encompasses needs assessment, content development, pilot-testing, and finally, dissemination [ 11 ]. The Obesity-Related Behavioral Intervention Trials (ORBIT) model [ 12 ] is instrumental to this end. The ORBIT model, which uses a user-centered design, provides a methodological framework encompassing a pliable and iterative progressive procedure, predefined clinically significant milestones for advancement, and the option to revert to a prior phase of refinement in case of suboptimal outcomes. Its primary emphasis is on pre-efficacy development and testing, yet not failing to incorporate subsequent research phases to illustrate that treatment optimization is viable even for interventions that have attained the efficacy or effectiveness stage [ 12 ].

e-Mental Health in Perinatal Care: A Focus on Prevention Interventions

The World Health Organization (WHO) [ 13 ] has consistently emphasized the significance of identifying and preventing risks, with the WHO and the United Nations Population Fund acknowledging maternal mental health as a pivotal factor in accomplishing the Millennium Development Goals [ 14 ]. The transition to motherhood involves various intrapersonal and interpersonal changes and challenges that can have negative effects on women’s mental health, increasing the risk of developing peripartum depression [ 15 - 17 ]. However, despite the negative repercussions this poses on the women, the child, and the whole family [ 18 ], as well as the broader society [ 19 - 21 ], it often goes untreated. There are various reasons for this. On the one hand, few women proactively seek professional assistance for their mental health problems, mainly due to factors such as lack of mental health literacy; stigma; and practical barriers like childcare, professional, and financial constraints [ 22 ]. By contrast, women face limited access to specialized perinatal mental health services, which is attributed to the capacity constraints of existing services and long waiting times for those in need of support [ 23 , 24 ]. Therefore, many women never receive any support or treatment. Indeed, this situation has sparked interest in the potential of e-mental health. It can circumvent some of the aforementioned barriers, ultimately facilitating a more widespread help-seeking process; this has led to the creation and dissemination of scalable and more far-reaching tools to support the well-being and mental health of perinatal women [ 25 , 26 ]. In this context, the stepped-care model is noteworthy, as its intentions are focused on promoting the dissemination of mental health programs by facilitating coordination between primary and secondary mental health services [ 27 ], and this coordination can be facilitated through e-mental health. This would ultimately align with the evidence that engaging in help-seeking behaviors increases the likelihood of perinatal women seeking further assistance for their depression symptoms [ 28 ]. In this regard, structured, evidence-based interventions such as behavioral activation (BA) might be particularly suitable. BA is a behavioral intervention designed to alleviate symptoms of depression [ 29 - 32 ] by offering individuals practical strategies to improve their adjustment and well-being and supporting participation in enjoyable and positive activities while reducing engagement in behaviors that worsen depressive symptoms [ 29 , 33 ]. As such, these interventions hold great potential as initial broad-case preventive work. However, when specifically focusing on peripartum depression, there appears to be a deficiency in digital prevention and treatment programs at large [ 34 ], and of BA interventions as well [ 35 ], in addressing depression symptoms during pregnancy compared with the postpartum period, thus underscoring the necessity to boost the development and evaluation of primary mental health services.

This Study: Within the Iterative Design Phase

This study arises from the results obtained by a previous exploratory co-design study [ 36 ] investigating the feasibility of an internet-based BA intervention for pregnant women showing subclinical symptoms of depression. As such, it constitutes the second phase of investigation within the “design phase” foreseen by the above-reported ORBIT model [ 12 ]. This prior exploratory study not only aimed to assess the initial feasibility of the intervention but also sought to gather valuable feedback directly from pregnant women. This then guided the adjustment of the intervention’s content and structure while promoting the use of a different digital solution. More specifically, the study aimed to compare a guided and unguided version of the digital intervention, with the guided group involving psychologists who engaged in weekly text message conversations with women to support them in the intervention content revision. In this respect, data suggested that the guided group showed greater adherence and were more willing overall to finish the intervention than the unguided group. Building on this and in line with the existing literature [ 37 , 38 ] highlighting the potential benefit of including chatbots within psychological interventions by fostering intervention adherence through increased engagement and involvement, a new structuring of the BA intervention as a chatbot-delivered intervention was prototyped. Chatbots are artificial intelligence–enabled engagement technologies, falling under the category of technologies that enable interaction with patients through natural language processing by engaging in limited text conversations intending to support subsequent behavior-change tasks [ 39 ]. It is crucial to emphasize that in this context, chatbots are conceptualized as tools suitable for educational purposes, facilitating the acquisition of specific evidence-based techniques or skills [ 40 ] resulting in suitability for application in preventive contexts.

Mindful of the above, this study aims to qualitatively evaluate, through a multiple case study, pregnant women’s experience and perception of a chatbot prototype to deploy a BA preventive support tool and intervention. In this regard, incorporating a dedicated prototype evaluation during co-design can streamline the process of conducting rigorous evaluations in real-world settings during the subsequent evaluative phases, which may involve activities such as pilot-testing and subsequent randomized controlled trials [ 41 ]. Furthermore, women’s desire for improvement and technological advancements of chatbot-based technology in the field of perinatal mental health was also investigated. As such, this study bounds the design and evaluation of the chatbot and prevention intervention it deploys within the ORBIT methodological framework [ 12 ], in favor of a thorough and meticulous evaluation of the intervention design phase regarding both definition and refinement. In line with this, a multiple–case study design is used as it permits the conduct of a comparative analysis of cases, aiming to identify both similarities and differences among them and, thus, in the perception of the chatbot and the content it deploys. In addition, this approach seeks to unveil patterns and themes that arise from the cross-case analysis. By evaluating the phenomenon of interest across different contexts, a multicase study might enhance the validity of findings by investigating in depth how the phenomenon may vary or remain consistent under various circumstances [ 42 ].

Ethical Considerations

Ethical considerations adhered to the guidelines outlined in the Declaration of Helsinki [ 43 ] and European data protection laws (EU GDPR 679/2016). Approval for the study was obtained from the Ethical Committee of the Psychology Department at the University of Padova (approval 5434/2023). Participants provided their informed consent to participation and data publication for scientific reasons.

Participants and Enrolment Procedure

Women aged >18 years and between the 12th and 30th week of gestation could take part in this study. Exclusion criteria were the following: clinically significant depression symptoms (Patient Health Questionnaire-9 [PHQ-9] [ 44 ] score≥15), suicidal ideation (PHQ-9 item 9), present or past history of psychiatric disorders, and experiencing an artificially induced pregnancy. To allow participation, a Google Form link containing the baseline questionnaires was shared through social media platforms (ie, Facebook and Instagram) in pregnancy-related national groups and pages. After the inclusion and exclusion criteria evaluation, women were provided with the information needed to start the interaction with the chatbot in Telegram and sent a copy of the informed consent they had agreed on that was reported within the web-based questionnaire. To uphold confidentiality, each participant was assigned a unique alphanumeric code. Women were granted the autonomy to withdraw from participation at any point without the obligation to provide reasons and without facing any adverse consequence. Furthermore, they were clearly informed that the software (ie, Telegram and the chatbot) did not constitute a medical device, as its use does not extend to the diagnosis, prevention, monitoring, prediction, prognosis, treatment, or alleviation of diseases. It was, instead, clarified that the developed support intervention and related software were exclusively intended for research purposes and used for the sole collection, storage, transmission of data and administration of questionnaires.

A total of 12 women completed the baseline questionnaire. Among them, 2 dropped out after the first interaction (week 1), 2 after completing the interaction in week 2, and 2 following the third interaction (week 3). One participant withdrew after completing the interaction in week 4. Among those who dropped out in the early weeks, 5 reported medical conditions: Crohn disease, risk of miscarriage associated with a shortened cervix and hypertonic pelvic floor, gestational diabetes, hypothyroidism, and fibroma. Ultimately, 5 participants were included in the multiple–case study evaluation, with none reporting any medical conditions. Of them, 4 (80%) participants reached the postintervention questionnaire evaluation, while 1 (20%) had to interrupt the interaction after week 4; however, she agreed to participate in the final semistructured interview. Given that this study aimed to qualitatively evaluate the perception and experience with a chatbot prototype, the decision was made to include this participant despite not finishing the study since she nonetheless was able to engage with the chatbot for more than half of the anticipated interactions.

The Intervention Content and Structuring

This study aligns with the iterative process outlined in the ORBIT model [ 12 ] for intervention design and evaluation. Specifically, it falls within the refined subphase of the initial design phase, in which practical aspects such as mode and agent of delivery, as well as the frequency and duration of contact, are evaluated to identify the most efficient ways to achieve clinical targets. Parallel to this, and in reference to the Digital Product Lifecycle, we care to emphasize that this project is at the beginning stages of the product life cycle, thus moving back and forth between the “definition phase” (in which the product or intervention concepts and related digital requirements are defined) and the “design phase” (which involves prototyping and pilot-testing the product) [ 45 , 46 ].

Accordingly, this study focuses on evaluating a revised version of an intervention based on an evidence-based BA intervention protocol (behavioral activation treatment for depression-revised) [ 47 ]. This revised intervention represents a second evaluation that builds upon exploratory testing conducted in a preceding co-design study [ 36 ]; as such, thorough information on the intervention content and rationality can be found in this prior study paper. However, commencing with the results from this latter study, in this study, the intervention was organized into 6 weekly sessions ( Figure 1 A), omitting the 3 additional ones previously included. The intervention content was streamlined by eliminating separate in-between–session homework. Instead, the essential components of the homework were incorporated within the main sessions or interactions as on-the-moment exercises strategically designed to promote the original intent of the homework. In this context, it is noteworthy that while the original protocol may not explicitly encompass a comprehensive functional analysis, several treatment components seamlessly aligned within such a framework and were further enhanced in the modified version of the intervention. This alignment is underscored by BA’s dual objectives of pinpointing factors that sustain or reinforce depressive behaviors (both positive and negative reinforcement) and identifying positive reinforcers that can support healthy behavioral patterns ( Figure 1 A). This process thus forms the basis for understanding the functional aspects of behavior, laying the groundwork for targeted strategies that can aid the person in autonomously addressing and modifying the maladaptive behavioral patterns effectively.

Moreover, there was a modification in the mode of delivering the intervention. Specifically, the content, previously structured as an e-learning course, was facilitated through a rule-based chatbot named Juno, operationalized within the Telegram platform with the sole purpose of delivering the intervention. As such, information was delivered through text messages, complemented by explanatory videos and images using the Telegram interface. The text messages were sent by Juno, which adhered to a preestablished protocol that had to be followed sequentially, enabling structured dialogues in which women engaged primarily by selecting the predefined buttons to navigate the conversation. Due to the rule-based nature of Juno, individualized feedback was not provided. In this regard, Figure 1 B depicts a simulation of the interaction between Juno and the user, showing how Juno responds and guides the user during the on-the-moment exercises (the reported example is an exercise conducted during week 2 reported in Figure 1 A “The bidirectional link between behavior and emotions”).

Moreover, by using the Telegram interface, participants can access multimedia resources such as videos and images in the multimedia section of the app. In addition, they could scroll back through the chat history to review past topics, although there was not a specific page summarizing the weekly intervention topics. This allowed participants to revisit and reinforce previous discussions as needed while maintaining the logical sequencing of the intervention. Conversations were structured to last around 10 minutes per session.

The Implementation of the Chatbot Juno in Telegram

The chatbot Juno was developed drawing inspiration from the methodology used in designing Motibot, a chatbot dedicated to providing psychosocial support to adults with diabetes mellitus [ 48 ]. Leveraging the capabilities of the Rasa open-source platform [ 49 ], which has been explicitly tailored for chatbot development and training, became viable owing to the domain-agnostic nature of Motibot’s core structure, which provided a remarkably flexible foundation. The Rasa platform seamlessly integrates advanced machine learning techniques and harnesses pretrained embeddings from language models. This integration empowers the construction of a chatbot finely tuned to a specific language. The synergy of machine learning techniques with crafted rules ensures a chatbot that is not only dynamic but also highly responsive. Within Juno, the pivotal role played by natural language understanding [ 50 ] became evident in interpreting user messages while considering the conversational history. A carefully defined set of variables facilitated a smooth transition between turns in the dialogue.

For instance, named entity recognition, a specific natural language understanding task, was used to interpret the intent “say your name” and identify the entity “user’s name.” Juno optimally used Telegram as its user interface, offering numerous advantages to users while streamlining the development process. In addition, Telegram’s built-in support for interactive tools, including buttons, links, and images, enhanced the overall UX. In this regard, enhancing UX involves using personalized interaction time frames. Juno, as part of its intervention process, prompted users at the end of the initial day to specify when they prefer follow-up contacts. This proactive approach assisted users in scheduling their intervention; Juno uses Rasa’s reminder interface to accomplish this task. However, potential server malfunctions can affect this tool. To mitigate such issues, Juno allows users to initiate the interaction (eg, by writing the message “Can we start?”) if the reminder date passes without any notification. Despite being a solution to a possible interaction problem, this approach should maintain a positive UX. Furthermore, it should be acknowledged that in this initial phase of development, the possibility that the chatbot could occasionally overlook an appointment was a possibility.

Moreover, in line with what was reported above, it is noteworthy that Juno follows an expert-written structured script to maintain focus on the intervention content and avoid deviating from the intended topics. Users can provide input by selecting predefined buttons or providing written responses, but they do not receive personalized feedback based on their input. If users attempt to engage with Juno outside the scope of the intervention, Juno informs them that it cannot respond to such queries and returns to the predefined interaction by starting from where they had left off.

With regard to data storage, no further development was required, as the native support of Rasa for storing interactions in a MongoDB database (ie, a universal time stamp) ensures both consistency and the archiving of users’ data (ie, log-in information, the time spent by each user interacting with Juno, etc).

Measurement Instruments

During the baseline assessment, women were asked the following demographic information: age, gestational week, if the pregnancy was physiological or induced through medically assisted techniques, marital status, educational level, category of occupation, living location, past and present psychiatric history, and presence of any medical condition (both pregnancy-related and not). Moreover, during the baseline assessment, participants completed questionnaires assessing psychological symptoms, levels of BA, and perceived environmental reward. The same questionnaires were administered at the end of the sixth week of interactions, facilitated by Juno in the Telegram Chat, for a postintervention evaluation; the UX and user engagement (UE) measures were included in the postintervention assessment.

Psychological Symptoms

Depression symptoms were evaluated through 2 unidimensional self-report tools: the PHQ-9 [ 44 , 51 ] and the Edinburgh Postnatal Depression Scale [ 52 , 53 ]. The PHQ-9 assesses the severity of depression symptoms over the past 2 weeks through 9 items measured on a 4-point Likert scale (0=“not at all”; 3=“almost every day”). Items align with the diagnostic criteria of the Diagnostic and Statistical Manual of Mental Disorders , Fourth Edition [ 54 ]. A score of ≤9 indicates mild or no depression symptoms, between 10 and 14 indicates moderate symptoms, and ≥15 indicates severe symptoms. Item 9 specifically assesses suicidal ideation. The Edinburgh Postnatal Depression Scale also assesses the severity of depression symptoms, yet on the previous week and more specifically in association with the perinatal period. It comprises 10 items measured on a 4-point Likert scale (0=“no, not at all”; 3=“yes, always”), with item 10 assessing suicidal ideation. Scores range between 0 to 30, and a score of ≥13 suggests probable depression. Anxiety symptoms were measured through the Generalized Anxiety Disorder-7 [ 51 , 55 ], a unidimensional self-report tool gauging the severity of these symptoms over the past 2 weeks through 7 items measured on a 4-point Likert scale (0=“never”; 3=“almost every day”). Scores range between 0 and 21; a score between 0 and 4 suggests minimal anxiety, between 5 and 9 mild anxiety symptoms, between 10 and 14 moderate anxiety symptoms, and ≥15 severe anxiety symptoms. Finally, stress symptoms were assessed through the Perceived Stress Scale-10 [ 56 , 57 ], a unidimensional self-report tool assessing stress symptoms over the past month using 10 items measured on a 4-point Likert scale (0=“never”; 3=“quite often”). Scores range between 10 and 40, with scores ranging from 0 to 13 suggesting lower stress levels, between 14 and 26 moderate stress levels, and ≥27 high perceived stress levels.

BA Measures

The BA for Depression Scale-Short Form [ 58 ] was used to measure changes in avoidance and activation during BA interventions for depression over the past week. It is a self-report featuring 9 items measured on a 7-point Likert scale (0=“not at all”; 6=“completely”), providing scores for BA, behavioral avoidance, and a total score ranging from 0 to 54. The Environmental Reward Observation Scale [ 59 ], a unidimensional self-report tool, was also used; it measures the level of environmental reward perceived in recent months through 10 items rated on a 4-point Likert scale (1=“strongly disagree”; 4=“strongly agree”). Scores range from 10 to 40.

The UX was evaluated through the Mobile Application Rating Scale [ 60 ], a self-report tool evaluating the quality of an app and its features. Comprising 23 items scored on a 5-point Likert scale (1=“poor”; 5=“excellent”), it assesses 4 dimensions of objective quality: engagement, functionality, esthetics, and information, along with a subjective quality scale. Only subscales related to “information,” “subjective app quality,” and “app-specific” (function) were considered for this study, totaling 17 items. UE was instead evaluated through the User Engagement Scale-Short Form [ 61 ], a short self-report tool assessing UE with a digital solution. With 12 items based on a 5-point Likert scale (1=“strongly disagree”; 5=“strongly agree”), it encompasses factors such as focused attention, perceived usability, esthetic attractiveness, and reward. Higher scores index a more positive evaluation.

Semistructured Interviews

Semistructured interviews, conducted by the first author between October and December 2023, featured 15 main questions tailored to the study. The semistructured interview comprehends 3 main blocks of questions and related probing questions: one focused on women’s personal experience with the intervention content (4 questions), another focused on their experience with the chatbot and the overall platforms (5 questions), and the last one inquired on opinions for future technological advancements (6 questions). Before asking the last block of questions, participants were provided with the definitions of digital intervention and technological advancement within the context of chatbot technologies, as reported in the Multimedia Appendix 1 . The interviews were conducted approximately 10 to 14 days after the participants had finished the interactions with the chatbot Juno; they were conducted either by phone call or through Google Meet, based on the participants’ preference. With the participants’ consent, the interviews were audio recorded for transcription and evaluation.

Data Analysis

All the analyses were computed with RStudio (RStudio IDE). Participants’ questionnaire scores were assessed, and score differences in psychological symptoms and activation levels, between preintervention and postintervention time points, were calculated by subtracting the postintervention scores from the baseline ones. Relying on the qualitative meaning of response points (particularly for the psychological symptoms, measured on a 4-point Likert scale), differences between time points were commented on when they differed by a minimum of +3 or –3 score points.

The semistructured interviews were evaluated in 2 different but complementary manners. First, a purely qualitative descriptive evaluation was conducted by extracting and evaluating the key points reported by each case in the related interview transcript for each question. Subsequently, a text-mining analysis was performed using the R package quanteda [ 62 ]. To this end, (1) the transcripts, written in Italian, were tokenized by using the specific quanteda function and converting uppercase letters into lowercase letters, removing numbers, punctuation, and stop words. Subsequently, (2) user responses were subdivided and grouped based on the question they referred to in separate .txt files. Finally, (3) recurrent words (ie, word stems) and their diagrams (ie, pairs of reoccurring word stems) were extracted; the former were considered recurrent if they appeared at least 3 times, while the latter if they appeared at least 2 times across interviews. The 3-occurrences criterion threshold was defined in line with past research [ 48 ]. In particular, the 3-occurrences criterion for including a stem was chosen based on the assumption that through this, an occurrence is expected to belong to the 5% most recurrent ones. This criterion resulted in the extraction of between 2.4% and 9.2% of the most recurrent stems (average 5.5%) for the different questions, reasonably complying with the assumed 5% threshold. In addition, for a given question, the average occurrence of stems was 1.3; thus, a 3-occurrences threshold was equivalent to the condition of a stem recurring with a frequency corresponding to more than twice the average occurrence.

Cases Presentation

Table 1 shows the participant’s demographic information. Figure 2 shows their scores regarding psychological symptoms, activity level, and environmental reward at baseline ( Figure 2 A) and the postintervention time point ( Figure 2 B), further plotting the difference between the 2 time points ( Figure 2 C). Of the 5 cases, participant E completed the interaction with Juno until (and including) week 4 because of a technical issue with the server provider of the chatbot (the update of the server’s public certificate resulted in a compromised connection between the Rasa server and Telegram, and despite efforts within the support time frame, communication restoration was unsuccessful). As such, her postintervention evaluation measurements are not available. It should also be noted that because of technical issues linked to the temporalization of the interactions, participant B skipped the interaction of week 2, participant A skipped the interaction of week 3 ( Figure 1 ), and participant D skipped the interaction of weeks 3 and 5. Furthermore, all participants had to autonomously prompt the interaction with Juno at least once.

Participant	Age (y)	Living area	Education level	Occupation	Marital status	Gestational week
A	29	North Italy	Master’s degree	Freelance worker	Married	18
B	34	Central Italy	PhD	Freelance worker	Married	12
C	31	North Italy	Bachelor’s degree	Employee	Cohabitant	12
D	33	North Italy	PhD	Researcher	Married	25
E	40	North Italy	PhD	Freelance worker	Cohabitant	12

Differences and Similarities Across Cases: Questionnaire Scores

Regarding the trend of change between the 2 time points, women showed comparable levels of psychological symptoms, BA, and environmental reward at baseline, which instead seem slightly different at the postintervention time point. More specifically, participant C stands out as the only one showing a reduction in all psychological symptom variables, with changes ranging from 3 to 4 score points. However, the levels of BA and environmental reward appear seemingly unchanged. By contrast, participant A seems to exhibit a peak in the reduction of stress symptoms and an increase in BA. Interestingly, participant B demonstrated a trend of increase in anxiety symptoms, alongside a trend of reduction in depression symptoms and a notable peak in increased BA.

In contrast, participant D appears to demonstrate a negative peak in BA (ie, a decrease), while the other dimensions seem unchanged. Notwithstanding, it should be stressed that at either time point, none of the participants reported clinically relevant symptoms in terms of depression, anxiety, and stress symptoms. Finally, Figure 3 plots the participants’ evaluation of UX and UE. Taken together, participant B provided, in all dimensions, the lowest UX and UE scores, while participant D had the highest scores. More specifically, all showed a quite high appreciation for the esthetic of the interactions (mean 4, SD 0) and a modest to high perceived usability of the chat (mean 3.67, SD 0.82), which although seems particularly true for participant D, while less so for participant B.

Furthermore, the latter reported a particularly low sense of absorption during the interaction, which was quite low also for participant C. This sense of absorption was instead moderate for participants A and D (mean 2.42, SD 1.17). These 2, together with participant C, also reported a moderate to quite high sense of reward from the interactions (mean 3.33, SD 0.72), instead lower for participant B. A comparable pattern emerged regarding UX-information (mean 4.08, SD 0.63); in addition, participant B, for whom the information was of modest quality, participants A, C, and D instead evaluated them as high-quality information in terms of credible sources, quantity, and clearness. An almost equal score distribution emerged for the app-specific function (ie, the app operation in terms of easy learning, logical flow, and gesture interaction design; mean 3.38, SD 0.75) and subjective quality (ie, the actual availability of using the app; mean 2.69, SD 1.01), with the latter being way lower.

Multimedia appendix 1 shows participants’ specific scores reported in Figures 2 and 3 .

Differences and Similarities Across Cases: Qualitative Evaluation of the Semistructured Interviews’ Answers and Text-Mining Results

A summary of the key concepts that emerged from the answers provided during the semistructured interviews is reported in Tables 2 - 4 , separately for each case. In this regard, it is worth noting the answers provided for q00 regarding the motivation for participation; only participant B reported a more personal motive linked to a desire to enrich her pregnancy experience. Differently, participants C, D, and E (participant E revealed during the semistructured interview that she is a perinatal psychologist and that she participated in the study because she was curious to experience firsthand the potentiality of digital tools in this context) were pushed by curiosity and a personal propensity to help with research. Finally, participant A reported that her curiosity was sparked by seeing one of the institutions that is part of this study.

Interview questions		Participant A	Participant B	Participant C	Participant D	Participant E
q00	What motivated your participation in the study? ( )
q01	How have the technical issues encountered made you feel?
q02	Can you briefly list which were the aspects that you liked the most and those that you liked the least of the intervention? Please, provide reasons for them.
q03	How would you define the content of the interactions concerning the period of pregnancy?
q04	Do you think that the contents you viewed and what can be learned from them can be useful to you in the future, during the postpartum, and even afterward? Why is that?
q05	How would you define or have you perceived the length of the intervention?

Interview questions		Participant A	Participant B	Participant C	Participant D	Participant E
q06	Overall, what do you think about the interactions with the chatbot Juno?; probing questions: 1. How did you feel during the interactions; 2. What did you think of the esthetic of the material?
q07	What do you think about using Chat interactions with a chatbot to communicate content related to psychological well-being such as that Juno sent you?
q08	If you could change or suggest changes, what would change your interactions with Juno? That is, beyond the content of the messages, what would remove and add to the way Juno interacts?					d/n
q09	What do you think about the use of Telegram as an app through which to communicate with Juno, or anyway, with a chatbot?

a d/n: did not know what to answer.

Interview questions		Participant A	Participant B	Participant C	Participant D	Participant E
q10	Based on your pregnancy experience, if you could imagine an ideal app that would provide you with psychological support, how would it be? probing questions: 1. How would you like the information to be structured and provided? 2. Concerning the ease of use and the clarity of the commands, how important do you think they are? What would make it clearer or easier to use? 3. What kind of content would you like to see?		section on how to manage technical problems
q11	What technological aspects would it add to a tool like Juno? probing questions: 1. What do you think about voice commands and voice responses from Juno? 2. Would you like to customize the look of the chatbot? If so, in what terms?
q12	Is there something that worries you about using technologies like chatbots and smartphone apps as tools to provide psychological support?
q13	What do you think might be the pros and cons of using technology over human support in providing psychological support during pregnancy? probing question: 1. Do you think there are personal or social situations in which one can be more suitable than the other?
q14	In your opinion, what could be done or created to manage the challenges and risks that you have mentioned to support the reliance on and the use of these technological tools?
q15	What do you think about the idea of integrating this type of tool within the health system and/or routine care with your gynecologist to promote the psycho-physical well-being of pregnant women?
q16	Is there anything else you would like to add?	—	—	—

a FAQ: frequently asked question.

b Not applicable.

Focusing on the text-mining analysis performed, the interview length ranged between 27.41 and 60.02 (mean 42.3, SD 12.63) minutes. After deleting the stop words, transcripts included a mean of 1732.8 (SD 795.78) words per participant. Overall, the aggregated results (text mining) are shown in Figures 4 - 6 . The nodes (ie, word stems) dimension illustrates the proportion of concept occurrences across transcripts for a specific question, all appearing at least 3 times. Word stems connected by arrows represent diagrams that have occurred at least twice. The word stems are translated after analysis for inclusion in the plots; therefore, the direction of the arrows reflects the Italian syntax.

Personal Experience With the Chatbot Juno

As for the text-mining aggregated results, the transcript of the interviews regarding participants’ personal experience with Juno included, after deleting the stop words, a mean of 67.56 (SD 28.39) words per question. Results are summarized for each question in Figure 4 . Instead, Table 2 summarizes, separately, the participants’ answers to each question. In this respect, all participants reported feeling negative regarding the technical problems encountered, except participant C, who felt indifferent to them. Noteworthy is that although all skipped at least 1 interaction, participant C is the sole one that had not skipped any interaction, while further reporting that she “knew that this is a research project,” thereby highlighting that she had foreseen some issues to occur. In line with this, the text-mining results highlighted the feelings of displeasure , untimely making the experience less impactful ( made it→less→impactful; q01).

Nonetheless, the experience (in terms of the content of the interactions) was liked and felt very interesting , in particular, the content of the exercises/questions (q02). However, participant A specified that she would have preferred if the latter were proposed in a more structured manner while also allowing for the possibility to continue practicing them in between the interactions to favor a sense of continuity. Furthermore, participant D felt that some of the questions (part of the exercises) were redundant. Participants B, C, and D instead stressed their appreciation for how the broader content was deployed in terms of videos and images, while participant E specifically appreciated how the messages were phrased. The overall content of the interactions, particularly the initial psychoeducation, was felt as pertinent and adequate. Coherently, the text-mining results highlighted that the interactions’ content was felt useful , allowing participants to take a moment to pay attention to themselves (q03). In this regard, they all reported that the content was pertinent to the pregnancy period ( period→pregnancy ) but could also be useful during the postpartum and the future in general , supporting them in asking for help ( ask→help ; q03 and q04) and in general favoring a self-awareness that can transversally be applied to life in favor of well-being. However, focusing on the subjective answers, while participant B felt that the content was suited for the beginning of the second trimester, participant C felt that such a period was already too late and that the support provided by the chatbot was better suited for the emotional tumult of the first trimester.

At last, all women felt the 6-week length of the intervention was adequate , although, given the length of the pregnancy period, they could have followed it even for a longer time (q05). This latter aspect was stressed by all those who had skipped at least 1 week of interaction and not by participant C, who had followed all 6.

UX With the Chatbot Juno in Telegram

The transcript of the interviews regarding participants’ UX with the chatbot Juno in Telegram included, after deleting the stop words, a mean of 69.65 (SD 29.95) words per question. Results are summarized for each question in Figure 5 , while Table 3 reports participants’ answers. With regard to women UX in interacting with Juno, as previously outlined, experiences were quite different, albeit the technical problems with the chatbot Juno have emerged as a matter to particularly account for (q08). In this regard, participant B pointed out the importance of providing clearer guidance, ideally beforehand, on how to autonomously deal with technical issues to help avoid feelings of confusion. Notwithstanding, all women showed appreciation for the esthetic of the material ( esthetic→material ) describing it as cute . Furthermore, it mostly brought the focus of the UX to the way Juno answered their inputs, highlighting the relevance of this aspect, thereby wishing for an increased personalization of the answers (q06). However, despite this, participant A perceived that because of the way messages, in general, were phrased and of the overall interaction flow, these made her at times “forget that there was not a person on the other side.” This is instead different from participant B’s perception, who considered the messages to be a bit sterile. In between these 2 polarities is instead the perception of participants D and E, the former describing them as “sufficiently spontaneous and realistic” and the latter further stressing that, although she felt properly guided by Juno, perceiving clearly that Juno was virtually created made her feel reassured. Coherently, when asked about their opinion on using a chatbot as a means to deploy psychological content (q07), participant A reported that the interactions’ limits (in terms of chatbot freedom) were both a limit and a strength. Nonetheless, overall, women felt that it could be an effective medium that can provide a kind of momentary containment ( type→containment ) and that it might work as a cue to subsequently reach for in- person support. Indeed, they felt that beyond its application in preventive contexts, a psychologist is needed ( go→psychologist→instead ; q07), and even in the context of this study, participant B felt the need for human contact at least by telephone call.

Finally, focusing on the app itself, women all agreed on the convenience ( convenient→app ) of Telegram as an interface, allowing them to avoid downloading another app and describing it as an optimal channel that they already knew and that is easy to use (q09).

Opinions on Future Technological Advancement

The transcript of the interviews’ answers regarding the participants’ desired technological advancement included, after deleting the stop words, a mean of 159.49 (SD 53.01) words per question. Results are summarized for each question in Figure 6 ; Table 4 reports the participants’ answers. Overall, when asked about opinions on future technical advancements, women’s answers were quite cohesive. In line with this, when asked about how they would image an ideal app in the context of perinatal care (q10), the greater focus was on the information content ( content→information ) related to what happens during pregnancy and in the different trimesters ( happens→trimester ) as well psychologically ( well-being→psychological ). It was also focused on the possibility of searching for this information and reading about it ( go→search ) freely.

Furthermore, they reported interest in having a chat with a chatbot within the app mainly to ask personal questions related to their personal experience ( linked to→experience→personal ). In this regard, focusing specifically on the potential technological advancements that could be foreseen for chatbots like Juno (q11), women showed a lack of interest in including vocal commands in terms of sending and receiving audios ( command→vocal→no ) and did not show a particular interest in personalizing the chatbot appearance ( personalization→appearance→chatbot ), albeit recognizing that others might. The sole exception was participant E; she reported voice commands as the first thing she would have liked to add, perceiving it as a way to optimize time. Regarding personalization , this aspect was again prominent among all women, stressing their desire to receive more personalized ( personal ) answers . However, albeit desired, such increased personalization and freedom of the chatbot also emerged as women’s main concern regarding the application of these tools in the mental health context (q12). As such, women reported the need to maintain human→monitoring . Indeed, worries were expressed regarding the kind of information the chatbot might give if unsupervised.

Furthermore, they expressed worries related to increased freedom and resemblance to human interactions, with the idea that this might lead to an overreliance on these tools. Indeed, they stressed the risk of these substituting interactions ( substitute→interaction ) with professionals and psychologists ( support→psychologist→risk→more→freedom→chatbot ), which was not desired. In line with this, participants believed that although a main advantage of these technological→tools is that they can be valuable in supporting psychological→well-being in preventive contexts ( preventive→terms ) or to satisfy specific needs without waiting to make an appointment , they cannot equate a therapeutic→intervention delivered in-person, particularly during pregnancy (q13). To deal with the concerns and risks reported, women agreed on the importance of underlining and reminding of what to expect from these tools ( meaning→tools ) and clearly stating their limits , thereby distinguishing the kind of support that can be received by a physical person versus a digital tool ( digital→person ). This would then also work as a disclaimer, thus preventing them from feeling disappointment when perceiving the limits of these tools (q14). In line with the above, women expressed a strong desire for an app that could be integrated within the health (care) →system, perceiving it as something that could create a shared space that facilitates interactions with gynecologists , thereby allowing the latter to account for women’s psychological well-being together with the medical aspects.

Principal Findings

This study aimed to use a multiple–case study design to evaluate and compare pregnant women’s experience and perception of Juno, a chatbot prototype to deploy a BA preventive support intervention; their opinions regarding desired improvements and technological advancements were also investigated. The insights gained from this study are valuable and in line with previous studies emphasizing the importance and essential nature of evaluating prototypes during the design stages of a digital tool and chatbot in particular [ 41 , 63 ]. Within this context, the adoption of a multicase study design [ 42 ] allowed us to gather valuable in-depth information on the similarities and differences in pregnant women’s perceptions, opinions, and desires while also evaluating the technical issues encountered and their impact on women’s experience [ 63 ].

Focusing on the implementation and operationalization of Juno in Telegram allowed women to benefit from the lack of installation requirements, experiencing an interface within a familiar environment; this is an advancement from the platform used in the previous study [ 36 ]. Instead, feedback regarding the materials’ esthetic and intervention content at large was again appreciated, and the content, in particular, was described as sound and useful. Women expressed specific appreciation for the exercises proposed by Juno as part of the BA intervention, assessing that they favored self-reflection. Differently from the previous study [ 36 ], most women would have liked for the intervention to be longer. This might be linked to the weeks of interaction skipped since only participant C who had completed all 6 interactions would have not lengthened the intervention. Another explanation could, instead, be linked to the change of platform and even more the new structuring of the exercises. Compared with the previous structuring [ 36 ], they have been changed so to be as short, simple, and effortless as possible, and as such, they were turned into on-the-moment reasoning exercises guided by Juno and no longer as in-between–session homework. This altogether seems to have been appreciated by women, except for 1 participant A, who instead stressed that she would have preferred to have the possibility to continue training them autonomously through practical exercises also in between the interactions. The desire for continuity became evident throughout her interview, suggesting that the digital tool was perceived as a companion to turn to when extra support was needed by providing a personal space to freely take care of herself. Participant E also highlighted this, considering it as something with added value, especially during the postpartum period, helping her take care of herself to then potentially better care for the newborn.

Mindful of the above, it is pivotal to remember that women in this study can be deemed “healthy,” as none of them present medical conditions, and all psychological symptom variables were below the clinical thresholds. Notwithstanding, it is worth noting that participant C, who besides having cleared the 6 interactions, also reported the highest symptomatic scores at baseline, showed the greatest and most consistent trend of reduction in symptom scores. However, despite some pretest-posttest changes in symptom variables scores, none of the participants ever reached clinical relevance, so they should be regarded as normal fluctuations in the state of well-being and ill-being occurring during pregnancy [ 64 ]. As such, the positive feedback on the meaning of the content and the exercises proposed is important since within a preventive context, the goal is not symptom reduction or resolution but to emphasize the awareness of psychosocial functioning and the intricate relationship between emotions and behaviors. This would ultimately allow the development of transversally applicable personal resources that can be applied across life situations, thus fostering adaptation capacities at large. Women here appeared to have perceived these benefits, acknowledging that the intervention content was relevant to the pregnancy period and could potentially be helpful during the postpartum and in the future. However, it is worth highlighting a difference in its perceived usefulness as a function of the pregnancy period during which they thought that the intervention should be deployed. Participant B considered it suitable for the early pregnancy period, around the beginning of the second trimester, while participant C suggested it was more beneficial for the emotional tumult of the first trimester. It is noteworthy that both women followed the intervention during the same gestational week. Such individual differences in the perception of need are though important in terms of motivation in following the preventive intervention and in the foreseen impact of the information received.

Notwithstanding these individual differences, consistent patterns across women’s feedback were identified, particularly when asking them how their ideal digital tool developed to provide psychological support during pregnancy should be. The first thing they stressed regarded the content; within a preventive context, all women felt the need for more holistic information in which the medical and the psychological aspects are integrated, helping them understand how the 2 influence each other to then receive guidance in understanding what is “normal” and what is not. They expressed a preference for this information to be readily available, giving them the freedom to access it as they preferred. This could indeed support their empowerment [ 26 ] and aligns with the evidence highlighting that engaging in help-seeking–related behaviors increases the likelihood of perinatal women seeking further assistance in the future if needed [ 28 ]. As for receiving support for their own subjective experience, women pointed to chatbots, seeing them as having the potential to provide a 24/7 means to answer their pregnancy-related questions. This was viewed in the context of containing worries without the need to wait or continually seek assistance for potentially smaller concerns. However, all women consistently emphasized that chatbots should not be intended to substitute in-person support or human relationships more broadly. They highlighted the importance of the perception of contact and vicinity for pregnant women. Furthermore, although recognizing the potential benefits in preventive contexts, in situations of greater need and increased psychological symptoms, women stressed that chatbots should always be accompanied by human monitoring.

Keeping this in mind, it is important to reason about the technical problems encountered with the chatbot Juno. Except for the sole woman who did not skip any interactions (participant C), all other women reported their dissatisfaction with the technical problems encountered. Their reactions varied from feeling that the intervention content became less impactful to experiencing frustration, disappointment, confusion, and a perception of loss of control. These aspects are particularly significant in a mental health context, even if preventive, since such perceptions might reduce the willingness to follow the intervention by hindering their sense of agency. Notably, participant B, who participated with the desire to enrich her pregnancy experience, reported higher and more personal expectations toward the intervention. Consistently, her scores on the UX and UE questionnaires and the results from the semistructured interview suggest that she felt the most negative about the technical problems encountered. In contraposition, participant C, who assessed that she knew that interactions with Juno were “part of a research project,” acknowledged the technical problems but remained indifferent. However, it is crucial to consider the reported feelings of confusion and loss of control. While desiring a more personalized chatbot to receive answers that are more in tune with their individual needs, participants themselves emphasized the importance of clear explanations and reminders about the chatbot’s capabilities and limitations as it becomes more sophisticated and autonomous. This is essential to prevent overreliance on the chatbot and to avoid potential disappointment and iatrogenic effects that could decrease the likelihood of seeking help in the future. When discussing the use of chatbots like Juno as a means to deliver psychological content, participants acknowledged its effectiveness in providing momentary containment and serving as an initial step before seeking further support. However, they also underscored the need for human psychologists or professionals in preventive contexts. Participant B, in particular, expressed a desire for some “human” contact, even by telephone while interacting with Juno. This resonates with a compelling argument made by Sedlakova and Trachsel [ 40 ]; they conducted an epistemic analysis of chatbots’ adoption within mental health or therapeutic contexts, prompting the need to carefully reason about how chatbots can be perceived. As such, in line with women’s desire for increased chatbot personalization but worries linked to its potential increased freedom, the authors [ 40 ] suggested balancing the number of humanlike characteristics and features of chatbots and that their application should be confined to specific functions.

Focusing on the broader real-life application of apps and chatbots in contexts such as the health care system, beyond being highly desired, they were seen as tools that could bridge between women and clinical professionals. Moreover, in line with the above, women reported that having such tools would make them feel like their psychological well-being was accounted for together with their physical well-being since the former is felt neglected. Another aspect that has emerged is that they could help favor self-monitoring and monitoring from the clinician; existing literature does indicate that tools of this nature are acceptable to perinatal women as a means of monitoring mood symptoms [ 65 ]. In this regard, the Interactive Centre of Perinatal Excellence developed by the Australian Centre of Perinatal Excellence [ 66 , 67 ] is noteworthy. It is an interactive digital screening app integrated into the health care system and designed to facilitate screening for perinatal depression and anxiety symptoms. It provides women with feedback on the screening results while generating related reports for the clinician. It can support the prevention of perinatal mental health disorders by empowering women, streamlining the screening process, and saving time and resources for both women and clinicians. In such a context, the inclusion of a tool like Juno within an app to “educate” and guide women through their pregnancy and postpartum while allowing for symptom monitoring might hold great potential; what if, within such an app, the preventive BA intervention deployed through Juno was proposed to women showing mild and/or moderate depression symptoms?

The literature highlights consistent prevalence metrics of depression symptoms throughout the whole pregnancy period (worldwide, 20.7%; Europe, 17.9% [ 68 ]; Italy, 6%-22% [ 69 - 72 ]), pointing at it as among the main predictors of postpartum depression [ 16 , 73 ], with repercussions on the quality of life of women [ 74 ] as well as on the child’s development and well-being [ 75 - 77 ]. These metrics highlight the necessity of collaborative efforts in designing and implementing tailored programs, particularly in primary prevention (to prevent symptoms before they start) and secondary prevention (targeting individuals at risk or with subclinical symptoms) [ 78 ]. This is especially crucial, given the unique characteristics of peripartum depression, referring to its direct association with the challenges and bodily changes inherent to the perinatal period [ 79 ] and stressing the need for tailored intervention programs. In this regard, taken together, our results suggest that Juno holds potential for apps in a preventive context, which is of value considering the paucity of preventive perinatal tools [ 34 ]. However, it has also emerged that within this context, a tool like Juno is not deemed as sufficient. In this regard, the comments made by participant A are emblematic; she felt that what Juno could give within the broader perinatal period was like “a drop in the ocean.” This is further exemplified by the dropout of all women with medical conditions, which suggests that in its current form, the intervention deployed through Juno would have limited application. As such, data indicate that its real-life adoption might be scarce if not inserted within a broader context that can better signify its value while allowing us to account for women’s differences in need, which influences the type and amount of use they would make of it. Furthermore, beyond ensuring a better functioning of the tool itself, a thorough action plan linked to problem resolution should be defined and provided to women (both in research and real-life contexts). However, the evaluation of these issues resonated with literature emphasizing the advantages of incorporating dedicated prototyping and implementation phases during the co-design of digital tools [ 41 ].

Study Limitations and Future Directions

Although the results’ generalizability cannot a priori be expected in this study design, women’s high educational level and residency in northern Italy still represent a limitation. Perinatal depression symptoms tend to be higher among women with lower educational levels [ 80 ], suggesting a potential bias in the sample. In addition, the mentioned sample’s characteristics may reduce the variability of analyzed cases, impacting the generalizability of the findings. A further limitation regards data collection, as it relied on self-reports and semistructured interviews, which are indeed vulnerable to social desirability biases. Moreover, in this study, women’s experience with depression symptoms and their use of e-mental health tools were not measured, thus representing a limit of the study. Nonetheless, assessing these matters could provide valuable insights into their perceptions and potential use of chatbots during pregnancy. These dimensions warrant consideration in future studies to better understand the factors influencing women’s engagement with digital interventions during this critical period.

Being at the beginning of the product life cycle [ 45 , 46 ], referring specifically to the technical problems encountered, while they represent a limitation in the study, their management by the research team was invaluable in providing insights into the software used and the potential of Rasa. This understanding contributes to a more flexible problem-solving approach for addressing current and potential future issues. Proactively addressing such problems helps users maintain a sense of control and proficiency with the tool. In addition, these issues offer important information on how problem resolution, or lack thereof, impacts the overall UX. In this regard, it is noteworthy that the users did not express dissatisfaction with the simplicity of the solution, which primarily operated on rule-based mechanisms. This emphasizes the significance of incorporating user-centered design principles in developing natural language processing solutions that effectively meet end users’ needs and expectations.

Conclusions

In line with this, good practices can be outlined to construct appropriate validity and mitigate any negative effects on the user, thus ensuring ethical standards [ 81 - 86 ]: (1) ground intervention content in evidence-based data pertinent to the perinatal literature; (2) seek input from both end users and clinical professionals to evaluate needs and gather feedback on intervention content and e-mental health tool usability; (3) conduct feasibility and pilot-testing to ensure acceptability, feasibility, efficacy, and effectiveness, along with evaluating e-mental health tool use (both frequency and duration); (4) use adequate measures and evaluate appropriate outcomes to assess intervention success; (5) ensure that end users are provided with a clear informed consent regarding intervention purpose, content, and e-mental health tool capacities, risks, and limits; (6) incorporate safety measures, including clear procedures for managing situations of need and heightened distress, such as providing crisis support services and establishing connections with reference clinicians or public health services, while also monitoring end user mental health; (7) continuously monitor intervention progress to refine effectiveness and minimize potential negative effects; and (8) ensure clinical professionals are properly guided and informed with up-to-date evidence on available e-mental health interventions, their effectiveness, suitability, and safety. Aligning with this, to ensure a consistently high-quality technical solution for end users, substantial investments in assistance and infrastructure are imperative. The insights from this study underscore the importance of prioritizing UX and technical reliability to enhance the effectiveness and adoption of preventive perinatal tools like Juno in real-world contexts. Although Juno already aligns with ethical standards 1 to 5, the results of this study indicate that the tool’s capacities, risks, and limitations need to be greatly reported (point 5). In addition, safety measures were limited to self-reported depression, anxiety, and stress levels, with no specific process for monitoring intervention progress (point 6). Therefore, future developments of Juno should incorporate comprehensive safety measures and test their feasibility and acceptability. This includes integrating technological requirements to establish a more specific procedure for monitoring intervention progress (point 7).

Conflicts of Interest

None declared.

Definitions of digital intervention and technological advancement provided to participants during the semistructured interviews and the participants’ scores.

Riper H, Andersson G, Christensen H, Cuijpers P, Lange A, Eysenbach G. Theme issue on e-mental health: a growing field in internet research. J Med Internet Res. Dec 19, 2010;12(5):e74. [ FREE Full text ] [ CrossRef ] [ Medline ]
Torous J, Bucci S, Bell IH, Kessing LV, Faurholt-Jepsen M, Whelan P, et al. The growing field of digital psychiatry: current evidence and the future of apps, social media, chatbots, and virtual reality. World Psychiatry. Oct 2021;20(3):318-335. [ FREE Full text ] [ CrossRef ] [ Medline ]
mHealth: new horizons for health through mobile technologies: second global survey on eHealth. World Health Organization. 2011. URL: https://iris.who.int/handle/10665/44607 [accessed 2024-04-29]
Fairburn CG, Patel V. The impact of digital technology on psychological treatments and their dissemination. Behav Res Ther. Jan 2017;88:19-25. [ FREE Full text ] [ CrossRef ] [ Medline ]
Kim J, Marcusson-Clavertz D, Yoshiuchi K, Smyth JM. Potential benefits of integrating ecological momentary assessment data into mHealth care systems. Biopsychosoc Med. 2019;13:19. [ FREE Full text ] [ CrossRef ] [ Medline ]
Wang Q, Su M, Zhang M, Li R. Integrating digital technologies and public health to fight COVID-19 pandemic: key technologies, applications, challenges and outlook of digital healthcare. Int J Environ Res Public Health. Jun 04, 2021;18(11):6053. [ FREE Full text ] [ CrossRef ] [ Medline ]
Vial S, Boudhraâ S, Dumont M. Human-centered design approaches in digital mental health interventions: exploratory mapping review. JMIR Ment Health. Jun 07, 2022;9(6):e35591. [ FREE Full text ] [ CrossRef ] [ Medline ]
Dorst K, Dijkhuis J. Comparing paradigms for describing design activity. Des Stud. Apr 1995;16(2):261-274. [ CrossRef ]
Dorst K. Design beyond design. She Ji J Des Econ Innov. Feb 2019;5(2):117-127. [ CrossRef ]
Dorst K. Design research: a revolution-waiting-to-happen. Des Stud. Jan 2008;29(1):4-11. [ CrossRef ]
Boyd H, McKernon S, Mullin B, Old A. Improving healthcare through the use of co-design. N Z Med J. Jun 29, 2012;125(1357):76-87. [ Medline ]
Czajkowski SM, Powell LH, Adler N, Naar-King S, Reynolds KD, Hunter CM, et al. From ideas to efficacy: the ORBIT model for developing behavioral treatments for chronic diseases. Health Psychol. Oct 2015;34(10):971-982. [ FREE Full text ] [ CrossRef ] [ Medline ]
WHO recommendations on maternal and newborn care for a positive postnatal experience. World Health Organization. URL: https://www.who.int/publications/i/item/9789240045989 [accessed 2024-04-29]
The millennium development goals report 2013. WHO-UNFPA. URL: https://www.unfpa.org/publications/millennium-development-goals-report-2013 [accessed 2024-04-29]
Sánchez-Polán M, Franco E, Silva-José C, Gil-Ares J, Pérez-Tejero J, Barakat R, et al. Exercise during pregnancy and prenatal depression: a systematic review and meta-analysis. Front Physiol. 2021;12:640024. [ FREE Full text ] [ CrossRef ] [ Medline ]
Hutchens BF, Kearney J. Risk factors for postpartum depression: an umbrella review. J Midwifery Womens Health. Jan 22, 2020;65(1):96-108. [ CrossRef ] [ Medline ]
Walker AL, de Rooij SR, Dimitrova MV, Witteveen AB, Verhoeven CJ, de Jonge A, et al. Psychosocial and peripartum determinants of postpartum depression: findings from a prospective population-based cohort. The ABCD study. Compr Psychiatry. Jul 2021;108:152239. [ FREE Full text ] [ CrossRef ] [ Medline ]
Howard LM, Khalifeh H. Perinatal mental health: a review of progress and challenges. World Psychiatry. Oct 2020;19(3):313-327. [ CrossRef ] [ Medline ]
Clark M. Maternal depression costs society billions each year, new model finds. Center For Children and Families. 2019. URL: https://ccf.georgetown.edu/2019/05/31/maternal-depression-costs-society-billions-each-year-new-model-finds/ [accessed 2024-04-29]
Pollack LM, Chen J, Cox S, Luo F, Robbins CL, Tevendale HD, et al. Healthcare utilization and costs associated with perinatal depression among Medicaid enrollees. Am J Prev Med. Jun 2022;62(6):e333-e341. [ FREE Full text ] [ CrossRef ] [ Medline ]
Rokicki S, McGovern M, Von Jaglinsky A, Reichman NE. Depression in the postpartum year and life course economic trajectories. Am J Prev Med. Feb 2022;62(2):165-173. [ FREE Full text ] [ CrossRef ] [ Medline ]
Button S, Thornton A, Lee S, Shakespeare J, Ayers S. Seeking help for perinatal psychological distress: a meta-synthesis of women’s experiences. Br J Gen Pract. Aug 28, 2017;67(663):e692-e699. [ CrossRef ]
Fonseca A, Ganho-Ávila A, Lambregtse-van den Berg M, Lupattelli A, Rodriguez-Muñoz MF, Ferreira P, et al. Emerging issues and questions on peripartum depression prevention, diagnosis and treatment: a consensus report from the cost action riseup-PPD. J Affect Disord. Sep 01, 2020;274:167-173. [ CrossRef ] [ Medline ]
Fonseca A, Silva S, Canavarro MC. Depression literacy and awareness of psychopathological symptoms during the perinatal period. J Obstet Gynecol Neonatal Nurs. Mar 2017;46(2):197-208. [ CrossRef ] [ Medline ]
Hussain-Shamsy N, Shah A, Vigod SN, Zaheer J, Seto E. Mobile health for perinatal depression and anxiety: scoping review. J Med Internet Res. Apr 13, 2020;22(4):e17011. [ FREE Full text ] [ CrossRef ] [ Medline ]
van den Heuvel JF, Groenhof TK, Veerbeek JH, van Solinge WW, Lely AT, Franx A, et al. eHealth as the next-generation perinatal care: an overview of the literature. J Med Internet Res. Jun 05, 2018;20(6):e202. [ FREE Full text ] [ CrossRef ] [ Medline ]
Richards DA. Stepped care: a method to deliver increased access to psychological therapies. Can J Psychiatry. Apr 01, 2012;57(4):210-215. [ CrossRef ] [ Medline ]
Fonseca A, Gorayeb R, Canavarro MC. Women׳s help-seeking behaviours for depressive symptoms during the perinatal period: socio-demographic and clinical correlates and perceived barriers to seeking professional help. Midwifery. Dec 2015;31(12):1177-1185. [ CrossRef ] [ Medline ]
Dimidjian S, Barrera M, Martell C, Muñoz RF, Lewinsohn PM. The origins and current status of behavioral activation treatments for depression. Annu Rev Clin Psychol. 2011;7:1-38. [ CrossRef ] [ Medline ]
Dimidjian S, Goodman SH, Sherwood NE, Simon GE, Ludman E, Gallop R, et al. A pragmatic randomized clinical trial of behavioral activation for depressed pregnant women. J Consult Clin Psychol. Jan 2017;85(1):26-36. [ FREE Full text ] [ CrossRef ] [ Medline ]
Jacobson NS, Martell CR, Dimidjian S. Behavioral activation treatment for depression: returning to contextual roots. Clin Psychol Sci Pract. Feb 2001;8(3):255-270. [ CrossRef ]
Soucy Chartier I, Provencher MD. Behavioural activation for depression: efficacy, effectiveness and dissemination. J Affect Disord. Mar 05, 2013;145(3):292-299. [ CrossRef ] [ Medline ]
Kanter JW, Manos RC, Bowe WM, Baruch DE, Busch AM, Rusch LC. What is behavioral activation? a review of the empirical literature. Clin Psychol Rev. Aug 2010;30(6):608-620. [ CrossRef ] [ Medline ]
Mancinelli E, Bassi G, Gabrielli S, Salcuni S. The efficacy of digital cognitive-behavioral interventions in supporting the psychological adjustment and sleep quality of pregnant women with sub-clinical symptoms: a systematic review and meta-analysis. Int J Environ Res Public Health. Aug 03, 2022;19(15):9549. [ FREE Full text ] [ CrossRef ] [ Medline ]
Mancinelli E, Dell'Arciprete G, Pattarozzi D, Gabrielli S, Salcuni S. Digital behavioral activation interventions during the perinatal period: scoping review. JMIR Pediatr Parent. Feb 28, 2023;6:e40937. [ FREE Full text ] [ CrossRef ] [ Medline ]
Mancinelli E, Gabrielli S, Salcuni S. A digital behavioral activation intervention (JuNEX) for pregnant women with sub-clinical depression symptoms: an explorative co-design study. JMIR Hum Factors. May 16, 2024;11:e50098. [ FREE Full text ] [ CrossRef ] [ Medline ]
Provoost S, Lau HM, Ruwaard J, Riper H. Embodied conversational agents in clinical psychology: a scoping review. J Med Internet Res. May 09, 2017;19(5):e151. [ FREE Full text ] [ CrossRef ] [ Medline ]
Gaffney H, Mansell W, Tai S. Conversational agents in the treatment of mental health problems: mixed-method systematic review. JMIR Ment Health. Oct 18, 2019;6(10):e14166. [ FREE Full text ] [ CrossRef ] [ Medline ]
Kellogg KC, Sadeh-Sharvit S. Pragmatic AI-augmentation in mental healthcare: key technologies, potential benefits, and real-world challenges and solutions for frontline clinicians. Front Psychiatry. Sep 6, 2022;13:990370. [ FREE Full text ] [ CrossRef ] [ Medline ]
Sedlakova J, Trachsel M. Conversational artificial intelligence in psychotherapy: a new therapeutic tool or agent? Am J Bioeth. May 2023;23(5):4-13. [ FREE Full text ] [ CrossRef ] [ Medline ]
Noorbergen TJ, Adam MT, Teubner T, Collins CE. Using co-design in mobile health system development: a qualitative study with experts in co-design and mobile health system development. JMIR Mhealth Uhealth. Nov 10, 2021;9(11):e27896. [ FREE Full text ] [ CrossRef ] [ Medline ]
Baxter P, Jack S. Qualitative case study methodology: study design and implementation for novice researchers. Qual Rep. Jan 14, 2015;13(4):544-559. [ CrossRef ]
World Medical Association. Dichiarazione di helsinki della world medical association principi etici per la ricerca biomedica che coinvolge gli esseri umani. Evidence. 1964;5(10):e1000059. [ FREE Full text ]
Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. Sep 2001;16(9):606-613. [ FREE Full text ] [ CrossRef ] [ Medline ]
Ji X, Abdoli S. Challenges and opportunities in product life cycle management in the context of industry 4.0. Procedia CIRP. 2023;119:29-34. [ CrossRef ]
Kiritsis D, Bufardi A, Xirouchakis P. Research issues on product lifecycle management and information tracking using smart embedded systems. Adv Eng Inform. Jul 2003;17(3-4):189-202. [ CrossRef ]
Lejuez CW, Hopko DR, Acierno R, Daughters SB, Pagoto SL. Ten year revision of the brief behavioral activation treatment for depression: revised treatment manual. Behav Modif. Mar 2011;35(2):111-161. [ CrossRef ] [ Medline ]
Bassi G, Giuliano C, Perinelli A, Forti S, Gabrielli S, Salcuni S. A virtual coach (Motibot) for supporting healthy coping strategies among adults with diabetes: proof-of-concept study. JMIR Hum Factors. Jan 21, 2022;9(1):e32211. [ FREE Full text ] [ CrossRef ] [ Medline ]
The future of customer experience. RASA Technologies. URL: https://rasa.com/ [accessed 2023-11-30]
Dinesh PM, Sujitha V, Salma C, Srijayapriya B. A review on natural language processing: back to basics. In: Raj JS, Iliyasu AM, Bestak R, Baig ZA, editors. Lecture Notes on Data Engineering and Communications Technologies. Singapore, Singapore. Springer; 2021:655-661.
Shevlin M, Butter S, McBride O, Murphy J, Gibson-Miller J, Hartman TK, et al. Measurement invariance of the patient health questionnaire (phq-9) and generalized anxiety disorder scale (GAD-7) across four European countries during the COVID-19 pandemic. BMC Psychiatry. Mar 01, 2022;22(1):154. [ FREE Full text ] [ CrossRef ] [ Medline ]
Benvenuti P, Ferrara M, Niccolai C, Valoriani V, Cox JL. The Edinburgh postnatal depression scale: validation for an Italian sample. J Affect Disord. May 1999;53(2):137-141. [ CrossRef ] [ Medline ]
Cox JL, Holden JM, Sagovsky R. Detection of postnatal depression. Development of the 10-item Edinburgh postnatal depression scale. Br J Psychiatry. Jun 2, 1987;150(06):782-786. [ CrossRef ] [ Medline ]
American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders. 4th edition. New York, NY. American Psychiatric Press; 1994.
Spitzer RL, Kroenke K, Williams JB, Löwe B. A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch Intern Med. May 22, 2006;166(10):1092-1097. [ CrossRef ] [ Medline ]
Cohen S, Kamarck T, Mermelstein R. A global measure of perceived stress. J Health Soc Behav. Dec 1983;24(4):385. [ CrossRef ]
Mondo M, Sechi C, Cabras C. Psychometric evaluation of three versions of the Italian Perceived Stress Scale. Curr Psychol. Jan 08, 2019;40(4):1884-1892. [ CrossRef ]
Manos RC, Kanter JW, Luo W. The behavioral activation for depression scale-short form: development and validation. Behav Ther. Dec 2011;42(4):726-739. [ CrossRef ] [ Medline ]
Armento ME, Hopko DR. The Environmental Reward Observation Scale (EROS): development, validity, and reliability. Behav Ther. Jun 2007;38(2):107-119. [ CrossRef ] [ Medline ]
Domnich A, Arata L, Amicizia D, Signori A, Patrick B, Stoyanov S, et al. Development and validation of the Italian version of the mobile application rating scale and its generalisability to apps targeting primary prevention. BMC Med Inform Decis Mak. Jul 07, 2016;16(1):83. [ FREE Full text ] [ CrossRef ] [ Medline ]
O’Brien HL, Cairns P, Hall M. A practical approach to measuring user engagement with the refined user engagement scale (UES) and new UES short form. Int J Hum Comput Interact. Apr 2018;112:28-39. [ CrossRef ]
Benoit K, Watanabe K, Wang H, Nulty P, Obeng A, Müller S, et al. quanteda: an R package for the quantitative analysis of textual data. J Open Source Softw. 2020;3(30):774. [ FREE Full text ] [ CrossRef ]
Yan SJ, Wadley G, Vo T, Xiao Y, Truong H. Design and evaluation of a prototype ‘hibaby’ for pregnancy assistance. In: Proceedings of the 2022 International Conference on Human Machine Interaction. 2022. Presented at: ICHMI '22; May 6-8, 2022:53-59; Beijing, China. URL: https://dl.acm.org/doi/abs/10.1145/3560470.3560477 [ CrossRef ]
Bjelica A, Cetkovic N, Trninic-Pjevic A, Mladenovic-Segedi L. The phenomenon of pregnancy — a psychological view. Ginekol Pol. Feb 28, 2018;89(2):102-106. [ CrossRef ]
Varma DS, Mualem M, Goodin A, Gurka KK, Wen TS, Gurka MJ, et al. Acceptability of an mHealth app for monitoring perinatal and postpartum mental health: qualitative study with women and providers. JMIR Form Res. Jun 07, 2023;7:e44500. [ FREE Full text ] [ CrossRef ] [ Medline ]
Blackmore R, Boyle JA, Gray KM, Willey S, Highet N, Gibson-Helm M. Introducing and integrating perinatal mental health screening: development of an equity-informed evidence-based approach. Health Expect. Oct 24, 2022;25(5):2287-2298. [ FREE Full text ] [ CrossRef ] [ Medline ]
ICOPE: perinatal mental health digital screening. Center of Perinatal Excellence. URL: https://www.cope.org.au/health-professionals/icope-digital-screening/ [accessed 2024-04-29]
Yin X, Sun N, Jiang N, Xu X, Gan Y, Zhang J, et al. Prevalence and associated factors of antenatal depression: systematic reviews and meta-analyses. Clin Psychol Rev. Feb 2021;83:101932. [ CrossRef ] [ Medline ]
Agostini F, Neri E, Salvatori P, Dellabartola S, Bozicevic L, Monti F. Antenatal depressive symptoms associated with specific life events and sources of social support among Italian women. Matern Child Health J. May 11, 2015;19(5):1131-1141. [ CrossRef ] [ Medline ]
Cena L, Mirabella F, Palumbo G, Gigantesco A, Trainini A, Stefana A. Prevalence of maternal antenatal and postnatal depression and their association with sociodemographic and socioeconomic factors: a multicentre study in Italy. J Affect Disord. Jan 15, 2021;279:217-221. [ FREE Full text ] [ CrossRef ] [ Medline ]
Corbani IE, Rucci P, Iapichino E, Quartieri Bollani M, Cauli G, Ceruti MR, et al. Comparing the prevalence and the risk profile for antenatal depressive symptoms across cultures. Int J Soc Psychiatry. Nov 14, 2017;63(7):622-631. [ CrossRef ] [ Medline ]
Giardinelli L, Innocenti A, Benni L, Stefanini MC, Lino G, Lunardi C, et al. Depression and anxiety in perinatal period: prevalence and risk factors in an Italian sample. Arch Womens Ment Health. Feb 29, 2012;15(1):21-30. [ CrossRef ] [ Medline ]
Caparros-Gonzalez RA, Romero-Gonzalez B, Strivens-Vilchez H, Gonzalez-Perez R, Martinez-Augustin O, Peralta-Ramirez MI. Hair cortisol levels, psychological stress and psychopathological symptoms as predictors of postpartum depression. PLoS One. Aug 28, 2017;12(8):e0182817. [ FREE Full text ] [ CrossRef ] [ Medline ]
Lagadec N, Steinecker M, Kapassi A, Magnier AM, Chastang J, Robert S, et al. Factors influencing the quality of life of pregnant women: a systematic review. BMC Pregnancy Childbirth. Nov 23, 2018;18(1):455. [ FREE Full text ] [ CrossRef ] [ Medline ]
Betts KS, Williams GM, Najman JM, Alati R. The relationship between maternal depressive, anxious, and stress symptoms during pregnancy and adult offspring behavioral and emotional problems. Depress Anxiety. Feb 30, 2015;32(2):82-90. [ CrossRef ] [ Medline ]
Field T, Diego M, Hernandez-Reif M, Figueiredo B, Deeds O, Ascencio A, et al. Comorbid depression and anxiety effects on pregnancy and neonatal outcome. Infant Behav Dev. Feb 2010;33(1):23-29. [ FREE Full text ] [ CrossRef ] [ Medline ]
Talge NM, Neal C, Glover V, Early Stress‚ Translational Research and Prevention Science Network: Fetal and Neonatal Experience on Child and Adolescent Mental Health. Antenatal maternal stress and long-term effects on child neurodevelopment: how and why? J Child Psychol Psychiatry. Mar 2007;48(3-4):245-261. [ FREE Full text ] [ CrossRef ] [ Medline ]
Baker J. Three levels of health promotion/disease prevention. In: Baker J, editor. Contemporary Health Issues. Davis, CA. LibreTexts; 2020.
Batt MM, Duffy KA, Novick AM, Metcalf CA, Epperson CN. Is postpartum depression different from depression occurring outside of the perinatal period? a review of the evidence. Focus (Am Psychiatr Publ). Apr 2020;18(2):106-119. [ FREE Full text ] [ CrossRef ] [ Medline ]
Papadopoulou SK, Pavlidou E, Dakanalis A, Antasouras G, Vorvolakos T, Mentzelou M, et al. Postpartum depression is associated with maternal sociodemographic and anthropometric characteristics, perinatal outcomes, breastfeeding practices, and mediterranean diet adherence. Nutrients. Sep 04, 2023;15(17):3853. [ FREE Full text ] [ CrossRef ] [ Medline ]
Bächle TC, Wernick A. The Futures of eHealth, Social, Ethical and Legal Challenges. Berlin, Germany. Alexander von Humboldt Institute for Internet and Society; 2019.
Balcombe L. AI chatbots in digital mental health. Informatics. Oct 27, 2023;10(4):82. [ CrossRef ]
Jokinen A, Stolt M, Suhonen R. Ethical issues related to eHealth: an integrative review. Nurs Ethics. Mar 2021;28(2):253-271. [ FREE Full text ] [ CrossRef ] [ Medline ]
Maeckelberghe E, Zdunek K, Marceglia S, Farsides B, Rigby M. The ethical challenges of personalized digital health. Front Med (Lausanne). Jun 19, 2023;10:1123863. [ FREE Full text ] [ CrossRef ] [ Medline ]
Wykes T, Lipshitz J, Schueller SM. Towards the design of ethical standards related to digital mental health and all its applications. Curr Treat Options Psych. Jul 5, 2019;6(3):232-242. [ CrossRef ]
Wykes T, Schueller S. Why reviewing apps is not enough: transparency for trust (T4T) principles of responsible health app marketplaces. J Med Internet Res. May 02, 2019;21(5):e12390. [ FREE Full text ] [ CrossRef ] [ Medline ]

Abbreviations

behavioral activation

Obesity-Related Behavioral Intervention Trials

Patient Health Questionnaire-9

user engagement

user experience

World Health Organization

Edited by A Mavragani; submitted 21.03.24; peer-reviewed by AR Yameogo, C-M Huang; comments to author 24.04.24; revised version received 14.05.24; accepted 17.06.24; published 14.08.24.

©Elisa Mancinelli, Simone Magnolini, Silvia Gabrielli, Silvia Salcuni. Originally published in JMIR Formative Research (https://formative.jmir.org), 14.08.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included.

Information

Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

Active Journals
Find a Journal
Proceedings Series
For Authors
For Reviewers
For Editors
For Librarians
For Publishers
For Societies
For Conference Organizers
Open Access Policy
Institutional Open Access Program
Special Issues Guidelines
Editorial Process
Research and Publication Ethics
Article Processing Charges
Testimonials
Preprints.org
SciProfiles
Encyclopedia

Article Menu

Subscribe SciFeed
Recommended Articles
Google Scholar
on Google Scholar
Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Performance management decision-making model: case study on foreign language learning curriculums.

1. Introduction

2. literature review, 2.1. questionnaire and pem, 2.2. index setting and statistical testing, 3. research methods, 4. an applied example, 5. conclusions, author contributions, informed consent statement, data availability statement, conflicts of interest.

Li, R. Research trends of blended language learning: A bibliometric synthesis of SSCI-indexed journal articles during 2000–2019. ReCALL 2022 , 34 , 309–326. [ Google Scholar ] [ CrossRef ]
Li, X.; Huang, X. Improvement and Optimization Method of College English Teaching Level Based on Convolutional Neural Network Model in an Embedded Systems Context. Comput. Aided Des. Appl. 2024 , 21 , 212–227. [ Google Scholar ] [ CrossRef ]
Graham, K.M.; Yeh, Y.F. Teachers’ implementation of bilingual education in Taiwan: Challenges and arrangements. Asia Pac. Educ. Rev. 2023 , 24 , 461–472. [ Google Scholar ] [ CrossRef ]
Chen, S.H. Establishment of a Performance-Evaluation Model for Service Quality in the Banking Industry. Serv. Ind. J. 2009 , 29 , 235–247. [ Google Scholar ] [ CrossRef ]
Marković, S.; Janković, S.R. Exploring the relationship between service quality and customer satisfaction in croatian hotel industry. Tour. Hosp. Manag.-Croat. 2013 , 19 , 149–164. [ Google Scholar ] [ CrossRef ]
Yang, C.C. Establishment and applications of the integrated model of service quality measurement. Manag. Serv. Qual. 2003 , 13 , 310–324. [ Google Scholar ] [ CrossRef ]
Wong, R.C.P.; Szeto, W.Y. An alternative methodology for evaluating the service quality of urban taxis. Transp. Policy 2018 , 69 , 132–140. [ Google Scholar ] [ CrossRef ]
Martínez-Caro, E.; Cegarra-Navarro, J.G.; Cepeda-Carrión, G. An application of the performance-evaluation model for e-learning quality in higher education. Total Qual. Manag. Bus. Excell. 2015 , 26 , 632–647. [ Google Scholar ] [ CrossRef ]
Mustafa, H.; Omar, B.; Mukhiar, S.N.S. Measuring destination competitiveness: An importance-performance analysis (IPA) of six top island destinations in South East Asia. Asia Pac. J. Tour. Res. 2020 , 25 , 223–243. [ Google Scholar ] [ CrossRef ]
Pai, F.Y.; Yeh, T.M. Effective implementation for introducing ISO/TS 16949 in semiconductor manufacturing industries, Total Qual. Manag. Bus. Excell. 2013 , 24 , 462–478. [ Google Scholar ] [ CrossRef ]
Jeng, M.Y.; Yeh, T.M.; Pai, F.Y. A Performance Evaluation Matrix for Measuring the Life Satisfaction of Older Adults Using eHealth Wearables. Healthcare 2022 , 10 , 605. [ Google Scholar ] [ CrossRef ]
Kucukaltan, B.; Irani, Z.; Aktas, E. A decision support model for identification and prioritization of key performance indicators in the logistics industry. Comput. Hum. Behav. 2016 , 65 , 346–358. [ Google Scholar ] [ CrossRef ]
Yu, C.M.; Chang, H.T.; Hsu, S.Y. An assessment of quality and quantity for foreign language training course to enhance students’ learning effectiveness. Int. J. Inf. Manag. Sci. 2017 , 28 , 53–66. [ Google Scholar ]
Yeh, L.C.; Tung, C.C.; Yang, S.Y.; Chen, J.H.; Shiau, F.H. The Development of the University Teacher Instructional Evaluation Scale. Psychol. Test. 2005 , 52 , 59–82. [ Google Scholar ]
Lambert, D.M.; Sharma, A. A customer-based competitive analysis for logistics decisions. Int. J. Phys. Distrib. Logist. Manag. 1990 , 20 , 17–24. [ Google Scholar ] [ CrossRef ]
Hung, Y.H.; Huang, M.L.; Chen, K.S. Service Quality Evaluation by Service Quality Performance Matrix. Total Qual. Manag. Bus. Excell. 2003 , 14 , 79–89. [ Google Scholar ] [ CrossRef ]
Yu, C.M.; Chang, H.T.; Chen, K.S. Developing a performance evaluation matrix to enhance the learner satisfaction of an e-learning system. Total Qual. Manag. Bus. Excell. 2018 , 29 , 727–745. [ Google Scholar ] [ CrossRef ]
Li, Y.; Wang, L.; Li, F. A data-driven prediction approach for sports team performance and its application to national basketball association. Omega 2021 , 98 , 102123. [ Google Scholar ] [ CrossRef ]
Nam, S.; Lee, H.C. A text analytics-based importance performance analysis and its application to airline service. Sustainability 2019 , 11 , 6153. [ Google Scholar ] [ CrossRef ]
Wu, J.; Wang, Y.; Zhang, R.; Cai, J. An approach to discovering product/service improvement priorities: Using dynamic importance-performance analysis. Sustainability 2018 , 10 , 3564. [ Google Scholar ] [ CrossRef ]
Gutierrez, D.M.; Scavarda, L.F.; Fiorencio, L.; Martins, R.A. Evolution of the performance measurement system in the logistics department of a broadcasting company: An a.ction research. Int. J. Prod. Econ. 2015 , 160 , 1–12. [ Google Scholar ] [ CrossRef ]
Rodriguez, R.R.; Saiz, J.J.A.; Bas, A.O. Quantitative relationships between key performance indicators for supporting decision-making processes. Comput. Ind. 2009 , 60 , 104–113. [ Google Scholar ] [ CrossRef ]
Cheng, S.W. Practical implementation of the process capability indices. Qual. Eng. 1994 , 7 , 239–259. [ Google Scholar ] [ CrossRef ]
Xu, Y.; Zhang, X.; Meng, P. A novel intelligent deep learning-based uncertainty-guided network training in market price. IEEE Trans. Ind. Inform. 2022 , 18 , 5705–5711. [ Google Scholar ] [ CrossRef ]
Chen, H.Y.; Lin, K.P. Fuzzy supplier selection model based on lifetime performance index. Expert Syst. Appl. 2022 , 208 , 118135. [ Google Scholar ] [ CrossRef ]
Durmuş, V. Does the healthcare decentralization provide better public health security capacity and health services satisfaction? An analysis of OECD countries. J. Health Organ. Manag. 2024 , 38 , 209–226. [ Google Scholar ] [ CrossRef ]
Syafrudin, M.; Alfian, G.; Fitriyani, N.L.; Rhee, J. Performance analysis of IoT-based sensor, big data processing, and machine learning model for real-time monitoring system in automotive manufacturing. Sensors 2018 , 18 , 2946. [ Google Scholar ] [ CrossRef ]
Gopalakrishnan, S.; Kumaran, M.S. Iiot framework based ml model to improve automobile industry product. Intell. Autom. Soft Comput. 2022 , 31 , 1435–1449. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

Dimension	Item j				Zone
Dimension 1: teaching preparation	1				Z
	2				Z
	⁝	⁝	⁝	⁝	⁝

⁝	⁝	⁝	⁝	⁝	⁝
⁝	j				Z
⁝	⁝	⁝	⁝	⁝	⁝
Dimension 5: coursework and evaluation	⁝	⁝	⁝	⁝	⁝
Dimension 5: coursework and evaluation

Dimension	Item
Dimension 1	1	0.18	0.16	0.20
	2	0.06	0.04	0.08
	3	0.12	0.10	0.14
	4	0.19	0.18	0.20
Dimension 2	5	0.09	0.07	0.11
	6	0.37	0.35	0.39
	7	−0.38	−0.39	−0.37
	8	0.23	0.21	0.25
	9	−0.01	−0.03	0.01
Dimension 3	10	0.02	0.00	0.04
Dimension 3	11	0.05	0.03	0.07
Dimension 4	12	−0.45	−0.46	−0.44
	13	0.09	0.07	0.11
	14	0.21	0.19	0.23
Dimension 5	15	0.01	−0.01	0.03
Dimension 5	16	−0.02	−0.04	0.00

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Chen, K.-S.; Yu, C.-M.; Yu, C.-H.; Chen, Y.-P. Performance Management Decision-Making Model: Case Study on Foreign Language Learning Curriculums. Information 2024 , 15 , 481. https://doi.org/10.3390/info15080481

Chen K-S, Yu C-M, Yu C-H, Chen Y-P. Performance Management Decision-Making Model: Case Study on Foreign Language Learning Curriculums. Information . 2024; 15(8):481. https://doi.org/10.3390/info15080481

Chen, Kuen-Suan, Chun-Min Yu, Chun-Hung Yu, and Yen-Po Chen. 2024. "Performance Management Decision-Making Model: Case Study on Foreign Language Learning Curriculums" Information 15, no. 8: 481. https://doi.org/10.3390/info15080481

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

Subscribe to receive issue release notifications and newsletters from MDPI journals

ORIGINAL RESEARCH article

Evaluation of the relationship between dietary acid load and cardiovascular risk factors in patients with type 2 diabetes: a case–control study.

1 Faculty of Health Science, Department of Nutrition and Dietetics, Atılım University, Ankara, Türkiye
2 Faculty of Health Science, Department of Nutrition and Dietetics, Başkent University, Ankara, Türkiye

Backround: Diets high in dietary acid load are thought to be associated with metabolic diseases. However, the number of studies examining the relationship between dietary acid load and metabolic diseases in Turkey is insufficient. The aim of this study was to investigate the relationship between cardiovascular disease risk factors and dietary acid load in individuals with type 2 diabetes.

Materials and methods: In this case–control study, 51 participants aged 30–65 years with type 2 diabetes and 59 participants in the control group were included. Blood pressure and biochemical findings were measured. Anthropometric measurements and body composition measurements were made. Dietary intake was assessed using a 3-day (1 day on weekends, 2 days on weekdays) food consumption record. Dietary acid load scores, including potential renal acid load (PRAL) and net endogenous acid production (NEAP), were calculated based on dietary intake. NEAP and PRAL scores were categorized as low and high according to the median value. Smoking status, body mass index (BMI), systolic blood pressure (SBP), diastolic blood pressure (DBP), total cholesterol (TC), trigylceride (TG), high density lipoprotein cholesterol (HDL-C), low density lipoprotein cholesterol (LDL), waist-hip ratio (WHR), waist-to- height ratio (WtHR), hemoglobin and fat mass (%) were evaluated as cardiovascular risk factors.

Results: The cut-off values of PRAL and NEAP were 3.61 and 44.78 mEq/d, respectively. After adjustment for various covariates, a significant positive association between PRAL and TG levels was observed in the diabetic group [odds ratio (OR), 5.98; 95% CI, 1.45–24.67; p = 0.013]. In contrast, a negative association was found between PRAL and SBP in the control group [odds ratio (OR), 0.21; 95% CI, 0.05–0.83; p = 0.026]. However, these associations were not observed for NEAP values in either group.

Conslusions: A higher PRAL value was consistently associated with higher TG level, but other cardiovascular risk factors were not. More longitudinal and interventional studies are needed to better establish a causal effect between dietary acid load and cardiovascular risk factors in individuals with diabetes.

Introduction

Diabetes is considered one of the most significant health problems of today. According to the Diabetes Atlas, there were 537 million adult diabetes patients worldwide in 2021, and it is estimated that this number will reach 783 million by 2045. In addition, Turkey has the highest diabetes population among European countries ( 1 ). Diabetes is a global health problem with a rapidly increasing prevalence. It is a chronic disease characterized by high blood glucose levels and abnormalities of carbohydrate, fat and protein metabolism ( 2 ). Chronic hyperglycemia caused by diabetes can cause microvascular and macrovascular complications in the long term.

Diabetes is an important risk factor for cardiovascular disease (CVD) ( 3 ). CVD risk factors such as obesity, hypertension, and dyslipidemia are common in patients with diabetes, especially those with type 2 diabetes ( 4 ). It is important to determine the risk of CVD, including diabetes, in the adult age group because CVD and diabetes generally affect each other, many risk factors are common, and more than one risk factor occurs together in individuals during adulthood ( 5 ). The management of modifiable CVD risk factors such as hyperglycemia, dyslipidemia, obesity, unhealthy diet, and physical inactivity is critically important to minimize the risk of macrovascular complications of diabetes ( 3 ).

Altering dietary habits is an important strategy in managing and preventing CVD risk factors. Food intake can affect’s the body’s acid base balance through the intake of acid or base precursors. Sulfur amino acids, which are the main determinants of acid load in the diet, are found in high amounts in foods of animal origin such as meat, eggs, fish and cheese, on the other hand, potassium and magnesium in plant foods and calcium in both plant foods and dairy products are determinants of alkaline load. A diet high in animal products and other acid-producing foods can lead to an acid load that cannot be compensated by fruit and vegetable consumption. This can lead to diet-induced metabolic acidosis ( 6 ). Recent studies have focused on the association between dietary acid load and health-related outcomes, including cardiometabolic risk factors and diabetes ( 6 – 11 ). It is thought that even a small reduction in diet-induced metabolic acidosis may improve insulin sensitivity, thus reducing the acid load in the diet may be effective in reducing insulin resistance ( 7 ). In this study, it was determined that the dietary acid loads of 125 newly diagnosed diabetic individuals were similar compared to the control group. There are studies showing that increased dietary acid load may be positively associated with insulin resistance that may develop in the future and may increase the risk of diabetes ( 6 , 11 – 13 ). While a Korean study put forward that dietary acid load was positively associated with the development of insulin resistance in the future ( 12 ), a longitudinal study by Moghadam et al. ( 9 ) in Iran emphasized that high dietary acid–base load may be a risk factor for the development of insulin resistance and related metabolic disorders.

The acid-forming potential of foods can be calculated using potential renal acid load (PRAL) and net endogenous acid production (NEAP). PRAL, developed by Remer et al. ( 14 ), takes into account different intestinal absorption rates of nutrients, ionic balances for calcium and magnesium, and dissociation of phosphate at pH 7.4. A positive PRAL score reflects acid-forming potential, while a negative score indicates alkaline-forming potential. Frassetto and colleagues ( 15 ) proposed a computational model focusing on (total) protein and potassium, which are thought to be the main variables responsible for NEAP. These methods are used to estimate acid loads from food intake and are frequently used in epidemiological studies. Because dietary acid load is related to urinary acid load measured from 24-h urine, it provides a simple and useful tool to assess the acidity of the diet ( 14 , 15 ).

The number of studies examining the relationship between diabetes, CVD and dietary acid load is limited. In addition, studies examining the relationship between diabetes, cardiometabolic risk factors and dietary acid load are inconsistent. Therefore, the aim of this study is to examine the relationship between dietary acid load and cardiometabolic risk factors in patient with diabetes.

Materials and methods

Study design and participant.

In this case–control study, participants aged 30–65 years with a diagnosis of Type 2 diabetes according to the American Diabetes Association criteria and age- and gender-matched controls who applied to Ankara Başkent University Hospital Endocrinology and Metabolic Diseases Outpatient Clinic between November 2019 and December 2020 were included. Diabetic patients with a history of any chronic disease such as CVD, cancer (including those with a history), kidney disease, gastrointestinal disorders and liver and lung diseases, acute infection, following any special diet or physical activity, daily energy intake outside the 800–4,200 kcal range, as well as pregnant and lactating patients were excluded. These patients who applied to the outpatient clinic and met our criteria were included in our study. The control group was selected from patients residing in Ankara, who had blood glucose control within the last 6 months and met the exclusion criteria. Exclusion criteria for the control group are as follows: participants with a history of any chronic disease such as CVD, cancer (including those with a history), kidney disease, gastrointestinal disorders and liver and lung diseases, acute infection, adherence to a specific lifestyle (diet and/or physical activity), medication use that may affect weight and diet, pregnant and lactating mothers, and daily energy intake outside the 800–4,200 kcal range were excluded. Urine albumin-to-creatinine ratio and estimated glomerular filtration rate (eGFR) were analyzed to assess renal function in individuals thought to be affected by dietary acid load. Participants with urine albumin-creatinine ratio > 30 mg/g and eGFR <60 mL/min/1.73 m2 were also excluded from the study. eGFR was calculated using the chronic kidney disease epidemiology collaboration equation (CKD-EPI equation, http://www.nkdep.nih.gov ). Sixty people in the diabetes group and 64 people in the control group were included in the study. Participants with high urine-albumin creatinine levels, low eGFR levels, and participants whose body composition measurement data and food consumption records could not be obtained due to the pandemic were excluded from this study (A total of 9 people in the diabetes group and 5 people in the control group were not included in the study). Accordingly, the study was conducted with 51 people in the diabetes group and 59 people in the control group.

Biochemical parameters

All laboratory assessments were measured after a 10–12 h overnight fast. The blood pressures and biochemical findings of the patients were taken by the nurse working in the hospital. Fasting blood glucose (FBG), low-density lipoprotein (LDL-C) and high-density lipoprotein (HDL-C) cholesterol, total cholesterol (TC), triglyceride (TG), hemoglobin, serum creatinine, eGFR and urine albumin/creatinine values were collected from biochemical test values routinely obtained at Başkent University Ankara Hospital. The fasting blood glucose collected was used to confirm that individuals in the control group did not have prediabetes or diabetes. The biochemical findings of the individuals who accepted the study were obtained from the medical records. Blood pressure (mmHg) was measured from the left arm using a mercury manometer while the person was sitting and calm.

Hypertension [systolic blood pressure (SBP) ≥ 130 mm Hg and diastolic blood pressure (DBP) ≥ 85 mm Hg], blood lipids [dyslipidemia LDL-C (≥130 mg/dL), HDL-C (male <40 mg/dL, female <50 mg/dL), TG (≥150 mg/dL)], were evaluated according to the National Cholesterol Education Program, Adult Treatment Panel III diagnostic criteria ( 16 ).

Assessment of other variables

Demographic information (age, sex, marital status, smoking and education level, etc.) was collected by face-to-face interviews with the participants, anthropometric measurements were made and a 3-day food consumption record (1 day on weekends, 2 days on weekdays) was obtained.

Body weight was measured while wearing light clothing and without shoes using the TANITA TBF-300 (TANITA Corp., Tokyo, Japan) body composition monitoring scale. Body fat mass (FM) percentage and body fat free mass (FFM) percentage were obtained using TANITA. Body height was measured using a tape meter (Seca scale; Seca Hamburg, Germany) in a standing position without shoes, while the shoulders were in normal position. Body Mass Index (BMI) (kg/m 2 ) was calculated by dividing weight in kilograms by the square of height in meters. BMI was defined according to cut-off values reported by the World Health Organization (WHO; overweight and obesity: BMI ≥25 kg/m 2 ) ( 17 ). Waist circumference (WC) and hip circumstance (HC) were measured with an accuracy of 0.1 cm using standard methods by tape measure without any pressure to the body surface. The waist-hip ratio (WHR) was also calculated by dividing WC by HC. Waist-to-height ratio (WtHR) was also calculated by dividing WC by height. All measurements were obtained as described previously and taken by a trained dietician.

Dietary assessment and definition of dietary acid load

In order to evaluate the daily energy and nutrients in the diet and to calculate the dietary acid load, 3-day 24-h food consumption records were taken from the individuals participating in the study, 2 days on weekdays and 1 on weekends. The daily diet, energy and nutrient intake from these data were analyzed using the “Computer Assisted Nutrition Program, Nutrition Information Systems Package Program (BEBIS)” developed for Turkey.

Various formulas have been used recently to estimate dietary acid load. The first is a physiological-based computational model used to estimate PRAL of foods. This model predicts endogenous acid production exceeded alkaline production for a certain amount of nutrients ingested daily ( 14 , 15 ).

PRAL was calculated using the following algorithm:

The calculation formula of the NEAP value, which is the second model used to calculate the dietary acid load of foods, is shown below:

Statistical analysis

In the statistical analysis phase of the study, firstly, the results of the Shapiro–Wilk test were examined to test the conformity of the numerical variables to the assumption of normal distribution. “Independent samples t-test” was used for two-group comparisons that conformed to normal distribution, and ‘Mann–Whitney-U test’ was used for those that did not conform to normal distribution. “Pearson chi-square test” was used for grouped data. The relationships between group variables were examined by correlation analysis. While the correlation analysis was being applied, the expected observation values of the cells were taken into account. “Fisher test” was used in comparisons of the number of observations with expected observation values below 5, and “Pearson chi-square test” was used in cases where the expected observation value was greater than 5. Then, logistic regression analysis was applied with the variables found to be statistically significant. Shapiro–Wilk test results were examined to test the compliance of numerical variables with the normal distribution assumption. “Independent samples t-test” was used for two-group comparisons suitable for normal distribution. In the study, logistic regression analysis was applied to determine the factors affecting the groups of PRAL and NEAP variables. Groups whose PRAL and NEAP values were below the median (Q2) were classified as low level, and those above were classified as high level. Binary logistic regression analysis was applied, taking low-high level PRAL and NEAP groups as dependent variables. Logistic regression analysis Two different models were created as Model −1 (unadjusted model) and Model-2 (adjusted model). In the adjusted model, age, sex, marital status and BMI variables were controlled. A p -value of <0.05 was set as statistically significant. Findings regarding the hypothesis tests were obtained using the IBM SPSS 26 program. NOTE: During the regression analysis phase, it was determined that the FFM (%) variable caused a multicollinearity problem (OR > 24,000) and was disabled from the entire analysis.

The findings of the cases and controls included in the study are presented in Table 1 . Considering these results, the average age of individuals in the diabetes group was 51.5 years, while the average age of individuals in the control group was 48.6 years ( p > 0.05). Body fat (%), TG, SBP and DBP values of individuals in the diabetes group were significantly higher than those in the control group ( p < 0.05). In addition, there was no significant difference between the groups in mean PRAL and NEAP values (p > 0.05) ( Table 1 ). The average FM% of the individuals in the study is 34.9 ± 13.89 and the average hemoglobin values are 8.2 ± 1.37. The median value of PRAL is determined as 3.612, while the median value of NEAP is 44.783.

Table 1 . Findings of individuals included in the research.

The characteristics of the individuals in the diabetes and control groups and comparisons between PRAL groups are given in Table 2 . There is a statistically significant relationship between sex, TG, HDL-C and PRAL groups of individuals with diabetes ( p < 0.05). When these relationships are examined, it is seen that women tend to follow a diet with a low PRAL value. In the diabetes group, most of the individuals with TG values below 150 were found to have low PRAL values. In addition, the majority of diabetes individuals with high HDL-C values had low PRAL values. When the results of the control group were analyzed, there was a statistically significant relationship between PRAL value and only SBP and HDL-C ( p < 0.05). No significant relationship was found between other cardiovascular risk factors. Most of the individuals in the control group with low SBP values had higher PRAL values. Furthermore, individuals with low HDL-C values in the control group tended to have PRAL values greater than 3.612.

Table 2 . Relationships between PRAL and characteristics of individuals in the diabetes and control groups.

The characteristics of individuals in the diabetes and control groups and comparisons between the NEAP groups were presented in Table 3 . In both groups, there was no statistically significant relationship between NEAP value and smoking, BMI, SBP, DBP, TC, TG, LDL-C, WHR, WtHR, physical activity and haemoglobin values ( p > 0.05). However, there is a statistically significant relationship between NEAP value and sex in both groups ( p < 0.05). Men tended to eat diets with a high dietary acid load, while women tended to eat diets with a lower dietary acid load. In addition, when the results of the control group were analyzed, it was observed that the majority of individuals with low NEAP values had high HDL-C values ( p < 0.05).

Table 3 . Relationships between NEAP and characteristics of individuals in the diabetes and control groups.

The results of the regression model in which the PRAL variable was taken as the dependent variable were given in Table 4 . When the findings of the diabetes group are examined, TG variable has a significant effect on the PRAL variable in both the unadjusted and adjusted models ( p < 0.05). Individuals with TG higher than 150 are more likely to have high PRAL than individuals with low TG (OR = 5.983). This rate is approximately 2 times higher in the adjusted model (OR = 10.226). When the results of the diabetes group are examined, it is seen that HDL-C and FM (%) variables do not have a significant effect on the PRAL variable in the unadjusted and adjusted models. When the findings of the control group are examined, it is seen that SBP has a significant effect on the PRAL variable in both the unadjusted and adjusted models ( p < 0.05). In the unadjusted model, individuals with SBP variable higher than 130 are 78.8% less likely to have high PRAL. In the adjusted model, this ratio increases even more. (83.2%). When the results of the control group are examined, it is seen that the HDL-C variable does not have a significant effect on PRAL in both models.

Table 4 . Regression model with PRAL variable as dependent variable.

Table 5 shows the results of the regression models in which the NEAP variable was taken as the dependent variable. When the findings of the diabetes group are examined, it is seen that the FM (%) variable does not have a significant effect on the NEAP variable in the uncorrected and corrected models ( p > 0.05). In the findings of the control group, it was determined that the HDL-C variable did not have a significant effect on the NEAP variable in both models (p > 0.05).

Table 5 . Regression model where the NEAP variable is taken as the dependent variable.

This case–control study evaluated the potential associations between dietary acid load and CVD risk factors in individuals with diabetes. PRAL and NEAP methods were used to determine dietary acid load and the results were divided into two groups as low and high according to median values. As a result of the analysis, significant associations were found between PRAL value and TG, sex and HDL-C in the diabetes group, while only sex was associated with NEAP value. After adjusting for potential confounding factors, TG in the diabetes group and SBP in the control group were found to have an effect on the PRAL value, whereas dietary acid load, as defined by NEAP, did not. No significant association was found between other CVD risk factors and NEAP and PRAL values. These findings suggest that individuals at high risk for cardiovascular risk factors may tend to eat diets with high dietary acid load.

Common conditions such as hypertension, dyslipidemia, obesity and insulin resistance accompanying diabetes form the basis of CVD ( 4 ). Previous studies have reported associations between dietary acid load and CVD risk factors ( 9 , 10 , 18 ). In a study conducted in Japan in 2008 ( 9 ), while a positive association was found between PRAL values and HDL-C of the individuals, no significant relationship was found between cardiovascular risk factors such as TG, BMI and smoking. In a retrospective cross-sectional study conducted in Korea, while a significant relationship was found between PRAL and LDL-C, smoking and BMI; no significant relationship was found with WC, total and HDL-C and diabetes ( 18 ). A meta-analysis examined the association between PRAL and NEAP with CVD and lipids; six studies found a positive association between dietary acid load and lipids, while no significant association was found in other studies ( 10 ). In contrast to these studies, there are also studies reporting that there is no independent relationship between CVD risk factors and dietary acid load ( 19 , 20 ). In our study, an independent relationship was found between PRAL and HDL-C, TG and FM% in the diabetes group, whereas PRAL was associated with HDL-C in the control group ( Table 2 ). Studies have shown that protein intake and protein types can affect HDL-C levels. The amount and type of protein may have affected both PRAL and HDL-C levels ( 21 , 22 ). The relationship between CVD risk factors and dietary acid load appears to be contradictory. In this study, in the diabetes group, only TG and PRAL were associated with PRAL among CVD risk factors ( Table 4 ) but no association was observed between NEAP and risk factors ( Table 5 ). This emphasizes the need for further longitudinal studies between dietary acid load and CVD risk factors.

In this study, a negative significant relationship was found between SBP and PRAL in the control group, and this significant relationship continued after all adjustments were made. Although the Polish study ( 20 ) and the E3N-EPIC cohort study ( 6 ) found no significant association between dietary acid load and hypertension prevalence, the Rotterdam study ( 23 ) reported that higher PRAL values were associated with blood pressure. This may be explained by the relatively low dietary acid-forming potential of individuals with diabetes in this study and other populations. Furthermore, since the use of medications that affect blood pressure and blood lipid levels of individuals with diabetes and the control group was not included as an exclusion criterion, this may constitute a potential confounding factor.

There are studies suggesting that the strength of the association between dietary acid load and both diabetes and CVD is inconsistent due to the different indices used and that gender may be a potential confounding factor in this difference. In a meta-analysis including seven prospective cohort studies, it was determined that a diet high in dietary acid load may increase the risk of diabetes, but this relationship was significant only in women. While a linear relationship was found between NEAP score and diabetes risk in women, this relationship was observed to be U-shaped in PRAL score ( 13 ). Three cohort studies conducted in diabetes showed that the association between dietary acid load score and diabetes was significant only among women ( 11 ). In a study conducted in Japan, it was observed that PRAL was associated with the risk of diabetes only in young men, but this relationship was not found between the NEAP value and diabetes. Some studies have shown that men have a higher dietary acid load and this study is similar to these findings ( 24 ). However, there are studies showing that dietary acid load values are similar in both genders ( 25 ). CVD has long been considered a condition that primarily affects men, but the actual lifetime risk of CVD appears to be similar for men and women. Moreover, a meta-analysis study found that women with diabetes have a 50% higher risk of fatal CVD compared to men with diabetes ( 26 ). Women tend to adopt a diet with a lower dietary acid load. However, considering that a diet with high dietary acid load may have an impact on the development of both CVD and diabetes, and since the prevalence of obesity is higher in women than men worldwide, gender-specific studies are required.

Due to the limited number of studies examining dietary acid load, the relationship between dietary acid load and metabolic diseases is not fully understood. It is reported that the main mechanism between dietary acid load and metabolic disease risk is insulin resistance6. However, high acidity in the blood levels may predispose to various metabolic complications such as mineral excretion, increase in blood pressure and higher cortisol secretion ( 27 ). Metabolic acidosis causes increased production of acid-forming metabolites in the body, which may lead to the release of plasma glucocorticoids, resulting in increased cortisol that supports visceral obesity and insulin resistance ( 28 ). Therefore, even in healthy individuals, there is a risk of very low degree metabolic acidosis causing hyperglycemia by causing insulin resistance ( 7 ). With the increase in dietary acid load, urinary citrate excretion decreases and it is thought that low urinary citrate excretion may be associated with insulin resistance ( 7 , 29 ). Potassium and magnesium, obtained mostly from plant foods, have an important role in acid–base balance. Insufficient intake of fruits and vegetables, and therefore potassium and magnesium, directs the pH balance toward acidosis, which may disrupt the β-cell response and lead to insulin resistance ( 30 , 31 ). Finally, it has been reported that minerals such as calcium and magnesium, which are necessary for the insulin response, may cause significant insulin dysfunction due to increased urinary excretion ( 32 ).

This study has some strengths and limitations. If we look at the strengths of the study, first of all, this study is the first study in our country to examine the relationship between CVD risk factors and dietary acid load obtained from the diet in individuals with type 2 diabetes. Secondly, nephropathy can develop in individuals with diabetes, so the participants’ kidney functions, which are critical in determining acid–base homeostasis, were controlled, therefore we attempted to reduce the impact of chronic metabolic acidosis or alkalosis by excluding individuals with chronic kidney disease, liver failure or cirrhosis, congestive heart failure or a history of CVD, and chronic obstructive pulmonary disease. Despite the strong aspects, the study also has some limitations: first, given the case–control design of the study, we could not conclude a causal relationship as to whether a high dietary acid load leads to the development of cardiometabolic diseases or vice versa. Therefore, interventional studies are needed to determine whether dietary acidity has an effect on the development of cardiometabolic diseases. Secondly, individuals’ dietary intakes were recorded with a 3-day food consumption record. Inaccurate reporting of dietary intake, especially by obese individuals, is an important problem with diet assessment methods based on self-reports. Also, compared to direct observation of food intake, self-reporting typically results in incomplete reporting of food intake. Third, PRAL and NEAP values were estimated from self-reported 3-day dietary intake and were not evaluated objectively. However, changes in dietary patterns over time, the actual nutritional composition of specific meals, preparation methods, and absorption of nutrients in the gastrointestinal tract are not taken into account by equations that measure dietary acid load, such as PRAL and NEAP. Dietary PRAL and NEAP scores are frequently used in epidemiological studies and although they are highly correlated with measured acid load, they may be affected by inaccurate nutritional reports ( 14 , 15 ).

In conclusion, after correcting for possible confounding factors, we found that higher PRAL value was associated with higher TG in individuals with diabetes, but we did not observe any association between NEAP value and risk factors. Aiming for an improvement in dietary acid–base balance may be a useful strategy for preventing cardiometabolic disorders. However, further prospective studies are needed to observe the effects of dietary acid–base load on diabetes and cardiometabolic risk factors better.

Data availability statement

The datasets presented in this article are not readily available because of privacy or ethical restrictions. Requests to access the datasets should be directed to SG, [email protected] .

Ethics statement

The studies involving humans were approved by the Baskent University Institutional Review Board and Ethics Committee. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

SG: Conceptualization, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing. MS: Conceptualization, Methodology, Supervision, Visualization, Writing – original draft, Writing – review & editing.

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This study was supported by Başkent University Research Fund.

Acknowledgments

This study was derived from the thesis completed in the PhD program of the Department of Nutrition and Dietetics, Başkent University Institute of Health Sciences. We thank the staff of the Endocrinology and Metabolic Diseases Outpatient Clinic of Ankara Başkent University Hospital in Çankaya district for their support during data collection.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

1. International Diabetes Federation (IDF). Diabetes Atlas . 10th ed. Brussels: Belgium (2021).

Google Scholar

2. Sapra, A, and Bhandari, P. Diabetes. Available at: https://www.ncbi.nlm.nih.gov/books/NBK551501/ (Accessed December 2023).

3. Lorber, D. Importance of cardiovascular disease risk management in patients with type 2 diabetes mellitus. Diabetes Metab Syndr Obes . (2014) 7:169–83. doi: 10.2147/DMSO.S61438

PubMed Abstract | Crossref Full Text | Google Scholar

4. Matheus, AS, Tannus, LR, Cobas, RA, Palma, CC, Negrato, CA, and Gomes, MB. Impact of diabetes on cardiovascular disease: an update. Int J Hypertens . (2013) 2013:653789. doi: 10.1155/2013/653789

5. Adıbelli, D, Sümen, A, and İlaslan, E. Yetişkin bireylerde kardiyovasküler hastalık ve diyabet riskinin psikolojik semptomlar ile ilişkisi. Ordu Univesity J Nurs Stud . (2020) 3:83–92. doi: 10.38108/ouhcd.750517

Crossref Full Text | Google Scholar

6. Fagherazzi, G, Vilier, A, Bonnet, F, Lajous, M, Balkau, B, Boutron-Ruault, MC, et al. Dietary acid load and risk of type 2 diabetes: the E3N-EPIC cohort study. Diabetologia . (2014) 57:313–20. doi: 10.1007/s00125-013-3100-0

7. Souto, G, Donapetry, C, Calviño, J, and Adeva, MM. Metabolic acidosis-induced insulin resistance and cardiovascular risk. Metab Syndr Relat Disord . (2011) 9:247–53. doi: 10.1089/met.2010.0108

8. Moghadam, SK, Bahadoran, Z, Mirmiran, P, Tohidi, M, and Azizi, F. Association between dietary acid load and insulin resistance: Tehran lipid and glucose study. Prev Nutr Food Sci . (2016) 21:104–9. doi: 10.3746/pnf.2016.21.2.104

9. Murakami, K, Sasaki, S, Takahashi, Y, and Uenishi, Kthe Japan Dietetic Students' Study for Nutrition and Biomarkers Group. Association between dietary acid base load and cardiometabolic risk factors in young japanese women. Br J Nutr . (2008) 100:642–51. doi: 10.1017/S0007114508901288

10. Abbasalizad Farhangi, M, Nikniaz, L, and Nikniaz, Z. Higher dietary acid load potentially increases serum triglyceride and obesity prevalence in adults: an updated systematic review and meta-analysis. PLoS One . (2019) 14:e0216547. doi: 10.1371/journal.pone.0216547

11. Kiefte-de Jong, JC, Li, Y, Chen, M, Curhan, GC, Mattei, J, Malik, VS, et al. Diet-dependent acid load and type 2 diabetes: pooled results from three prospective cohort studies. Diabetologia . (2017) 60:270–9. doi: 10.1007/s00125-016-4153-7

12. Lee, KW, and Shin, D. Positive association between dietary acid load and future insulin resistance risk: findings from the Korean genome and epidemiology study. Nutr J . (2020) 19:137. doi: 10.1186/s12937-020-00653-6

13. Jayedi, A, and Shab-Bidar, S. Dietary acid load and risk of type 2 diabetes: a systematic review and doseeresponse meta-analysis of prospective observational studies. Clin Nutr ESPEN . (2018) 23:10–8. doi: 10.1016/j.clnesp.2017.12.005

14. Remer, T, and Manz, F. Estimation of the renal net acid excretion by adults consuming diets containing variable amounts of protein. Am J Clin Nutr . (1994) 59:1356–61. doi: 10.1093/ajcn/59.6.1356

15. Frassetto, LA, Todd, KM, Morris, RC, et al. Estimation of net endogenous noncarbonic acid production in humans from diet potassium and protein contents. Am J Clin Nutr . (1998) 68:576–83. doi: 10.1093/ajcn/68.3.576

16. Huang, PL. A comprehensive definition for metabolic syndrome. Dis Model Mech . (2009) 2:231–7. doi: 10.1242/dmm.001180

17. World Health Organization. Obesity: Preventing and managing the global epidemic . Geneva: Switzerland (2000).

18. Mazidi, M, Mikhailidis, DP, and Banach, M. Higher dietary acid load is associated with higher likelihood of peripheral arterial disease among American adults. J Diabetes Complicat . (2018) 32:565–9. doi: 10.1016/j.jdiacomp.2018.03.001

19. Mirmiran, P, Houshialsadat, Z, Bahadoran, Z, Khalili‑Moghadam, S, Shahrzad, MK, and Azizi, F. Dietary acid load and risk of cardiovascular disease: a prospective population-based study. BMC Cardiovasc Disord . (2021) 21:432. doi: 10.1186/s12872-021-02243-8

20. Kucharska, AM, Szostak-Węgierek, DE, Waśkiewicz, A, Piotrowski, W, Stepaniak, U, Pająk, A, et al. Dietary acid load and cardiometabolic risk in the polish adult population. Adv Clin Exp Med . (2018) 27:1347–5. doi: 10.17219/acem/69733

21. Pasiakos, SM, Lieberman, HR, and Fulgoni, VL. Higher-protein diets are associated with higher HDL cholesterol and lower BMI and waist circumference in US adults. J Nutr . (2015) 145:605–14. doi: 10.3945/jn.114.205203

22. Lamberg-Allardt, C, Bärebring, L, Arnesen, EK, Nwaru, BI, Thorisdottir, B, Ramel, A, et al. Animal versus plant-based protein and risk of cardiovascular disease and type 2 diabetes: a systematic review of randomized controlled trials and prospective cohort studies. Food. Nutr Res . (2023) 67:67. doi: 10.29219/fnr.v67.9003

23. Engberink, MF, Bakker, SJ, Brink, EJ, et al. Dietary acid load and risk of hypertension: the Rotterdam study. Am J Clin Nutr . (2012) 95:1438–44. doi: 10.3945/ajcn.111.022343

24. Akter, S, Kurotani, K, Kashino, I, Goto, A, Mizoue, T, Noda, M, et al. High dietary acid load score is associated with increased risk of type 2 diabetes in japanese men: the Japan public health center-based prospective study. J Nutr . (2016) 146:1076–83. doi: 10.3945/jn.115.225177

25. Fereidouni, S, Hejazi, N, Homayounfar, R, and Farjam, M. Diet quality and dietary acid load in relation to cardiovascular disease mortality: results from Fasa PERSIAN cohort study. Food Sci Nutr . (2023) 11:1563–71. doi: 10.1002/fsn3.3197

26. Huxley, R, Barzi, F, and Woodward, M. Excess risk of fatal coronary heart disease associated with diabetes in men and women: meta-analysis of 37 prospective cohort studies. BMJ . (2006) 332:73–8. doi: 10.1136/bmj.38678.389583.7C

27. Carnauba, RA, Baptistella, AB, Paschoal, V, and Hübscher, G. Diet- induced low-grade metabolic acidosis and clinical outcomes: a review. Nutrients . (2017) 9:538. doi: 10.3390/nu9060538

28. Maurer, M, Riesen, W, Muser, J, Hulter, HN, and Krapf, R. Neutralization of Western diet inhibits bone resorption independently of K intake and reduces cortisol secretion in humans. Am J Physiol Renal Physiol . (2003) 284:F32–40. doi: 10.1152/ajprenal.00212.2002

29. Abate, N, Chandalia, M, Cabo-Chan, AV Jr, Moe, OW, and Sakhaee, K. The metabolic syndrome and uric acid nephrolithiasis: novel features of renal manifestation of insulin resistance. Kidney Int . (2004) 65:386–92. doi: 10.1111/j.1523-1755.2004.00386.x

30. Mandel, EI, Taylor, EN, and Curhan, GC. Dietary and lifestyle factors and medical conditions associated with urinary citrate excretion. Clin J Am Soc Nephrol . (2013) 8:901–8. doi: 10.2215/CJN.07190712

31. Rebolledo, OR, Hernandez, RE, Zanetta, AC, and Gagliardino, JJ. Insulin secretion during acid-base alterations. Am J Phys . (1978) 234:E426–9. doi: 10.1152/ajpendo.1978.234.4.E426

32. Rylander, R, Tallheden, T, and Vormann, J. Acid-base conditions regulate calcium and magnesium homeostasis. Magnes Res . (2009) 22:262–5. doi: 10.1684/mrh.2009.0182

Keywords: cardiovascular disease, dietary acid load, type 2 diabetes, potential renal acid load, net endogenous acid production

Citation: Güngör S and Saka M (2024) Evaluation of the relationship between dietary acid load and cardiovascular risk factors in patients with type 2 diabetes: a case–control study. Front. Nutr . 11:1445933. doi: 10.3389/fnut.2024.1445933

Received: 08 June 2024; Accepted: 05 August 2024; Published: 14 August 2024.

Reviewed by:

Copyright © 2024 Güngör and Saka. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Sedef Güngör, [email protected]

† ORCID: Sedef Güngör, orcid.org/0000-0002-2338-8576 Mendane Saka, orcid.org/0000-0002-5516-426X

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

This website may not work correctly because your browser is out of date. Please update your browser .

Using case studies to do program evaluation

Using case studies to do program evaluation File type PDF File size 79.49 KB

This paper, authored by Edith D. Balbach for the California Department of Health Services is designed to help evaluators decide whether to use a case study evaluation approach.

It also offers guidance on how to conduct a case study evaluation.

This resource was suggested to BetterEvaluation by Benita Williams.

Using a Case Study as an Evaluation Tool 3
When to Use a Case Study 4
How to Do a Case Study 6
Unit Selection 6
Data Collection 7
Data Analysis and Interpretation 12

Balbach, E. D. 9 California Department of Health Services, (1999). Using case studies to do program evaluation . Retrieved from website: http://www.case.edu/affil/healthpromotion/ProgramEvaluation.pdf

'Using case studies to do program evaluation' is referenced in:

The relationship between MRI-detected hip abnormalities and hip pain in hip osteoarthritis: a systematic review

Systematic Review
Open access
Published: 13 August 2024

Cite this article

You have full access to this open access article

Haonan Fang 1 ,
Xiaoyue Zhang 1 ,
Junjie Wang 1 ,
Xing Xing 1 ,
Ziyuan Shen 1 , 2 &
Guoqi Cai 1 , 2

41 Accesses

1 Altmetric

Explore all metrics

Magnetic resonance imaging (MRI) is increasingly used in the classification and evaluation of osteoarthritis (OA). Many studies have focused on knee OA, investigating the association between MRI-detected knee structural abnormalities and knee pain. Hip OA differs from knee OA in many aspects, but little is known about the role of hip structural abnormalities in hip pain. This study aimed to systematically evaluate the association of hip abnormalities on MRI, such as cartilage defects, bone marrow lesions (BMLs), osteophytes, paralabral cysts, effusion-synovitis, and subchondral cysts, with hip pain. We searched electronic databases from inception to February 2024, to identify publications that reported data on the association between MRI features in the hip joint and hip pain. The quality of the included studies was scored using the Newcastle-Ottawa Scale (NOS). The levels of evidence were evaluated according to the Cochrane Back Review Group Method Guidelines and classified into five levels: strong, moderate, limited, conflicting, and no evidence. A total of nine studies were included, comprising five cohort studies, three cross-sectional studies, and one case-control study. Moderate level of evidence suggested a positive association of the presence and change of BMLs with the severity and progress of hip pain, and evidence for the associations between other MRI features and hip pain were limited or even conflicting. Only a few studies with small to modest sample sizes evaluated the association between hip structural changes on MRI and hip pain. BMLs may contribute to the severity and progression of hip pain. Further studies are warranted to uncover the role of hip MRI abnormalities in hip pain. The protocol for the systematic review was registered with PROSPERO ( https://www.crd.york.ac.uk/PROSPERO/ , CRD42023401233).

Osteoarthritis of the hip: is radiography still needed?

Quantification of hip effusion-synovitis and its cross-sectional and longitudinal associations with hip pain, MRI findings and early radiographic hip OA

Bone marrow lesions in osteoarthritis: biomarker or treatment target? A narrative review

Explore related subjects.

Medical Imaging

Avoid common mistakes on your manuscript.

Intruduction

Osteoarthritis (OA) is a common musculoskeletal disease of the entire joint, characterized by pain and disability [ 1 ]. The hip joint is a frequently affected site of OA [ 2 ], affecting more than 240 million people in the world [ 3 ]. The pathophysiology of OA involves multiple tissues, including cartilage, bone, ligaments, synovium, and muscles [ 2 , 4 ]. Understanding the involvement of these tissues in joint symptoms is crucial for developing effective treatment strategies. Although conventional x-rays are frequently used for the diagnosis and classification of OA, soft tissues cannot be adequately evaluated using this technique. Moreover, the available evidence does not show a consistent association between radiographic features and OA pain [ 5 , 6 ]. More advanced imaging techniques, especially magnetic resonance imaging (MRI), offer much higher sensitivity in detecting early signs of joint damage, making it an invaluable tool for evaluating OA and its associated pain [ 7 , 8 ].

In contrast to the extensive body of research examining factors associated with knee pain, there has been much fewer studies investigating the source of hip pain [ 9 ]. The xcharacteristics of hip OA differ significantly from knee OA in many aspects including epidemiology, prognosis, pathophysiology, anatomical and biomechanical factors, clinical presentation, and pain management [ 10 ]. Thus, the etiology and contributing factors for hip pain can differ from those of knee pain. It has been shown that knee pain is associated with several MRI features such as bone marrow lesions (BMLs) [ 11 ], effusion/synovitis [ 12 , 13 ], meniscus tear, infrapatellar fat pad [ 14 ], osteophytes [ 15 ] and cartilage defects [ 16 ]. Clinical studies have gone further to explore the use of BMLs and effusion-synovitis as treatment targets for knee OA [ 17 , 18 , 19 , 20 ]. However, few studies have evaluated the role of MRI features in the hip in the assessment, prognosis, and treatment of hip OA. Therefore, this study aimed to systematically review studies evaluating the association between MRI abnormalities and hip pain.

Materials and methods

Protocol registration.

The protocol for the systematic review was registered with PROSPERO ( https://www.crd.york.ac.uk/PROSPERO/ , CRD42023401233). This systematic review was reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist [ 21 ]. The report of this study followed the Cochrane Handbook for Systematic Reviews of Interventions [ 22 ]. This study was a systematic review and ethics committee review was not applicable.

Data source and search strategy

We searched Medline (via Ovid), Web of science, Embase (via Ovid), and Cumulative Index to Nursing & Allied Health Literature (CINAHL) from inception to June 2024, for relevant studies evaluating the association of MRI abnormalities in the hip with hip pain. The following search terms were used: ‘hip’, ‘hip joint’, ‘pain’, ‘MRI’, ‘osteoarthritis’, detailed search strategies are provided in the Supplementary Methods. We also checked the citation lists of the included studies and relevant systematic reviews and gray literature (e.g. conference abstract) for additional studies.

Study selection

Two authors (HF and XZ) conducted an independent review of the titles and abstracts of all identified studies, followed by retrieving the full texts of relevant studies for further screening. The full-text reviews were performed in accordance with the selection criteria outlined in the registered protocol. Specifically, observational studies evaluating the association between MRI abnormalities (e.g. BMLs or cartilage defect) and pain in the hip joint were included. Animal studies or studies without data on MRI features and/or hip pain were excluded. There was no restriction on language.

Data extraction

Two authors (HF and XZ) independently extracted data from each included study. The extracted data included: (1) study characteristics (the first author, year of publication, place (country/territory), study design, and sample size); (2) characteristics of the study population (e.g. age, sex, OA patients or community-dwelling participants); (3) MRI features (e.g. subchondral cysts, paralabral cysts, cartilage defects, BMLs, osteophytes, and effusion/synovitis) (Table 1 ); (4) assessment of hip pain, (5) main findings for the association between MRI features and hip pain; and (6) adjusted covariates.

Assessment of study quality and credibility of evidence

Two authors (HF and XZ) independently assessed the methodological quality of the included studies using the Newcastle-Ottawa Scale (NOS) for cohort studies [ 23 ] and case-control studies [ 24 ], and an extension for cross-sectional studies [ 25 ]. Differences in scoring were resolved by discussion or by consulting the third author (GC). The possible scores of study quality ranged from 0 to 9 for cohort studies, 0–8 for cross-sectional and case-control studies, with higher scores indicating higher quality. A score of ≥ 7 was considered high study quality for cohort studies [ 26 ], cross-sectional studies [ 27 ] and case-control studies [ 28 ].

The same two authors independently evaluated the credibility of evidence for the association between each MRI feature and hip pain on the basis of the guidelines of the Cochrane Collaboration Back Review Group [ 29 ]. The credibility of evidence was categorized into five levels based on the following criteria: (1) Strong: multiple high-quality cohort studies show generally consistent findings, (2) Moderate: One high-quality cohort study and at least two high-quality cross-sectional studies or only at least three high-quality cross-sectional studies show generally consistent findings, (3) Limited: a single cohort study, or up to two cross-sectional studies show less consistent findings, (4) Conflicting: no consistent findings were reported, (5) No evidence: no studies were published.

Literature search

The flowchart of the study selection process is shown in Fig. 1 . We identified a total of 1878 potentially relevant records from electronic search. After screening the titles and abstracts, 1864 were excluded. From the remaining 14 records, we further excluded 5 irrelevant studies, leaving 9 studies in this systematic review. Among the studies included, 5 were cohort studies [ 30 , 31 , 32 , 33 , 34 ], 3 were cross-sectional studies [ 35 , 36 , 37 ], and 1 was a case-control study [ 38 ].

Flowchart of study selection

Characteristics of included studies

Table 2 shows the characteristics of included studies. Overall, the sample size of the studies were small to modest ( n = 19 to 237), and the follow-up time of the 5 cohort studies ranged from 1 to 2.3 years. Among the 9 included studies, 4 examined multiple MRI features [ 32 , 35 , 37 , 38 ] and 5 examined a single MRI feature [ 30 , 31 , 33 , 34 , 36 ]. Three studies were conducted in the same population [ 30 , 33 , 34 ]. Four studies used a 1.5T MRI [ 30 , 33 , 34 , 36 ], four used a 3T scanners [ 31 , 32 , 35 , 38 ], and the remaining one did not report the strength of MRI used [ 37 ]. Most of the studies used sagittal imaging [ 30 , 32 , 33 , 34 , 37 , 38 ], with two studies using both sagittal, coronal and oblique axial imaging [ 32 , 38 ], only one study used coronal and sagittal imaging [ 35 ], and one study used coronal imaging alone [ 36 ]. The patients investigated in the included studies were essentially middle-aged and older adults (mean age 46.5 to 66 years, 27.6-57.9% males), except for one study that examined high-impact athletes in their 20s and 30s [ 38 ].

Assessment of study quality

Five of the 9 studies (55.6%) were scored above the high-quality threshold (i.e. ≥7) according to the NOS assessment. For cross-sectional studies, only 1 of the 3 studies was scored high-quality, with the main issues being small sample size, sample representativeness, and the lack of comparison between respondents and non-respondents. Meanwhile, 4 of the 5 cohort studies were above the high-quality threshold, and only 1 study had issues with the representation and selection of exposed and non-exposed groups (Supplementary Tables 1–3).

Association between MRI features and hip pain

Subchondral cysts.

One cross-sectional study [ 35 ], one case-control study [ 38 ] and one cohort study [ 32 ] evaluated the association between subchondral cysts and hip pain (Table 3 ). The credibility of the evidence was limited. The cross-sectional study showed a positive correlation between total subchondral cyst score (grade 0–2) and more severe hip pain score, assessed by the Harris Hip Score and Hip Disability and Osteoarthritis Outcome Score (HOOS) pain subscale (rank correlation coefficient = 0.37, P = 0.001) [ 35 ]. The case-control study did not observe a significant difference in subchondral cysts (grade 0–2) between symptomatic and control hips in athletes (8% vs. 7%, odds ratio (OR) = 1.29, 95% confidence interval (CI) 0.51 to 3.23) [ 38 ]. The cohort study showed a boardline significant association between baseline subchondral cyst score (grade 0–2) and change in hip pain (rank correlation coefficient = 0.30, p = 0.051) [ 32 ]. Moreover, the cohort study found a significant correlation between progression of subchondral cysts and change in HOOS symptoms other than pain (i.e. functional disability and stiffness) (rank correlation coefficient = 0.30, p = 0.03) but not hip pain score over 1.5 years (rank correlation coefficient = 0.18, p = 0.19) [ 32 ].

Paralabral cyst

One cross-sectional study [ 37 ] and one cohort study [ 32 ] evaluated the association between paralabral cyst and hip pain (Table 3 ). The credibility of the evidence was limited. The cross-sectional study found that paralabral cyst scores, based on the Hip Osteoarthritis MRI Scoring System (HOAMS), were similar in painless and painful hips (mean paralabral cyst score: 0.81 vs. 0.91, p = 0.39) [ 37 ]. Consistently, the cohort study found that neither baseline nor progression of paralabral cysts was associated with change in HOOS pain or other subscales, except that progression of paralabral cysts was associated with HOOS activity of daily living subscale (rank correlation coefficient = 0.30, p = 0.03) [ 32 ].

Effusion-synovitis

One cross-sectional study [ 37 ], one case-control study [ 38 ] and one cohort study [ 34 ] showed inconsistent findings for the association between hip effusion-synovitis and hip pain (Table 3 ). The credibility of the evidence was conflicting. The cohort study observed a significant positive correlation between presence of hip effusion-synovitis at two/three sites and presence of hip pain (PR (95% CI): 1.42 (1.05, 1.93)), although there was no significant correlation between change in effusion-synovitis size and change in hip pain [ 34 ]. By contrast, the case-control study showed an inverse correlation between effusion-synovitis and the presence of hip symptoms (OR (95% CI) 0.46 (0.26, 0.81)), before and after adjusting for age, sex, and BMI [ 38 ]. The remaining cross-sectional study reported no significant associations between joint effusion/synovitis and hip pain [ 37 ].

Cartilage defects

One cohort study [ 33 ], one case-control study [ 38 ] and two cross-sectional studies [ 35 , 37 ] examined the association between cartilage defects and hip pain (Table 3 ). The credibility of the evidence was limited. The cohort study reported higher levels of Western Ontario and McMaster Universities Arthritis Index (WOMAC) hip pain in individuals with any type of hip cartilage defects (PR (95% CI): 1.20 (1.02, 1.35)) and secondary cartilage defects (PR (95% CI): 1.40 (1.09, 1.80)) [ 33 ]. One cross-sectional study reported a significant linear correlation between cartilage defects score and Visual Analogue Scale (VAS) hip pain ( r = 0.46, P < 0.001), although cartilage defects score was not statistically significantly different between individuals with and those without hip pain (mean cartilage defects score: 1.23 vs. 0.75, p = 0.18) [ 37 ], another cross-sectional study found a significant correlation between acetabular cartilage score and HOOS pain ( r = 0.25, p = 0.026), but there’s no correlation between femoral cartilage score and HOOS pain (r s =0.17, p = 0.146) [ 35 ].

Osteophytes

One cross-sectional study [ 37 ] examined the relationship between MRI-detected osteophytes and hip pain (Table 3 ), showing a positive correlation between osteophyte score and VAS pain ( r = 0.5811, p < 0.0001), and there was a higher osteophyte score in the inferomedial compartment in individuals with hip pain than those without (3.0 vs. 2.0, p = 0.03) [ 37 ]. The credibility of evidence was limited.

Three cohort studies [ 30 , 31 , 32 ], three cross-sectional studies [ 35 , 36 , 37 ], and one case-control study [ 38 ] evaluated the association between BMLs and hip pain (Table 3 ). The credibility of evidence was moderate. All three cohort studies consistently reported a significant association between BMLs and hip pain, with one showing that change in BML size was significantly associated with change in hip pain (regression coefficient [β] (95% CI): 0.85 (0.00, 1.71)), and the severity of hip pain was associated with a per square centimeter increase in the size of acetabular BML (regression coefficient [β] (95% CI): 4.18 (1.54, 6.88)) [ 30 ]. The second cohort study found that Modified Harris Hip Score (MHHS) pain score was significantly lower in individuals with BMLs than those without, regardless of the size of BMLs ( p < 0.05) [ 31 ], and the third cohort study indicated that baseline BML size was significantly associated with worsening of HOOS pain subscale (regression coefficient [β] (95% CI): 0.690 (0.464, 0.913)) [ 32 ]. All three cross-sectional studies reported positive correlations between BML scores and hip pain ( r = 0.29 to 0.51, p < 0.05) [ 35 , 36 , 37 ], and the remaining case-control study did not observe a significant differences in BML scores between symptomatic and control hips [ 38 ].

This systematic review screened and evaluated studies that described the association between MRI-detected hip abnormalities and hip pain, and several MRI features were identified, such as osteophytes, subchondral cysts, paralabral cysts, effusion-synovitis, BMLs and cartilage defects. Overall, the number, sample size, and quality of included studies were inferior to studies focusing on the knee, and current evidence suggests that BMLs, cartilage defects, and osteophytes may be associated with the presence and severity of hip pain, while subchondral and paralabral cysts may not. Moreover, the association between effusion-synovitis and hip pain was conflicting. Considering the paucity of studies examining their association, a robust conclusion cannot be reached [ 39 ]. Thus, more studies are required to validate whether these MRI features contribute to the presence and severity of hip pain.

The credibility of evidence for the association between each of the hip MRI features and hip pain was limited or even conflicting, except that there was a moderate level of evidence for the association between BMLs and hip pain. This can be attributed to various reasons. Firstly, the limited number of included studies may have restricted the breadth and depth of the analysis, potentially leading to less robust conclusions. Secondly, some of the included studies might have exhibited lower overall quality of evidence due to factors such as small sample sizes and inadequate representativeness, impacting the reliability and validity of the findings. Moreover, our research methodology, which involved aggregating study results and applying uniform criteria, while simple, may have hindered the effective synthesis and interpretation of the data, potentially resulting in less accurate or comprehensive outcomes.

We found moderate evidence of a positive association between BMLs and hip pain. These findings are similar to other studies showing a significant association between BMLs and knee pain [ 11 , 40 ], suggesting that BMLs could be a potential cause or indicator of both knee and hip OA. This could contribute to the management of hip OA, as effectively managing the progression of BMLs may reduce knee pain in knee OA with BMLs [ 18 ]. The additional MRI features in this study, despite showing limited or conflicting evidence, play a role in semi-quantitative evaluation of hip OA [ 41 ]. These features, awaiting further study, hold promise for distinguishing hip OA subtypes and informing its diagnosis and treatment.

The strength of this study is that we systematically screened studies that evaluated the association between hip MRI abnormalities and hip pain and employed a pre-specified assessment system to qualitatively evaluate the credibility of evidence. There are several limitations in this study. First, we categorized the results of the included studies as either negative or positive based solely on statistical significance, without considering the influence of sample size on the outcomes, and this may have overlooked false negative findings. However, the limited number of studies disabled us from conducting a meta-analysis to pool these results. Second, we scored the methodological quality of the included studies with different designs. The subjective awareness of the evaluator can have an impact on the results of the assessment, leading to biases, although the scores were rated by different authors to reach a consensus.

In conclusion, only a few studies with small to modest sample sizes evaluated the association between hip structural changes on MRI and hip pain. BMLs may contribute to the severity and progression of hip pain. Further studies are warranted to uncover the role of hip MRI abnormalities in hip pain.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Hutton CW (1987) Generalised osteoarthritis: an evolutionary problem? Lancet. (London England) 1(8548):1463–1465. https://doi.org/10.1016/s0140-6736(87)92209-4

Article CAS Google Scholar

Lespasio MJ, Sultan AA, Piuzzi NS, Khlopas A, Husni ME, Muschler GF, Mont MA (2018) Hip osteoarthritis: a primer. Permanente J 22:17–084. https://doi.org/10.7812/tpp/17-084

Article Google Scholar

Katz JN, Arant KR, Loeser RF (2021) Diagnosis and treatment of hip and knee osteoarthritis: a review. JAMA 325(6):568–578. https://doi.org/10.1001/jama.2020.22171

Article CAS PubMed PubMed Central Google Scholar

Loeser RF, Goldring SR, Scanzello CR, Goldring MB (2012) Osteoarthritis: a disease of the joint as an organ. Arthritis Rheum 64(6):1697–1707. https://doi.org/10.1002/art.34453

Article PubMed PubMed Central Google Scholar

Neogi T, Felson D, Niu J, Nevitt M, Lewis CE, Aliabadi P, Sack B, Torner J, Bradley L, Zhang Y (2009) Association between radiographic features of knee osteoarthritis and pain: results from two cohort studies. BMJ (Clinical Res ed) 339:b2844. https://doi.org/10.1136/bmj.b2844

Kim C, Nevitt MC, Niu J, Clancy MM, Lane NE, Link TM, Vlad S, Tolstykh I, Jungmann PM, Felson DT, Guermazi A (2015) Association of hip pain with radiographic evidence of hip osteoarthritis: diagnostic test study. BMJ (Clinical Res ed) 351:h5983. https://doi.org/10.1136/bmj.h5983

Guermazi A, Roemer FW, Haugen IK, Crema MD, Hayashi D (2013) MRI-based semiquantitative scoring of joint pathology in osteoarthritis. Nat Rev Rheumatol 9(4):236–251. https://doi.org/10.1038/nrrheum.2012.223

Article PubMed Google Scholar

Roemer FW, Guermazi A, Demehri S, Wirth W, Kijowski R (2022) Imaging in Osteoarthritis. Osteoarthr Cartil 30(7):913–934. https://doi.org/10.1016/j.joca.2021.04.018

Sandhar S, Smith TO, Toor K, Howe F, Sofat N (2020) Risk factors for pain and functional impairment in people with knee and hip osteoarthritis: a systematic review and meta-analysis. BMJ open 10(8):e038720. https://doi.org/10.1136/bmjopen-2020-038720

van der Hall M, Hinman RS, Peat G, de Zwart A, Quicke JG, Runhaar J, van der Knoop J, de Rooij M, Meulenbelt I, Vliet Vlieland T, Lems WF, Holden MA, Foster NE, Bennell KL (2022) How does hip osteoarthritis differ from knee osteoarthritis? Osteoarthr Cartil 30(1):32–41. https://doi.org/10.1016/j.joca.2021.09.010

Felson DT, Chaisson CE, Hill CL, Totterman SM, Gale ME, Skinner KM, Kazis L, Gale DR (2001) The association of bone marrow lesions with pain in knee osteoarthritis. Ann Intern Med 134(7):541–549. https://doi.org/10.7326/0003-4819-134-7-200104030-00007

Article CAS PubMed Google Scholar

Hill CL, Hunter DJ, Niu J, Clancy M, Guermazi A, Genant H, Gale D, Grainger A, Conaghan P, Felson DT (2007) Synovitis detected on magnetic resonance imaging and its relation to pain and cartilage loss in knee osteoarthritis. Ann Rheum Dis 66(12):1599–1603. https://doi.org/10.1136/ard.2006.067470

Wang X, Jin X, Han W, Cao Y, Halliday A, Blizzard L, Pan F, Antony B, Cicuttini F, Jones G, Ding C (2016) Cross-sectional and longitudinal associations between knee joint effusion synovitis and knee Pain in older adults. J Rhuematol 43(1):121–130. https://doi.org/10.3899/jrheum.150355

Pan F, Han W, Wang X, Liu Z, Jin X, Antony B, Cicuttini F, Jones G, Ding C (2015) A longitudinal study of the association between infrapatellar fat pad maximal area and changes in knee symptoms and structure in older adults. Ann Rheum Dis 74(10):1818–1824. https://doi.org/10.1136/annrheumdis-2013-205108

Zhu Z, Laslett LL, Jin X, Han W, Antony B, Wang X, Lu M, Cicuttini F, Jones G, Ding C (2017) Association between MRI-detected osteophytes and changes in knee structures and pain in older adults: a cohort study. Osteoarthr Cartil 25(7):1084–1092. https://doi.org/10.1016/j.joca.2017.01.007

Everhart JS, Abouljoud MM, Flanigan DC (2019) Role of full-thickness cartilage defects in knee osteoarthritis (OA) incidence and progression: data from the OA Initiative. J Orthop Res 37(1):77–83. https://doi.org/10.1002/jor.24140

Eriksen EF (2015) Treatment of bone marrow lesions (bone marrow edema). BoneKEy Rep 4:755. https://doi.org/10.1038/bonekey.2015.124

Callaghan MJ, Parkes MJ, Hutchinson CE, Gait AD, Forsythe LM, Marjanovic EJ, Lunt M, Felson DT (2015) A randomised trial of a brace for patellofemoral osteoarthritis targeting knee pain and bone marrow lesions. Ann Rheum Dis 74(6):1164–1170. https://doi.org/10.1136/annrheumdis-2014-206376

Cai G, Aitken D, Laslett LL, Pelletier JP, Martel-Pelletier J, Hill C, March L, Wluka AE, Wang Y, Antony B, Blizzard L, Winzenberg T, Cicuttini F, Jones G (2020) Effect of Intravenous Zoledronic Acid on Tibiofemoral cartilage volume among patients with knee osteoarthritis with bone marrow lesions: a Randomized Clinical Trial. JAMA 323(15):1456–1466. https://doi.org/10.1001/jama.2020.2938

Wang Z, Jones G, Winzenberg T, Cai G, Laslett LL, Aitken D, Hopper I, Singh A, Jones R, Fripp J, Ding C, Antony B (2020) Effectiveness of Curcuma longa Extract for the treatment of symptoms and effusion-synovitis of knee osteoarthritis : a Randomized Trial. Ann Intern Med 173(11):861–869. https://doi.org/10.7326/m20-0990

Moher D, Liberati A, Tetzlaff J, Altman DG, Group P (2009) Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med 6(7):e1000097. https://doi.org/10.1371/journal.pmed.1000097

Higgins JPTTJ, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (eds) (2023) Cochrane Handbook for Systematic Reviews of Interventions version 6.4 (updated August 2023). www.training.cochrane.org/handbook

Wells GA, Wells G, Shea B, Shea B, O’Connell D, Peterson J, Welch, Losos M, Tugwell P, Ga SW, Zello GA, Petersen JA (2014) The Newcastle-Ottawa Scale (NOS) for Assessing the Quality of Nonrandomised Studies in Meta-Analyses. In

Burfield M, Sayers M, Buhmann R (2023) The association between running volume and knee osteoarthritis prevalence: a systematic review and meta-analysis. Phys Therapy Sport : Official J Association Chart Physiotherapists Sports Med 61:1–10. https://doi.org/10.1016/j.ptsp.2023.02.003

Moskalewicz A, Oremus M (2020) No clear choice between Newcastle-Ottawa Scale and Appraisal Tool for cross-sectional studies to assess methodological quality in cross-sectional studies of health-related quality of life and breast cancer. J Clin Epidemiol 120:94–103. https://doi.org/10.1016/j.jclinepi.2019.12.013

Mahdi SS, Allana R, Battineni G, Khalid T, Agha D, Khawaja M, Amenta F (2022) The promise of telemedicine in Pakistan: a systematic review. Health Sci Rep 5(1):e438. https://doi.org/10.1002/hsr2.438

Neal BS, Lack SD, Lankhorst NE, Raye A, Morrissey D, van Middelkoop M (2019) Risk factors for patellofemoral pain: a systematic review and meta-analysis. Br J Sports Med 53(5):270–281. https://doi.org/10.1136/bjsports-2017-098890

Su B, Qin W, Xue F, Wei X, Guan Q, Jiang W, Wang S, Xu M, Yu S (2018) The relation of passive smoking with cervical cancer: a systematic review and meta-analysis. Medicine 97(46):e13061. https://doi.org/10.1097/md.0000000000013061

van Tulder M, Furlan A, Bombardier C, Bouter L (2003) Updated method guidelines for systematic reviews in the cochrane collaboration back review group. Spine 28(12):1290–1299. https://doi.org/10.1097/01.Brs.0000065484.95996.Af

Ahedi H, Aitken D, Blizzard L, Cicuttini F, Jones G (2014) A population-based study of the association between hip bone marrow lesions, high cartilage signal, and hip and knee pain. Clin Rheumatol 33(3):369–376. https://doi.org/10.1007/s10067-013-2394-0

Koyama T, Fukushima K, Uchida K, Ohashi Y, Uchiyama K, Takahira N, Takaso M (2022) Is bone marrow oedema in patients with labral tear an indicator of hip pain? J Orthop Surg Res 17(1):420. https://doi.org/10.1186/s13018-022-03243-w

Schwaiger BJ, Gersing AS, Lee S, Nardo L, Samaan MA, Souza RB, Link TM, Majumdar S (2016) Longitudinal assessment of MRI in hip osteoarthritis using SHOMRI and correlation with clinical progression. Semin Arthritis Rheum 45(6):648–655. https://doi.org/10.1016/j.semarthrit.2016.04.001

Ahedi HG, Aitken DA, Blizzard LC, Ding CH, Cicuttini FM, Jones G (2016) Correlates of hip cartilage defects: a cross-sectional study in older adults. J Rhuematol 43(7):1406–1412. https://doi.org/10.3899/jrheum.151001

Ahedi H, Aitken D, Blizzard L, Cicuttini F, Jones G (2020) Quantification of hip effusion-synovitis and its cross-sectional and longitudinal associations with hip pain, MRI findings and early radiographic hip OA. BMC Musculoskelet Disord 21(1):533. https://doi.org/10.1186/s12891-020-03532-7

Kumar D, Wyatt CR, Lee S, Nardo L, Link TM, Majumdar S, Souza RB (2013) Association of cartilage defects, and other MRI findings with pain and function in individuals with mild-moderate radiographic hip osteoarthritis and controls. Osteoarthr Cartil 21(11):1685–1692. https://doi.org/10.1016/j.joca.2013.08.009

Taljanovic MS, Graham AR, Benjamin JB, Gmitro AF, Krupinski EA, Schwartz SA, Hunter TB, Resnick DL (2008) Bone marrow edema pattern in advanced hip osteoarthritis: quantitative assessment with magnetic resonance imaging and correlation with clinical examination, radiographic findings, and histopathology. Skeletal Radiol 37(5):423–431. https://doi.org/10.1007/s00256-008-0446-3

Kijima H, Yamada S, Konishi N, Kubota H, Tazawa H, Tani T, Suzuki N, Kamo K, Okudera Y, Fujii M, Sasaki K, Kawano T, Iwamoto Y, Nagahata I, Miura T, Miyakoshi N, Shimada Y (2020) The differences in imaging findings between painless and painful osteoarthritis of the hip. Clin Med Insights Arthritis Musculoskelet Disorders 13:1179544120946747. https://doi.org/10.1177/1179544120946747

Heerey JJ, Srinivasan R, Agricola R, Smith A, Kemp JL, Pizzari T, King MG, Lawrenson PR, Scholes MJ, Souza RB, Link T, Majumdar S, Crossley KM (2021) Prevalence of early hip OA features on MRI in high-impact athletes. The femoroacetabular impingement and hip osteoarthritis cohort (FORCe) study. Osteoarthr Cartil 29(3):323–334. https://doi.org/10.1016/j.joca.2020.12.013

Atkins D, Best D, Briss PA, Eccles M, Falck-Ytter Y, Flottorp S, Guyatt GH, Harbour RT, Haugh MC, Henry D, Hill S, Jaeschke R, Leng G, Liberati A, Magrini N, Mason J, Middleton P, Mrukowicz J, O’Connell D, Oxman AD, Phillips B, Schünemann HJ, Edejer T, Varonen H, Vist GE, Williams JW Jr., Zaza S (2004) Grading quality of evidence and strength of recommendations. BMJ (Clinical Res ed) 328(7454):1490. https://doi.org/10.1136/bmj.328.7454.1490

Yusuf E, Kortekaas MC, Watt I, Huizinga TW, Kloppenburg M (2011) Do knee abnormalities visualised on MRI explain knee pain in knee osteoarthritis? A systematic review. Ann Rheum Dis 70(1):60–67. https://doi.org/10.1136/ard.2010.131904

Jaremko JL, Lambert RG, Zubler V, Weber U, Loeuille D, Roemer FW, Cibere J, Pianta M, Gracey D, Conaghan P, Ostergaard M, Maksymowych WP (2014) Methodologies for semiquantitative evaluation of hip osteoarthritis by magnetic resonance imaging: approaches based on the whole organ and focused on active lesions. J Rhuematol 41(2):359–369. https://doi.org/10.3899/jrheum.131082

Download references

Acknowledgements

No AI or editing software was employed for writing and editing this manuscript.

Open Access funding enabled and organized by CAUL and its Member Institutions. This work is supported by the National Natural Science Foundation of China (NSFC, 82103933) and the Scientific Research Level Upgrading Project of Anhui Medical University (2021xkjT006).

Open Access funding enabled and organized by CAUL and its Member Institutions

Author information

Authors and affiliations.

Department of Epidemiology and Biostatistics, School of Public Health, Anhui Medical University, Hefei, 230032, Anhui, China

Haonan Fang, Xiaoyue Zhang, Junjie Wang, Xing Xing, Ziyuan Shen & Guoqi Cai

Menzies Institute for Medical Research, University of Tasmania, Hobart, TAS, 7000, Australia

Ziyuan Shen & Guoqi Cai

You can also search for this author in PubMed Google Scholar

Contributions

The guarantor (GC) had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. GC conceived, initiated, and supervised the project. CG, HF and XZ cleaned and analyzed the data. HF, XZ, JW, XX, ZY and GC contributed to the interpretation of the results and writing and revision of the manuscript. All authors gave final approval of the version submitted. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

Corresponding author

Correspondence to Guoqi Cai .

Ethics declarations

Ethics approval.

Inapplicable.

Conflict of interest

Additional information, publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, supplementary material 3, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Fang, H., Zhang, X., Wang, J. et al. The relationship between MRI-detected hip abnormalities and hip pain in hip osteoarthritis: a systematic review. Rheumatol Int (2024). https://doi.org/10.1007/s00296-024-05678-2

Download citation

Received : 30 April 2024

Accepted : 03 August 2024

Published : 13 August 2024

DOI : https://doi.org/10.1007/s00296-024-05678-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Hip osteoarthritis
Magnetic resonance imaging
Find a journal
Publish with us
Track your research

IMAGES

49 Free Case Study Templates ( + Case Study Format Examples + )
Sample Case Evaluation Resource
Case Study Evaluation-tool (CaSE) checklist for essential components in
(PDF) Case study evaluation
49 Free Case Study Templates ( + Case Study Format Examples + )
49 ejemplos y plantillas de estudios de casos gratuitos

COMMENTS

Case study evaluation
Case study evaluations, using one or more qualitative methods, have been used to investigate important practical and policy questions in health care. This paper describes the features of a well ...
Case study
Case study A case study focuses on a particular unit - a person, a site, a project. It often uses a combination of quantitative and qualitative data. Case studies can be particularly useful for understanding how different elements fit together and how different elements (implementation, context and other factors) have produced the observed impacts.
Case Study Method: A Step-by-Step Guide for Business Researchers
The multiple case studies used in this article as an application of step-by-step guideline are specifically designed to facilitate these business and management researchers. This article presents an easy to read, practical, experience-based, step-by-step guided path to select, conduct, and complete the qualitative case study successfully.
Case Study Evaluation Approach
A case study evaluation approach is a great way to gain an in-depth understanding of a particular issue or situation. This type of approach allows the researcher to observe, analyze, and assess the effects of a particular situation on individuals or groups. An individual, a location, or a project may serve as the focal point of a case study's ...
15.7 Evaluation: Presentation and Analysis of Case Study
Evaluate the effectiveness and quality of a case study report. Case studies follow a structure of background and context, methods, findings, and analysis. Body paragraphs should have main points and concrete details. In addition, case studies are written in formal language with precise wording and with a specific purpose and audience (generally ...
Writing a Case Study Analysis
Writing a Case Study Analysis A case study analysis requires you to investigate a business problem, examine the alternative solutions, and propose the most effective solution using supporting evidence.
What Is a Case Study?
A case study is a detailed study of a specific subject, such as a person, group, place, event, organization, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research. A case study research design usually involves qualitative methods, but quantitative methods are sometimes also used.
Case Study Methodology of Qualitative Research: Key Attributes and
Abstract A case study is one of the most commonly used methodologies of social research. This article attempts to look into the various dimensions of a case study research strategy, the different epistemological strands which determine the particular case study type and approach adopted in the field, discusses the factors which can enhance the effectiveness of a case study research, and the ...
Guidance for the design of qualitative case study evaluation
Guidance for the design of qualitative case study evaluation. This guide, written by Professor Frank Vanclay of the Department of Cultural Geography, University of Groningen, provides notes on planning and implementing qualitative case study research. It outlines the use of a variety of different evaluation options that can be used in outcomes ...
How to Critically Evaluate Case Studies in Social Work
Abstract The purpose of this article is to develop guidelines to assist practitioners and researchers in evaluating and developing rigorous case studies. The main concern in evaluating a case study is to accurately assess its quality and ultimately to offer clients social work interventions informed by the best available evidence. To assess the quality of a case study, we propose criteria ...
Case study research for better evaluations of complex interventions
Whilst the diversity of published case studies in health services and public health research is rich and productive, we recommend further clarity and specific methodological guidance for those reporting case study research for evaluation audiences.
PDF Using Case Studies to do Program Evaluation
A case study evaluation for a program implemented in a turbulent environment should begin when program planning begins. A case study evaluation allows you to create a full, complex picture of what occurs in such environments. For example, ordinance work is pursued in political arenas, some of which are highly volatile.
Qualitative Research: Case study evaluation
Case study evaluations, using one or more qualitative methods, have been used to investigate important practical and policy questions in health care. This paper describes the features of a well designed case study and gives examples showing how qualitative methods are used in evaluations of health services and health policy. This is the last in a series of seven articles describing non ...
PDF How to Analyze a Case Study
How to Analyze a Case Study Adapted from Ellet, W. (2007). The case study handbook. Boston, MA: Harvard Business School. A business case simulates a real situation and has three characteristics: 1. a significant issue, 2. enough information to reach a reasonable conclusion, 3. no stated conclusion.
Case Study
A case study is a research method that involves an in-depth examination and analysis of a particular phenomenon or case, such as an individual, organization, community, event, or situation. It is a qualitative research approach that aims to provide a detailed and comprehensive understanding of the case being studied.
Case Study Evaluation
Case study evaluations, using one or more qualitative. methods, have been used to investigate important practical and policy questions in health care. This. paper describes the features of a well designed case study and gives examples showing how qualitative. methods are used in evaluations of health services. and health policy.
What is a case study?
Case study methodology serves to provide a framework for evaluation and analysis of complex issues. It shines a light on the holistic nature of nursing practice and offers a perspective that informs improved patient care.
PDF Case-based Evaluation
EVALUATION Case-based evaluations focus on the systematic generation and analysis of cases (sometimes known as case studies or stories of change). Cases may be based around any unit of analysis, such as people, communities, projects, programmes, institutions, policies or events. Most case-based evaluations include both the analyses of individual cases, and analysis across multiple cases.
PDF Case Study Evaluations.p65
Case studies are appropriate for determining the effects of programs or projects and reasons for success or failure. OED does most impact evaluation case studies for this purpose. The method is often used in combination with others, such as sample surveys, and there is a mix of qualitative and quantitative data.
Case study evaluation
Case study evaluations, using one or more qualitative methods, have been used to investigate important practical and policy questions in health care. This paper describes the features of a well designed case study and gives examples showing how qualitative methods are used in evaluations of health s …
JMIR Formative Research
Methods: Using a multiple-case study design, the research aims to uncover similarities and differences in participants' perceptions of the chatbot while also exploring women's desires for improvement and technological advancements in chatbot-based interventions in perinatal mental health. ... Qualitative Evaluation Using a Multiple Case ...
The Case Study Method as a Tool for Doing Evaluation
Put prevention into practice Evaluation of program initiation in nine ... Predivinsk Lespromkhoz: A Case Study on the Collaborative Restructurin... Humancentric Applications of RFID Implants: The Usability Contexts of ... If you have access to journal content via a personal subscription, university, library, employer or society, select from the ...
Performance Management Decision-Making Model: Case Study on Foreign
The performance evaluation matrix (PEM) is an excellent tool for evaluation and resource management decision making, and the administrator uses the satisfaction and the importance indices to establish evaluation coordinate points based on the rules of statistical testing. ... Case Study on Foreign Language Learning Curriculums" Information 15 ...
PDF Case Study Evaluations GAO/PEMD-91-10.1
Case Study Evaluations is one of a series of papers issued by the Program Evaluation and Methodology Division (PEMD). The purpose of the series is to provide GAO evaluators with guides to various aspects of audit and evaluation methodology, to illustrate applications, and to indicate where more detailed information is available.
Post-occupancy evaluation for tactical urbanism interventions through
The findings include the identification of best practices for tactical urbanism as well as limitations and areas for improvement. Lastly, a reflection on the post-occupancy status of the case studies towards the inclusion of community and stakeholders in future planning processes of the "Piazze Aperte" program.
Evaluation of the relationship between dietary acid load and
Study design and participant. In this case-control study, participants aged 30-65 years with a diagnosis of Type 2 diabetes according to the American Diabetes Association criteria and age- and gender-matched controls who applied to Ankara Başkent University Hospital Endocrinology and Metabolic Diseases Outpatient Clinic between November ...
Using case studies to do program evaluation
Using case studies to do program evaluation. PDF. 79.49 KB. This paper, authored by Edith D. Balbach for the California Department of Health Services is designed to help evaluators decide whether to use a case study evaluation approach. It also offers guidance on how to conduct a case study evaluation.
The relationship between MRI-detected hip abnormalities and ...
Assessment of study quality and credibility of evidence. Two authors (HF and XZ) independently assessed the methodological quality of the included studies using the Newcastle-Ottawa Scale (NOS) for cohort studies [] and case-control studies [], and an extension for cross-sectional studies [].Differences in scoring were resolved by discussion or by consulting the third author (GC).

Introduction to Case Study Evaluation Approach

Illustrative Case Study

Exploratory Case Study

Critical Instance Case Study

Program Implementation Program Implementation

Program Effects Case Study

Cumulative Case Study

Benefits of Incorporating the Case Study Evaluation Approach in the Monitoring and Evaluation of Projects and Programmes

Leave a Comment Cancel Reply

How strong is my Resume?

Land a better, higher-paying career

Jobs for You

Project Assistant – Close Out

Global Technical Advisor – Information Management

Intern- International Project and Proposal Support – ISPI

Budget and Billing Consultant

Energy Evaluation Specialist

LAND A JOB REFERRAL IN 2 WEEKS (NO ONLINE APPS!)

15.7 Evaluation: Presentation and Analysis of Case Study

Have a language expert improve your writing

What Is a Case Study? | Definition, Examples & Methods

Table of contents

Here's why students love Scribbr's proofreading services

Prevent plagiarism. Run a free check.

Cite this Scribbr article

Is this article helpful?

Shona McCombes

Case study research for better evaluations of complex interventions: rationale and challenges

Case study research offers evidence about context, causal inference in complex systems and implementation

The challenges in exploiting evidence from case study research

Acknowledging plurality and developing guidance

Availability of data and materials

Abbreviations

Acknowledgements

Author information

Contributions

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Share this article

BMC Medicine

Search form

Qualitative Research: Case study evaluation

Introduction

Case study evaluations

Log in using your username and password

Log in through your institution

Case Study – Methods, Examples and Guide

Types of Case Study

Single-Case Study

Multiple-Case Study

Descriptive Case Study

Instrumental Case Study

Case Study Data Collection Methods

Observations

How to conduct Case Study Research

Examples of Case Study

Application of Case Study

Business and Management

Social Sciences

Law and Ethics

Purpose of Case Study

Advantages of Case Study Research

Limitations of Case Study Research

About the author

Muhammad Hassan

You may also like

Applied Research – Types, Methods and Examples

Mixed Methods Research – Types & Analysis

Ethnographic Research -Types, Methods and Guide

Triangulation in Research – Types, Methods and...

Qualitative Research Methods

Quasi-Experimental Research Design – Types...

Log in using your username and password

You are here

Statistics from Altmetric.com

What is it?

Benefits and limitations of case studies