Dissertations & projects: Literature-based projects

  • Research questions
  • The process of reviewing
  • Project management
  • Literature-based projects

On these pages:

“As a general rule, the introduction is usually around 5 to 10 per cent of the word limit; each chapter around 15 to 25 per cent; and the conclusion around 5 per cent.” Bryan Greetham, How to Write Your Undergraduate Dissertation

This page gives guidance on the structure of a literature-based project.   That is, a project where the data is found in existing literature rather than found through primary research. They may also include information from primary sources such as original documents or other sources.

How to structure a literature-based project

The structure of a literature-based dissertation is usually thematic, but make sure to check with your supervisor to make sure you are abiding by your department’s project specifications. A typical literature-based dissertation will be broken up into the following sections:

Abstract or summary

Acknowledgments, contents page, introduction.

  • Literature Review

Themed Chapters

  • Bibliography

Use this basic structure as your document plan . Remember that you do not need to write it in the order it will finally be written in. 

For more advice on managing the order of your project, see our section on Project Management.   

If you use the template provided on our Formatting page, you will see that it already has a title page included. You just need to fill in the appropriate boxes by typing or choosing from the drop-down-lists. The information you need to provide is: 

Title page

  • Type of assignment (thesis, dissertation or independent project)
  • Partial or full fulfilment information
  • Subject area
  • Your name (and previous qualifications if applicable)
  • Month and year of submission

This may not always be required - check with your tutor.

Abstract - single page, one paragraph

  • It is  independent  of the rest of the report - it is a mini-report, which needs to make sense completely on its own.
  • References should  not  be included.
  • Nothing should appear in the abstract that is not in the rest of the report.
  • Usually between 200-300 words.
  • Write as a  single  paragraph.

It is recommended that you write your abstract  after  your report.

Contents page with list of headings and page numbers

If you choose not to use the template, then you will need to go through the document after it is written and create list showing which heading is on which page of your document.

Purpose: To thank those who were directly involved in your work .

  • Do not confuse the acknowledgements section with a dedication - this is not where you thank your friends and relatives unless they have helped you with your manuscript.
  • Acknowledgments are about courtesy, where you thank those who were directly involved in your work, or were involved in supporting your work (technicians, tutors, other students, financial support etc).
  • This section tends to be  very brief , a few lines at the most. Identify those who provided you with the most support, and thank them appropriately.
  • At the very least, make sure you acknowledge your supervisor!!

Purpose: To state the research problem, provide justification for your research questions and explain your methodology and main findings.

literature based research project

  • Explain what the problem you will be addressing is, what your research questions are, and why they will help address the issue.
  • Explain your basic methodology
  • Define the scope of the dissertation, explaining any limitations.
  • Layout the structure of the dissertation, taking the reader through each section and providing any key definitions.
  • Very briefly describe what your main findings are - but leave the detail for the sections below.

It is good practice to come back to the introduction after you have finished writing up the rest of the document to ensure it sets the appropriately scene for subsequent sections.

Background Literature Review

This may be part of your introduction - check what your supervisor advises.

Purpose: Positions your project within the wider literature. Justifies your research questions

As you are undertaking a literature-based project, it can seem odd to include a separate literature review - and indeed some supervisors may suggest it is not necessary. However, most will have a section, either as a separate chapter, or as part of the introduction, that:

  • Provides a background to your study
  • Shows where your study fits within the existing literature
  • Justifies your research questions and methods (your search strategy etc).  

For more advice on writing a literature review see the Literature Review pages on this guide.

Purpose: To present the themes you have identified in your research and explain how they contribute to answering your research questions

You will typically have 3-5 themed chapters. Each one should contain:

  • An introduction to the theme - what things it means and what it incorporates.
  • How the theme was addressed within the literature - this should be analytical not just descriptive.
  • A conclusion which shows how the theme relates to the research question(s).

Ensuring your themed chapters flow

Choosing the order of your theme chapters is an important part of the structure to your project. For example, if you study History and your project covers a topic that develops over a large time period, it may be best to order each chapter chronologically. Other subjects may have a natural narrative running through the themes. Think about how your reader will be able to follow along with your overall argument.

Although each chapter must be dedicated to a particular theme, it must link back to previous chapters and flow into the following chapter. You need to ensure they do not seem like they are unrelated to each other. There will be overlaps, mention these.

Some literature-based projects will focus on primary sources. If yours does, make sure primary sources are at the core of your paragraphs and chapters, and use secondary sources to expand and explore the theme further. 

Purpose: To present the conclusion that you have reached as a result of both the literature review and the analysis in your thematic chapters

Conclusion in separate chapter

A conclusion summarises all the points you have previously made and it  should not  include any evidence or topics you have not included in your introduction or main body. There should be no surprises.

It should be about 5-10% of your word limit so make sure you leave enough words to do it justice. There will be marks in the marking scheme specifically allocated to the strength of your conclusion which cannot be made up elsewhere.

Some conclusions will also include recommendations for practice or ideas for further research. Check with your supervisor to see if they are expecting either or both of these.

Appendices showing appendix 1, 2 etc

  • Questionnaires
  • Transcriptions
  • Correspondence

If you have information that you would like to include but are finding it disrupts the main body of text as its too cumbersome, or would distract from the main arguments of your dissertation, the information can be included in the appendix section. Each appendix should be focused on one item. 

Appendices  should not include any information that is key to your topic or overall argument. 

Reference list

literature based research project

It is good practice to develop a reference list whilst  writing the project, rather than leaving it until the end. This prevents a lot of searching around trying to remember where you accessed a particular source. If using primary sources, it also allows you to monitor the balance between primary and secondary sources included in the project. There is software available to help manage your references and the university officially supports RefWorks and EndNote. 

For more advice on reference management, see our Skills Guide: Referencing Software

  • << Previous: Structure
  • Next: Scientific >>
  • Last Updated: Apr 9, 2024 3:41 PM
  • URL: https://libguides.hull.ac.uk/dissertations
  • Login to LibApps
  • Library websites Privacy Policy
  • University of Hull privacy policy & cookies
  • Website terms and conditions
  • Accessibility
  • Report a problem

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Starting the research process
  • How to Write a Research Proposal | Examples & Templates

How to Write a Research Proposal | Examples & Templates

Published on October 12, 2022 by Shona McCombes and Tegan George. Revised on November 21, 2023.

Structure of a research proposal

A research proposal describes what you will investigate, why it’s important, and how you will conduct your research.

The format of a research proposal varies between fields, but most proposals will contain at least these elements:

Introduction

Literature review.

  • Research design

Reference list

While the sections may vary, the overall objective is always the same. A research proposal serves as a blueprint and guide for your research plan, helping you get organized and feel confident in the path forward you choose to take.

Table of contents

Research proposal purpose, research proposal examples, research design and methods, contribution to knowledge, research schedule, other interesting articles, frequently asked questions about research proposals.

Academics often have to write research proposals to get funding for their projects. As a student, you might have to write a research proposal as part of a grad school application , or prior to starting your thesis or dissertation .

In addition to helping you figure out what your research can look like, a proposal can also serve to demonstrate why your project is worth pursuing to a funder, educational institution, or supervisor.

Research proposal length

The length of a research proposal can vary quite a bit. A bachelor’s or master’s thesis proposal can be just a few pages, while proposals for PhD dissertations or research funding are usually much longer and more detailed. Your supervisor can help you determine the best length for your work.

One trick to get started is to think of your proposal’s structure as a shorter version of your thesis or dissertation , only without the results , conclusion and discussion sections.

Download our research proposal template

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Writing a research proposal can be quite challenging, but a good starting point could be to look at some examples. We’ve included a few for you below.

  • Example research proposal #1: “A Conceptual Framework for Scheduling Constraint Management”
  • Example research proposal #2: “Medical Students as Mediators of Change in Tobacco Use”

Like your dissertation or thesis, the proposal will usually have a title page that includes:

  • The proposed title of your project
  • Your supervisor’s name
  • Your institution and department

The first part of your proposal is the initial pitch for your project. Make sure it succinctly explains what you want to do and why.

Your introduction should:

  • Introduce your topic
  • Give necessary background and context
  • Outline your  problem statement  and research questions

To guide your introduction , include information about:

  • Who could have an interest in the topic (e.g., scientists, policymakers)
  • How much is already known about the topic
  • What is missing from this current knowledge
  • What new insights your research will contribute
  • Why you believe this research is worth doing

Prevent plagiarism. Run a free check.

As you get started, it’s important to demonstrate that you’re familiar with the most important research on your topic. A strong literature review  shows your reader that your project has a solid foundation in existing knowledge or theory. It also shows that you’re not simply repeating what other people have already done or said, but rather using existing research as a jumping-off point for your own.

In this section, share exactly how your project will contribute to ongoing conversations in the field by:

  • Comparing and contrasting the main theories, methods, and debates
  • Examining the strengths and weaknesses of different approaches
  • Explaining how will you build on, challenge, or synthesize prior scholarship

Following the literature review, restate your main  objectives . This brings the focus back to your own project. Next, your research design or methodology section will describe your overall approach, and the practical steps you will take to answer your research questions.

To finish your proposal on a strong note, explore the potential implications of your research for your field. Emphasize again what you aim to contribute and why it matters.

For example, your results might have implications for:

  • Improving best practices
  • Informing policymaking decisions
  • Strengthening a theory or model
  • Challenging popular or scientific beliefs
  • Creating a basis for future research

Last but not least, your research proposal must include correct citations for every source you have used, compiled in a reference list . To create citations quickly and easily, you can use our free APA citation generator .

Some institutions or funders require a detailed timeline of the project, asking you to forecast what you will do at each stage and how long it may take. While not always required, be sure to check the requirements of your project.

Here’s an example schedule to help you get started. You can also download a template at the button below.

Download our research schedule template

If you are applying for research funding, chances are you will have to include a detailed budget. This shows your estimates of how much each part of your project will cost.

Make sure to check what type of costs the funding body will agree to cover. For each item, include:

  • Cost : exactly how much money do you need?
  • Justification : why is this cost necessary to complete the research?
  • Source : how did you calculate the amount?

To determine your budget, think about:

  • Travel costs : do you need to go somewhere to collect your data? How will you get there, and how much time will you need? What will you do there (e.g., interviews, archival research)?
  • Materials : do you need access to any tools or technologies?
  • Help : do you need to hire any research assistants for the project? What will they do, and how much will you pay them?

If you want to know more about the research process , methodology , research bias , or statistics , make sure to check out some of our other articles with explanations and examples.

Methodology

  • Sampling methods
  • Simple random sampling
  • Stratified sampling
  • Cluster sampling
  • Likert scales
  • Reproducibility

 Statistics

  • Null hypothesis
  • Statistical power
  • Probability distribution
  • Effect size
  • Poisson distribution

Research bias

  • Optimism bias
  • Cognitive bias
  • Implicit bias
  • Hawthorne effect
  • Anchoring bias
  • Explicit bias

Once you’ve decided on your research objectives , you need to explain them in your paper, at the end of your problem statement .

Keep your research objectives clear and concise, and use appropriate verbs to accurately convey the work that you will carry out for each one.

I will compare …

A research aim is a broad statement indicating the general purpose of your research project. It should appear in your introduction at the end of your problem statement , before your research objectives.

Research objectives are more specific than your research aim. They indicate the specific ways you’ll address the overarching aim.

A PhD, which is short for philosophiae doctor (doctor of philosophy in Latin), is the highest university degree that can be obtained. In a PhD, students spend 3–5 years writing a dissertation , which aims to make a significant, original contribution to current knowledge.

A PhD is intended to prepare students for a career as a researcher, whether that be in academia, the public sector, or the private sector.

A master’s is a 1- or 2-year graduate degree that can prepare you for a variety of careers.

All master’s involve graduate-level coursework. Some are research-intensive and intend to prepare students for further study in a PhD; these usually require their students to write a master’s thesis . Others focus on professional training for a specific career.

Critical thinking refers to the ability to evaluate information and to be aware of biases or assumptions, including your own.

Like information literacy , it involves evaluating arguments, identifying and solving problems in an objective and systematic way, and clearly communicating your ideas.

The best way to remember the difference between a research plan and a research proposal is that they have fundamentally different audiences. A research plan helps you, the researcher, organize your thoughts. On the other hand, a dissertation proposal or research proposal aims to convince others (e.g., a supervisor, a funding body, or a dissertation committee) that your research topic is relevant and worthy of being conducted.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. & George, T. (2023, November 21). How to Write a Research Proposal | Examples & Templates. Scribbr. Retrieved April 12, 2024, from https://www.scribbr.com/research-process/research-proposal/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, how to write a problem statement | guide & examples, writing strong research questions | criteria & examples, how to write a literature review | guide, examples, & templates, unlimited academic ai-proofreading.

✔ Document error-free in 5minutes ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Dissertation
  • What is a Literature Review? | Guide, Template, & Examples

What is a Literature Review? | Guide, Template, & Examples

Published on 22 February 2022 by Shona McCombes . Revised on 7 June 2022.

What is a literature review? A literature review is a survey of scholarly sources on a specific topic. It provides an overview of current knowledge, allowing you to identify relevant theories, methods, and gaps in the existing research.

There are five key steps to writing a literature review:

  • Search for relevant literature
  • Evaluate sources
  • Identify themes, debates and gaps
  • Outline the structure
  • Write your literature review

A good literature review doesn’t just summarise sources – it analyses, synthesises, and critically evaluates to give a clear picture of the state of knowledge on the subject.

Instantly correct all language mistakes in your text

Be assured that you'll submit flawless writing. Upload your document to correct all your mistakes.

upload-your-document-ai-proofreader

Table of contents

Why write a literature review, examples of literature reviews, step 1: search for relevant literature, step 2: evaluate and select sources, step 3: identify themes, debates and gaps, step 4: outline your literature review’s structure, step 5: write your literature review, frequently asked questions about literature reviews, introduction.

  • Quick Run-through
  • Step 1 & 2

When you write a dissertation or thesis, you will have to conduct a literature review to situate your research within existing knowledge. The literature review gives you a chance to:

  • Demonstrate your familiarity with the topic and scholarly context
  • Develop a theoretical framework and methodology for your research
  • Position yourself in relation to other researchers and theorists
  • Show how your dissertation addresses a gap or contributes to a debate

You might also have to write a literature review as a stand-alone assignment. In this case, the purpose is to evaluate the current state of research and demonstrate your knowledge of scholarly debates around a topic.

The content will look slightly different in each case, but the process of conducting a literature review follows the same steps. We’ve written a step-by-step guide that you can follow below.

Literature review guide

The only proofreading tool specialized in correcting academic writing

The academic proofreading tool has been trained on 1000s of academic texts and by native English editors. Making it the most accurate and reliable proofreading tool for students.

literature based research project

Correct my document today

Writing literature reviews can be quite challenging! A good starting point could be to look at some examples, depending on what kind of literature review you’d like to write.

  • Example literature review #1: “Why Do People Migrate? A Review of the Theoretical Literature” ( Theoretical literature review about the development of economic migration theory from the 1950s to today.)
  • Example literature review #2: “Literature review as a research methodology: An overview and guidelines” ( Methodological literature review about interdisciplinary knowledge acquisition and production.)
  • Example literature review #3: “The Use of Technology in English Language Learning: A Literature Review” ( Thematic literature review about the effects of technology on language acquisition.)
  • Example literature review #4: “Learners’ Listening Comprehension Difficulties in English Language Learning: A Literature Review” ( Chronological literature review about how the concept of listening skills has changed over time.)

You can also check out our templates with literature review examples and sample outlines at the links below.

Download Word doc Download Google doc

Before you begin searching for literature, you need a clearly defined topic .

If you are writing the literature review section of a dissertation or research paper, you will search for literature related to your research objectives and questions .

If you are writing a literature review as a stand-alone assignment, you will have to choose a focus and develop a central question to direct your search. Unlike a dissertation research question, this question has to be answerable without collecting original data. You should be able to answer it based only on a review of existing publications.

Make a list of keywords

Start by creating a list of keywords related to your research topic. Include each of the key concepts or variables you’re interested in, and list any synonyms and related terms. You can add to this list if you discover new keywords in the process of your literature search.

  • Social media, Facebook, Instagram, Twitter, Snapchat, TikTok
  • Body image, self-perception, self-esteem, mental health
  • Generation Z, teenagers, adolescents, youth

Search for relevant sources

Use your keywords to begin searching for sources. Some databases to search for journals and articles include:

  • Your university’s library catalogue
  • Google Scholar
  • Project Muse (humanities and social sciences)
  • Medline (life sciences and biomedicine)
  • EconLit (economics)
  • Inspec (physics, engineering and computer science)

You can use boolean operators to help narrow down your search:

Read the abstract to find out whether an article is relevant to your question. When you find a useful book or article, you can check the bibliography to find other relevant sources.

To identify the most important publications on your topic, take note of recurring citations. If the same authors, books or articles keep appearing in your reading, make sure to seek them out.

You probably won’t be able to read absolutely everything that has been written on the topic – you’ll have to evaluate which sources are most relevant to your questions.

For each publication, ask yourself:

  • What question or problem is the author addressing?
  • What are the key concepts and how are they defined?
  • What are the key theories, models and methods? Does the research use established frameworks or take an innovative approach?
  • What are the results and conclusions of the study?
  • How does the publication relate to other literature in the field? Does it confirm, add to, or challenge established knowledge?
  • How does the publication contribute to your understanding of the topic? What are its key insights and arguments?
  • What are the strengths and weaknesses of the research?

Make sure the sources you use are credible, and make sure you read any landmark studies and major theories in your field of research.

You can find out how many times an article has been cited on Google Scholar – a high citation count means the article has been influential in the field, and should certainly be included in your literature review.

The scope of your review will depend on your topic and discipline: in the sciences you usually only review recent literature, but in the humanities you might take a long historical perspective (for example, to trace how a concept has changed in meaning over time).

Remember that you can use our template to summarise and evaluate sources you’re thinking about using!

Take notes and cite your sources

As you read, you should also begin the writing process. Take notes that you can later incorporate into the text of your literature review.

It’s important to keep track of your sources with references to avoid plagiarism . It can be helpful to make an annotated bibliography, where you compile full reference information and write a paragraph of summary and analysis for each source. This helps you remember what you read and saves time later in the process.

You can use our free APA Reference Generator for quick, correct, consistent citations.

To begin organising your literature review’s argument and structure, you need to understand the connections and relationships between the sources you’ve read. Based on your reading and notes, you can look for:

  • Trends and patterns (in theory, method or results): do certain approaches become more or less popular over time?
  • Themes: what questions or concepts recur across the literature?
  • Debates, conflicts and contradictions: where do sources disagree?
  • Pivotal publications: are there any influential theories or studies that changed the direction of the field?
  • Gaps: what is missing from the literature? Are there weaknesses that need to be addressed?

This step will help you work out the structure of your literature review and (if applicable) show how your own research will contribute to existing knowledge.

  • Most research has focused on young women.
  • There is an increasing interest in the visual aspects of social media.
  • But there is still a lack of robust research on highly-visual platforms like Instagram and Snapchat – this is a gap that you could address in your own research.

There are various approaches to organising the body of a literature review. You should have a rough idea of your strategy before you start writing.

Depending on the length of your literature review, you can combine several of these strategies (for example, your overall structure might be thematic, but each theme is discussed chronologically).

Chronological

The simplest approach is to trace the development of the topic over time. However, if you choose this strategy, be careful to avoid simply listing and summarising sources in order.

Try to analyse patterns, turning points and key debates that have shaped the direction of the field. Give your interpretation of how and why certain developments occurred.

If you have found some recurring central themes, you can organise your literature review into subsections that address different aspects of the topic.

For example, if you are reviewing literature about inequalities in migrant health outcomes, key themes might include healthcare policy, language barriers, cultural attitudes, legal status, and economic access.

Methodological

If you draw your sources from different disciplines or fields that use a variety of research methods , you might want to compare the results and conclusions that emerge from different approaches. For example:

  • Look at what results have emerged in qualitative versus quantitative research
  • Discuss how the topic has been approached by empirical versus theoretical scholarship
  • Divide the literature into sociological, historical, and cultural sources

Theoretical

A literature review is often the foundation for a theoretical framework . You can use it to discuss various theories, models, and definitions of key concepts.

You might argue for the relevance of a specific theoretical approach, or combine various theoretical concepts to create a framework for your research.

Like any other academic text, your literature review should have an introduction , a main body, and a conclusion . What you include in each depends on the objective of your literature review.

The introduction should clearly establish the focus and purpose of the literature review.

If you are writing the literature review as part of your dissertation or thesis, reiterate your central problem or research question and give a brief summary of the scholarly context. You can emphasise the timeliness of the topic (“many recent studies have focused on the problem of x”) or highlight a gap in the literature (“while there has been much research on x, few researchers have taken y into consideration”).

Depending on the length of your literature review, you might want to divide the body into subsections. You can use a subheading for each theme, time period, or methodological approach.

As you write, make sure to follow these tips:

  • Summarise and synthesise: give an overview of the main points of each source and combine them into a coherent whole.
  • Analyse and interpret: don’t just paraphrase other researchers – add your own interpretations, discussing the significance of findings in relation to the literature as a whole.
  • Critically evaluate: mention the strengths and weaknesses of your sources.
  • Write in well-structured paragraphs: use transitions and topic sentences to draw connections, comparisons and contrasts.

In the conclusion, you should summarise the key findings you have taken from the literature and emphasise their significance.

If the literature review is part of your dissertation or thesis, reiterate how your research addresses gaps and contributes new knowledge, or discuss how you have drawn on existing theories and methods to build a framework for your research. This can lead directly into your methodology section.

A literature review is a survey of scholarly sources (such as books, journal articles, and theses) related to a specific topic or research question .

It is often written as part of a dissertation , thesis, research paper , or proposal .

There are several reasons to conduct a literature review at the beginning of a research project:

  • To familiarise yourself with the current state of knowledge on your topic
  • To ensure that you’re not just repeating what others have already done
  • To identify gaps in knowledge and unresolved problems that your research can address
  • To develop your theoretical framework and methodology
  • To provide an overview of the key findings and debates on the topic

Writing the literature review shows your reader how your work relates to existing research and what new insights it will contribute.

The literature review usually comes near the beginning of your  dissertation . After the introduction , it grounds your research in a scholarly field and leads directly to your theoretical framework or methodology .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

McCombes, S. (2022, June 07). What is a Literature Review? | Guide, Template, & Examples. Scribbr. Retrieved 9 April 2024, from https://www.scribbr.co.uk/thesis-dissertation/literature-review/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, how to write a dissertation proposal | a step-by-step guide, what is a theoretical framework | a step-by-step guide, what is a research methodology | steps & tips.

Harvey Cushing/John Hay Whitney Medical Library

  • Collections
  • Research Help

YSN Doctoral Programs: Steps in Conducting a Literature Review

  • Biomedical Databases
  • Global (Public Health) Databases
  • Soc. Sci., History, and Law Databases
  • Grey Literature
  • Trials Registers
  • Data and Statistics
  • Public Policy
  • Google Tips
  • Recommended Books
  • Steps in Conducting a Literature Review

What is a literature review?

A literature review is an integrated analysis -- not just a summary-- of scholarly writings and other relevant evidence related directly to your research question.  That is, it represents a synthesis of the evidence that provides background information on your topic and shows a association between the evidence and your research question.

A literature review may be a stand alone work or the introduction to a larger research paper, depending on the assignment.  Rely heavily on the guidelines your instructor has given you.

Why is it important?

A literature review is important because it:

  • Explains the background of research on a topic.
  • Demonstrates why a topic is significant to a subject area.
  • Discovers relationships between research studies/ideas.
  • Identifies major themes, concepts, and researchers on a topic.
  • Identifies critical gaps and points of disagreement.
  • Discusses further research questions that logically come out of the previous studies.

APA7 Style resources

Cover Art

APA Style Blog - for those harder to find answers

1. Choose a topic. Define your research question.

Your literature review should be guided by your central research question.  The literature represents background and research developments related to a specific research question, interpreted and analyzed by you in a synthesized way.

  • Make sure your research question is not too broad or too narrow.  Is it manageable?
  • Begin writing down terms that are related to your question. These will be useful for searches later.
  • If you have the opportunity, discuss your topic with your professor and your class mates.

2. Decide on the scope of your review

How many studies do you need to look at? How comprehensive should it be? How many years should it cover? 

  • This may depend on your assignment.  How many sources does the assignment require?

3. Select the databases you will use to conduct your searches.

Make a list of the databases you will search. 

Where to find databases:

  • use the tabs on this guide
  • Find other databases in the Nursing Information Resources web page
  • More on the Medical Library web page
  • ... and more on the Yale University Library web page

4. Conduct your searches to find the evidence. Keep track of your searches.

  • Use the key words in your question, as well as synonyms for those words, as terms in your search. Use the database tutorials for help.
  • Save the searches in the databases. This saves time when you want to redo, or modify, the searches. It is also helpful to use as a guide is the searches are not finding any useful results.
  • Review the abstracts of research studies carefully. This will save you time.
  • Use the bibliographies and references of research studies you find to locate others.
  • Check with your professor, or a subject expert in the field, if you are missing any key works in the field.
  • Ask your librarian for help at any time.
  • Use a citation manager, such as EndNote as the repository for your citations. See the EndNote tutorials for help.

Review the literature

Some questions to help you analyze the research:

  • What was the research question of the study you are reviewing? What were the authors trying to discover?
  • Was the research funded by a source that could influence the findings?
  • What were the research methodologies? Analyze its literature review, the samples and variables used, the results, and the conclusions.
  • Does the research seem to be complete? Could it have been conducted more soundly? What further questions does it raise?
  • If there are conflicting studies, why do you think that is?
  • How are the authors viewed in the field? Has this study been cited? If so, how has it been analyzed?

Tips: 

  • Review the abstracts carefully.  
  • Keep careful notes so that you may track your thought processes during the research process.
  • Create a matrix of the studies for easy analysis, and synthesis, across all of the studies.
  • << Previous: Recommended Books
  • Last Updated: Jan 4, 2024 10:52 AM
  • URL: https://guides.library.yale.edu/YSNDoctoral
  • USC Libraries
  • Research Guides

Organizing Your Social Sciences Research Paper

  • 5. The Literature Review
  • Purpose of Guide
  • Design Flaws to Avoid
  • Independent and Dependent Variables
  • Glossary of Research Terms
  • Reading Research Effectively
  • Narrowing a Topic Idea
  • Broadening a Topic Idea
  • Extending the Timeliness of a Topic Idea
  • Academic Writing Style
  • Applying Critical Thinking
  • Choosing a Title
  • Making an Outline
  • Paragraph Development
  • Research Process Video Series
  • Executive Summary
  • The C.A.R.S. Model
  • Background Information
  • The Research Problem/Question
  • Theoretical Framework
  • Citation Tracking
  • Content Alert Services
  • Evaluating Sources
  • Primary Sources
  • Secondary Sources
  • Tiertiary Sources
  • Scholarly vs. Popular Publications
  • Qualitative Methods
  • Quantitative Methods
  • Insiderness
  • Using Non-Textual Elements
  • Limitations of the Study
  • Common Grammar Mistakes
  • Writing Concisely
  • Avoiding Plagiarism
  • Footnotes or Endnotes?
  • Further Readings
  • Generative AI and Writing
  • USC Libraries Tutorials and Other Guides
  • Bibliography

A literature review surveys prior research published in books, scholarly articles, and any other sources relevant to a particular issue, area of research, or theory, and by so doing, provides a description, summary, and critical evaluation of these works in relation to the research problem being investigated. Literature reviews are designed to provide an overview of sources you have used in researching a particular topic and to demonstrate to your readers how your research fits within existing scholarship about the topic.

Fink, Arlene. Conducting Research Literature Reviews: From the Internet to Paper . Fourth edition. Thousand Oaks, CA: SAGE, 2014.

Importance of a Good Literature Review

A literature review may consist of simply a summary of key sources, but in the social sciences, a literature review usually has an organizational pattern and combines both summary and synthesis, often within specific conceptual categories . A summary is a recap of the important information of the source, but a synthesis is a re-organization, or a reshuffling, of that information in a way that informs how you are planning to investigate a research problem. The analytical features of a literature review might:

  • Give a new interpretation of old material or combine new with old interpretations,
  • Trace the intellectual progression of the field, including major debates,
  • Depending on the situation, evaluate the sources and advise the reader on the most pertinent or relevant research, or
  • Usually in the conclusion of a literature review, identify where gaps exist in how a problem has been researched to date.

Given this, the purpose of a literature review is to:

  • Place each work in the context of its contribution to understanding the research problem being studied.
  • Describe the relationship of each work to the others under consideration.
  • Identify new ways to interpret prior research.
  • Reveal any gaps that exist in the literature.
  • Resolve conflicts amongst seemingly contradictory previous studies.
  • Identify areas of prior scholarship to prevent duplication of effort.
  • Point the way in fulfilling a need for additional research.
  • Locate your own research within the context of existing literature [very important].

Fink, Arlene. Conducting Research Literature Reviews: From the Internet to Paper. 2nd ed. Thousand Oaks, CA: Sage, 2005; Hart, Chris. Doing a Literature Review: Releasing the Social Science Research Imagination . Thousand Oaks, CA: Sage Publications, 1998; Jesson, Jill. Doing Your Literature Review: Traditional and Systematic Techniques . Los Angeles, CA: SAGE, 2011; Knopf, Jeffrey W. "Doing a Literature Review." PS: Political Science and Politics 39 (January 2006): 127-132; Ridley, Diana. The Literature Review: A Step-by-Step Guide for Students . 2nd ed. Los Angeles, CA: SAGE, 2012.

Types of Literature Reviews

It is important to think of knowledge in a given field as consisting of three layers. First, there are the primary studies that researchers conduct and publish. Second are the reviews of those studies that summarize and offer new interpretations built from and often extending beyond the primary studies. Third, there are the perceptions, conclusions, opinion, and interpretations that are shared informally among scholars that become part of the body of epistemological traditions within the field.

In composing a literature review, it is important to note that it is often this third layer of knowledge that is cited as "true" even though it often has only a loose relationship to the primary studies and secondary literature reviews. Given this, while literature reviews are designed to provide an overview and synthesis of pertinent sources you have explored, there are a number of approaches you could adopt depending upon the type of analysis underpinning your study.

Argumentative Review This form examines literature selectively in order to support or refute an argument, deeply embedded assumption, or philosophical problem already established in the literature. The purpose is to develop a body of literature that establishes a contrarian viewpoint. Given the value-laden nature of some social science research [e.g., educational reform; immigration control], argumentative approaches to analyzing the literature can be a legitimate and important form of discourse. However, note that they can also introduce problems of bias when they are used to make summary claims of the sort found in systematic reviews [see below].

Integrative Review Considered a form of research that reviews, critiques, and synthesizes representative literature on a topic in an integrated way such that new frameworks and perspectives on the topic are generated. The body of literature includes all studies that address related or identical hypotheses or research problems. A well-done integrative review meets the same standards as primary research in regard to clarity, rigor, and replication. This is the most common form of review in the social sciences.

Historical Review Few things rest in isolation from historical precedent. Historical literature reviews focus on examining research throughout a period of time, often starting with the first time an issue, concept, theory, phenomena emerged in the literature, then tracing its evolution within the scholarship of a discipline. The purpose is to place research in a historical context to show familiarity with state-of-the-art developments and to identify the likely directions for future research.

Methodological Review A review does not always focus on what someone said [findings], but how they came about saying what they say [method of analysis]. Reviewing methods of analysis provides a framework of understanding at different levels [i.e. those of theory, substantive fields, research approaches, and data collection and analysis techniques], how researchers draw upon a wide variety of knowledge ranging from the conceptual level to practical documents for use in fieldwork in the areas of ontological and epistemological consideration, quantitative and qualitative integration, sampling, interviewing, data collection, and data analysis. This approach helps highlight ethical issues which you should be aware of and consider as you go through your own study.

Systematic Review This form consists of an overview of existing evidence pertinent to a clearly formulated research question, which uses pre-specified and standardized methods to identify and critically appraise relevant research, and to collect, report, and analyze data from the studies that are included in the review. The goal is to deliberately document, critically evaluate, and summarize scientifically all of the research about a clearly defined research problem . Typically it focuses on a very specific empirical question, often posed in a cause-and-effect form, such as "To what extent does A contribute to B?" This type of literature review is primarily applied to examining prior research studies in clinical medicine and allied health fields, but it is increasingly being used in the social sciences.

Theoretical Review The purpose of this form is to examine the corpus of theory that has accumulated in regard to an issue, concept, theory, phenomena. The theoretical literature review helps to establish what theories already exist, the relationships between them, to what degree the existing theories have been investigated, and to develop new hypotheses to be tested. Often this form is used to help establish a lack of appropriate theories or reveal that current theories are inadequate for explaining new or emerging research problems. The unit of analysis can focus on a theoretical concept or a whole theory or framework.

NOTE : Most often the literature review will incorporate some combination of types. For example, a review that examines literature supporting or refuting an argument, assumption, or philosophical problem related to the research problem will also need to include writing supported by sources that establish the history of these arguments in the literature.

Baumeister, Roy F. and Mark R. Leary. "Writing Narrative Literature Reviews."  Review of General Psychology 1 (September 1997): 311-320; Mark R. Fink, Arlene. Conducting Research Literature Reviews: From the Internet to Paper . 2nd ed. Thousand Oaks, CA: Sage, 2005; Hart, Chris. Doing a Literature Review: Releasing the Social Science Research Imagination . Thousand Oaks, CA: Sage Publications, 1998; Kennedy, Mary M. "Defining a Literature." Educational Researcher 36 (April 2007): 139-147; Petticrew, Mark and Helen Roberts. Systematic Reviews in the Social Sciences: A Practical Guide . Malden, MA: Blackwell Publishers, 2006; Torracro, Richard. "Writing Integrative Literature Reviews: Guidelines and Examples." Human Resource Development Review 4 (September 2005): 356-367; Rocco, Tonette S. and Maria S. Plakhotnik. "Literature Reviews, Conceptual Frameworks, and Theoretical Frameworks: Terms, Functions, and Distinctions." Human Ressource Development Review 8 (March 2008): 120-130; Sutton, Anthea. Systematic Approaches to a Successful Literature Review . Los Angeles, CA: Sage Publications, 2016.

Structure and Writing Style

I.  Thinking About Your Literature Review

The structure of a literature review should include the following in support of understanding the research problem :

  • An overview of the subject, issue, or theory under consideration, along with the objectives of the literature review,
  • Division of works under review into themes or categories [e.g. works that support a particular position, those against, and those offering alternative approaches entirely],
  • An explanation of how each work is similar to and how it varies from the others,
  • Conclusions as to which pieces are best considered in their argument, are most convincing of their opinions, and make the greatest contribution to the understanding and development of their area of research.

The critical evaluation of each work should consider :

  • Provenance -- what are the author's credentials? Are the author's arguments supported by evidence [e.g. primary historical material, case studies, narratives, statistics, recent scientific findings]?
  • Methodology -- were the techniques used to identify, gather, and analyze the data appropriate to addressing the research problem? Was the sample size appropriate? Were the results effectively interpreted and reported?
  • Objectivity -- is the author's perspective even-handed or prejudicial? Is contrary data considered or is certain pertinent information ignored to prove the author's point?
  • Persuasiveness -- which of the author's theses are most convincing or least convincing?
  • Validity -- are the author's arguments and conclusions convincing? Does the work ultimately contribute in any significant way to an understanding of the subject?

II.  Development of the Literature Review

Four Basic Stages of Writing 1.  Problem formulation -- which topic or field is being examined and what are its component issues? 2.  Literature search -- finding materials relevant to the subject being explored. 3.  Data evaluation -- determining which literature makes a significant contribution to the understanding of the topic. 4.  Analysis and interpretation -- discussing the findings and conclusions of pertinent literature.

Consider the following issues before writing the literature review: Clarify If your assignment is not specific about what form your literature review should take, seek clarification from your professor by asking these questions: 1.  Roughly how many sources would be appropriate to include? 2.  What types of sources should I review (books, journal articles, websites; scholarly versus popular sources)? 3.  Should I summarize, synthesize, or critique sources by discussing a common theme or issue? 4.  Should I evaluate the sources in any way beyond evaluating how they relate to understanding the research problem? 5.  Should I provide subheadings and other background information, such as definitions and/or a history? Find Models Use the exercise of reviewing the literature to examine how authors in your discipline or area of interest have composed their literature review sections. Read them to get a sense of the types of themes you might want to look for in your own research or to identify ways to organize your final review. The bibliography or reference section of sources you've already read, such as required readings in the course syllabus, are also excellent entry points into your own research. Narrow the Topic The narrower your topic, the easier it will be to limit the number of sources you need to read in order to obtain a good survey of relevant resources. Your professor will probably not expect you to read everything that's available about the topic, but you'll make the act of reviewing easier if you first limit scope of the research problem. A good strategy is to begin by searching the USC Libraries Catalog for recent books about the topic and review the table of contents for chapters that focuses on specific issues. You can also review the indexes of books to find references to specific issues that can serve as the focus of your research. For example, a book surveying the history of the Israeli-Palestinian conflict may include a chapter on the role Egypt has played in mediating the conflict, or look in the index for the pages where Egypt is mentioned in the text. Consider Whether Your Sources are Current Some disciplines require that you use information that is as current as possible. This is particularly true in disciplines in medicine and the sciences where research conducted becomes obsolete very quickly as new discoveries are made. However, when writing a review in the social sciences, a survey of the history of the literature may be required. In other words, a complete understanding the research problem requires you to deliberately examine how knowledge and perspectives have changed over time. Sort through other current bibliographies or literature reviews in the field to get a sense of what your discipline expects. You can also use this method to explore what is considered by scholars to be a "hot topic" and what is not.

III.  Ways to Organize Your Literature Review

Chronology of Events If your review follows the chronological method, you could write about the materials according to when they were published. This approach should only be followed if a clear path of research building on previous research can be identified and that these trends follow a clear chronological order of development. For example, a literature review that focuses on continuing research about the emergence of German economic power after the fall of the Soviet Union. By Publication Order your sources by publication chronology, then, only if the order demonstrates a more important trend. For instance, you could order a review of literature on environmental studies of brown fields if the progression revealed, for example, a change in the soil collection practices of the researchers who wrote and/or conducted the studies. Thematic [“conceptual categories”] A thematic literature review is the most common approach to summarizing prior research in the social and behavioral sciences. Thematic reviews are organized around a topic or issue, rather than the progression of time, although the progression of time may still be incorporated into a thematic review. For example, a review of the Internet’s impact on American presidential politics could focus on the development of online political satire. While the study focuses on one topic, the Internet’s impact on American presidential politics, it would still be organized chronologically reflecting technological developments in media. The difference in this example between a "chronological" and a "thematic" approach is what is emphasized the most: themes related to the role of the Internet in presidential politics. Note that more authentic thematic reviews tend to break away from chronological order. A review organized in this manner would shift between time periods within each section according to the point being made. Methodological A methodological approach focuses on the methods utilized by the researcher. For the Internet in American presidential politics project, one methodological approach would be to look at cultural differences between the portrayal of American presidents on American, British, and French websites. Or the review might focus on the fundraising impact of the Internet on a particular political party. A methodological scope will influence either the types of documents in the review or the way in which these documents are discussed.

Other Sections of Your Literature Review Once you've decided on the organizational method for your literature review, the sections you need to include in the paper should be easy to figure out because they arise from your organizational strategy. In other words, a chronological review would have subsections for each vital time period; a thematic review would have subtopics based upon factors that relate to the theme or issue. However, sometimes you may need to add additional sections that are necessary for your study, but do not fit in the organizational strategy of the body. What other sections you include in the body is up to you. However, only include what is necessary for the reader to locate your study within the larger scholarship about the research problem.

Here are examples of other sections, usually in the form of a single paragraph, you may need to include depending on the type of review you write:

  • Current Situation : Information necessary to understand the current topic or focus of the literature review.
  • Sources Used : Describes the methods and resources [e.g., databases] you used to identify the literature you reviewed.
  • History : The chronological progression of the field, the research literature, or an idea that is necessary to understand the literature review, if the body of the literature review is not already a chronology.
  • Selection Methods : Criteria you used to select (and perhaps exclude) sources in your literature review. For instance, you might explain that your review includes only peer-reviewed [i.e., scholarly] sources.
  • Standards : Description of the way in which you present your information.
  • Questions for Further Research : What questions about the field has the review sparked? How will you further your research as a result of the review?

IV.  Writing Your Literature Review

Once you've settled on how to organize your literature review, you're ready to write each section. When writing your review, keep in mind these issues.

Use Evidence A literature review section is, in this sense, just like any other academic research paper. Your interpretation of the available sources must be backed up with evidence [citations] that demonstrates that what you are saying is valid. Be Selective Select only the most important points in each source to highlight in the review. The type of information you choose to mention should relate directly to the research problem, whether it is thematic, methodological, or chronological. Related items that provide additional information, but that are not key to understanding the research problem, can be included in a list of further readings . Use Quotes Sparingly Some short quotes are appropriate if you want to emphasize a point, or if what an author stated cannot be easily paraphrased. Sometimes you may need to quote certain terminology that was coined by the author, is not common knowledge, or taken directly from the study. Do not use extensive quotes as a substitute for using your own words in reviewing the literature. Summarize and Synthesize Remember to summarize and synthesize your sources within each thematic paragraph as well as throughout the review. Recapitulate important features of a research study, but then synthesize it by rephrasing the study's significance and relating it to your own work and the work of others. Keep Your Own Voice While the literature review presents others' ideas, your voice [the writer's] should remain front and center. For example, weave references to other sources into what you are writing but maintain your own voice by starting and ending the paragraph with your own ideas and wording. Use Caution When Paraphrasing When paraphrasing a source that is not your own, be sure to represent the author's information or opinions accurately and in your own words. Even when paraphrasing an author’s work, you still must provide a citation to that work.

V.  Common Mistakes to Avoid

These are the most common mistakes made in reviewing social science research literature.

  • Sources in your literature review do not clearly relate to the research problem;
  • You do not take sufficient time to define and identify the most relevant sources to use in the literature review related to the research problem;
  • Relies exclusively on secondary analytical sources rather than including relevant primary research studies or data;
  • Uncritically accepts another researcher's findings and interpretations as valid, rather than examining critically all aspects of the research design and analysis;
  • Does not describe the search procedures that were used in identifying the literature to review;
  • Reports isolated statistical results rather than synthesizing them in chi-squared or meta-analytic methods; and,
  • Only includes research that validates assumptions and does not consider contrary findings and alternative interpretations found in the literature.

Cook, Kathleen E. and Elise Murowchick. “Do Literature Review Skills Transfer from One Course to Another?” Psychology Learning and Teaching 13 (March 2014): 3-11; Fink, Arlene. Conducting Research Literature Reviews: From the Internet to Paper . 2nd ed. Thousand Oaks, CA: Sage, 2005; Hart, Chris. Doing a Literature Review: Releasing the Social Science Research Imagination . Thousand Oaks, CA: Sage Publications, 1998; Jesson, Jill. Doing Your Literature Review: Traditional and Systematic Techniques . London: SAGE, 2011; Literature Review Handout. Online Writing Center. Liberty University; Literature Reviews. The Writing Center. University of North Carolina; Onwuegbuzie, Anthony J. and Rebecca Frels. Seven Steps to a Comprehensive Literature Review: A Multimodal and Cultural Approach . Los Angeles, CA: SAGE, 2016; Ridley, Diana. The Literature Review: A Step-by-Step Guide for Students . 2nd ed. Los Angeles, CA: SAGE, 2012; Randolph, Justus J. “A Guide to Writing the Dissertation Literature Review." Practical Assessment, Research, and Evaluation. vol. 14, June 2009; Sutton, Anthea. Systematic Approaches to a Successful Literature Review . Los Angeles, CA: Sage Publications, 2016; Taylor, Dena. The Literature Review: A Few Tips On Conducting It. University College Writing Centre. University of Toronto; Writing a Literature Review. Academic Skills Centre. University of Canberra.

Writing Tip

Break Out of Your Disciplinary Box!

Thinking interdisciplinarily about a research problem can be a rewarding exercise in applying new ideas, theories, or concepts to an old problem. For example, what might cultural anthropologists say about the continuing conflict in the Middle East? In what ways might geographers view the need for better distribution of social service agencies in large cities than how social workers might study the issue? You don’t want to substitute a thorough review of core research literature in your discipline for studies conducted in other fields of study. However, particularly in the social sciences, thinking about research problems from multiple vectors is a key strategy for finding new solutions to a problem or gaining a new perspective. Consult with a librarian about identifying research databases in other disciplines; almost every field of study has at least one comprehensive database devoted to indexing its research literature.

Frodeman, Robert. The Oxford Handbook of Interdisciplinarity . New York: Oxford University Press, 2010.

Another Writing Tip

Don't Just Review for Content!

While conducting a review of the literature, maximize the time you devote to writing this part of your paper by thinking broadly about what you should be looking for and evaluating. Review not just what scholars are saying, but how are they saying it. Some questions to ask:

  • How are they organizing their ideas?
  • What methods have they used to study the problem?
  • What theories have been used to explain, predict, or understand their research problem?
  • What sources have they cited to support their conclusions?
  • How have they used non-textual elements [e.g., charts, graphs, figures, etc.] to illustrate key points?

When you begin to write your literature review section, you'll be glad you dug deeper into how the research was designed and constructed because it establishes a means for developing more substantial analysis and interpretation of the research problem.

Hart, Chris. Doing a Literature Review: Releasing the Social Science Research Imagination . Thousand Oaks, CA: Sage Publications, 1 998.

Yet Another Writing Tip

When Do I Know I Can Stop Looking and Move On?

Here are several strategies you can utilize to assess whether you've thoroughly reviewed the literature:

  • Look for repeating patterns in the research findings . If the same thing is being said, just by different people, then this likely demonstrates that the research problem has hit a conceptual dead end. At this point consider: Does your study extend current research?  Does it forge a new path? Or, does is merely add more of the same thing being said?
  • Look at sources the authors cite to in their work . If you begin to see the same researchers cited again and again, then this is often an indication that no new ideas have been generated to address the research problem.
  • Search Google Scholar to identify who has subsequently cited leading scholars already identified in your literature review [see next sub-tab]. This is called citation tracking and there are a number of sources that can help you identify who has cited whom, particularly scholars from outside of your discipline. Here again, if the same authors are being cited again and again, this may indicate no new literature has been written on the topic.

Onwuegbuzie, Anthony J. and Rebecca Frels. Seven Steps to a Comprehensive Literature Review: A Multimodal and Cultural Approach . Los Angeles, CA: Sage, 2016; Sutton, Anthea. Systematic Approaches to a Successful Literature Review . Los Angeles, CA: Sage Publications, 2016.

  • << Previous: Theoretical Framework
  • Next: Citation Tracking >>
  • Last Updated: Apr 11, 2024 1:27 PM
  • URL: https://libguides.usc.edu/writingguide

Advertisement

Advertisement

Toward a framework for selecting indicators of measuring sustainability and circular economy in the agri-food sector: a systematic literature review

  • LIFE CYCLE SUSTAINABILITY ASSESSMENT
  • Published: 02 March 2022

Cite this article

  • Cecilia Silvestri   ORCID: orcid.org/0000-0003-2528-601X 1 ,
  • Luca Silvestri   ORCID: orcid.org/0000-0002-6754-899X 2 ,
  • Michela Piccarozzi   ORCID: orcid.org/0000-0001-9717-9462 1 &
  • Alessandro Ruggieri 1  

2853 Accesses

11 Citations

9 Altmetric

Explore all metrics

A Correction to this article was published on 24 March 2022

This article has been updated

The implementation of sustainability and circular economy (CE) models in agri-food production can promote resource efficiency, reduce environmental burdens, and ensure improved and socially responsible systems. In this context, indicators for the measurement of sustainability play a crucial role. Indicators can measure CE strategies aimed to preserve functions, products, components, materials, or embodied energy. Although there is broad literature describing sustainability and CE indicators, no study offers such a comprehensive framework of indicators for measuring sustainability and CE in the agri-food sector.

Starting from this central research gap, a systematic literature review has been developed to measure the sustainability in the agri-food sector and, based on these findings, to understand how indicators are used and for which specific purposes.

The analysis of the results allowed us to classify the sample of articles in three main clusters (“Assessment-LCA,” “Best practice,” and “Decision-making”) and has shown increasing attention to the three pillars of sustainability (triple bottom line). In this context, an integrated approach of indicators (environmental, social, and economic) offers the best solution to ensure an easier transition to sustainability.

Conclusions

The sample analysis facilitated the identification of new categories of impact that deserve attention, such as the cooperation among stakeholders in the supply chain and eco-innovation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

literature based research project

Source: Authors’ elaboration. Notes: The graph shows the temporal distribution of the articles under analysis

literature based research project

Source: Authors’ elaborations. Notes: The graph shows the time distribution of articles from the three major journals

literature based research project

Source: Authors’ elaboration. Notes: The graph shows the composition of the sample according to the three clusters identified by the analysis

literature based research project

Source: Authors’ elaboration. Notes: The graph shows the distribution of articles over time by cluster

literature based research project

Source: Authors’ elaboration. Notes: The graph shows the network visualization

literature based research project

Source: Authors’ elaboration. Notes: The graph shows the overlay visualization

literature based research project

Source: Authors’ elaboration. Notes: The graph shows the classification of articles by scientific field

literature based research project

Source: Authors’ elaboration. Notes: Article classification based on their cluster to which they belong and scientific field

literature based research project

Source: Authors’ elaboration

literature based research project

Source: Authors’ elaboration. Notes: The graph shows the distribution of items over time based on TBL

literature based research project

Source: Authors’ elaboration. Notes: The graph shows the Pareto diagram highlighting the most used indicators in literature for measuring sustainability in the agri-food sector

literature based research project

Source: Authors’ elaboration. Notes: The graph shows the distribution over time of articles divided into conceptual and empirical

literature based research project

Source: Authors’ elaboration. Notes: The graph shows the classification of articles, divided into conceptual and empirical, in-depth analysis

literature based research project

Source: Authors’ elaboration. Notes: The graph shows the geographical distribution of the authors

literature based research project

Source: Authors’ elaboration. Notes: The graph shows the distribution of authors according to the continent from which they originate

literature based research project

Source: Authors’ elaboration. Notes: The graph shows the time distribution of publication of authors according to the continent from which they originate

literature based research project

Source: Authors’ elaboration. Notes: Sustainability measurement indicators and impact categories of LCA, S-LCA, and LCC tools should be integrated in order to provide stakeholders with best practices as guidelines and tools to support both decision-making and measurement, according to the circular economy approach

Similar content being viewed by others

literature based research project

The socio-economic performance of agroecology. A review

Ioanna Mouratiadou, Alexander Wezel, … Paolo Bàrberi

literature based research project

The Circular Economy: An Interdisciplinary Exploration of the Concept and Application in a Global Context

Alan Murray, Keith Skene & Kathryn Haynes

The sustainability of “local” food: a review for policy-makers

Alexander J. Stein & Fabien Santini

Change history

24 march 2022.

A Correction to this paper has been published: https://doi.org/10.1007/s11367-022-02038-9

Acero AP, Rodriguez C, Ciroth A (2017) LCIA methods: impact assessment methods in life cycle assessment and their impact categories. Version 1.5.6. Green Delta 1–23

Accorsi R, Versari L, Manzini R (2015) Glass vs. plastic: Life cycle assessment of extra-virgin olive oil bottles across global supply chains. Sustain 7:2818–2840. https://doi.org/10.3390/su7032818

Adjei-Bamfo P, Maloreh-Nyamekye T, Ahenkan A (2019) The role of e-government in sustainable public procurement in developing countries: a systematic literature review. Resour Conserv Recycl 142:189–203. https://doi.org/10.1016/j.resconrec.2018.12.001

Article   Google Scholar  

Aivazidou E, Tsolakis N, Vlachos D, Iakovou E (2015) Water footprint management policies for agrifood supply chains: a critical taxonomy and a system dynamics modelling approach. Chem Eng Trans 43:115–120. https://doi.org/10.3303/CET1543020

Alhaddi H (2015) Triple bottom line and sustainability: a literature review. Bus Manag Stud 1:6–10

Allaoui H, Guo Y, Sarkis J (2019) Decision support for collaboration planning in sustainable supply chains. J Clean Prod 229:761–774. https://doi.org/10.1016/j.jclepro.2019.04.367

Alshqaqeeq F, Amin Esmaeili M, Overcash M, Twomey J (2020) Quantifying hospital services by carbon footprint: a systematic literature review of patient care alternatives. Resour Conserv Recycl 154:104560. https://doi.org/10.1016/j.resconrec.2019.104560

Anwar F, Chaudhry FN, Nazeer S et al (2016) Causes of ozone layer depletion and its effects on human: review. Atmos Clim Sci 06:129–134. https://doi.org/10.4236/acs.2016.61011

Aquilani B, Silvestri C, Ruggieri A (2016). A Systematic Literature Review on Total Quality Management Critical Success Factors and the Identification of New Avenues of Research. https://doi.org/10.1108/TQM-01-2016-0003

Aramyan L, Hoste R, Van Den Broek W et al (2011) Towards sustainable food production: a scenario study of the European pork sector. J Chain Netw Sci 11:177–189. https://doi.org/10.3920/JCNS2011.Qpork8

Arfini F, Antonioli F, Cozzi E et al (2019) Sustainability, innovation and rural development: the case of Parmigiano-Reggiano PDO. Sustain 11:1–17. https://doi.org/10.3390/su11184978

Assembly UG (2005) Resolution adopted by the general assembly. New York, NY

Avilés-Palacios C, Rodríguez-Olalla A (2021) The sustainability of waste management models in circular economies. Sustain 13:1–19. https://doi.org/10.3390/su13137105

Azevedo SG, Silva ME, Matias JCO, Dias GP (2018) The influence of collaboration initiatives on the sustainability of the cashew supply chain. Sustain 10:1–29. https://doi.org/10.3390/su10062075

Bajaj S, Garg R, Sethi M (2016) Total quality management: a critical literature review using Pareto analysis. Int J Product Perform Manag 67:128–154

Banasik A, Kanellopoulos A, Bloemhof-Ruwaard JM, Claassen GDH (2019) Accounting for uncertainty in eco-efficient agri-food supply chains: a case study for mushroom production planning. J Clean Prod 216:249–256. https://doi.org/10.1016/j.jclepro.2019.01.153

Barth H, Ulvenblad PO, Ulvenblad P (2017) Towards a conceptual framework of sustainable business model innovation in the agri-food sector: a systematic literature review. Sustain 9. https://doi.org/10.3390/su9091620

Bastas A, Liyanage K (2018) Sustainable supply chain quality management: a systematic review

Beckerman W (1992) Economic growth and the environment: whose growth? Whose environment? World Dev 20:481–496. https://doi.org/10.1016/0305-750X(92)90038-W

Belaud JP, Prioux N, Vialle C, Sablayrolles C (2019) Big data for agri-food 4.0: application to sustainability management for by-products supply chain. Comput Ind 111:41–50. https://doi.org/10.1016/j.compind.2019.06.006

Bele B, Norderhaug A, Sickel H (2018) Localized agri-food systems and biodiversity. Agric 8. https://doi.org/10.3390/agriculture8020022

Bilali H El, Calabrese G, Iannetta M et al (2020) Environmental sustainability of typical agro-food products: a scientifically sound and user friendly approach. New Medit 19:69–83. https://doi.org/10.30682/nm2002e

Blanc S, Massaglia S, Brun F et al (2019) Use of bio-based plastics in the fruit supply chain: an integrated approach to assess environmental, economic, and social sustainability. Sustain 11. https://doi.org/10.3390/su11092475

Bloemhof JM, van der Vorst JGAJ, Bastl M, Allaoui H (2015) Sustainability assessment of food chain logistics. Int J Logist Res Appl 18:101–117. https://doi.org/10.1080/13675567.2015.1015508

Bonisoli L, Galdeano-Gómez E, Piedra-Muñoz L (2018) Deconstructing criteria and assessment tools to build agri-sustainability indicators and support farmers’ decision-making process. J Clean Prod 182:1080–1094. https://doi.org/10.1016/j.jclepro.2018.02.055

Bonisoli L, Galdeano-Gómez E, Piedra-Muñoz L, Pérez-Mesa JC (2019) Benchmarking agri-food sustainability certifications: evidences from applying SAFA in the Ecuadorian banana agri-system. J Clean Prod 236. https://doi.org/10.1016/j.jclepro.2019.07.054

Bornmann L, Haunschild R, Hug SE (2018) Visualizing the context of citations referencing papers published by Eugene Garfield: a new type of keyword co-occurrence analysis. Scientometrics 114:427–437. https://doi.org/10.1007/s11192-017-2591-8

Boulding KE (1966) The economics of the coming spaceship earth. New York, 1-17

Bracquené E, Dewulf W, Duflou JR (2020) Measuring the performance of more circular complex product supply chains. Resour Conserv Recycl 154:104608. https://doi.org/10.1016/j.resconrec.2019.104608

Burck J, Hagen U, Bals C et al (2021) Climate Change Performance Index

Calisto Friant M, Vermeulen WJV, Salomone R (2020) A typology of circular economy discourses: navigating the diverse visions of a contested paradigm. Resour Conserv Recycl 161:104917. https://doi.org/10.1016/j.resconrec.2020.104917

Campbell BM, Beare DJ, Bennett EM et al (2017) Agriculture production as a major driver of the earth system exceeding planetary boundaries. Ecol Soc 22. https://doi.org/10.5751/ES-09595-220408

Capitanio F, Coppola A, Pascucci S (2010) Product and process innovation in the Italian food industry. Agribusiness 26:503–518. https://doi.org/10.1002/agr.20239

Caputo P, Zagarella F, Cusenza MA et al (2020) Energy-environmental assessment of the UIA-OpenAgri case study as urban regeneration project through agriculture. Sci Total Environ 729:138819. https://doi.org/10.1016/j.scitotenv.2020.138819

Article   CAS   Google Scholar  

Chabowski BR, Mena JA, Gonzalez-Padron TL (2011) The structure of sustainability research in marketing, 1958–2008: a basis for future research opportunities. J Acad Mark Sci 39:55–70. https://doi.org/10.1007/s11747-010-0212-7

Chadegani AA, Salehi H, Yunus M et al (2017) A comparison between two main academic literature collections : Web of Science and Scopus databases. Asian Soc Sci 9:18–26. https://doi.org/10.5539/ass.v9n5p18

Chams N, Guesmi B, Gil JM (2020) Beyond scientific contribution: assessment of the societal impact of research and innovation to build a sustainable agri-food sector. J Environ Manage 264. https://doi.org/10.1016/j.jenvman.2020.110455

Chandrakumar C, McLaren SJ, Jayamaha NP, Ramilan T (2019) Absolute sustainability-based life cycle assessment (ASLCA): a benchmarking approach to operate agri-food systems within the 2°C global carbon budget. J Ind Ecol 23:906–917. https://doi.org/10.1111/jiec.12830

Chaparro-Africano AM (2019) Toward generating sustainability indicators for agroecological markets. Agroecol Sustain Food Syst 43:40–66. https://doi.org/10.1080/21683565.2019.1566192

Colicchia C, Strozzi F (2012) Supply chain risk management: a new methodology for a systematic literature review

Conca L, Manta F, Morrone D, Toma P (2021) The impact of direct environmental, social, and governance reporting: empirical evidence in European-listed companies in the agri-food sector. Bus Strateg Environ 30:1080–1093. https://doi.org/10.1002/bse.2672

Coppola A, Ianuario S, Romano S, Viccaro M (2020) Corporate social responsibility in agri-food firms: the relationship between CSR actions and firm’s performance. AIMS Environ Sci 7:542–558. https://doi.org/10.3934/environsci.2020034

Corona B, Shen L, Reike D et al (2019) Towards sustainable development through the circular economy—a review and critical assessment on current circularity metrics. Resour Conserv Recycl 151:104498. https://doi.org/10.1016/j.resconrec.2019.104498

Correia MS (2019) Sustainability: An overview of the triple bottom line and sustainability implementation. Int J Strateg Eng 2:29–38.  https://doi.org/10.4018/IJoSE.2019010103

Coteur I, Marchand F, Debruyne L, Lauwers L (2019) Structuring the myriad of sustainability assessments in agri-food systems: a case in Flanders. J Clean Prod 209:472–480. https://doi.org/10.1016/j.jclepro.2018.10.066

CREA (2020) L’agricoltura italiana conta 2019

Crenna E, Sala S, Polce C, Collina E (2017) Pollinators in life cycle assessment: towards a framework for impact assessment. J Clean Prod 140:525–536. https://doi.org/10.1016/j.jclepro.2016.02.058

D’Eusanio M, Serreli M, Zamagni A, Petti L (2018) Assessment of social dimension of a jar of honey: a methodological outline. J Clean Prod 199:503–517. https://doi.org/10.1016/j.jclepro.2018.07.157

Dania WAP, Xing K, Amer Y (2018) Collaboration behavioural factors for sustainable agri-food supply chains: a systematic review. J Clean Prod 186:851–864

De Pascale A, Arbolino R, Szopik-Depczyńska K et al (2021) A systematic review for measuring circular economy: the 61 indicators. J Clean Prod 281. https://doi.org/10.1016/j.jclepro.2020.124942

De Schoenmakere M, Gillabel J (2017) Circular by design: products in the circular economy

Del Borghi A, Gallo M, Strazza C, Del Borghi M (2014) An evaluation of environmental sustainability in the food industry through life cycle assessment: the case study of tomato products supply chain. J Clean Prod 78:121–130. https://doi.org/10.1016/j.jclepro.2014.04.083

Del Borghi A, Strazza C, Magrassi F et al (2018) Life cycle assessment for eco-design of product–package systems in the food industry—the case of legumes. Sustain Prod Consum 13:24–36. https://doi.org/10.1016/j.spc.2017.11.001

Denyer D, Tranfield D (2009) Producing a systematic review. In: Buchanan B (ed) The sage handbook of organization research methods. Sage Publications Ltd, Cornwall, pp 671–689

Google Scholar  

Dietz T, Grabs J, Chong AE (2019) Mainstreamed voluntary sustainability standards and their effectiveness: evidence from the Honduran coffee sector. Regul Gov. https://doi.org/10.1111/rego.12239

Dixon-Woods M (2011) Using framework-based synthesis for conducting reviews of qualitative studies. BMC Med 9:9–10. https://doi.org/10.1186/1741-7015-9-39

do Canto NR, Bossle MB, Marques L, Dutra M, (2020) Supply chain collaboration for sustainability: a qualitative investigation of food supply chains in Brazil. Manag Environ Qual an Int J. https://doi.org/10.1108/MEQ-12-2019-0275

dos Santos RR, Guarnieri P (2020) Social gains for artisanal agroindustrial producers induced by cooperation and collaboration in agri-food supply chain. Soc Responsib J. https://doi.org/10.1108/SRJ-09-2019-0323

Doukidis GI, Matopoulos A, Vlachopoulou M, Manthou V, Manos B (2007) A conceptual framework for supply chain collaboration: empirical evidence from the agri‐food industry. Supply Chain Manag an Int Journal 12:177–186. https://doi.org/10.1108/13598540710742491

Durach CF, Kembro J, Wieland A (2017) A new paradigm for systematic literature reviews in supply chain management. J Supply Chain Manag 53:67–85. https://doi.org/10.1111/jscm.12145

Durán-Sánchez A, Álvarez-García J, Río-Rama D, De la Cruz M (2018) Sustainable water resources management: a bibliometric overview. Water 10:1–19. https://doi.org/10.3390/w10091191

Duru M, Therond O (2015) Livestock system sustainability and resilience in intensive production zones: which form of ecological modernization? Reg Environ Chang 15:1651–1665. https://doi.org/10.1007/s10113-014-0722-9

Edison Fondazione (2019) Le eccellenze agricole italiane. I primati europei e mondiali dell’Italia nei prodotti vegetali. Milan (IT)

Ehrenfeld JR (2005) The roots of sustainability. MIT Sloan Manag Rev 46(2)46:23–25

Elia V, Gnoni MG, Tornese F (2017) Measuring circular economy strategies through index methods: a critical analysis. J Clean Prod 142:2741–2751. https://doi.org/10.1016/j.jclepro.2016.10.196

Elkington J (1997) Cannibals with forks : the triple bottom line of 21st century business. Capstone, Oxford

Esposito B, Sessa MR, Sica D, Malandrino O (2020) Towards circular economy in the agri-food sector. A systematic literature review. Sustain 12. https://doi.org/10.3390/SU12187401

European Commission (2018) Agri-food trade in 2018

European Commission (2019) Monitoring EU agri-food trade: development until September 2019

Eurostat (2018) Small and large farms in the EU - statistics from the farm structure survey

FAO (2011) Biodiversity for food and agriculture. Italy, Rome

FAO (2012) Energy-smart food at FAO: an overview. Italy, Rome

FAO (2014) Food wastage footprint: fool cost-accounting

FAO (2016) The state of food and agriculture climate change, agriculture and food security. Italy, Rome

FAO (2017) The future of food and agriculture: trends and challenges. Italy, Rome

FAO (2020) The state of food security and nutrition in the world. Transforming Food Systems for Affordable Healthy Diets. Rome, Italy

Fassio F, Tecco N (2019) Circular economy for food: a systemic interpretation of 40 case histories in the food system in their relationships with SDGs. Systems 7:43. https://doi.org/10.3390/systems7030043

Fathollahi A, Coupe SJ (2021) Life cycle assessment (LCA) and life cycle costing (LCC) of road drainage systems for sustainability evaluation: quantifying the contribution of different life cycle phases. Sci Total Environ 776:145937. https://doi.org/10.1016/j.scitotenv.2021.145937

Ferreira VJ, Arnal ÁJ, Royo P et al (2019) Energy and resource efficiency of electroporation-assisted extraction as an emerging technology towards a sustainable bio-economy in the agri-food sector. J Clean Prod 233:1123–1132. https://doi.org/10.1016/j.jclepro.2019.06.030

Fiksel J (2006) A framework for sustainable remediation. JOM 8:15–22. https://doi.org/10.1021/es202595w

Flick U (2014) An introduction to qualitative research

Franciosi C, Voisin A, Miranda S et al (2020) Measuring maintenance impacts on sustainability of manufacturing industries : from a systematic literature review to a framework proposal. J Clean Prod 260:1–19. https://doi.org/10.1016/j.jclepro.2020.121065

Gaitán-Cremaschi D, Meuwissen MPM, Oude AGJML (2017) Total factor productivity: a framework for measuring agri-food supply chain performance towards sustainability. Appl Econ Perspect Policy 39:259–285. https://doi.org/10.1093/aepp/ppw008

Galdeano-Gómez E, Zepeda-Zepeda JA, Piedra-Muñoz L, Vega-López LL (2017) Family farm’s features influencing socio-economic sustainability: an analysis of the agri-food sector in southeast Spain. New Medit 16:50–61

Gallopín G, Herrero LMJ, Rocuts A (2014) Conceptual frameworks and visual interpretations of sustainability. Int J Sustain Dev 17:298–326. https://doi.org/10.1504/IJSD.2014.064183

Gallopín GC (2003) Sostenibilidad y desarrollo sostenible: un enfoque sistémico. Cepal, LATIN AMERICA

Garnett T (2013) Food sustainability: problems, perspectives and solutions. Proc Nutr Soc 72:29–39. https://doi.org/10.1017/S0029665112002947

Garofalo P, D’Andrea L, Tomaiuolo M et al (2017) Environmental sustainability of agri-food supply chains in Italy: the case of the whole-peeled tomato production under life cycle assessment methodology. J Food Eng 200:1–12. https://doi.org/10.1016/j.jfoodeng.2016.12.007

Gava O, Bartolini F, Venturi F et al (2018) A reflection of the use of the life cycle assessment tool for agri-food sustainability. Sustain 11. https://doi.org/10.3390/su11010071

Gazzola P, Querci E (2017) The connection between the quality of life and sustainable ecological development. Eur Sci J 7881:1857–7431

Geissdoerfer M, Savaget P, Bocken N, Hultink EJ (2017) The circular economy – a new sustainability paradigm ? The circular economy – a new sustainability paradigm ? J Clean Prod 143:757–768. https://doi.org/10.1016/j.jclepro.2016.12.048

Georgescu-Roegen N (1971) The entropy low and the economic process. Harward University Press, Cambridge Mass

Book   Google Scholar  

Gerbens-Leenes PW, Moll HC, Schoot Uiterkamp AJM (2003) Design and development of a measuring method for environmental sustainability in food production systems. Ecol Econ 46:231–248. https://doi.org/10.1016/S0921-8009(03)00140-X

Gésan-Guiziou G, Alaphilippe A, Aubin J et al (2020) Diversity and potentiality of multi-criteria decision analysis methods for agri-food research. Agron Sustain Dev 40. https://doi.org/10.1007/s13593-020-00650-3

Ghisellini P, Cialani C, Ulgiati S (2016) A review on circular economy: the expected transition to a balanced interplay of environmental and economic systems. J Clean Prod 114:11–32. https://doi.org/10.1016/j.jclepro.2015.09.007

Godoy-Durán Á, Galdeano- Gómez E, Pérez-Mesa JC, Piedra-Muñoz L (2017) Assessing eco-efficiency and the determinants of horticultural family-farming in southeast Spain. J Environ Manage 204:594–604. https://doi.org/10.1016/j.jenvman.2017.09.037

Gold S, Kunz N, Reiner G (2017) Sustainable global agrifood supply chains: exploring the barriers. J Ind Ecol 21:249–260. https://doi.org/10.1111/jiec.12440

Goucher L, Bruce R, Cameron DD et al (2017) The environmental impact of fertilizer embodied in a wheat-to-bread supply chain. Nat Plants 3:1–5. https://doi.org/10.1038/nplants.2017.12

Green A, Nemecek T, Chaudhary A, Mathys A (2020) Assessing nutritional, health, and environmental sustainability dimensions of agri-food production. Glob Food Sec 26:100406. https://doi.org/10.1016/j.gfs.2020.100406

Guinée JB, Heijungs R, Huppes G et al (2011) Life cycle assessment: past, present, and future †. Environ Sci Technol 45:90–96. https://doi.org/10.1021/es101316v

Guiomar N, Godinho S, Pinto-Correia T et al (2018) Typology and distribution of small farms in Europe: towards a better picture. Land Use Policy 75:784–798. https://doi.org/10.1016/j.landusepol.2018.04.012

Gunasekaran A, Patel C, McGaughey RE (2004) A framework for supply chain performance measurement. Int J Prod Econ 87:333–347. https://doi.org/10.1016/j.ijpe.2003.08.003

Gunasekaran A, Patel C, Tirtiroglu E (2001) Performance measures and metrics in a supply chain environment. Int J Oper Prod Manag 21:71–87. https://doi.org/10.1108/01443570110358468

Hamam M, Chinnici G, Di Vita G et al (2021) Circular economy models in agro-food systems: a review. Sustain 13

Harun SN, Hanafiah MM, Aziz NIHA (2021) An LCA-based environmental performance of rice production for developing a sustainable agri-food system in Malaysia. Environ Manage 67:146–161. https://doi.org/10.1007/s00267-020-01365-7

Harvey M, Pilgrim S (2011) The new competition for land: food, energy, and climate change. Food Policy 36:S40–S51. https://doi.org/10.1016/j.foodpol.2010.11.009

Hawkes C, Ruel MT (2006) Understanding the links between agriculture and health. DC: International Food Policy Research Institute. Washington, USA

Hellweg S, Milà i Canals L (2014) Emerging approaches, challenges and opportunities in life cycle assessment. Science (80)344:1109LP–1113. https://doi.org/10.1126/science.1248361

Higgins V, Dibden J, Cocklin C (2015) Private agri-food governance and greenhouse gas abatement: constructing a corporate carbon economy. Geoforum 66:75–84. https://doi.org/10.1016/j.geoforum.2015.09.012

Hill T (1995) Manufacturing strategy: text and cases., Macmillan

Hjeresen DD, Gonzales R (2020) Green chemistry promote sustainable agriculture?The rewards are higher yields and less environmental contamination. Environemental Sci Techonology 103–107

Horne R, Grant T, Verghese K (2009) Life cycle assessment: principles, practice, and prospects. Csiro Publishing, Collingwood, Australia

Horton P, Koh L, Guang VS (2016) An integrated theoretical framework to enhance resource efficiency, sustainability and human health in agri-food systems. J Clean Prod 120:164–169. https://doi.org/10.1016/j.jclepro.2015.08.092

Hospido A, Davis J, Berlin J, Sonesson U (2010) A review of methodological issues affecting LCA of novel food products. Int J Life Cycle Assess 15:44–52. https://doi.org/10.1007/s11367-009-0130-4

Huffman T, Liu J, Green M et al (2015) Improving and evaluating the soil cover indicator for agricultural land in Canada. Ecol Indic 48:272–281. https://doi.org/10.1016/j.ecolind.2014.07.008

Ilbery B, Maye D (2005) Food supply chains and sustainability: evidence from specialist food producers in the Scottish/English borders. Land Use Policy 22:331–344. https://doi.org/10.1016/j.landusepol.2004.06.002

Ingrao C, Faccilongo N, Valenti F et al (2019) Tomato puree in the Mediterranean region: an environmental life cycle assessment, based upon data surveyed at the supply chain level. J Clean Prod 233:292–313. https://doi.org/10.1016/j.jclepro.2019.06.056

Iocola I, Angevin F, Bockstaller C et al (2020) An actor-oriented multi-criteria assessment framework to support a transition towards sustainable agricultural systems based on crop diversification. Sustain 12. https://doi.org/10.3390/su12135434

Irabien A, Darton RC (2016) Energy–water–food nexus in the Spanish greenhouse tomato production. Clean Technol Environ Policy 18:1307–1316. https://doi.org/10.1007/s10098-015-1076-9

ISO 14040:2006 (2006) Environmental management — life cycle assessment — principles and framework

ISO 14044:2006 (2006) Environmental management — life cycle assessment — requirements and guidelines

ISO 15392:2008 (2008) Sustainability in building construction–general principles

Istat (2019) Andamento dell’economia agricola

Jaakkola E (2020) Designing conceptual articles : four approaches. AMS Rev 1–9. https://doi.org/10.1007/s13162-020-00161-0

Jin R, Yuan H, Chen Q (2019) Science mapping approach to assisting the review of construction and demolition waste management research published between 2009 and 2018. Resour Conserv Recycl 140:175–188. https://doi.org/10.1016/j.resconrec.2018.09.029

Johnston P, Everard M, Santillo D, Robèrt KH (2007) Reclaiming the definition of sustainability. Environ Sci Pollut Res Int 14:60–66. https://doi.org/10.1065/espr2007.01.375

Jorgensen SE, Burkhard B, Müller F (2013) Twenty volumes of ecological indicators-an accounting short review. Ecol Indic 28:4–9. https://doi.org/10.1016/j.ecolind.2012.12.018

Joshi S, Sharma M, Kler R (2020) Modeling circular economy dimensions in agri-tourism clusters: sustainable performance and future research directions. Int J Math Eng Manag Sci 5:1046–1061. https://doi.org/10.33889/IJMEMS.2020.5.6.080

Kamilaris A, Gao F, Prenafeta-Boldu FX, Ali MI (2017) Agri-IoT: a semantic framework for Internet of Things-enabled smart farming applications. In: 2016 IEEE 3rd World Forum on Internet of Things, WF-IoT 2016. pp 442–447

Karuppusami G, Gandhinathan R (2006) Pareto analysis of critical success factors of total quality management: a literature review and analysis. TQM Mag 18:372–385. https://doi.org/10.1108/09544780610671048

Kates RW, Parris TM, Leiserowitz AA (2005) What is sustainable development? Goals, indicators, values, and practice. Environ Sci Policy Sustain Dev 47:8–21. https://doi.org/10.1080/00139157.2005.10524444

Khounani Z, Hosseinzadeh-Bandbafha H, Moustakas K et al (2021) Environmental life cycle assessment of different biorefinery platforms valorizing olive wastes to biofuel, phosphate salts, natural antioxidant, and an oxygenated fuel additive (triacetin). J Clean Prod 278:123916. https://doi.org/10.1016/j.jclepro.2020.123916

Kitchenham B, Charters S (2007) Guidelines for performing systematic literature reviews in software engineering version 2.3. Engineering 45. https://doi.org/10.1145/1134285.1134500

Korhonen J, Nuur C, Feldmann A, Birkie SE (2018) Circular economy as an essentially contested concept. J Clean Prod 175:544–552. https://doi.org/10.1016/j.jclepro.2017.12.111

Kuisma M, Kahiluoto H (2017) Biotic resource loss beyond food waste: agriculture leaks worst. Resour Conserv Recycl 124:129–140. https://doi.org/10.1016/j.resconrec.2017.04.008

Laso J, Hoehn D, Margallo M et al (2018) Assessing energy and environmental efficiency of the Spanish agri-food system using the LCA/DEA methodology. Energies 11. https://doi.org/10.3390/en11123395

Lee KM (2007) So What is the “triple bottom line”? Int J Divers Organ Communities Nations Annu Rev 6:67–72. https://doi.org/10.18848/1447-9532/cgp/v06i06/39283

Lehmann RJ, Hermansen JE, Fritz M et al (2011) Information services for European pork chains - closing gaps in information infrastructures. Comput Electron Agric 79:125–136. https://doi.org/10.1016/j.compag.2011.09.002

León-Bravo V, Caniato F, Caridi M, Johnsen T (2017) Collaboration for sustainability in the food supply chain: a multi-stage study in Italy. Sustainability 9:1253

Lepage A (2009) The quality of life as attribute of sustainability. TQM J 21:105–115. https://doi.org/10.1108/17542730910938119

Li CZ, Zhao Y, Xiao B et al (2020) Research trend of the application of information technologies in construction and demolition waste management. J Clean Prod 263. https://doi.org/10.1016/j.jclepro.2020.121458

Lo Giudice A, Mbohwa C, Clasadonte MT, Ingrao C (2014) Life cycle assessment interpretation and improvement of the Sicilian artichokes production. Int J Environ Res 8:305–316. https://doi.org/10.22059/ijer.2014.721

Lueddeckens S, Saling P, Guenther E (2020) Temporal issues in life cycle assessment—a systematic review. Int J Life Cycle Assess 25:1385–1401. https://doi.org/10.1007/s11367-020-01757-1

Luo J, Ji C, Qiu C, Jia F (2018) Agri-food supply chain management: bibliometric and content analyses. Sustain 10. https://doi.org/10.3390/su10051573

Lynch J, Donnellan T, Finn JA et al (2019) Potential development of Irish agricultural sustainability indicators for current and future policy evaluation needs. J Environ Manage 230:434–445. https://doi.org/10.1016/j.jenvman.2018.09.070

MacArthur E (2013) Towards the circular economy. J Ind Ecol 2:23–44

MacArthur E (2017) Delivering the circular economy a toolkit for policymakers, The Ellen MacArthur Foundation

MacInnis DJ (2011) A framework for conceptual. J Mark 75:136–154. https://doi.org/10.1509/jmkg.75.4.136

Mangla SK, Luthra S, Rich N et al (2018) Enablers to implement sustainable initiatives in agri-food supply chains. Int J Prod Econ 203:379–393. https://doi.org/10.1016/j.ijpe.2018.07.012

Marotta G, Nazzaro C, Stanco M (2017) How the social responsibility creates value: models of innovation in Italian pasta industry. Int J Glob Small Bus 9:144–167. https://doi.org/10.1504/IJGSB.2017.088923

Martucci O, Arcese G, Montauti C, Acampora A (2019) Social aspects in the wine sector: comparison between social life cycle assessment and VIVA sustainable wine project indicators. Resources 8. https://doi.org/10.3390/resources8020069

Mayring P (2004) Forum : Qualitative social research Sozialforschung 2. History of content analysis. A Companion to Qual Res 1:159–176

McKelvey B (2002) Managing coevolutionary dynamics. In: 18th EGOS Conference. Barcelona, Spain, pp 1–21

McMichael AJ, Butler CD, Folke C (2003) New visions for addressing sustainability. Science (80- ) 302:1191–1920

Mehmood A, Ahmed S, Viza E et al (2021) Drivers and barriers towards circular economy in agri-food supply chain: a review. Bus Strateg Dev 1–17. https://doi.org/10.1002/bsd2.171

Mella P, Gazzola P (2011) Sustainability and quality of life: the development model. In: Kapounek S (ed) Enterprise and competitive environment. Mendel University: Brno, Czechia. 542–551

Merli R, Preziosi M, Acampora A (2018) How do scholars approach the circular economy ? A systematic literature review. J Clean Prod 178:703–722. https://doi.org/10.1016/j.jclepro.2017.12.112

Merli R, Preziosi M, Acampora A et al (2020) Recycled fibers in reinforced concrete: a systematic literature review. J Clean Prod 248:119207. https://doi.org/10.1016/j.jclepro.2019.119207

Miglietta PP, Morrone D (2018) Managing water sustainability: virtual water flows and economic water productivity assessment of the wine trade between Italy and the Balkans. Sustain 10. https://doi.org/10.3390/su10020543

Mitchell MGE, Chan KMA, Newlands NK, Ramankutty N (2020) Spatial correlations don’t predict changes in agricultural ecosystem services: a Canada-wide case study. Front Sustain Food Syst 4:1–17. https://doi.org/10.3389/fsufs.2020.539892

Moraga G, Huysveld S, Mathieux F et al (2019) Circular economy indicators: what do they measure?. Resour Conserv Recycl 146:452–461. https://doi.org/10.1016/j.resconrec.2019.03.045

Morrissey JE, Dunphy NP (2015) Towards sustainable agri-food systems: the role of integrated sustainability and value assessment across the supply-chain. Int J Soc Ecol Sustain Dev 6:41–58. https://doi.org/10.4018/IJSESD.2015070104

Moser G (2009) Quality of life and sustainability: toward person-environment congruity. J Environ Psychol 29:351–357. https://doi.org/10.1016/j.jenvp.2009.02.002

Muijs D (2010) Doing quantitative research in education with SPSS. London

Muller MF, Esmanioto F, Huber N, Loures ER (2019) A systematic literature review of interoperability in the green Building Information Modeling lifecycle. J Clean Prod 223:397–412. https://doi.org/10.1016/j.jclepro.2019.03.114

Muradin M, Joachimiak-Lechman K, Foltynowicz Z (2018) Evaluation of eco-efficiency of two alternative agricultural biogas plants. Appl Sci 8. https://doi.org/10.3390/app8112083

Naseer MA, ur R, Ashfaq M, Hassan S, et al (2019) Critical issues at the upstream level in sustainable supply chain management of agri-food industries: evidence from Pakistan’s citrus industry. Sustain 11:1–19. https://doi.org/10.3390/su11051326

Nattassha R, Handayati Y, Simatupang TM, Siallagan M (2020) Understanding circular economy implementation in the agri-food supply chain: the case of an Indonesian organic fertiliser producer. Agric Food Secur 9:1–16. https://doi.org/10.1186/s40066-020-00264-8

Nazari-Sharabian M, Ahmad S, Karakouzian M (2018) Climate change and eutrophication: a short review. Eng Technol Appl Sci Res 8:3668–3672. https://doi.org/10.5281/zenodo.2532694

Nazir N (2017) Understanding life cycle thinking and its practical application to agri-food system. Int J Adv Sci Eng Inf Technol 7:1861–1870. https://doi.org/10.18517/ijaseit.7.5.3578

Negra C, Remans R, Attwood S et al (2020) Sustainable agri-food investments require multi-sector co-development of decision tools. Ecol Indic 110:105851. https://doi.org/10.1016/j.ecolind.2019.105851

Newsham KK, Robinson SA (2009) Responses of plants in polar regions to UVB exposure: a meta-analysis. Glob Chang Biol 15:2574–2589. https://doi.org/10.1111/j.1365-2486.2009.01944.x

Niemeijer D, de Groot RS (2008) A conceptual framework for selecting environmental indicator sets. Ecol Indic 8:14–25. https://doi.org/10.1016/j.ecolind.2006.11.012

Niero M, Kalbar PP (2019) Coupling material circularity indicators and life cycle based indicators: a proposal to advance the assessment of circular economy strategies at the product level. Resour Conserv Recycl 140:305–312. https://doi.org/10.1016/j.resconrec.2018.10.002

Nikolaou IE, Tsagarakis KP (2021) An introduction to circular economy and sustainability: some existing lessons and future directions. Sustain Prod Consum 28:600–609. https://doi.org/10.1016/j.spc.2021.06.017

Notarnicola B, Hayashi K, Curran MA, Huisingh D (2012) Progress in working towards a more sustainable agri-food industry. J Clean Prod 28:1–8. https://doi.org/10.1016/j.jclepro.2012.02.007

Notarnicola B, Tassielli G, Renzulli PA, Monforti F (2017) Energy flows and greenhouses gases of EU (European Union) national breads using an LCA (life cycle assessment) approach. J Clean Prod 140:455–469. https://doi.org/10.1016/j.jclepro.2016.05.150

Opferkuch K, Caeiro S, Salomone R, Ramos TB (2021) Circular economy in corporate sustainability reporting: a review of organisational approaches. Bus Strateg Environ 1–22. https://doi.org/10.1002/bse.2854

Padilla-Rivera A, do Carmo BBT, Arcese G, Merveille N, (2021) Social circular economy indicators: selection through fuzzy delphi method. Sustain Prod Consum 26:101–110. https://doi.org/10.1016/j.spc.2020.09.015

Pagotto M, Halog A (2016) Towards a circular economy in Australian agri-food industry: an application of input-output oriented approaches for analyzing resource efficiency and competitiveness potential. J Ind Ecol 20:1176–1186. https://doi.org/10.1111/jiec.12373

Parent G, Lavallée S (2011) LCA potentials and limits within a sustainable agri-food statutory framework. Global food insecurity. Springer, Netherlands, Dordrecht, pp 161–171

Chapter   Google Scholar  

Pattey E, Qiu G (2012) Trends in primary particulate matter emissions from Canadian agriculture. J Air Waste Manag Assoc 62:737–747. https://doi.org/10.1080/10962247.2012.672058

Pauliuk S (2018) Critical appraisal of the circular economy standard BS 8001:2017 and a dashboard of quantitative system indicators for its implementation in organizations. Resour Conserv Recycl 129:81–92. https://doi.org/10.1016/j.resconrec.2017.10.019

Peano C, Migliorini P, Sottile F (2014) A methodology for the sustainability assessment of agri-food systems: an application to the slow food presidia project. Ecol Soc 19. https://doi.org/10.5751/ES-06972-190424

Peano C, Tecco N, Dansero E et al (2015) Evaluating the sustainability in complex agri-food systems: the SAEMETH framework. Sustain 7:6721–6741. https://doi.org/10.3390/su7066721

Pearce DW, Turner RK (1990) Economics of natural resources and the environment. Harvester Wheatsheaf, Hemel Hempstead, Herts

Pelletier N (2018) Social sustainability assessment of Canadian egg production facilities: methods, analysis, and recommendations. Sustain 10:1–17. https://doi.org/10.3390/su10051601

Peña C, Civit B, Gallego-Schmid A et al (2021) Using life cycle assessment to achieve a circular economy. Int J Life Cycle Assess 26:215–220. https://doi.org/10.1007/s11367-020-01856-z

Perez Neira D (2016) Energy sustainability of Ecuadorian cacao export and its contribution to climate change. A case study through product life cycle assessment. J Clean Prod 112:2560–2568. https://doi.org/10.1016/j.jclepro.2015.11.003

Pérez-Neira D, Grollmus-Venegas A (2018) Life-cycle energy assessment and carbon footprint of peri-urban horticulture. A comparative case study of local food systems in Spain. Landsc Urban Plan 172:60–68. https://doi.org/10.1016/j.landurbplan.2018.01.001

Pérez-Pons ME, Plaza-Hernández M, Alonso RS et al (2021) Increasing profitability and monitoring environmental performance: a case study in the agri-food industry through an edge-iot platform. Sustain 13:1–16. https://doi.org/10.3390/su13010283

Petti L, Serreli M, Di Cesare S (2018) Systematic literature review in social life cycle assessment. Int J Life Cycle Assess 23:422–431. https://doi.org/10.1007/s11367-016-1135-4

Pieroni MPP, McAloone TC, Pigosso DCA (2019) Business model innovation for circular economy and sustainability: a review of approaches. J Clean Prod 215:198–216. https://doi.org/10.1016/j.jclepro.2019.01.036

Polit DF, Beck CT (2004) Nursing research: principles and methods. Lippincott Williams & Wilkins, Philadelphia, PA

Porkka M, Gerten D, Schaphoff S et al (2016) Causes and trends of water scarcity in food production. Environ Res Lett 11:015001. https://doi.org/10.1088/1748-9326/11/1/015001

Prajapati H, Kant R, Shankar R (2019) Bequeath life to death: state-of-art review on reverse logistics. J Clean Prod 211:503–520. https://doi.org/10.1016/j.jclepro.2018.11.187

Priyadarshini P, Abhilash PC (2020) Policy recommendations for enabling transition towards sustainable agriculture in India. Land Use Policy 96:104718. https://doi.org/10.1016/j.landusepol.2020.104718

Pronti A, Coccia M (2020) Multicriteria analysis of the sustainability performance between agroecological and conventional coffee farms in the East Region of Minas Gerais (Brazil). Renew Agric Food Syst. https://doi.org/10.1017/S1742170520000332

Rabadán A, González-Moreno A, Sáez-Martínez FJ (2019) Improving firms’ performance and sustainability: the case of eco-innovation in the agri-food industry. Sustain 11. https://doi.org/10.3390/su11205590

Raut RD, Luthra S, Narkhede BE et al (2019) Examining the performance oriented indicators for implementing green management practices in the Indian agro sector. J Clean Prod 215:926–943. https://doi.org/10.1016/j.jclepro.2019.01.139

Recanati F, Marveggio D, Dotelli G (2018) From beans to bar: a life cycle assessment towards sustainable chocolate supply chain. Sci Total Environ 613–614:1013–1023. https://doi.org/10.1016/j.scitotenv.2017.09.187

Redclift M (2005) Sustainable development (1987–2005): an oxymoron comes of age. Sustain Dev 13:212–227. https://doi.org/10.1002/sd.281

Rezaei M, Soheilifard F, Keshvari A (2021) Impact of agrochemical emission models on the environmental assessment of paddy rice production using life cycle assessment approach. Energy Sources. Part A Recover Util Environ Eff 1–16

Rigamonti L, Mancini E (2021) Life cycle assessment and circularity indicators. Int J Life Cycle Assess. https://doi.org/10.1007/s11367-021-01966-2

Risku-Norja H, Mäenpää I (2007) MFA model to assess economic and environmental consequences of food production and consumption. Ecol Econ 60:700–711. https://doi.org/10.1016/j.ecolecon.2006.05.001

Ritzén S, Sandström GÖ (2017) Barriers to the circular economy – integration of perspectives and domains. Procedia CIRP 64:7–12. https://doi.org/10.1016/j.procir.2017.03.005

Rockström J, Steffen W, Noone K et al (2009) A safe operating space for humanity. Nature 461:472–475. https://doi.org/10.1038/461472a

Roos Lindgreen E, Mondello G, Salomone R et al (2021) Exploring the effectiveness of grey literature indicators and life cycle assessment in assessing circular economy at the micro level: a comparative analysis. Int J Life Cycle Assess. https://doi.org/10.1007/s11367-021-01972-4

Roselli L, Casieri A, De Gennaro BC et al (2020) Environmental and economic sustainability of table grape production in Italy. Sustain 12.  https://doi.org/10.3390/su12093670

Ross RB, Pandey V, Ross KL (2015) Sustainability and strategy in U.S. agri-food firms: an assessment of current practices. Int Food Agribus Manag Rev 18:17–48

Royo P, Ferreira VJ, López-Sabirón AM, Ferreira G. (2016) Hybrid diagnosis to characterise the energy and environmental enhancement of photovoltaic modules using smart materials. Energy 101:174–189. https://doi.org/10.1016/j.energy.2016.01.101

Ruggerio CA (2021) Sustainability and sustainable development: a review of principles and definitions. Sci Total Environ 786:147481. https://doi.org/10.1016/j.scitotenv.2021.147481

Ruiz-Almeida A, Rivera-Ferre MG (2019) Internationally-based indicators to measure agri-food systems sustainability using food sovereignty as a conceptual framework. Food Secur 11:1321–1337. https://doi.org/10.1007/s12571-019-00964-5

Ryan M, Hennessy T, Buckley C et al (2016) Developing farm-level sustainability indicators for Ireland using the Teagasc National Farm Survey. Irish J Agric Food Res 55:112–125. https://doi.org/10.1515/ijafr-2016-0011

Saade MRM, Yahia A, Amor B (2020) How has LCA been applied to 3D printing ? A systematic literature review and recommendations for future studies. J Clean Prod 244:118803. https://doi.org/10.1016/j.jclepro.2019.118803

Saitone TL, Sexton RJ (2017) Agri-food supply chain: evolution and performance with conflicting consumer and societal demands. Eur Rev Agric Econ 44:634–657. https://doi.org/10.1093/erae/jbx003

Salim N, Ab Rahman MN, Abd Wahab D (2019) A systematic literature review of internal capabilities for enhancing eco-innovation performance of manufacturing firms. J Clean Prod 209:1445–1460. https://doi.org/10.1016/j.jclepro.2018.11.105

Salimi N (2021) Circular economy in agri-food systems BT - strategic decision making for sustainable management of industrial networks. In: International S (ed) Rezaei J. Publishing, Cham, pp 57–70

Salomone R, Ioppolo G (2012) Environmental impacts of olive oil production: a life cycle assessment case study in the province of Messina (Sicily). J Clean Prod 28:88–100. https://doi.org/10.1016/j.jclepro.2011.10.004

Sánchez AD, Río DMDLC, García JÁ (2017) Bibliometric analysis of publications on wine tourism in the databases Scopus and WoS. Eur Res Manag Bus Econ 23:8–15. https://doi.org/10.1016/j.iedeen.2016.02.001

Saputri VHL, Sutopo W, Hisjam M, Ma’aram A (2019) Sustainable agri-food supply chain performance measurement model for GMO and non-GMO using data envelopment analysis method. Appl Sci 9. https://doi.org/10.3390/app9061199

Sassanelli C, Rosa P, Rocca R, Terzi S (2019) Circular economy performance assessment methods : a systematic literature review. J Clean Prod 229:440–453. https://doi.org/10.1016/j.jclepro.2019.05.019

Schiefer S, Gonzalez C, Flanigan S (2015) More than just a factor in transition processes? The role of collaboration in agriculture. In: Sutherland LA, Darnhofer I, Wilson GA, Zagata L (eds) Transition pathways towards sustainability in agriculture: case studies from Europe, CPI Group. Croydon, UK, pp. 83

Seuring S, Muller M (2008) From a literature review to a conceptual framework for sustainable supply chain management. J Clean Prod 16:1699–1710. https://doi.org/10.1016/j.jclepro.2008.04.020

Silvestri C, Silvestri L, Forcina A, et al (2021) Green chemistry contribution towards more equitable global sustainability and greater circular economy: A systematic literature review. J Clean Prod 294. https://doi.org/10.1016/j.jclepro.2021.126137

Smetana S, Schmitt E, Mathys A (2019) Sustainable use of Hermetia illucens insect biomass for feed and food: attributional and consequential life cycle assessment. Resour Conserv Recycl 144:285–296. https://doi.org/10.1016/j.resconrec.2019.01.042

Sonesson U, Berlin J, Ziegler F (2010) Environmental assessment and management in the food industry: life cycle assessment and related approaches. Woodhead Publishing, Cambridge

Soussana JF (2014) Research priorities for sustainable agri-food systems and life cycle assessment. J Clean Prod 73:19–23. https://doi.org/10.1016/j.jclepro.2014.02.061

Soylu A, Oruç C, Turkay M et al (2006) Synergy analysis of collaborative supply chain management in energy systems using multi-period MILP. Eur J Oper Res 174:387–403. https://doi.org/10.1016/j.ejor.2005.02.042

Spaiser V, Ranganathan S, Swain RB, Sumpter DJ (2017) The sustainable development oxymoron: quantifying and modelling the incompatibility of sustainable development goals. Int J Sustain Dev World Ecol 24:457–470. https://doi.org/10.1080/13504509.2016.1235624

Stewart R, Niero M (2018) Circular economy in corporate sustainability strategies: a review of corporate sustainability reports in the fast-moving consumer goods sector. Bus Strateg Environ 27:1005–1022. https://doi.org/10.1002/bse.2048

Stillitano T, Spada E, Iofrida N et al (2021) Sustainable agri-food processes and circular economy pathways in a life cycle perspective: state of the art of applicative research. Sustain 13:1–29. https://doi.org/10.3390/su13052472

Stone J, Rahimifard S (2018) Resilience in agri-food supply chains: a critical analysis of the literature and synthesis of a novel framework. Supply Chain Manag 23:207–238. https://doi.org/10.1108/SCM-06-2017-0201

Strazza C, Del Borghi A, Gallo M, Del Borghi M (2011) Resource productivity enhancement as means for promoting cleaner production: analysis of co-incineration in cement plants through a life cycle approach. J Clean Prod 19:1615–1621. https://doi.org/10.1016/j.jclepro.2011.05.014

Su B, Heshmati A, Geng Y, Yu X (2013) A review of the circular economy in China: moving from rhetoric to implementation. J Clean Prod 42:215–227. https://doi.org/10.1016/j.jclepro.2012.11.020

Suárez-Eiroa B, Fernández E, Méndez-Martínez G, Soto-Oñate D (2019) Operational principles of circular economy for sustainable development: linking theory and practice. J Clean Prod 214:952–961. https://doi.org/10.1016/j.jclepro.2018.12.271

Svensson G, Wagner B (2015) Implementing and managing economic, social and environmental efforts of business sustainability. Manag Environ Qual an Int Journal 26:195–213. https://doi.org/10.1108/MEQ-09-2013-0099

Tasca AL, Nessi S, Rigamonti L (2017) Environmental sustainability of agri-food supply chains: an LCA comparison between two alternative forms of production and distribution of endive in northern Italy. J Clean Prod 140:725–741. https://doi.org/10.1016/j.jclepro.2016.06.170

Tassielli G, Notarnicola B, Renzulli PA, Arcese G (2018) Environmental life cycle assessment of fresh and processed sweet cherries in southern Italy. J Clean Prod 171:184–197. https://doi.org/10.1016/j.jclepro.2017.09.227

Teixeira R, Pax S (2011) A survey of life cycle assessment practitioners with a focus on the agri-food sector. J Ind Ecol 15:817–820. https://doi.org/10.1111/j.1530-9290.2011.00421.x

Tobergte DR, Curtis S (2013) ILCD Handbook. J Chem Info Model. https://doi.org/10.278/33030

Tortorella MM, Di Leo S, Cosmi C et al (2020) A methodological integrated approach to analyse climate change effects in agri-food sector: the TIMES water-energy-food module. Int J Environ Res Public Health 17:1–21. https://doi.org/10.3390/ijerph17217703

Tranfield D, Denyer D, Smart P (2003) Towards a methodology for developing evidenceinformed management knowledge by means of systematic review. Br J Manag 14:207–222

Trivellas P, Malindretos G, Reklitis P (2020) Implications of green logistics management on sustainable business and supply chain performance: evidence from a survey in the greek agri-food sector. Sustain 12:1–29. https://doi.org/10.3390/su122410515

Tsangas M, Gavriel I, Doula M et al (2020) Life cycle analysis in the framework of agricultural strategic development planning in the Balkan region. Sustain 12:1–15. https://doi.org/10.3390/su12051813

Ülgen VS, Björklund M, Simm N (2019) Inter-organizational supply chain interaction for sustainability : a systematic literature review.

UNEP S (2020) Guidelines for social life cycle assessment of products and organizations 2020.

UNEP/SETAC (2009) United Nations Environment Programme-society of Environmental Toxicology and Chemistry. Guidelines for social life cycle assessment of products. France

United Nations (2011) Guiding principles on business and human rights. Implementing the United Nations “protect, respect and remedy” framework

United Nations (2015) Transforming our world: the 2030 agenda for sustainable development. sustainabledevelopment.un.org

Van Asselt ED, Van Bussel LGJ, Van Der Voet H et al (2014) A protocol for evaluating the sustainability of agri-food production systems - a case study on potato production in peri-urban agriculture in the Netherlands. Ecol Indic 43:315–321. https://doi.org/10.1016/j.ecolind.2014.02.027

Van der Ploeg JD (2014) Peasant-driven agricultural growth and food sovereignty. J Peasant Stud 41:999–1030. https://doi.org/10.1080/03066150.2013.876997

van Eck NJ, Waltman L (2010) Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics 84:523–538. https://doi.org/10.1007/s11192-009-0146-3

Van Eck NJ, Waltman L (2019) Manual for VOSviwer version 1.6.10. CWTS Meaningful metrics 1–53

Vasa L, Angeloska A, Trendov NM (2017) Comparative analysis of circular agriculture development in selected Western Balkan countries based on sustainable performance indicators. Econ Ann 168:44–47. https://doi.org/10.21003/ea.V168-09

Verdecho MJ, Alarcón-Valero F, Pérez-Perales D et al (2020) A methodology to select suppliers to increase sustainability within supply chains. Cent Eur J Oper Res. https://doi.org/10.1007/s10100-019-00668-3

Vergine P, Salerno C, Libutti A et al (2017) Closing the water cycle in the agro-industrial sector by reusing treated wastewater for irrigation. J Clean Prod 164:587–596. https://doi.org/10.1016/j.jclepro.2017.06.239

WCED (1987) Our common future - call for action

Webster K (2013) What might we say about a circular economy? Some temptations to avoid if possible. World Futures 69:542–554

Wheaton E, Kulshreshtha S (2013) Agriculture and climate change: implications for environmental sustainability indicators. WIT Trans Ecol Environ 175:99–110. https://doi.org/10.2495/ECO130091

Wijewickrama MKCS, Chileshe N, Rameezdeen R, Ochoa JJ (2021) Information sharing in reverse logistics supply chain of demolition waste: a systematic literature review. J Clean Prod 280:124359. https://doi.org/10.1016/j.jclepro.2020.124359

Woodhouse A, Davis J, Pénicaud C, Östergren K (2018) Sustainability checklist in support of the design of food processing. Sustain Prod Consum 16:110–120. https://doi.org/10.1016/j.spc.2018.06.008

Wu R, Yang D, Chen J (2014) Social Life Cycle Assessment Revisited Sustain 6:4200–4226. https://doi.org/10.3390/su6074200

Yadav S, Luthra S, Garg D (2021) Modelling Internet of things (IoT)-driven global sustainability in multi-tier agri-food supply chain under natural epidemic outbreaks. Environ Sci Pollut Res 16633–16654. https://doi.org/10.1007/s11356-020-11676-1

Yee FM, Shaharudin MR, Ma G et al (2021) Green purchasing capabilities and practices towards Firm’s triple bottom line in Malaysia. J Clean Prod 307:127268. https://doi.org/10.1016/j.jclepro.2021.127268

Yigitcanlar T (2010) Rethinking sustainable development: urban management, engineering, and design. IGI Global

Zamagni A, Amerighi O, Buttol P (2011) Strengths or bias in social LCA? Int J Life Cycle Assess 16:596–598. https://doi.org/10.1007/s11367-011-0309-3

Download references

Author information

Authors and affiliations.

Department of Economy, Engineering, Society and Business Organization, University of “Tuscia, ” Via del Paradiso 47, 01100, Viterbo, VT, Italy

Cecilia Silvestri, Michela Piccarozzi & Alessandro Ruggieri

Department of Engineering, University of Rome “Niccolò Cusano, ” Via Don Carlo Gnocchi, 3, 00166, Rome, Italy

Luca Silvestri

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Cecilia Silvestri .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Communicated by Monia Niero

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: a number of ill-placed paragraph headings were removed and the source indication "Authors' elaborations" was added to Tables 1-3.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 31 KB)

Rights and permissions.

Reprints and permissions

About this article

Silvestri, C., Silvestri, L., Piccarozzi, M. et al. Toward a framework for selecting indicators of measuring sustainability and circular economy in the agri-food sector: a systematic literature review. Int J Life Cycle Assess (2022). https://doi.org/10.1007/s11367-022-02032-1

Download citation

Received : 15 June 2021

Accepted : 16 February 2022

Published : 02 March 2022

DOI : https://doi.org/10.1007/s11367-022-02032-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Agri-food sector
  • Sustainability
  • Circular economy
  • Triple bottom line
  • Life cycle assessment
  • Find a journal
  • Publish with us
  • Track your research

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 07 April 2023

Using machine learning to predict student retention from socio-demographic characteristics and app-based engagement metrics

  • Sandra C. Matz 1 ,
  • Christina S. Bukow 2 ,
  • Heinrich Peters 1 ,
  • Christine Deacons 3 ,
  • Alice Dinu 5   na1 &
  • Clemens Stachl 4  

Scientific Reports volume  13 , Article number:  5705 ( 2023 ) Cite this article

7775 Accesses

2 Citations

35 Altmetric

Metrics details

  • Human behaviour

An Author Correction to this article was published on 21 June 2023

This article has been updated

Student attrition poses a major challenge to academic institutions, funding bodies and students. With the rise of Big Data and predictive analytics, a growing body of work in higher education research has demonstrated the feasibility of predicting student dropout from readily available macro-level (e.g., socio-demographics or early performance metrics) and micro-level data (e.g., logins to learning management systems). Yet, the existing work has largely overlooked a critical meso-level element of student success known to drive retention: students’ experience at university and their social embeddedness within their cohort. In partnership with a mobile application that facilitates communication between students and universities, we collected both (1) institutional macro-level data and (2) behavioral micro and meso-level engagement data (e.g., the quantity and quality of interactions with university services and events as well as with other students) to predict dropout after the first semester. Analyzing the records of 50,095 students from four US universities and community colleges, we demonstrate that the combined macro and meso-level data can predict dropout with high levels of predictive performance (average AUC across linear and non-linear models = 78%; max AUC = 88%). Behavioral engagement variables representing students’ experience at university (e.g., network centrality, app engagement, event ratings) were found to add incremental predictive power beyond institutional variables (e.g., GPA or ethnicity). Finally, we highlight the generalizability of our results by showing that models trained on one university can predict retention at another university with reasonably high levels of predictive performance.

Similar content being viewed by others

literature based research project

Ethics and discrimination in artificial intelligence-enabled recruitment practices

Zhisheng Chen

literature based research project

Principal component analysis

Michael Greenacre, Patrick J. F. Groenen, … Elena Tuzhilina

literature based research project

A cross-verified database of notable people, 3500BC-2018AD

Morgane Laouenan, Palaash Bhargava, … Etienne Wasmer

Introduction

In the US, only about 60% of full-time students graduate from their program 1 , 2 with the majority of those who discontinue their studies dropping out during their first year 3 These high attrition rates pose major challenges to students, universities, and funding bodies alike 4 , 5 .

Dropping out of university without a degree negatively impacts students’ finances and mental health. Over 65% of US undergraduate students receive student loans to help pay for college, causing them to incur heavy debts over the course of their studies 6 . According to the U.S. Department of Education, students who take out a loan but never graduate are three times more likely to default on loan repayment than students who graduate 7 . This is hardly surprising, given that students who drop out of university without a degree, earn 66% less than university graduates with a bachelor's degree and are far more likely to be unemployed 2 . In addition to financial losses, the feeling of failure often adversely impacts students’ well-being and mental health 8 .

At the same time, student attrition negatively impacts universities and federal funding bodies. For universities, student attrition results in an average annual revenue reduction of approximately $16.5 billion per year through the loss of tuition fees 9 , 10 . Similarly, student attrition wastes valuable resources provided by states and federal governments. For example, the US Department of Education Integrated Postsecondary Education Data System (IPEDS) shows that between 2003 and 2008, state and federal governments together provided more than $9 billion in grants and subsidies to students who did not return to the institution where they were enrolled for a second year 11 .

Given the high costs of attrition, the ability to predict at-risk students – and to provide them with additional support – is critical 12 , 13 . As most dropouts occur during the first year 14 , such predictions are most valuable if they can identify at-risk students as early as possible 13 , 15 , 16 . The earlier one can identify students who might struggle, the better the chances that interventions aimed at protecting them from gradually falling behind – and eventually discontinuing their studies – will be effective 17 , 18 .

Indicators of student retention

Previous research has identified various predictors of student retention, including previous academic performance, demographic and socio-economic factors, and the social embeddedness of a student in their home institution 19 , 20 , 21 , 22 , 23 .

Prior academic performance (e.g., high school GPA, SAT and ACT scores or college GPA) has been identified as one of the most consistent predictors of student retention: Students who are more successful academically are less likely to drop out 17 , 21 , 24 , 25 , 26 , 27 , 28 , 29 . Similarly, research has highlighted the role of demographic and socio-economic variables, including age, gender, and ethnicity 12 , 19 , 25 , 27 , 30 as well as socio-economic status 31 in predicting a students’ likelihood of persisting. For example, women are more likely to continue their studies than men 12 , 30 , 32 , 33 while White and Asian students are more likely to persist than students from other ethnic groups 19 , 27 , 30 . Moreover, a student’s socio-economic status and immediate financial situation have been shown to impact retention. Students are more likely to discontinue their studies if they are first generation students 34 , 35 , 36 or experience high levels of financial distress, (e.g., due to student loans or working nearly full time to cover college expenses) 37 , 38 . In contrast, students who receive financial support that does not have to be repaid post-graduation are more likely to complete their degree 39 , 40 .

While most of the outlined predictors of student retention are relatively stable intrapersonal characteristics and oftentimes difficult or costly to change, research also points to a more malleable pillar of retention: the students’ experience at university. In particular, the extent to which they are successfully integrated and socialized into the institution 16 , 22 , 41 , 42 . As Bean (2005) notes, “few would deny that the social lives of students in college and their exchanges with others inside and outside the institution are important in retention decisions” (p. 227) 41 . The extent to which a student is socially integrated and embedded into their institution has been studied in a number of ways, relating retention to the development of friendships with fellow students 43 , the student’s position in the social networks 16 , 29 , the experience of social connectedness 44 and a sense of belonging 42 , 45 , 46 . Taken together, these studies suggest that interactions with peers as well as faculty and staff – for example through participation in campus activities, membership of organizations, and the pursuit of extracurricular activities – help students better integrate into university life 44 , 47 . In contrast, a lack of social integration resulting from commuting (i.e., not living on campus with other students) has shown to negatively impact a student’s chances of completing their degree 48 , 49 , 50 , 51 . In short, the stronger a student is embedded and feels integrated into the university community – particularly in their first year – the less likely the student will drop out 42 , 52 .

Predicting retention using machine learning

A large proportion of research on student attrition has focused on understanding and explaining drivers of student retention. However, alongside the rise of computational methods and predictive modeling in the social sciences 53 , 54 , 55 , educational researchers and practitioners have started exploring the feasibility and value of data-driven approaches in supporting institutional decision making and educational effectiveness (for excellent overviews of the growing field see 56 , 57 ). In line with this broader trend, a growing body of work has shown the potential of predicting student dropout with the help of machine learning. In contrast to traditional inferential approaches, machine learning approaches are predominantly concerned with predictive performance (i.e., the ability to accurately forecast behavior that has not yet occurred) 54 . In the context of student retention this means: How accurately can we predict whether a student is going to complete or discontinue their studies (in the future) by analyzing their demographic and socio-economic characteristics, their past and current academic performance, as well as their current embeddedness in the university system and culture?

Echoing the National Academy of Education’s statement (2017) that “in the educational context, big data typically take the form of administrative data and learning process data, with each offering their own promise for educational research” (p.4) 58 , the vast majority of existing studies have focused on the prediction of student retention from demographic and socio-economic characteristics as well as students’ academic history and current performance 13 , 59 , 60 , 61 , 62 , 63 , 64 , 65 , 66 . In a recent study, Aulck and colleagues trained a model on the administrative data of over 66,000 first-year students enrolled in a public US university (e.g., race, gender, highschool GPA, entrance exam scores and early college performance/transcript data) to predict whether they would re-enroll in the second year and eventually graduate 59 . Specifically, they used a range of linear and non-linear machine learning models (e.g., regularized logistic regression, k-nearest neighbor, random forest, support vector machine, and gradient boosted trees) to predict retention out-of-sample using a standard cross-validation procedures. Their model was able to predict dropouts with an accuracy of 88% and graduation with an accuracy of 81% (where 50% is chance).

While the existing body of work provides robust evidence for the potential of predictive models for identifying at-risk students, it is based on similar sets of macro-level data (e.g., institutional data, academic performance) or micro-level data (e.g., click-stream data). Almost entirely absent from this research is data on students’ daily experience and engagement with both other students and the university itself (meso-level). Although there have been a small number of studies trying to capture part of this experience by inferring social networks from smart card transactions that were made by students in the same time and place 16 or engagement metrics with an open online course 67 , none of the existing work has offered a more holistic and comprehensive view on students’ daily experience. One potential explanation for this gap is that information about students’ social interactions with classmates or their day-to-day engagement with university services and events is difficult to track. While universities often have access to demographic or socio-economic variables through their Student Information Systems (SISs), and can easily track their academic performance, most universities do not have an easy way of capturing student’s deeper engagement with the system.

Research overview

In this research, we partner with an educational software company – READY Education – that offers a virtual one-stop interaction platform in the form of a smartphone application to facilitate communication between students, faculty, and staff. Students receive relevant information and announcements, can manage their university activities, and interact with fellow students in various ways. For example, the app offers a social media experience like Facebook, including private messaging, groups, public walls, and friendships. In addition, it captures students’ engagement with the university asking them to check into events (e.g., orientation, campus events, and student services) using QR code functionality and prompting them to rate their experience afterwards (see Methods for more details on the features we extracted from this data). As a result, the READY Education app allows us to observe a comprehensive set of information about students that include both (i) institutional data (i.e., demographic, and socio-economic characteristics as well as academic performance), and (ii) their idiosyncratic experience at university captured by their daily interactions with other students and the university services/events. Combining the two data sources captures a student’s profile more holistically and makes it possible to consider potential interactions between the variable sets. For example, being tightly embedded in a social support network of friends might be more important for retention among first-generation students who might not receive the same level of academic support or learn about implicit academic norms and rules from their parents.

Building on this unique dataset, we use machine learning models to predict student retention (i.e., dropout) from both institutional and behavioral engagement data. Given the desire to identify at-risk students as early as possible, we only use information gathered in the students’ first semester to predict whether the student dropped out at any point in time during their program. To thoroughly validate and scrutinize our analytical approach, generate insights for potential interventions, and probe the generalizability of our predictive models across different universities, we investigate the following three research questions:

How accurately can we predict a student's likelihood of discontinuing their studies using information from the first term of their studies (i.e., institutional data, behavioral engagement data, and a combination of both)?

Which features are the most predictive of student retention?

How well do the predictive models generalize across universities (i.e., how well can we predict student retention of students from one university if we use the model trained on data from another university and vice versa)?

Participants

We analyze de-identified data from four institutions with a total of 50,095 students (min = 476, max = 45,062). All students provided informed consent to the use of the anonymized data by READY Education and research partners. All experimental protocols were approved by the Columbia University Ethics Board, and all methods carried out were in accordance with the Board’s guidelines and regulations. The data stem from two sources: (a) Institutional data and (b) behavioral engagement data. The institutional data collected by the universities contain socio-demographics (e.g., gender, ethnicity), general study information (e.g., term of admission, study program), financial information (e.g., pell eligibility), students’ academic achievement scores (e.g., GPA, ACT) as well as the retention status. The latter indicated whether students continued or dropped out and serves as the outcome variable. As different universities collect different information about their students, the scope of institutional data varied between universities. Table 1 provides a descriptive overview of the most important sociodemographic characteristics for each of the four universities. In addition, it provides a descriptive overview of the app usage, including the average number of logs per student, the total number of sessions and logs, as well as the percentage of students in a cohort using the app (i.e., coverage). The broad coverage of students using the app, ranging between 70 and 98%, results in a largely representative sample of the student populations at the respective universities.

Notably, Universities 1–3 are traditional university campuses, while University 4 is a combination of 16 different community colleges. Given that there is considerable heterogeneity across campuses, the predictive accuracies for University 4 are a-priori expected to be lower than those observed for universities 1–3 (and partly speak to the generalizability of findings already). The decision to include University 4 as a single entity was based on the fact that separating out the 16 colleges would have resulted in an over-representation of community colleges that all share similar characteristics thereby artificially inflating the observed cross-university accuracies. Given these limitations (and the fact that the University itself collapsed the college campuses for many of their internal reports), we decided to analyze it as a single unit, acknowledging that this approach brings its own limitations.

The behavioral engagement data were generated through the app (see Table 1 for the specific data collection windows at each University). Behavioral engagement data were available in the form of time-stamped event-logs (i.e., each row in the raw data represented a registered event such as a tab clicked, a comment posted, a message sent). Each log could be assigned to a particular student via an anonymized, unique identifier. Across all four universities, the engagement data contained 7,477,630 sessions (Mean = 1,869,408, SD = 3,329,852) and 17,032,633 logs (Mean = 4,258,158, SD = 6,963,613) across all universities. For complete overview of all behavioral engagement metrics including a description see Table S1 in the Supplementary Materials.

Pre-processing and feature extraction

As a first step, we cleaned both the institutional and app data. For the institutional data, we excluded students who did not use the app and therefore could not be assigned a unique identifier. In addition, we excluded students without a term of admission to guarantee that we are only observing the students’ first semester. Lastly, we removed duplicate entries resulting from dual enrollment in different programs. For the app usage data, we visually inspected the variables in our data set for outliers that might stem from technical issues. We pre-processed data that reflected clicking through the app, named “clicked_[…]” and “viewed_[…]” (see Table S1 in the Supplementary Materials). A small number of observations showed unrealistically high numbers of clicks on the same tab in a very short period, which is likely a reflection of a student repeatedly clicking on a tab due to long loading time or other technical issues. To avoid oversampling these behaviors, we removed all clicks of the same type which were made by the same person less than one minute apart.

We extracted up to 462 features for each university across two broad categories: (i) institutional features and (ii) engagement features, using evidence from previous research as a reference point (see Table S2 in the Supplementary Materials for a comprehensive overview of all features and their availability for each of the universities). Institutional features contain students’ demographic, socio-economic and academic information. The engagement features represent the students’ behavior during their first term of studies. They can be further divided into app engagement and community engagement. The app engagement features represent the students’ behavior related to app usage, such as whether the students used the app before the start of the semester, how often they clicked on notifications or the community tabs, or whether their app use increased over the course of the semester. The community involvement features reflect social behavior and interaction with others, e.g., the number of messages sent, posts and comments made, events visited or a student’s position in the network as inferred from friendships and direct messages. Importantly, many of the features in our dataset will be inter-correlated. For example, living in college accommodation could signal higher levels of socio-economic status, but also make it more likely that students attend campus events and connect with other students living on campus. While intercorrelations among predictors is a challenge with standard inferential statistical techniques such as regression analyses, the methods we apply in this paper can account for a large number of correlated predictors.

Institutional features were directly derived from the data recorded by the institutions. As noted above, not all features were available for all universities, resulting in slightly different feature sets across universities. The engagement features were extracted from the app usage data. As we focused on an early prediction of drop-out, we restricted the data to event-logs that were recorded in the respective students' first term. Notably, the data captures students’ engagement as a time-stamped series of events, offering fine-grained insights into their daily experience. For reasons of simplicity and interpretability (see research question 2), we collapse the data into a single entry for each student. Specifically, we describe a student’s overall experience during the first semester, by calculating distribution measures for each student such as the arithmetic mean, standard deviation, kurtosis, skewness, and sum values. For example, we calculate how many daily messages a particular student sent or received during their first semester, or how many campus events they attended in total. However, we also account for changes in a student’s behavior over time by calculating more complex features such as entropy (e.g., the extent to which a person has frequent contact with few people or the same degree of contact with many people) and the development of specific behaviors over time measured by the slope of regression analyses, as well as features representing the regularity of behavior (e.g., the deviation of time between sending messages). Overall, the feature set was aimed at describing a student’s overall engagement with campus resources and other students during the first semester as well as changed in engagement over time. Finally, we extracted some of the features separately for weekdays and weekends to account for differences and similarities in students’ activities during the week and the weekend. For example, little social interaction on weekdays might predict retention differently than little social interaction on the weekend.

We further cleaned the data by discarding participants for whom the retention status was missing and those in which 95% or more of the values were zero or missing. Furthermore, features were removed if they showed little or no variance across participants, which makes them essentially meaningless in a prediction task. Specifically, we excluded numerical features which showed the same values for more than 90% of observations and categorical features which showed the same value for all observations.

In addition to these general pre-processing procedures, we integrated additional pre-processing steps into the resampling prior to training the models to avoid an overestimation of model performance 68 . To prevent problems with categorical features that occur when there are fewer levels in the test than in the training data, we first removed categories that did not occur in the training data. Second, we removed constant categorical features containing a single value only (and therefore no variation). Third, we imputed missing values using the following procedures: Categorical features were imputed with the mode. Following commonly used approaches to dealing with missing data, the imputation of numeric features varied between the learners. For the elastic net, we imputed those features with the median. For the random forest, we used twice the maximum to give missing values a distinct meaning that would allow the model to leverage this information. Lastly, we used the "Synthetic Minority Oversampling Technique" (SMOTE) to create artificial examples for the minority class in the training data 69 . The only exception was University 4 which followed a different procedure due to the large sample size and estimated computing power for implementing SMOTE. Instead of oversampling minority cases, we downsampled majority cases such that the positive and negative class were balanced. This was done to address the class imbalance caused by most students continuing their studies rather than dropping out 12 .

Predictive modeling approach

We predicted the retention status (1 = dropped out, 0 = continued) in a binary prediction task, with three sets of features: (1) institutional features (2) engagement features, and (3) a combined set of all features. To ensure the robustness of our predictions and to identify the model which is best suited for the current prediction context 54 , we compared a linear classifier ( elastic net; implemented in glmnet 4.1–4) 70 , 71 and a nonlinear classifier ( random forest ; implemented in randomForest 4.7–1) 72 , 73 . Both models are particularly well suited for our prediction context and are common choices in computational social science. That is, simple linear or logistic regression models are not suitable to work with datasets that have many inter-correlated predictors (in our case, a total of 462 predictors many of which are highly correlated) due to a high risk of overfitting. Both the elastic net and the random forest algorithm can effectively utilize large feature sets while reducing the risk of overfitting. We evaluate the performance of our six models for each school (2 algorithms and 3 feature sets), using out-of-sample benchmark experiments that estimate predictive performance and compare it against a common non-informative baseline model. The baseline represents a null-model that does not include any features, but instead always predicts the majority class, which in our samples means “continued.” 74 Below, we provide more details about the specific algorithms (i.e., elastic net and random forest), the cross-validation procedure, and the performance metrics we used for model evaluation.

Elastic net model

The elastic net is a regularized regression approach that combines advantages of ridge regression 75 with those of the LASSO 76 and is motivated by the need to handle large feature sets. The elastic net shrinks the beta-coefficients of features that add little predictive value (e.g., intercorrelated, little variance). Additionally, the elastic net can effectively remove variables from the model by reducing the respective beta coefficients to zero 70 . Unlike classical regression models, the elastic net does not aim to optimize the sum of least squares, but includes two penalty terms (L1, L2) that incentivize the model to reduce the estimated beta value of features that do not add information to the model. Combining the L1 (the sum of absolute values of the coefficients) and L2 (the sum of the squared values of the coefficients) penalties, elastic net addresses the limitations of alternative linear models such as LASSO regression (not capable of handling multi-collinearity) and Ridge Regression (may not produce sparse-enough solutions) 70 .

Formally, following Hastie & Qian (2016) the model equation of elastic net for binary classification problems can be written as follows 77 . Suppose the response variable takes values in G = {0,1}, y i denoted as I(g i  = 1), the model formula is written as

After applying the log-odds transformation, the model formula can be written as

The objective function for logistic regression is the penalized negative binomial log-likelihood

where λ is the regularization parameter that controls the overall strength of the regularization, α is the mixing parameter that controls the balance between L1 and L2 regularization with α values closer to zero to result in sparser models (lasso regression α = 1, ridge regression α = 0). β represents coefficients of the regression model, ||β|| 1 is the is the L1 norm of the coefficients (the sum of absolute values of the coefficients), ||β|| 2 is the L2 norm of the coefficients (the sum of the squared values of the coefficients).

The regularized regression approach is especially relevant for our model because many of the app-based engagement features are highly correlated (e.g., the number of clicks is related to the number of activities registered in the app). In addition, we favored the elastic net algorithm over more complex alternatives, because the regularized beta coefficients can be interpreted as feature importance, allowing insights into which predictors are most informative of college dropout 78 , 79 .

Random forest model

Random forest models are a widely used ensemble learning method that grows many bagged and decorrelated decision trees to come up with a “collective” prediction of the outcome (i.e., the outcome that is chosen by most trees in a classification problem) 72 . Individual decision trees recursively split the feature space (rules to distinguish classes) with the goal to separate the different classes of the criterion (drop out vs. remain in our case). For a detailed description of how individual decision trees operate and translate to a random forest see Pargent, Schoedel & Stachl 80 .

Unlike the elastic net, random forest models can account for nonlinear associations between features and criterion and automatically include multi-dimensional interactions between features. Each decision tree in a random forest considers a random subset of bootstrapped cases and features, thereby increasing the variance of predictions across trees and the robustness of the overall prediction. For the splitting in each node of each tree, a random subset of features (mtry hyperparameter that we optimize in our models) are used by randomly drawing from the total set. For each split, all combinations of split variables and split points are compared, with the model choosing the splits that optimize the separation between classes 72 .

The random forest algorithm can be formally described as follows (verbatim from Hastie et al., 2016, p. 588):

For b = 1 to B:

Draw a bootstrap sample of size N from the training data.

Grow a decision tree to the bootstrapped data, by recursively repeating the following steps for each terminal node of the tree, until the minimum node size is reached.

Select m variables at random from the p variables.

Pick the best variable/split-point among the m according to the loss function (in our case Gini-impurity decrease)

Split the node into two daughter nodes.

Output the ensemble of trees

New predictions can then be made by generating a prediction for each tree and aggregating the results using majority vote.

The aggregation of predictions across trees in random forests improves the prediction performance compared to individual decision trees, as it can benefit from the trees’ variance and greatly reduces it to arrive at a single prediction 72 , 81 .

(Nested) Cross-validation: Out-of-sample model evaluation

We evaluate the performance of our predictive models using an out-of-sample validation approach. The idea behind out-of-sample validation is to increase the likelihood that a model will accurately predict student dropout on new data (e.g. new students) by using different datasets when training and evaluating the model. A commonly used, efficient technique for out-of-sample validation is to repeatedly fit (cf. training) and evaluate (cf. testing) models on non-overlapping parts of the same datasets and to combine the individual estimates across multiple iterations. This procedure – known as cross-validation – can also be used for model optimization (e.g., hyperparameter-tuning, pre-processing, variable selection), by repeatedly evaluating different settings for optimal predictive performance. When both approaches are combined, evaluation and optimization steps need to be performed in a nested fashion to ensure a strict separation of training and test data for a realistic out-of-sample performance estimation. The general idea is to emulate all modeling steps in each fold of the resampling as if it were a single in-sample model. Here, we use nested cross-validation to estimate the predictive performance of our models, to optimize model hyperparameters, and to pre-process data. We illustrate the procedure in Fig.  1 .

figure 1

Schematic cross-validation procedure for out-of-sample predictions. The figure shows a tenfold cross-validation in the outer loop which is used to estimate the overall performance of the model by comparing the predicted outcomes for each student in the previously unseen test set with their actual outcomes. Within each of the 10 outer loops, a fivefold cross-validation in the inner loop is used to finetune model hyperparameters by evaluating different model settings.

The cross-validation procedure works as follows: Say we have a dataset with 1,000 students. In a first step, the dataset is split into ten different subsamples, each containing data from 100 students. In the first round, nine of these subsamples are used for training (i.e., fitting the model to estimate parameters, green boxes). That means, the data from the first 900 students will be included in training the model to relate the different features to the retention outcome. Once training is completed, the model’s performance can be evaluated on the data of the remaining 100 students (i.e., test dataset, blue boxes). For each student, the actual outcome (retained or discontinued, grey and black figures) is compared to the predicted outcome (retained or discontinued, grey and black figures). This comparison allows for the calculation of various performance metrics (see “ Performance metrics ” section below for more details). In contrast to the application of traditional inferential statistics, the evaluation process in predictive models separates the data used to train a model from the data used to evaluate these associations. Hence any overfitting that occurs at the training stage (e.g., using researcher degrees of freedom or due to the model learning relationships that are unique to the training data), hurts the predictive performance in the testing stage. To further increase the robustness of findings and leverage the entire dataset, this process is repeated for all 10 subsamples, such that each subsample is used nine times for training and once for testing. Finally, the obtained estimates from those ten iterations are aggregated to arrive at a cross-validated estimate of model performance. This tenfold cross validation procedure is referred to as the “outer loop”.

In addition to the outer loop, our models also contain an “inner loop”. The inner loop consists of an additional cross-validation procedure that is used to identify ideal hyperparameter settings (see “ Hyperparameter tuning ” section below). That is, in each of the ten iterations of the outer loop, the training sample is further divided into a training and test set to identify the best parameter constellations before model evaluation in the outer loop. We used fivefold cross-validation in the inner loop. All analyses scripts for the pre-processing and modeling steps are available on OSF ( https://osf.io/bhaqp/?view_only=629696d6b2854aa9834d5745425cdbbc ).

Performance metrics

We evaluate model performance based on four different metrics. Our main metric for model performance is AUC (area under the received operating characteristics curve). AUC is commonly used to assess the performance of a model over a 50%-chance baseline, and can range anywhere between 0 and 1. The AUC metric captures the area under the receiver operating characteristic (ROC) curve, which plots the true positive rate (TPR or recall; i.e. the percentage of correctly classified dropouts among all students who actually dropped out), against the false positive rate (FPR; i.e. the percentage of students erroneously classified as dropouts among all the students who actually continued). When the AUC is 0.5, the model’s predictive performance is equal to chance or a coin flip. The closer to 1, the higher the model’s predictive performance in distinguishing between students who continued and those who dropped out.

In addition, we report the F1 score, which ranges between 0 and 1 82 . The F1 score is based on the model’s positive predictive value (or precision, i.e., the percentage of correctly classified dropouts among all students predicted to have dropped out) as well as the model's TPR. A high F1 score hence indicates that there are both few false positives and few false negatives.

Given the specific context, we also report the TPR and the true negative rates (TNR, i.e. the percentage of students predicted to continue among all students who actually continued). Depending on their objective, universities might place a stronger emphasis on optimizing the TPR to make sure no student who is at risk of dropping out gets overlooked or on optimizing the TNR to save resources and assure that students do not get overly burdened. Notably, in most cases, universities are likely to strive for a balance between the two, which is reflected in our main AUC measure. All reported performance metrics represent the mean predictive performance across the 10 cross-validation folds of the outer loop 54 .

Hyperparameter tuning

We used a randomized search with 50 iterations and fivefold cross-validation for hyperparameter tuning in the inner loop of our cross-validation. The randomized search algorithm fits models with hyperparameter configurations randomly selected from a previously defined hyperparameter space and then picks the model that shows the best generalized performance averaged over the five cross-validation folds. The best hyperparamter configuration is used for training in the outer resampling loop to evaluate model performance.

For the elastic net classifier, we tuned the regularization parameter lambda, the decision rule used to choose lambda, and the L1-ratio parameter. The search space for lambda encompassed the 100 glmnet default values 71 . The space of decision rules for lambda included lambda.min which chooses the value of lambda that results in the minimum mean cross-validation error, and lambda.1se which chooses the value of lambda that results in the most regularized model such that the cross-validation error remains within one standard error of the minimum. The search space for the L1-ratio parameter included the range of values between 0 (ridge) to 1 (lasso). For the random forest classifier, we tuned the number of features selected for each split within a decision tree (mtry) and the minimum node size (i.e., how many cases are required to be left in the resulting end-nodes of the tree). The search space for the number of input features per decision tree was set to a range of 1 to p, where p represents the dimensionality of the feature space. The search space for minimum node size was set to a range of 1 to 5. Additionally, for both models, we tuned the oversampling rate and the number or neighbors used to generate new samples utilized by the SMOTE algorithm. The oversampling rate was set to a range of 2 to 15 and the number of nearest neighbors was set to a range of 1 to 10.

RQ1: How accurately can we predict a student's likelihood of discontinuing their studies using information from the first term of their studies?

Figure  2 displays AUC scores (Y-axis) across the different universities (rows), separated by the different feature sets (colors) and predictive algorithms (X-axis labels). The figure displays the distribution of AUC accuracies across the 10 cross-validation folds, alongside their mean and standard deviation. Independent t-tests using Holm corrections for multiple comparisons indicate statistical differences in the predictive performance across the different models and feature sets within each university. Table 2 provides the predictive performance across all four metrics.

figure 2

AUC performance across the four universities for different feature sets and model. Inst. = Institutional data. Engag. = Engagement data. (EN) = Elastic Net. (RF) = Random Forest.

Overall, our models showed high levels of predictive accuracies across universities, models, feature sets and performance metrics, significantly outperforming the baseline in all instances. The main performance metric AUC reached an average of 73% (where 50% is chance), with a maximum of 88% for the random forest model and the full feature set in University 1. Both institutional features and engagement features significantly contributed to predictive performance, highlighting the fact that a student’s likelihood to drop out is both a function of their more stable socio-demographic characteristics as well as their experience of campus life. In most cases, the joint model (i.e., the combination of institutional and engagement features) performed better than each of the individual models alone. Finally, the random forest models produced higher levels of predictive performance than the elastic net in most cases (average AUC elastic net = 70%, AUC random forest = 75%), suggesting that the features are likely to interact with one another in predicting student retention and might not always be linearly related to the outcome.

RQ2: Which features are the most predictive of student retention?

To provide insights into the underlying relationships between student retention and socio-demographic as well as behavioral features, we examined two indicators of feature importance that both offer unique insights. First, we calculated the zero-order correlations between the features and the outcome for each of the four universities. We chose zero-order correlations over elastic net coefficients as they represent the relationships unaltered by the model’s regularization procedure (i.e., the relationship between a feature and the outcome is shown independently of the importance of the other features in the model). To improve the robustness of our findings, we only included the variables that passed the threshold for data inclusion in our models and had less than 50% of the data imputed. The top third of Table 3 displays the 10 most important features (i.e., highest absolute correlation with retention). The sign in brackets indicates the direction of the effects with ( +) indicating a protective factor and (−) indicating a risk factor. Features that showed up in the top 10 for more than 1 university are highlighted in bold.

Second, we calculated permutation variable importance scores for the elastic net and random forest models. For the elastic net model, feature importance is reported as the model coefficient after shrinking the coefficients according to their incremental predictive power. Compared to the zero-order correlation, the elastic net coefficients hence identify the features that have the strongest unique variance. For the random forest models, feature importance is reported as a model-agnostic metric that estimates the importance of a feature by observing the drop in model predictive performance when the actual association between the feature and the outcome is broken by randomly shuffling observations 72 , 83 . A feature is considered important if shuffling its values increases the model error (and therefore decreases the model’s predictive performance). In contrast to the coefficients from the elastic net model, the permutation feature importance scores are undirected and do not provide insights into the specific nature of the relationship between the feature and the outcome. However, they account for the fact that some features might not be predictive themselves but could still prove valuable in the overall model performance because they moderate the impact of other features. For example, minority or first-generation students might benefit more from being embedded in a strong social network than majority students who do not face the same barriers and are likely to have a stronger external support network. The bottom of Table 3 displays the 10 most important features in the elastic net and random forest models (i.e., highest permutation variable importance).

Supporting the findings reported in RQ1, the zero-order correlations confirm that both institutional and behavioral engagement features play an important role in predicting student retention. Aligned with prior work, students’ performance (measured by GPA or ACT) repeatedly appeared as one of the most important predictors across universities and models. In addition, many of the engagement features (e.g., services attended, chat messages network centrality) are related to social activities or network features, supporting the notion that a student’s social connections and support play a critical role in student retention. In addition, the extent to which students are positively engaged with their institutions (e.g., by attending events and rating them highly) appears to play a critical role in preventing dropout.

RQ3: How well do the predictive models generalize across universities?

To test the generalizability of our models across universities, we used the predictive model trained on one university (e.g., University 1) to predict retention of the remaining three universities (e.g., Universities 2–4). Figures  3 A,B display the AUCs across all possible pairs, indicating which university was used for training (X-axis) and which was used for testing (Y-axis, see Figure S1 in the SI for graphs illustrating the findings for F1, TNR and TPR).

figure 3

Performance (average AUC) of cross-university predictions.

Overall, we observed reasonably high levels of predictive performance when applying a model trained on one university to the data of another. The average AUC observed was 63% (for both the elastic net and the random forest), with the highest predictive performance reaching 74% (trained on University 1, predicting University 2), just 1%-point short of the predictive performance observed for the prediction from the universities own model (trained on University 2, predicting University 2). Contrary to the findings in RQ1, the random forest models did not perform better than the elastic net when making predictions for other universities. This suggests that the benefits afforded by the random forest models capture complex interaction patterns that are somewhat unique to each university but might not generalize well to new contexts. The main outlier in generalizability was University 4, where none of the other models reached accuracies much better than chance, and whose model produced relatively low levels of accuracies when predicting student retention across universities 1–2. This is likely a result of the fact that University 4 was qualitatively different from the other universities in several ways, including the fact that University 4 was a community college and consisted of 16 different campuses that were merged for the purpose of this analysis (see Methods for more details).

We show that student retention can be predicted from institutional data, behavioral engagement data, and their combination. Using data from over 50,000 students across four Universities, our predictive models achieve out-of-sample accuracies of up to 88% (where 50% is chance). Notably, while both institutional data and behavioral engagement data significantly predict retention, the combination of the two performs best in most instances. This finding is further supported by our feature importance analyses which suggest that both institutional and behavioral engagement features are among the most important predictors of student retention. Specifically, academic performance as measured by GPA and behavioral metrics associated with campus engagement (e.g., event attendances or ratings) or a student’s position in the network (e.g., closeness or centrality) were shown to consistently act as protective factors. Finally, we highlight the generalizability of our models across universities. Models trained on one university were able to predict student retention at another with reasonably high levels of predictive performance. As one might expect, the generalizability across universities heavily depends on the extent to which the universities are similar on important structural dimensions, with prediction accuracies dropping radically in cases where similarity is low (see low cross-generalization for University 4).

Contributions to the scientific literature

Our findings contribute to the existing literature in several ways. First, they respond to recent calls for more predictive research in psychology 54 , 55 as well as the use of Big Data analytics in education research 56 , 57 . Not only do our models consider socio-demographic characteristics that are collected by universities, but they also capture students’ daily experience and university engagement by tracking behaviors via the READY Education app. Our findings suggest, these more psychological predictors of student retention can improve the performance of predictive models above and beyond socio-demographic variables. This is consistent with previous findings suggesting that the inclusion of engagement metrics improves the performance of predictive models 16 , 84 , 85 . Overall, our models showed superior accuracies to models of former studies that were trained only on demographics and transcript records 15 , 25 or less comprehensive behavioral features 16 and provided results comparable to those reported in studies that additionally included a wide range of socio-economic variables 12 . Given that the READY Education app captures only a fraction of the students' actual experience, the high predictive accuracies make an even stronger case for the importance of student engagement in college retention.

Second, our findings provide insights into the features that are most important in predicting whether a student is going to drop out or not. By doing so they complement our predictive approach with layers of understanding that are conducive to not only validating our models but also generating insights into potential protective and risk factors. Most importantly, our findings highlight the relevance of the behavioral engagement metrics for predicting student retention. Most features identified as being important in the prediction were related to app and community engagement. In line with previous research, features indicative of early and deep social integration, such as interactions with peers and faculty or the development of friendships and social networks, were found to be highly predictive 16 , 41 . For example, it is reasonable to assume that a short time between app registration and the first visit of a campus event (one of the features identified as important) has a positive impact on retention, because campus events offer ideal opportunities for students to socialize 86 . Early participation in a campus event implies early integration and networking with others, protecting students from perceived stress 87 and providing better social and emotional support 88 . In contrast, a student who never attends an event or does so very late in the semester may be less connected to the campus life and the student community which in turn increases the likelihood of dropping out. This interpretation is strengthened by the fact that a high proportion of positive event ratings was identified as an important predictor of a student continuing their studies. Students who enjoy an event are likely to feel more comfortable, be embedded in the university life, make more connections, and build stronger connections. This might result in a virtuous cycle in which students continue attending events and over time create a strong social connection to their peers. As in most previous work, a high GPA score was consistently related to a higher likelihood of continuing one’s studies 21 , 24 . Although their importance varied across universities, ethnicity was also found to play a major role for retention, with consistent inequalities replicating in our predictive models 12 , 19 , 47 . For example, Black students were on average more likely to drop-out, suggesting that universities should dedicate additional resources to protect this group. Importantly, all qualitative interpretations are post-hoc. While many of the findings are intuitive and align with previous research on the topic, future studies should validate our results and investigate the causality underlying the effects in experimental or longitudinal within-person designs 54 , 78 .

Finally, our findings are the first to explore the extent to which the relationships between certain socio-demographic and behavioral characteristics might be idiosyncratic and unique to a specific university. By being able to compare the models across four different universities, we were able to show that many of the insights gained from one university can be leveraged to predict student retention at another. However, our findings also point to important boundary conditions: The more dissimilar universities are in their organizational structures and student experience, the more idiosyncratic the patterns between certain socio-demographic and behavioral features with student retention will be and the harder it is to merely translate general insights to the specific university campus.

Practical contributions

Our findings also have important practical implications. In the US, student attrition results in an average annual revenue loss of approximately $16.5 billion per year 9 , 10 and over $9 billion wasted in federal and state grants and subsidies that are awarded to students who do not finish their degree 11 . Hence, it is critical to predict potential dropouts as early and as accurately as possible to be able to offer dedicated support and allocate resources where they are needed the most. Our models rely exclusively on data collected in the first semester at university and are therefore an ideal “early warning” system for universities who want to predict whether their students will likely continue their studies or drop out at some point. Depending on the university’s resources and goals, the predictive models can be optimized for different performance measures. Indeed, a university might decide to focus on the true positive rate to capture as many dropouts as possible. While this would mean erroneously classifying “healthy “ students as potential dropouts, universities might decide that the burden of providing “unnecessary “ support to these healthy students is worth the reduced risk of missing a dropout. Importantly, our models go beyond mere socio-demographic variables and allow for a more nuanced, personal model that considers not just “who someone is” but also what their experience on campus looks like. As such, our models make it possible to acknowledge individuality rather than using over-generalized assessments of entire socio-demographic segments.

Importantly, however, it is critical to subject these models to continuous quality assurance. While predictive models could allow universities to flag at-risk students early, they could also perpetuate biases that get calcified in the predictive models themselves. For example, students who are traditionally less likely to discontinue their studies might have to pass a much higher level of dysfunctional engagement behavior before their file gets flagged as “at-risk”. Similarly, a person from a traditionally underrepresented group might receive an unnecessarily high volume of additional check-ins even though they are generally flourishing in their day-to-day experience. Given that being labeled as “at-risk” can be associated with stigma that could reinforce stigmas around historically marginalized groups, it will be critical to monitor both the performance of the model over time as well as the perception of its helpfulness among administrators, faculty, and students.

Limitations and future research

Our study has several limitations and highlights avenues for future research. First, our sample consisted of four US universities. Thus, our results are not necessarily generalizable to countries with more collectivistic cultures and other education systems such as Asia, where the reasons for dropping out might be different 89 , 90 , or Europe where most students work part-time jobs and live off-campus. Future research should investigate the extent to which our models can generalize to other cultural contexts and identify the features of student retention that are universally valid across contexts.

Second, our predictive models relied on app usage data. Therefore, our predictive approach could only be applied to students who decided to use the app. This selection, in and by itself, is likely to introduce a sampling bias, as students who decide to use the app might be more likely to retain in the first place, restricting the variance in observations, and excluding students for whom app usage data was not available. However, as our findings suggest, the institutional data alone provide predictive performance independent of the app features, making this a viable alternative for students who do not use the app.

Third, our predictive models rely on cross-sectional predictions. That is, we observe a students’ behavior over the course of an entire semester and based on the patterns observed in other students we predict whether that student is likely to drop out or not. Future research could try to improve both the predictive performance of the model and its usefulness for applied contexts by modeling within-person trends dynamically. Given enough data, the model could observe a person’s baseline behavior and identify changes from that baseline as potentially problematic. In fact, more social contact with other students might be considered a protective factor in our cross-sectional model. However, there are substantial individual differences in how much social contact individuals seek out and enjoy 91 . Hence, sending 10 chat messages a week might be considered a lot for one person, but very little for another. Future research should hence investigate whether the behavioral engagement features allow for a more dynamic within-person model that makes it possible to take base rates into account and provide a dynamic, momentary assessment of a student’s likelihood to drop out.

Fourth, although the engagement data was captured as a longitudinal time series with time-stamped events, we collapsed the data into a single set of cross-sectional features for each student. Although some of these features captures variation in behaviors over time (e.g., entropy and linear trends), future research should try to implement more advanced machine learning models to account for this time series data directly. For example, long short-term memory models (LSTMs) 92 – a type of recurrent neural network – are capable of learning patterns in longitudinal, sequential data like ours.

Fifth, even though the current research provides initial insights into the workings of the models by highlighting the importance of certain features, the conclusions that can be drawn from these analyses are limited as the importance metrics are calculated for the overall population. Future research could aim to calculate the importance of certain features at the individual level to test whether their importance varies across certain socio-demographic features. Estimating the importance of a person’s position in the social network on an individual level, for example, would make it possible to see whether the importance is correlated with institutional data such as minority or first-generation status.

Finally, our results lay the foundation for developing interventions that foster retention through shaping students’ experience at university 93 . Interventions which have been shown to have a positive effect on retention, include orientation programs and academic advising 94 , student support services like mentoring and coaching as well as need-based grants 95 . However, to date, the first-year experience programs meant to strengthen social integration of first year students, do not seem to have yielded positive results 96 , 97 . Our findings could support the development of interventions aimed at improving and maintaining student integration on campus. On a high level, the insights into the most important features provide an empirical path for developing relevant interventions that target the most important levers of student retention. For example, the fact that the time between registration and the first event attendance has such a big impact on student retention means that universities should do everything they can to get students to attend events as early as possible. Similarly, they could develop interventions that lead to more cohesive networks among cohorts and make sure that all students connect to their community. On a deeper, more sophisticated level, new approaches to model explainability could allow universities to tailor their intervention to each student 98 , 99 . For example, explainable AI makes it possible to derive decision rules for each student, indicating which features were critical in predicting the students’ outcome. While student A might be predicted to drop out because they are disconnected from the network, student B might be predicted to drop out because they don’t access the right information on the app. Given this information, universities would be able to personalize their offerings to the specific needs of the student. While student A might be encouraged to spend more time socializing with other students, student B might be reminded to check out important course information. Hence, predictive models could not only be used to identify students at risk but also provide an automated path to offering personalized guidance and support.

For every study that is discontinued, an educational dream shatters. And every shattered dream has a negative long-term impact both on the student and the university the student attended. In this study we introduce an approach to accurately predicting student retention after the first term. Our results show that student retention can be predicted with relatively high levels of predictive performance when considering institutional data, behavioral engagement data, or a combination of the two. By combining socio-demographic characteristics with passively observed behavioral traces reflecting a student’s daily activities, our models offer a holistic picture of students' university experiences and its relation to retention. Overall, such predictive models have great potential both for the early identification of at-risk students and for enabling timely, evidence-based interventions.

Data availability

Raw data are not publicly available due to their proprietary nature and the risks associated with de-anonymization, but they are available from the corresponding author on reasonable request. The pre-processed data and all analyses codes are available on OSF ( https://osf.io/bhaqp/ ) to facilitate reproducibility of our work. Data were analyzed using R, version 4.0.0 (R Core Team, 2020; see subsections for specific packages and versions used). The study’s design relies on secondary data and the analyses were not preregistered.

Change history

21 june 2023.

A Correction to this paper has been published: https://doi.org/10.1038/s41598-023-36579-2

Ginder, S. A., Kelly-Reid, J. E. & Mann, F. B. Graduation Rates for Selected Cohorts, 2009–14; Outcome Measures for Cohort Year 2009–10; Student Financial Aid, Academic Year 2016–17; and Admissions in Postsecondary Institutions, Fall 2017. First Look (Provisional Data). NCES 2018–151. National Center for Education Statistics (2018).

Snyder, T. D., de Brey, C. & Dillow, S. A. Digest of Education Statistics 2017 NCES 2018-070. Natl. Cent. Educ. Stat. (2019).

NSC Research Center. Persistence & Retention – 2019. NSC Research Center https://nscresearchcenter.org/snapshotreport35-first-year-persistence-and-retention/ (2019).

Bound, J., Lovenheim, M. F. & Turner, S. Why have college completion rates declined? An analysis of changing student preparation and collegiate resources. Am. Econ. J. Appl. Econ. 2 , 129–157 (2010).

Article   PubMed   PubMed Central   Google Scholar  

Bowen, W. G., Chingos, M. M. & McPherson, M. S. Crossing the finish line. in Crossing the Finish Line (Princeton University Press, 2009).

McFarland, J. et al. The Condition of Education 2019. NCES 2019-144. Natl. Cent. Educ. Stat. (2019).

Education, U. S. D. of. Fact sheet: Focusing higher education on student success. [Fact Sheet] (2015).

Freudenberg, N. & Ruglis, J. Peer reviewed: Reframing school dropout as a public health issue. Prev. Chronic Dis. 4 , 4 (2007).

Google Scholar  

Raisman, N. The cost of college attrition at four-year colleges & universities-an analysis of 1669 US institutions. Policy Perspect. (2013).

Wellman, J., Johnson, N. & Steele, P. Measuring (and Managing) the Invisible Costs of Postsecondary Attrition. Policy brief. Delta Cost Proj. Am. Instit. Res. (2012).

Schneider, M. Finishing the first lap: The cost of first year student attrition in America’s four year colleges and universities (American Institutes for Research, 2010).

Delen, D. A comparative analysis of machine learning techniques for student retention management. Decis. Support Syst. 49 , 498–506 (2010).

Article   Google Scholar  

Yu, R., Lee, H. & Kizilcec, R. F. Should College Dropout Prediction Models Include Protected Attributes? in Proceedings of the Eighth ACM Conference on Learning@ Scale 91–100 (2021).

Tinto, V. Reconstructing the first year of college. Plan. High. Educ. 25 , 1–6 (1996).

Ortiz-Lozano, J. M., Rua-Vieites, A., Bilbao-Calabuig, P. & Casadesús-Fa, M. University student retention: Best time and data to identify undergraduate students at risk of dropout. Innov. Educ. Teach. Int. 57 , 74–85 (2020).

Ram, S., Wang, Y., Currim, F. & Currim, S. Using big data for predicting freshmen retention. in 2015 international conference on information systems: Exploring the information frontier, ICIS 2015 (Association for Information Systems, 2015).

Levitz, R. S., Noel, L. & Richter, B. J. Strategic moves for retention success. N. Dir. High. Educ. 1999 , 31–49 (1999).

Veenstra, C. P. A strategy for improving freshman college retention. J. Qual. Particip. 31 , 19–23 (2009).

Astin, A. W. How, “good” is your institution’s retention rate?. Res. High. Educ. 38 , 647–658 (1997).

Coleman, J. S. Social capital in the creation of human capital. Am. J. Sociol. 94 , S95–S120 (1988).

Reason, R. D. Student variables that predict retention: Recent research and new developments. J. Stud. Aff. Res. Pract. 40 , 704–723 (2003).

Tinto, V. Dropout from higher education: A theoretical synthesis of recent research. Rev Educ Res 45 , 89–125 (1975).

Tinto, V. Completing college: Rethinking institutional action (University of Chicago Press, 2012).

Book   Google Scholar  

Astin, A. Retaining and Satisfying Students. Educ. Rec. 68 , 36–42 (1987).

Aulck, L., Velagapudi, N., Blumenstock, J. & West, J. Predicting student dropout in higher education. arXiv preprint arXiv:1606.06364 (2016).

Bogard, M., Helbig, T., Huff, G. & James, C. A comparison of empirical models for predicting student retention (Western Kentucky University, 2011).

Murtaugh, P. A., Burns, L. D. & Schuster, J. Predicting the retention of university students. Res. High. Educ. 40 , 355–371 (1999).

Porter, K. B. Current trends in student retention: A literature review. Teach. Learn. Nurs. 3 , 3–5 (2008).

Thomas, S. L. Ties that bind: A social network approach to understanding student integration and persistence. J. High. Educ. 71 , 591–615 (2000).

Peltier, G. L., Laden, R. & Matranga, M. Student persistence in college: A review of research. J. Coll. Stud. Ret. 1 , 357–375 (2000).

Nandeshwar, A., Menzies, T. & Nelson, A. Learning patterns of university student retention. Expert Syst. Appl. 38 , 14984–14996 (2011).

Boero, G., Laureti, T. & Naylor, R. An econometric analysis of student withdrawal and progression in post-reform Italian universities. (2005).

Tinto, V. Leaving college: Rethinking the causes and cures of student attrition (ERIC, 1987).

Choy, S. Students whose parents did not go to college: Postsecondary access, persistence, and attainment. Findings from the condition of education, 2001. (2001).

Ishitani, T. T. Studying attrition and degree completion behavior among first-generation college students in the United States. J. High. Educ. 77 , 861–885 (2006).

Thayer, P. B. Retention of students from first generation and low income backgrounds. (2000).

Britt, S. L., Ammerman, D. A., Barrett, S. F. & Jones, S. Student loans, financial stress, and college student retention. J. Stud. Financ. Aid 47 , 3 (2017).

McKinney, L. & Burridge, A. B. Helping or hindering? The effects of loans on community college student persistence. Res. High Educ. 56 , 299–324 (2015).

Hochstein, S. K. & Butler, R. R. The effects of the composition of a financial aids package on student retention. J. Stud. Financ. Aid 13 , 21–26 (1983).

Singell, L. D. Jr. Come and stay a while: Does financial aid effect retention conditioned on enrollment at a large public university?. Econ. Educ. Rev. 23 , 459–471 (2004).

Bean, J. P. Nine themes of college student. Coll. Stud. Retent. Formula Stud. Success 215 , 243 (2005).

Tinto, V. Through the eyes of students. J. Coll. Stud. Ret. 19 , 254–269 (2017).

Cabrera, A. F., Nora, A. & Castaneda, M. B. College persistence: Structural equations modeling test of an integrated model of student retention. J. High. Educ. 64 , 123–139 (1993).

Roberts, J. & Styron, R. Student satisfaction and persistence: Factors vital to student retention. Res. High. Educ. J. 6 , 1 (2010).

Gopalan, M. & Brady, S. T. College students’ sense of belonging: A national perspective. Educ. Res. 49 , 134–137 (2020).

Hoffman, M., Richmond, J., Morrow, J. & Salomone, K. Investigating, “sense of belonging” in first-year college students. J. Coll. Stud. Ret. 4 , 227–256 (2002).

Terenzini, P. T. & Pascarella, E. T. Toward the validation of Tinto’s model of college student attrition: A review of recent studies. Res. High Educ. 12 , 271–282 (1980).

Astin, A. W. The impact of dormitory living on students. Educational record (1973).

Astin, A. W. Student involvement: A developmental theory for higher education. J. Coll. Stud. Pers. 25 , 297–308 (1984).

Terenzini, P. T. & Pascarella, E. T. Studying college students in the 21st century: Meeting new challenges. Rev. High Ed. 21 , 151–165 (1998).

Thompson, J., Samiratedu, V. & Rafter, J. The effects of on-campus residence on first-time college students. NASPA J. 31 , 41–47 (1993).

Tinto, V. Research and practice of student retention: What next?. J. Coll. Stud. Ret. 8 , 1–19 (2006).

Lazer, D. et al. Computational social science. Science 1979 (323), 721–723 (2009).

Yarkoni, T. & Westfall, J. Choosing prediction over explanation in psychology: Lessons from machine learning. Perspect. Psychol. Sci. 12 , 1100–1122 (2017).

Peters, H., Marrero, Z. & Gosling, S. D. The Big Data toolkit for psychologists: Data sources and methodologies. in The psychology of technology: Social science research in the age of Big Data. 87–124 (American Psychological Association, 2022). doi: https://doi.org/10.1037/0000290-004 .

Fischer, C. et al. Mining big data in education: Affordances and challenges. Rev. Res. Educ. 44 , 130–160 (2020).

Hilbert, S. et al. Machine learning for the educational sciences. Rev. Educ. 9 , e3310 (2021).

National Academy of Education. Big data in education: Balancing the benefits of educational research and student privacy . (2017).

Aulck, L., Nambi, D., Velagapudi, N., Blumenstock, J. & West, J. Mining university registrar records to predict first-year undergraduate attrition. Int. Educ. Data Min. Soc. (2019).

Beaulac, C. & Rosenthal, J. S. Predicting university students’ academic success and major using random forests. Res. High Educ. 60 , 1048–1064 (2019).

Berens, J., Schneider, K., Görtz, S., Oster, S. & Burghoff, J. Early detection of students at risk–predicting student dropouts using administrative student data and machine learning methods. Available at SSRN 3275433 (2018).

Dawson, S., Jovanovic, J., Gašević, D. & Pardo, A. From prediction to impact: Evaluation of a learning analytics retention program. in Proceedings of the seventh international learning analytics & knowledge conference 474–478 (2017).

Dekker, G. W., Pechenizkiy, M. & Vleeshouwers, J. M. Predicting students drop Out: A case study. Int. Work. Group Educ. Data Min. (2009).

del Bonifro, F., Gabbrielli, M., Lisanti, G. & Zingaro, S. P. Student dropout prediction. in International Conference on Artificial Intelligence in Education 129–140 (Springer, 2020).

Hutt, S., Gardner, M., Duckworth, A. L. & D’Mello, S. K. Evaluating fairness and generalizability in models predicting on-time graduation from college applications. Int. Educ. Data Min. Soc. (2019).

Jayaprakash, S. M., Moody, E. W., Lauría, E. J. M., Regan, J. R. & Baron, J. D. Early alert of academically at-risk students: An open source analytics initiative. J. Learn. Anal. 1 , 6–47 (2014).

Balakrishnan, G. & Coetzee, D. Predicting student retention in massive open online courses using hidden markov models. Elect. Eng. Comput. Sci. Univ. Calif. Berkeley 53 , 57–58 (2013).

Hastie, T., Tibshirani, R. & Friedman, J. The elements of statistical learning (Springer series in statistics, New York, NY, USA, 2001).

Book   MATH   Google Scholar  

Chawla, N. V., Bowyer, K. W., Hall, L. O. & Kegelmeyer, W. P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 16 , 321–357 (2002).

Article   MATH   Google Scholar  

Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Seri. B Stat. Methodol. 67 , 301–320 (2005).

Article   MathSciNet   MATH   Google Scholar  

Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33 , 1 (2010).

Breiman, L. Random forests. Mach. Learn. 45 , 5–32 (2001).

Liaw, A. & Wiener, M. Classification and regression by randomForest. R News 2 , 18–22 (2002).

Pargent, F., Schoedel, R. & Stachl, C. An introduction to machine learning for psychologists in R. Psyarxiv (2022).

Hoerl, A. E. & Kennard, R. W. Ridge Regression. in Encyclopedia of Statistical Sciences vol. 8 129–136 (John Wiley & Sons, Inc., 2004).

Tibshirani, R. Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58 , 267–288 (1996).

MathSciNet   MATH   Google Scholar  

Hastie, T. & Qian, J. Glmnet vignette. vol. 9 1–42 https://hastie.su.domains/Papers/Glmnet_Vignette.pdf (2016).

Orrù, G., Monaro, M., Conversano, C., Gemignani, A. & Sartori, G. Machine learning in psychometrics and psychological research. Front. Psychol. 10 , 2970 (2020).

Pargent, F. & Albert-von der Gönna, J. Predictive modeling with psychological panel data. Z Psychol (2019).

Pargent, F., Schoedel, R. & Stachl, C. Best practices in supervised machine learning: A tutorial for psychologists. Doi: https://doi.org/10.31234/osf.io/89snd (2023).

Friedman, J., Hastie, T. & Tibshirani, R. The elements of statistical learning Vol. 1 (Springer series in statistics, 2001).

MATH   Google Scholar  

Rijsbergen, V. & Joost, C. K. Information Retrieval Butterworths London. Google Scholar Google Scholar Digital Library Digital Library (1979).

Molnar, C. Interpretable machine learning . (Lulu. com, 2020).

Aguiar, E., Ambrose, G. A., Chawla, N. v, Goodrich, V. & Brockman, J. Engagement vs Performance: Using Electronic Portfolios to Predict First Semester Engineering Student Persistence . Journal of Learning Analytics vol. 1 (2014).

Chai, K. E. K. & Gibson, D. Predicting the risk of attrition for undergraduate students with time based modelling. Int. Assoc. Dev. Inf. Soc. (2015).

Saenz, T., Marcoulides, G. A., Junn, E. & Young, R. The relationship between college experience and academic performance among minority students. Int. J. Educ. Manag (1999).

Pidgeon, A. M., Coast, G., Coast, G. & Coast, G. Psychosocial moderators of perceived stress, anxiety and depression in university students: An international study. Open J. Soc. Sci. 2 , 23 (2014).

Wilcox, P., Winn, S. & Fyvie-Gauld, M. ‘It was nothing to do with the university, it was just the people’: The role of social support in the first-year experience of higher education. Stud. High. Educ. 30 , 707–722 (2005).

Guiffrida, D. A. Toward a cultural advancement of Tinto’s theory. Rev. High Ed. 29 , 451–472 (2006).

Triandis, H. C., McCusker, C. & Hui, C. H. Multimethod probes of individualism and collectivism. J. Pers. Soc. Psychol. 59 , 1006 (1990).

Watson, D. & Clark, L. A. Extraversion and its positive emotional core. in Handbook of personality psychology 767–793 (Elsevier, 1997).

Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R. & Schmidhuber, J. LSTM: A search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28 , 2222–2232 (2017).

Article   MathSciNet   PubMed   Google Scholar  

Arnold, K. E. & Pistilli, M. D. Course signals at Purdue: Using learning analytics to increase student success. in Proceedings of the 2nd international conference on learning analytics and knowledge 267–270 (2012).

Braxton, J. M. & McClendon, S. A. The fostering of social integration and retention through institutional practice. J. Coll. Stud. Ret. 3 , 57–71 (2001).

Sneyers, E. & de Witte, K. Interventions in higher education and their effect on student success: A meta-analysis. Educ. Rev. (Birm) 70 , 208–228 (2018).

Jamelske, E. Measuring the impact of a university first-year experience program on student GPA and retention. High Educ. (Dordr) 57 , 373–391 (2009).

Purdie, J. R. & Rosser, V. J. Examining the academic performance and retention of first-year students in living-learning communities and first-year experience courses. Coll. Stud. Aff. J. 29 , 95 (2011).

Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2 , 56–67 (2020).

Ramon, Y., Farrokhnia, R. A., Matz, S. C. & Martens, D. Explainable AI for psychological profiling from behavioral data: An application to big five personality predictions from financial transaction records. Information 12 , 518 (2021).

Download references

Author information

Alice Dinu is an Independent Researcher.

Authors and Affiliations

Columbia University, New York, USA

Sandra C. Matz & Heinrich Peters

Ludwig Maximilian University of Munich, Munich, Germany

Christina S. Bukow

Ready Education, Montreal, Canada

Christine Deacons

University of St. Gallen, St. Gallen, Switzerland

Clemens Stachl

Montreal, Canada

You can also search for this author in PubMed   Google Scholar

Contributions

S.C.M., C.B, A.D., H.P., and C.S. designed the research. C.D. and A.D. provided the data. S.C.M, C.B. and H.P. analyzed the data. S.C.M and C.B. wrote the manuscript. All authors reviewed the manuscript. Earlier versions of thi research were part of the C.B.’s masters thesis which was supervised by S.C.M. and C.S.

Corresponding author

Correspondence to Sandra C. Matz .

Ethics declarations

Competing interests.

C.D. is a former employee of Ready Education. None of the other authors have conflict of interests related to this submission.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this Article was revised: Alice Dinu was omitted from the author list in the original version of this Article. The Author Contributions section now reads: “S.C.M., C.B, A.D., H.P., and C.S. designed the research. C.D. and A.D. provided the data. S.C.M, C.B. and H.P. analyzed the data. S.C.M and C.B. wrote the manuscript. All authors reviewed the manuscript. Earlier versions of this research were part of the C.B.’s masters thesis which was supervised by S.C.M. and C.S.” Additionally, the Article contained an error in Data Availability section and the legend of Figure 2 was incomplete.

Supplementary Information

Supplementary information., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Matz, S.C., Bukow, C.S., Peters, H. et al. Using machine learning to predict student retention from socio-demographic characteristics and app-based engagement metrics. Sci Rep 13 , 5705 (2023). https://doi.org/10.1038/s41598-023-32484-w

Download citation

Received : 09 August 2022

Accepted : 28 March 2023

Published : 07 April 2023

DOI : https://doi.org/10.1038/s41598-023-32484-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

literature based research project

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Grad Med Educ
  • v.8(3); 2016 Jul

The Literature Review: A Foundation for High-Quality Medical Education Research

a  These are subscription resources. Researchers should check with their librarian to determine their access rights.

Despite a surge in published scholarship in medical education 1 and rapid growth in journals that publish educational research, manuscript acceptance rates continue to fall. 2 Failure to conduct a thorough, accurate, and up-to-date literature review identifying an important problem and placing the study in context is consistently identified as one of the top reasons for rejection. 3 , 4 The purpose of this editorial is to provide a road map and practical recommendations for planning a literature review. By understanding the goals of a literature review and following a few basic processes, authors can enhance both the quality of their educational research and the likelihood of publication in the Journal of Graduate Medical Education ( JGME ) and in other journals.

The Literature Review Defined

In medical education, no organization has articulated a formal definition of a literature review for a research paper; thus, a literature review can take a number of forms. Depending on the type of article, target journal, and specific topic, these forms will vary in methodology, rigor, and depth. Several organizations have published guidelines for conducting an intensive literature search intended for formal systematic reviews, both broadly (eg, PRISMA) 5 and within medical education, 6 and there are excellent commentaries to guide authors of systematic reviews. 7 , 8

  • A literature review forms the basis for high-quality medical education research and helps maximize relevance, originality, generalizability, and impact.
  • A literature review provides context, informs methodology, maximizes innovation, avoids duplicative research, and ensures that professional standards are met.
  • Literature reviews take time, are iterative, and should continue throughout the research process.
  • Researchers should maximize the use of human resources (librarians, colleagues), search tools (databases/search engines), and existing literature (related articles).
  • Keeping organized is critical.

Such work is outside the scope of this article, which focuses on literature reviews to inform reports of original medical education research. We define such a literature review as a synthetic review and summary of what is known and unknown regarding the topic of a scholarly body of work, including the current work's place within the existing knowledge . While this type of literature review may not require the intensive search processes mandated by systematic reviews, it merits a thoughtful and rigorous approach.

Purpose and Importance of the Literature Review

An understanding of the current literature is critical for all phases of a research study. Lingard 9 recently invoked the “journal-as-conversation” metaphor as a way of understanding how one's research fits into the larger medical education conversation. As she described it: “Imagine yourself joining a conversation at a social event. After you hang about eavesdropping to get the drift of what's being said (the conversational equivalent of the literature review), you join the conversation with a contribution that signals your shared interest in the topic, your knowledge of what's already been said, and your intention.” 9

The literature review helps any researcher “join the conversation” by providing context, informing methodology, identifying innovation, minimizing duplicative research, and ensuring that professional standards are met. Understanding the current literature also promotes scholarship, as proposed by Boyer, 10 by contributing to 5 of the 6 standards by which scholarly work should be evaluated. 11 Specifically, the review helps the researcher (1) articulate clear goals, (2) show evidence of adequate preparation, (3) select appropriate methods, (4) communicate relevant results, and (5) engage in reflective critique.

Failure to conduct a high-quality literature review is associated with several problems identified in the medical education literature, including studies that are repetitive, not grounded in theory, methodologically weak, and fail to expand knowledge beyond a single setting. 12 Indeed, medical education scholars complain that many studies repeat work already published and contribute little new knowledge—a likely cause of which is failure to conduct a proper literature review. 3 , 4

Likewise, studies that lack theoretical grounding or a conceptual framework make study design and interpretation difficult. 13 When theory is used in medical education studies, it is often invoked at a superficial level. As Norman 14 noted, when theory is used appropriately, it helps articulate variables that might be linked together and why, and it allows the researcher to make hypotheses and define a study's context and scope. Ultimately, a proper literature review is a first critical step toward identifying relevant conceptual frameworks.

Another problem is that many medical education studies are methodologically weak. 12 Good research requires trained investigators who can articulate relevant research questions, operationally define variables of interest, and choose the best method for specific research questions. Conducting a proper literature review helps both novice and experienced researchers select rigorous research methodologies.

Finally, many studies in medical education are “one-offs,” that is, single studies undertaken because the opportunity presented itself locally. Such studies frequently are not oriented toward progressive knowledge building and generalization to other settings. A firm grasp of the literature can encourage a programmatic approach to research.

Approaching the Literature Review

Considering these issues, journals have a responsibility to demand from authors a thoughtful synthesis of their study's position within the field, and it is the authors' responsibility to provide such a synthesis, based on a literature review. The aforementioned purposes of the literature review mandate that the review occurs throughout all phases of a study, from conception and design, to implementation and analysis, to manuscript preparation and submission.

Planning the literature review requires understanding of journal requirements, which vary greatly by journal ( table 1 ). Authors are advised to take note of common problems with reporting results of the literature review. Table 2 lists the most common problems that we have encountered as authors, reviewers, and editors.

Sample of Journals' Author Instructions for Literature Reviews Conducted as Part of Original Research Article a

An external file that holds a picture, illustration, etc.
Object name is i1949-8357-8-3-297-t01.jpg

Common Problem Areas for Reporting Literature Reviews in the Context of Scholarly Articles

An external file that holds a picture, illustration, etc.
Object name is i1949-8357-8-3-297-t02.jpg

Locating and Organizing the Literature

Three resources may facilitate identifying relevant literature: human resources, search tools, and related literature. As the process requires time, it is important to begin searching for literature early in the process (ie, the study design phase). Identifying and understanding relevant studies will increase the likelihood of designing a relevant, adaptable, generalizable, and novel study that is based on educational or learning theory and can maximize impact.

Human Resources

A medical librarian can help translate research interests into an effective search strategy, familiarize researchers with available information resources, provide information on organizing information, and introduce strategies for keeping current with emerging research. Often, librarians are also aware of research across their institutions and may be able to connect researchers with similar interests. Reaching out to colleagues for suggestions may help researchers quickly locate resources that would not otherwise be on their radar.

During this process, researchers will likely identify other researchers writing on aspects of their topic. Researchers should consider searching for the publications of these relevant researchers (see table 3 for search strategies). Additionally, institutional websites may include curriculum vitae of such relevant faculty with access to their entire publication record, including difficult to locate publications, such as book chapters, dissertations, and technical reports.

Strategies for Finding Related Researcher Publications in Databases and Search Engines

An external file that holds a picture, illustration, etc.
Object name is i1949-8357-8-3-297-t03.jpg

Search Tools and Related Literature

Researchers will locate the majority of needed information using databases and search engines. Excellent resources are available to guide researchers in the mechanics of literature searches. 15 , 16

Because medical education research draws on a variety of disciplines, researchers should include search tools with coverage beyond medicine (eg, psychology, nursing, education, and anthropology) and that cover several publication types, such as reports, standards, conference abstracts, and book chapters (see the box for several information resources). Many search tools include options for viewing citations of selected articles. Examining cited references provides additional articles for review and a sense of the influence of the selected article on its field.

Box Information Resources

  • Web of Science a
  • Education Resource Information Center (ERIC)
  • Cumulative Index of Nursing & Allied Health (CINAHL) a
  • Google Scholar

Once relevant articles are located, it is useful to mine those articles for additional citations. One strategy is to examine references of key articles, especially review articles, for relevant citations.

Getting Organized

As the aforementioned resources will likely provide a tremendous amount of information, organization is crucial. Researchers should determine which details are most important to their study (eg, participants, setting, methods, and outcomes) and generate a strategy for keeping those details organized and accessible. Increasingly, researchers utilize digital tools, such as Evernote, to capture such information, which enables accessibility across digital workspaces and search capabilities. Use of citation managers can also be helpful as they store citations and, in some cases, can generate bibliographies ( table 4 ).

Citation Managers

An external file that holds a picture, illustration, etc.
Object name is i1949-8357-8-3-297-t04.jpg

Knowing When to Say When

Researchers often ask how to know when they have located enough citations. Unfortunately, there is no magic or ideal number of citations to collect. One strategy for checking coverage of the literature is to inspect references of relevant articles. As researchers review references they will start noticing a repetition of the same articles with few new articles appearing. This can indicate that the researcher has covered the literature base on a particular topic.

Putting It All Together

In preparing to write a research paper, it is important to consider which citations to include and how they will inform the introduction and discussion sections. The “Instructions to Authors” for the targeted journal will often provide guidance on structuring the literature review (or introduction) and the number of total citations permitted for each article category. Reviewing articles of similar type published in the targeted journal can also provide guidance regarding structure and average lengths of the introduction and discussion sections.

When selecting references for the introduction consider those that illustrate core background theoretical and methodological concepts, as well as recent relevant studies. The introduction should be brief and present references not as a laundry list or narrative of available literature, but rather as a synthesized summary to provide context for the current study and to identify the gap in the literature that the study intends to fill. For the discussion, citations should be thoughtfully selected to compare and contrast the present study's findings with the current literature and to indicate how the present study moves the field forward.

To facilitate writing a literature review, journals are increasingly providing helpful features to guide authors. For example, the resources available through JGME include several articles on writing. 17 The journal Perspectives on Medical Education recently launched “The Writer's Craft,” which is intended to help medical educators improve their writing. Additionally, many institutions have writing centers that provide web-based materials on writing a literature review, and some even have writing coaches.

The literature review is a vital part of medical education research and should occur throughout the research process to help researchers design a strong study and effectively communicate study results and importance. To achieve these goals, researchers are advised to plan and execute the literature review carefully. The guidance in this editorial provides considerations and recommendations that may improve the quality of literature reviews.

IMAGES

  1. PPT

    literature based research project

  2. Topics For Research, Research Writing, Essay Writing Help, Essay Help

    literature based research project

  3. How To Make A Literature Review For A Research Paper

    literature based research project

  4. Introduction to Literature Reviews

    literature based research project

  5. Research development based on quantitative...- Mind Map

    literature based research project

  6. Literature Review Thesis Example

    literature based research project

VIDEO

  1. Chapter two

  2. The Literature Review

  3. Research Methods

  4. Approaches , Analysis And Sources Of Literature Review ( RESEARCH METHODOLOGY AND IPR)

  5. Literature Review

  6. WRITING THE LITERATURE REVIEW #research#trending

COMMENTS

  1. Dissertations & projects: Literature-based projects

    The structure of a literature-based dissertation is usually thematic, but make sure to check with your supervisor to make sure you are abiding by your department's project specifications. A typical literature-based dissertation will be broken up into the following sections: Title. Abstract or summary. Acknowledgments.

  2. How to Write a Literature Review

    Examples of literature reviews. Step 1 - Search for relevant literature. Step 2 - Evaluate and select sources. Step 3 - Identify themes, debates, and gaps. Step 4 - Outline your literature review's structure. Step 5 - Write your literature review.

  3. Literature review as a research methodology: An ...

    As mentioned previously, there are a number of existing guidelines for literature reviews. Depending on the methodology needed to achieve the purpose of the review, all types can be helpful and appropriate to reach a specific goal (for examples, please see Table 1).These approaches can be qualitative, quantitative, or have a mixed design depending on the phase of the review.

  4. Writing a Literature Review Research Paper: A step-by-step approach

    A literature review is a surveys scholarly articles, books and other sources relevant to a particular. issue, area of research, or theory, and by so doing, providing a description, summary, and ...

  5. Writing a literature review

    Writing a literature review requires a range of skills to gather, sort, evaluate and summarise peer-reviewed published data into a relevant and informative unbiased narrative. Digital access to research papers, academic texts, review articles, reference databases and public data sets are all sources of information that are available to enrich ...

  6. How to Write a Research Proposal

    Writing a research proposal can be quite challenging, but a good starting point could be to look at some examples. We've included a few for you below. Example research proposal #1: "A Conceptual Framework for Scheduling Constraint Management" Example research proposal #2: "Medical Students as Mediators of Change in Tobacco Use" Title page

  7. What is a Literature Review?

    A literature review is a survey of scholarly sources on a specific topic. It provides an overview of current knowledge, allowing you to identify relevant theories, methods, and gaps in the existing research. There are five key steps to writing a literature review: Search for relevant literature. Evaluate sources. Identify themes, debates and gaps.

  8. Literature Based Dissertation

    Chapter 1 - Introduction. Begin your literature-based dissertation with a compelling introduction that sets the context for your research. Clearly state the purpose and significance of your study, along with your research objectives and research questions. Emphasise the gap or problem your dissertation aims to address and explain how a thorough ...

  9. (PDF) Literature Review as a Research Methodology: An overview and

    Literature reviews allow scientists to argue that they are expanding current. expertise - improving on what already exists and filling the gaps that remain. This paper demonstrates the literatu ...

  10. Methodological Approaches to Literature Review

    A literature review is defined as "a critical analysis of a segment of a published body of knowledge through summary, classification, and comparison of prior research studies, reviews of literature, and theoretical articles." (The Writing Center University of Winconsin-Madison 2022) A literature review is an integrated analysis, not just a summary of scholarly work on a specific topic.

  11. Ten Simple Rules for Writing a Literature Review

    Literature reviews are in great demand in most scientific fields. Their need stems from the ever-increasing output of scientific publications .For example, compared to 1991, in 2008 three, eight, and forty times more papers were indexed in Web of Science on malaria, obesity, and biodiversity, respectively .Given such mountains of papers, scientists cannot be expected to examine in detail every ...

  12. A systematic review on literature-based discovery workflow

    This systematic review provides a comprehensive overview of the LBD workflow by answering nine research questions related to the major components of the LBD workflow (i.e., input, process, output, and evaluation). With regards to the input component, we discuss the data types and data sources used in the literature.

  13. Reviewing literature for research: Doing it the right way

    A thorough review of literature is not only essential for selecting research topics, but also enables the right applicability of a research project. Most importantly, a good literature search is the cornerstone of practice of evidence based medicine.

  14. How to write a superb literature review

    The best proposals are timely and clearly explain why readers should pay attention to the proposed topic. It is not enough for a review to be a summary of the latest growth in the literature: the ...

  15. Project-based learning: A review of the literature

    Project-based learning: A review of the literature. Dimitra Kokotsaki [email protected], Victoria Menzies, and Andy Wiggins View all authors and affiliations. ... A review of research on project-based learning. California: The Autodesk Foundation. Google Scholar. Wrigley T. (2007). Projects, stories and challenges: More open architectures for ...

  16. Steps in Conducting a Literature Review

    A literature review is an integrated analysis-- not just a summary-- of scholarly writings and other relevant evidence related directly to your research question.That is, it represents a synthesis of the evidence that provides background information on your topic and shows a association between the evidence and your research question.

  17. 5. The Literature Review

    A literature review may consist of simply a summary of key sources, but in the social sciences, a literature review usually has an organizational pattern and combines both summary and synthesis, often within specific conceptual categories.A summary is a recap of the important information of the source, but a synthesis is a re-organization, or a reshuffling, of that information in a way that ...

  18. Literature search for research planning and identification of research

    INTRODUCTION. Literature search is a systematic and well-organised search from the already published data to identify a breadth of good quality references on a specific topic.[] The reasons for conducting literature search are numerous that include drawing information for making evidence-based guidelines, a step in the research method and as part of academic assessment.[]

  19. Guidance on Conducting a Systematic Literature Review

    Literature review is an essential feature of academic research. Fundamentally, knowledge advancement must be built on prior existing work. To push the knowledge frontier, we must know where the frontier is. By reviewing relevant literature, we understand the breadth and depth of the existing body of work and identify gaps to explore.

  20. Project-based learning: A review of the literature

    Project-based learning is a student-centred form of instruction which is based on. three constructivist principles: learning is context-specific, learners are involved. actively in the learning ...

  21. Toward a framework for selecting indicators of measuring ...

    Purpose The implementation of sustainability and circular economy (CE) models in agri-food production can promote resource efficiency, reduce environmental burdens, and ensure improved and socially responsible systems. In this context, indicators for the measurement of sustainability play a crucial role. Indicators can measure CE strategies aimed to preserve functions, products, components ...

  22. Full article: Engineering education 5.0: a systematic literature review

    The remaining papers use the methods of interviews and literature research, with two mentions each. Secondly, the papers were divided into a total of two thematic categories according to the level of higher education didactic planning and their involvement at the macro, meso and(/or) micro level. ... Project-Based Learning (PBL): ...

  23. Approaching literature review for academic purposes: The Literature

    A sophisticated literature review (LR) can result in a robust dissertation/thesis by scrutinizing the main problem examined by the academic study; anticipating research hypotheses, methods and results; and maintaining the interest of the audience in how the dissertation/thesis will provide solutions for the current gaps in a particular field.

  24. Using machine learning to predict student retention from socio ...

    Student attrition poses a major challenge to academic institutions, funding bodies and students. With the rise of Big Data and predictive analytics, a growing body of work in higher education ...

  25. Research on the Correlation of Safety Risk of Railway Bridge ...

    China has emerged as a prominent global player in the field of railways, with numerous railway construction projects spanning across diverse locations. Railway bridges, as a crucial component of railway construction, warrant significant attention. Meta-analysis, a statistical method that systematically synthesizes research findings, has been utilized to summarize and compare the results of ...

  26. Literature-based discovery approaches for evidence-based healthcare: a

    Purpose. Literature-Based Discovery (LBD) is a text mining technique used to generate novel hypotheses from vast amounts of literature sources, by identifying links between concepts from disparate sources. One of the main areas where it has been predominantly applied is the healthcare domain, whereby promising results, in the form of novel ...

  27. The Literature Review: A Foundation for High-Quality Medical Education

    Purpose and Importance of the Literature Review. An understanding of the current literature is critical for all phases of a research study. Lingard 9 recently invoked the "journal-as-conversation" metaphor as a way of understanding how one's research fits into the larger medical education conversation. As she described it: "Imagine yourself joining a conversation at a social event.