literature review on different models

Resources Home 🏠
Try SciSpace Copilot
Search research papers
Add Copilot Extension
Try AI Detector
Try Paraphraser
Try Citation Generator
April Papers
June Papers
July Papers

Types of Literature Review — A Guide for Researchers

Researchers often face challenges when choosing the appropriate type of literature review for their study. Regardless of the type of research design and the topic of a research problem , they encounter numerous queries, including:

What is the right type of literature review my study demands?

How do we gather the data?
How to conduct one?
How reliable are the review findings?
How do we employ them in our research? And the list goes on.

If you’re also dealing with such a hefty questionnaire, this article is of help. Read through this piece of guide to get an exhaustive understanding of the different types of literature reviews and their step-by-step methodologies along with a dash of pros and cons discussed.

Heading from scratch!

What is a Literature Review?

A literature review provides a comprehensive overview of existing knowledge on a particular topic, which is quintessential to any research project. Researchers employ various literature reviews based on their research goals and methodologies. The review process involves assembling, critically evaluating, and synthesizing existing scientific publications relevant to the research question at hand. It serves multiple purposes, including identifying gaps in existing literature, providing theoretical background, and supporting the rationale for a research study.

What is the importance of a Literature review in research?

Literature review in research serves several key purposes, including:

Background of the study: Provides proper context for the research. It helps researchers understand the historical development, theoretical perspectives, and key debates related to their research topic.
Identification of research gaps: By reviewing existing literature, researchers can identify gaps or inconsistencies in knowledge, paving the way for new research questions and hypotheses relevant to their study.
Theoretical framework development: Facilitates the development of theoretical frameworks by cultivating diverse perspectives and empirical findings. It helps researchers refine their conceptualizations and theoretical models.
Methodological guidance: Offers methodological guidance by highlighting the documented research methods and techniques used in previous studies. It assists researchers in selecting appropriate research designs, data collection methods, and analytical tools.
Quality assurance and upholding academic integrity: Conducting a thorough literature review demonstrates the rigor and scholarly integrity of the research. It ensures that researchers are aware of relevant studies and can accurately attribute ideas and findings to their original sources.

Types of Literature Review

Literature review plays a crucial role in guiding the research process , from providing the background of the study to research dissemination and contributing to the synthesis of the latest theoretical literature review findings in academia.

However, not all types of literature reviews are the same; they vary in terms of methodology, approach, and purpose. Let's have a look at the various types of literature reviews to gain a deeper understanding of their applications.

1. Narrative Literature Review

A narrative literature review, also known as a traditional literature review, involves analyzing and summarizing existing literature without adhering to a structured methodology. It typically provides a descriptive overview of key concepts, theories, and relevant findings of the research topic.

Unlike other types of literature reviews, narrative reviews reinforce a more traditional approach, emphasizing the interpretation and discussion of the research findings rather than strict adherence to methodological review criteria. It helps researchers explore diverse perspectives and insights based on the research topic and acts as preliminary work for further investigation.

Steps to Conduct a Narrative Literature Review

Source:- https://www.researchgate.net/figure/Steps-of-writing-a-narrative-review_fig1_354466408

Define the research question or topic:

The first step in conducting a narrative literature review is to clearly define the research question or topic of interest. Defining the scope and purpose of the review includes — What specific aspect of the topic do you want to explore? What are the main objectives of the research? Refine your research question based on the specific area you want to explore.

Conduct a thorough literature search

Once the research question is defined, you can conduct a comprehensive literature search. Explore and use relevant databases and search engines like SciSpace Discover to identify credible and pertinent, scholarly articles and publications.

Select relevant studies

Before choosing the right set of studies, it’s vital to determine inclusion (studies that should possess the required factors) and exclusion criteria for the literature and then carefully select papers. For example — Which studies or sources will be included based on relevance, quality, and publication date?

*Important (applies to all the reviews): Inclusion criteria are the factors a study must include (For example: Include only peer-reviewed articles published between 2022-2023, etc.). Exclusion criteria are the factors that wouldn’t be required for your search strategy (Example: exclude irrelevant papers, preprints, written in non-English, etc.)

Critically analyze the literature

Once the relevant studies are shortlisted, evaluate the methodology, findings, and limitations of each source and jot down key themes, patterns, and contradictions. You can use efficient AI tools to conduct a thorough literature review and analyze all the required information.

Synthesize and integrate the findings

Now, you can weave together the reviewed studies, underscoring significant findings such that new frameworks, contrasting viewpoints, and identifying knowledge gaps.

Discussion and conclusion

This is an important step before crafting a narrative review — summarize the main findings of the review and discuss their implications in the relevant field. For example — What are the practical implications for practitioners? What are the directions for future research for them?

Write a cohesive narrative review

Organize the review into coherent sections and structure your review logically, guiding the reader through the research landscape and offering valuable insights. Use clear and concise language to convey key points effectively.

Structure of Narrative Literature Review

A well-structured, narrative analysis or literature review typically includes the following components:

Introduction: Provides an overview of the topic, objectives of the study, and rationale for the review.
Background: Highlights relevant background information and establish the context for the review.
Main Body: Indexes the literature into thematic sections or categories, discussing key findings, methodologies, and theoretical frameworks.
Discussion: Analyze and synthesize the findings of the reviewed studies, stressing similarities, differences, and any gaps in the literature.
Conclusion: Summarizes the main findings of the review, identifies implications for future research, and offers concluding remarks.

Pros and Cons of Narrative Literature Review

Flexibility in methodology and doesn’t necessarily rely on structured methodologies
Follows traditional approach and provides valuable and contextualized insights
Suitable for exploring complex or interdisciplinary topics. For example — Climate change and human health, Cybersecurity and privacy in the digital age, and more
Subjectivity in data selection and interpretation
Potential for bias in the review process
Lack of rigor compared to systematic reviews

Example of Well-Executed Narrative Literature Reviews

Paper title: Examining Moral Injury in Clinical Practice: A Narrative Literature Review

Source: SciSpace

While narrative reviews offer flexibility, academic integrity remains paramount. So, ensure proper citation of all sources and maintain a transparent and factual approach throughout your critical narrative review, itself.

2. Systematic Review

A systematic literature review is one of the comprehensive types of literature review that follows a structured approach to assembling, analyzing, and synthesizing existing research relevant to a particular topic or question. It involves clearly defined criteria for exploring and choosing studies, as well as rigorous methods for evaluating the quality of relevant studies.

It plays a prominent role in evidence-based practice and decision-making across various domains, including healthcare, social sciences, education, health sciences, and more. By systematically investigating available literature, researchers can identify gaps in knowledge, evaluate the strength of evidence, and report future research directions.

Steps to Conduct Systematic Reviews

Source:- https://www.researchgate.net/figure/Steps-of-Systematic-Literature-Review_fig1_321422320

Here are the key steps involved in conducting a systematic literature review

Formulate a clear and focused research question

Clearly define the research question or objective of the review. It helps to centralize the literature search strategy and determine inclusion criteria for relevant studies.

Develop a thorough literature search strategy

Design a comprehensive search strategy to identify relevant studies. It involves scrutinizing scientific databases and all relevant articles in journals. Plus, seek suggestions from domain experts and review reference lists of relevant review articles.

Screening and selecting studies

Employ predefined inclusion and exclusion criteria to systematically screen the identified studies. This screening process also typically involves multiple reviewers independently assessing the eligibility of each study.

Data extraction

Extract key information from selected studies using standardized forms or protocols. It includes study characteristics, methods, results, and conclusions.

Critical appraisal

Evaluate the methodological quality and potential biases of included studies. Various tools (BMC medical research methodology) and criteria can be implemented for critical evaluation depending on the study design and research quetions .

Data synthesis

Analyze and synthesize review findings from individual studies to draw encompassing conclusions or identify overarching patterns and explore heterogeneity among studies.

Interpretation and conclusion

Interpret the findings about the research question, considering the strengths and limitations of the research evidence. Draw conclusions and implications for further research.

The final step — Report writing

Craft a detailed report of the systematic literature review adhering to the established guidelines of PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses). This ensures transparency and reproducibility of the review process.

By following these steps, a systematic literature review aims to provide a comprehensive and unbiased summary of existing evidence, help make informed decisions, and advance knowledge in the respective domain or field.

Structure of a systematic literature review

A well-structured systematic literature review typically consists of the following sections:

Introduction: Provides background information on the research topic, outlines the review objectives, and enunciates the scope of the study.
Methodology: Describes the literature search strategy, selection criteria, data extraction process, and other methods used for data synthesis, extraction, or other data analysis..
Results: Presents the review findings, including a summary of the incorporated studies and their key findings.
Discussion: Interprets the findings in light of the review objectives, discusses their implications, and identifies limitations or promising areas for future research.
Conclusion: Summarizes the main review findings and provides suggestions based on the evidence presented in depth meta analysis.

*Important (applies to all the reviews): Remember, the specific structure of your literature review may vary depending on your topic, research question, and intended audience. However, adhering to a clear and logical hierarchy ensures your review effectively analyses and synthesizes knowledge and contributes valuable insights for readers.

Pros and Cons of Systematic Literature Review

Adopts rigorous and transparent methodology
Minimizes bias and enhances the reliability of the study
Provides evidence-based insights
Time and resource-intensive
High dependency on the quality of available literature (literature research strategy should be accurate)
Potential for publication bias

Example of Well-Executed Systematic Literature Review

Paper title: Systematic Reviews: Understanding the Best Evidence For Clinical Decision-making in Health Care: Pros and Cons.

Read this detailed article on how to use AI tools to conduct a systematic review for your research!

3. Scoping Literature Review

A scoping literature review is a methodological review type of literature review that adopts an iterative approach to systematically map the existing literature on a particular topic or research area. It involves identifying, selecting, and synthesizing relevant papers to provide an overview of the size and scope of available evidence. Scoping reviews are broader in scope and include a diverse range of study designs and methodologies especially focused on health services research.

The main purpose of a scoping literature review is to examine the extent, range, and nature of existing studies on a topic, thereby identifying gaps in research, inconsistencies, and areas for further investigation. Additionally, scoping reviews can help researchers identify suitable methodologies and formulate clinical recommendations. They also act as the frameworks for future systematic reviews or primary research studies.

Scoping reviews are primarily focused on —

Emerging or evolving topics — where the research landscape is still growing or budding. Example — Whole Systems Approaches to Diet and Healthy Weight: A Scoping Review of Reviews .
Broad and complex topics : With a vast amount of existing literature.
Scenarios where a systematic review is not feasible: Due to limited resources or time constraints.

Steps to Conduct a Scoping Literature Review

While Scoping reviews are not as rigorous as systematic reviews, however, they still follow a structured approach. Here are the steps:

Identify the research question: Define the broad topic you want to explore.

Identify Relevant Studies: Conduct a comprehensive search of relevant literature using appropriate databases, keywords, and search strategies.

Select studies to be included in the review: Based on the inclusion and exclusion criteria, determine the appropriate studies to be included in the review.

Data extraction and charting : Extract relevant information from selected studies, such as year, author, main results, study characteristics, key findings, and methodological approaches. However, it varies depending on the research question.

Collate, summarize, and report the results: Analyze and summarize the extracted data to identify key themes and trends. Then, present the findings of the scoping review in a clear and structured manner, following established guidelines and frameworks .

Structure of a Scoping Literature Review

A scoping literature review typically follows a structured format similar to a systematic review. It includes the following sections:

Introduction: Introduce the research topic and objectives of the review, providing the historical context, and rationale for the study.
Methods : Describe the methods used to conduct the review, including search strategies, study selection criteria, and data extraction procedures.
Results: Present the findings of the review, including key themes, concepts, and patterns identified in the literature review.
Discussion: Examine the implications of the findings, including strengths, limitations, and areas for further examination.
Conclusion: Recapitulate the main findings of the review and their implications for future research, policy, or practice.

Pros and Cons of Scoping Literature Review

Provides a comprehensive overview of existing literature
Helps to identify gaps and areas for further research
Suitable for exploring broad or complex research questions
Doesn’t provide the depth of analysis offered by systematic reviews
Subject to researcher bias in study selection and data extraction
Requires careful consideration of literature search strategies and inclusion criteria to ensure comprehensiveness and validity.

In short, a scoping review helps map the literature on developing or emerging topics and identifying gaps. It might be considered as a step before conducting another type of review, such as a systematic review. Basically, acts as a precursor for other literature reviews.

Example of a Well-Executed Scoping Literature Review

Paper title: Health Chatbots in Africa Literature: A Scoping Review

Check out the key differences between Systematic and Scoping reviews — Evaluating literature review: systematic vs. scoping reviews

4. Integrative Literature Review

Integrative Literature Review (ILR) is a type of literature review that proposes a distinctive way to analyze and synthesize existing literature on a specific topic, providing a thorough understanding of research and identifying potential gaps for future research.

Unlike a systematic review, which emphasizes quantitative studies and follows strict inclusion criteria, an ILR embraces a more pliable approach. It works beyond simply summarizing findings — it critically analyzes, integrates, and interprets research from various methodologies (qualitative, quantitative, mixed methods) to provide a deeper understanding of the research landscape. ILRs provide a holistic and systematic overview of existing research, integrating findings from various methodologies. ILRs are ideal for exploring intricate research issues, examining manifold perspectives, and developing new research questions.

Steps to Conduct an Integrative Literature Review

Identify the research question: Clearly define the research question or topic of interest as formulating a clear and focused research question is critical to leading the entire review process.
Literature search strategy: Employ systematic search techniques to locate relevant literature across various databases and sources.
Evaluate the quality of the included studies : Critically assess the methodology, rigor, and validity of each study by applying inclusion and exclusion criteria to filter and select studies aligned with the research objectives.
Data Extraction: Extract relevant data from selected studies using a structured approach.
Synthesize the findings : Thoroughly analyze the selected literature, identify key themes, and synthesize findings to derive noteworthy insights.
Critical appraisal: Critically evaluate the quality and validity of qualitative research and included studies by using BMC medical research methodology.
Interpret and present your findings: Discuss the purpose and implications of your analysis, spotlighting key insights and limitations. Organize and present the findings coherently and systematically.

Structure of an Integrative Literature Review

Introduction : Provide an overview of the research topic and the purpose of the integrative review.
Methods: Describe the opted literature search strategy, selection criteria, and data extraction process.
Results: Present the synthesized findings, including key themes, patterns, and contradictions.
Discussion: Interpret the findings about the research question, emphasizing implications for theory, practice, and prospective research.
Conclusion: Summarize the main findings, limitations, and contributions of the integrative review.

Pros and Cons of Integrative Literature Review

Informs evidence-based practice and policy to the relevant stakeholders of the research.
Contributes to theory development and methodological advancement, especially in the healthcare arena.
Integrates diverse perspectives and findings
Time-consuming process due to the extensive literature search and synthesis
Requires advanced analytical and critical thinking skills
Potential for bias in study selection and interpretation
The quality of included studies may vary, affecting the validity of the review

Example of Integrative Literature Reviews

Paper Title: An Integrative Literature Review: The Dual Impact of Technological Tools on Health and Technostress Among Older Workers

5. Rapid Literature Review

A Rapid Literature Review (RLR) is the fastest type of literature review which makes use of a streamlined approach for synthesizing literature summaries, offering a quicker and more focused alternative to traditional systematic reviews. Despite employing identical research methods, it often simplifies or omits specific steps to expedite the process. It allows researchers to gain valuable insights into current research trends and identify key findings within a shorter timeframe, often ranging from a few days to a few weeks — unlike traditional literature reviews, which may take months or even years to complete.

When to Consider a Rapid Literature Review?

When time impediments demand a swift summary of existing research
For emerging topics where the latest literature requires quick evaluation
To report pilot studies or preliminary research before embarking on a comprehensive systematic review

Steps to Conduct a Rapid Literature Review

Define the research question or topic of interest. A well-defined question guides the search process and helps researchers focus on relevant studies.
Determine key databases and sources of relevant literature to ensure comprehensive coverage.
Develop literature search strategies using appropriate keywords and filters to fetch a pool of potential scientific articles.
Screen search results based on predefined inclusion and exclusion criteria.
Extract and summarize relevant information from the above-preferred studies.
Synthesize findings to identify key themes, patterns, or gaps in the literature.
Prepare a concise report or a summary of the RLR findings.

Structure of a Rapid Literature Review

An effective structure of an RLR typically includes the following sections:

Introduction: Briefly introduce the research topic and objectives of the RLR.
Methodology: Describe the search strategy, inclusion and exclusion criteria, and data extraction process.
Results: Present a summary of the findings, including key themes or patterns identified.
Discussion: Interpret the findings, discuss implications, and highlight any limitations or areas for further research
Conclusion: Summarize the key findings and their implications for practice or future research

Pros and Cons of Rapid Literature Review

RLRs can be completed quickly, authorizing timely decision-making
RLRs are a cost-effective approach since they require fewer resources compared to traditional literature reviews
Offers great accessibility as RLRs provide prompt access to synthesized evidence for stakeholders
RLRs are flexible as they can be easily adapted for various research contexts and objectives
RLR reports are limited and restricted, not as in-depth as systematic reviews, and do not provide comprehensive coverage of the literature compared to traditional reviews.
Susceptible to bias because of the expedited nature of RLRs. It would increase the chance of overlooking relevant studies or biases in the selection process.
Due to time constraints, RLR findings might not be robust enough as compared to systematic reviews.

Example of a Well-Executed Rapid Literature Review

Paper Title: What Is the Impact of ChatGPT on Education? A Rapid Review of the Literature

A Summary of Literature Review Types

Literature Review Type	Narrative	Systematic	Integrative	Rapid	Scoping
Approach	The traditional approach lacks a structured methodology	Systematic search, including structured methodology	Combines diverse methodologies for a comprehensive understanding	Quick review within time constraints	Preliminary study of existing literature
How Exhaustive is the process?	May or may not be comprehensive	Exhaustive and comprehensive search	A comprehensive search for integration	Time-limited search	Determined by time or scope constraints
Data Synthesis	Narrative	Narrative with tabular accompaniment	Integration of various sources or methodologies	Narrative and tabular	Narrative and tabular
Purpose	Provides description of meta analysis and conceptualization of the review	Comprehensive evidence synthesis	Holistic understanding	Quick policy or practice guidelines review	Preliminary literature review
Key characteristics	Storytelling, chronological presentation	Rigorous, traditional and systematic techniques approach	Diverse source or method integration	Time-constrained, systematic approach	Identifies literature size and scope
Example Use Case	Historical exploration	Effectiveness evaluation	Quantitative, qualitative, and mixed combination	Policy summary	Research literature overview

Tools and Resources for Conducting Different Types of Literature Reviews

Online scientific databases.

Platforms such as SciSpace , PubMed , Scopus , Elsevier , and Web of Science provide access to a vast array of scholarly literature, facilitating the search and data retrieval process.

Reference management software

Tools like SciSpace Citation Generator , EndNote, Zotero , and Mendeley assist researchers in organizing, annotating, and citing relevant literature, streamlining the review process altogether.

Automate Literature Review with AI tools

Automate the literature review process by using tools like SciSpace literature review which helps you compare and contrast multiple papers all on one screen in an easy-to-read matrix format. You can effortlessly analyze and interpret the review findings tailored to your study. It also supports the review in 75+ languages, making it more manageable even for non-English speakers.

Goes without saying — literature review plays a pivotal role in academic research to identify the current trends and provide insights to pave the way for future research endeavors. Different types of literature review has their own strengths and limitations, making them suitable for different research designs and contexts. Whether conducting a narrative review, systematic review, scoping review, integrative review, or rapid literature review, researchers must cautiously consider the objectives, resources, and the nature of the research topic.

If you’re currently working on a literature review and still adopting a manual and traditional approach, switch to the automated AI literature review workspace and transform your traditional literature review into a rapid one by extracting all the latest and relevant data for your research!

There you go!

Frequently Asked Questions

Narrative reviews give a general overview of a topic based on the author's knowledge. They may lack clear criteria and can be biased. On the other hand, systematic reviews aim to answer specific research questions by following strict methods. They're thorough but time-consuming.

A systematic review collects and analyzes existing research to provide an overview of a topic, while a meta-analysis statistically combines data from multiple studies to draw conclusions about the overall effect of an intervention or relationship between variables.

A systematic review thoroughly analyzes existing research on a specific topic using strict methods. In contrast, a scoping review offers a broader overview of the literature without evaluating individual studies in depth.

A systematic review thoroughly examines existing research using a rigorous process, while a rapid review provides a quicker summary of evidence, often by simplifying some of the systematic review steps to meet shorter timelines.

A systematic review carefully examines many studies on a single topic using specific guidelines. Conversely, an integrative review blends various types of research to provide a more comprehensive understanding of the topic.

Boosting Citations: A Comparative Analysis of Graphical Abstract vs. Video Abstract

The Impact of Visual Abstracts on Boosting Citations

Introducing SciSpace’s Citation Booster To Increase Research Visibility

Locations and Hours
UCLA Library
Research Guides
Biomedical Library Guides

Systematic Reviews

Types of Literature Reviews

What Makes a Systematic Review Different from Other Types of Reviews?

Planning Your Systematic Review
Database Searching
Creating the Search
Search Filters and Hedges
Grey Literature
Managing and Appraising Results
Further Resources

Reproduced from Grant, M. J. and Booth, A. (2009), A typology of reviews: an analysis of 14 review types and associated methodologies. Health Information & Libraries Journal, 26: 91–108. doi:10.1111/j.1471-1842.2009.00848.x


	Aims to demonstrate writer has extensively researched literature and critically evaluated its quality. Goes beyond mere description to include degree of analysis and conceptual innovation. Typically results in hypothesis or mode	Seeks to identify most significant items in the field	No formal quality assessment. Attempts to evaluate according to contribution	Typically narrative, perhaps conceptual or chronological	Significant component: seeks to identify conceptual contribution to embody existing or derive new theory
	Generic term: published materials that provide examination of recent or current literature. Can cover wide range of subjects at various levels of completeness and comprehensiveness. May include research findings	May or may not include comprehensive searching	May or may not include quality assessment	Typically narrative	Analysis may be chronological, conceptual, thematic, etc.
Mapping review/ systematic map	Map out and categorize existing literature from which to commission further reviews and/or primary research by identifying gaps in research literature	Completeness of searching determined by time/scope constraints	No formal quality assessment	May be graphical and tabular	Characterizes quantity and quality of literature, perhaps by study design and other key features. May identify need for primary or secondary research
	Technique that statistically combines the results of quantitative studies to provide a more precise effect of the results	Aims for exhaustive, comprehensive searching. May use funnel plot to assess completeness	Quality assessment may determine inclusion/ exclusion and/or sensitivity analyses	Graphical and tabular with narrative commentary	Numerical analysis of measures of effect assuming absence of heterogeneity
	Refers to any combination of methods where one significant component is a literature review (usually systematic). Within a review context it refers to a combination of review approaches for example combining quantitative with qualitative research or outcome with process studies	Requires either very sensitive search to retrieve all studies or separately conceived quantitative and qualitative strategies	Requires either a generic appraisal instrument or separate appraisal processes with corresponding checklists	Typically both components will be presented as narrative and in tables. May also employ graphical means of integrating quantitative and qualitative studies	Analysis may characterise both literatures and look for correlations between characteristics or use gap analysis to identify aspects absent in one literature but missing in the other
	Generic term: summary of the [medical] literature that attempts to survey the literature and describe its characteristics	May or may not include comprehensive searching (depends whether systematic overview or not)	May or may not include quality assessment (depends whether systematic overview or not)	Synthesis depends on whether systematic or not. Typically narrative but may include tabular features	Analysis may be chronological, conceptual, thematic, etc.
	Method for integrating or comparing the findings from qualitative studies. It looks for ‘themes’ or ‘constructs’ that lie in or across individual qualitative studies	May employ selective or purposive sampling	Quality assessment typically used to mediate messages not for inclusion/exclusion	Qualitative, narrative synthesis	Thematic analysis, may include conceptual models
	Assessment of what is already known about a policy or practice issue, by using systematic review methods to search and critically appraise existing research	Completeness of searching determined by time constraints	Time-limited formal quality assessment	Typically narrative and tabular	Quantities of literature and overall quality/direction of effect of literature
	Preliminary assessment of potential size and scope of available research literature. Aims to identify nature and extent of research evidence (usually including ongoing research)	Completeness of searching determined by time/scope constraints. May include research in progress	No formal quality assessment	Typically tabular with some narrative commentary	Characterizes quantity and quality of literature, perhaps by study design and other key features. Attempts to specify a viable review
	Tend to address more current matters in contrast to other combined retrospective and current approaches. May offer new perspectives	Aims for comprehensive searching of current literature	No formal quality assessment	Typically narrative, may have tabular accompaniment	Current state of knowledge and priorities for future investigation and research
	Seeks to systematically search for, appraise and synthesis research evidence, often adhering to guidelines on the conduct of a review	Aims for exhaustive, comprehensive searching	Quality assessment may determine inclusion/exclusion	Typically narrative with tabular accompaniment	What is known; recommendations for practice. What remains unknown; uncertainty around findings, recommendations for future research
	Combines strengths of critical review with a comprehensive search process. Typically addresses broad questions to produce ‘best evidence synthesis’	Aims for exhaustive, comprehensive searching	May or may not include quality assessment	Minimal narrative, tabular summary of studies	What is known; recommendations for practice. Limitations
	Attempt to include elements of systematic review process while stopping short of systematic review. Typically conducted as postgraduate student assignment	May or may not include comprehensive searching	May or may not include quality assessment	Typically narrative with tabular accompaniment	What is known; uncertainty around findings; limitations of methodology
	Specifically refers to review compiling evidence from multiple reviews into one accessible and usable document. Focuses on broad condition or problem for which there are competing interventions and highlights reviews that address these interventions and their results	Identification of component reviews, but no search for primary studies	Quality assessment of studies within component reviews and/or of reviews themselves	Graphical and tabular with narrative commentary	What is known; recommendations for practice. What remains unknown; recommendations for future research

<< Previous: Home
Next: Planning Your Systematic Review >>
Last Updated: Apr 17, 2024 2:02 PM
URL: https://guides.library.ucla.edu/systematicreviews

Methodological Approaches to Literature Review

Living reference work entry
First Online: 09 May 2023
Cite this living reference work entry

Dennis Thomas 2 ,
Elida Zairina 3 &
Johnson George 4

606 Accesses

1 Citations

The literature review can serve various functions in the contexts of education and research. It aids in identifying knowledge gaps, informing research methodology, and developing a theoretical framework during the planning stages of a research study or project, as well as reporting of review findings in the context of the existing literature. This chapter discusses the methodological approaches to conducting a literature review and offers an overview of different types of reviews. There are various types of reviews, including narrative reviews, scoping reviews, and systematic reviews with reporting strategies such as meta-analysis and meta-synthesis. Review authors should consider the scope of the literature review when selecting a type and method. Being focused is essential for a successful review; however, this must be balanced against the relevance of the review to a broad audience.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Reviewing Literature for and as Research

Discussion and Conclusion

Systematic Reviews in Educational Research: Methodology, Perspectives and Application

Akobeng AK. Principles of evidence based medicine. Arch Dis Child. 2005;90(8):837–40.

Article CAS PubMed PubMed Central Google Scholar

Alharbi A, Stevenson M. Refining Boolean queries to identify relevant studies for systematic review updates. J Am Med Inform Assoc. 2020;27(11):1658–66.

Article PubMed PubMed Central Google Scholar

Arksey H, O’Malley L. Scoping studies: towards a methodological framework. Int J Soc Res Methodol. 2005;8(1):19–32.

Article Google Scholar

Aromataris E MZE. JBI manual for evidence synthesis. 2020.

Google Scholar

Aromataris E, Pearson A. The systematic review: an overview. Am J Nurs. 2014;114(3):53–8.

Article PubMed Google Scholar

Aromataris E, Riitano D. Constructing a search strategy and searching for evidence. A guide to the literature search for a systematic review. Am J Nurs. 2014;114(5):49–56.

Babineau J. Product review: covidence (systematic review software). J Canad Health Libr Assoc Canada. 2014;35(2):68–71.

Baker JD. The purpose, process, and methods of writing a literature review. AORN J. 2016;103(3):265–9.

Bastian H, Glasziou P, Chalmers I. Seventy-five trials and eleven systematic reviews a day: how will we ever keep up? PLoS Med. 2010;7(9):e1000326.

Bramer WM, Rethlefsen ML, Kleijnen J, Franco OH. Optimal database combinations for literature searches in systematic reviews: a prospective exploratory study. Syst Rev. 2017;6(1):1–12.

Brown D. A review of the PubMed PICO tool: using evidence-based practice in health education. Health Promot Pract. 2020;21(4):496–8.

Cargo M, Harris J, Pantoja T, et al. Cochrane qualitative and implementation methods group guidance series – paper 4: methods for assessing evidence on intervention implementation. J Clin Epidemiol. 2018;97:59–69.

Cook DJ, Mulrow CD, Haynes RB. Systematic reviews: synthesis of best evidence for clinical decisions. Ann Intern Med. 1997;126(5):376–80.

Article CAS PubMed Google Scholar

Counsell C. Formulating questions and locating primary studies for inclusion in systematic reviews. Ann Intern Med. 1997;127(5):380–7.

Cummings SR, Browner WS, Hulley SB. Conceiving the research question and developing the study plan. In: Cummings SR, Browner WS, Hulley SB, editors. Designing Clinical Research: An Epidemiological Approach. 4th ed. Philadelphia (PA): P Lippincott Williams & Wilkins; 2007. p. 14–22.

Eriksen MB, Frandsen TF. The impact of patient, intervention, comparison, outcome (PICO) as a search strategy tool on literature search quality: a systematic review. JMLA. 2018;106(4):420.

Ferrari R. Writing narrative style literature reviews. Medical Writing. 2015;24(4):230–5.

Flemming K, Booth A, Hannes K, Cargo M, Noyes J. Cochrane qualitative and implementation methods group guidance series – paper 6: reporting guidelines for qualitative, implementation, and process evaluation evidence syntheses. J Clin Epidemiol. 2018;97:79–85.

Grant MJ, Booth A. A typology of reviews: an analysis of 14 review types and associated methodologies. Health Inf Libr J. 2009;26(2):91–108.

Green BN, Johnson CD, Adams A. Writing narrative literature reviews for peer-reviewed journals: secrets of the trade. J Chiropr Med. 2006;5(3):101–17.

Gregory AT, Denniss AR. An introduction to writing narrative and systematic reviews; tasks, tips and traps for aspiring authors. Heart Lung Circ. 2018;27(7):893–8.

Harden A, Thomas J, Cargo M, et al. Cochrane qualitative and implementation methods group guidance series – paper 5: methods for integrating qualitative and implementation evidence within intervention effectiveness reviews. J Clin Epidemiol. 2018;97:70–8.

Harris JL, Booth A, Cargo M, et al. Cochrane qualitative and implementation methods group guidance series – paper 2: methods for question formulation, searching, and protocol development for qualitative evidence synthesis. J Clin Epidemiol. 2018;97:39–48.

Higgins J, Thomas J. In: Chandler J, Cumpston M, Li T, Page MJ, Welch VA, editors. Cochrane Handbook for Systematic Reviews of Interventions version 6.3, updated February 2022). Available from www.training.cochrane.org/handbook.: Cochrane; 2022.

International prospective register of systematic reviews (PROSPERO). Available from https://www.crd.york.ac.uk/prospero/ .

Khan KS, Kunz R, Kleijnen J, Antes G. Five steps to conducting a systematic review. J R Soc Med. 2003;96(3):118–21.

Landhuis E. Scientific literature: information overload. Nature. 2016;535(7612):457–8.

Lockwood C, Porritt K, Munn Z, Rittenmeyer L, Salmond S, Bjerrum M, Loveday H, Carrier J, Stannard D. Chapter 2: Systematic reviews of qualitative evidence. In: Aromataris E, Munn Z, editors. JBI Manual for Evidence Synthesis. JBI; 2020. Available from https://synthesismanual.jbi.global . https://doi.org/10.46658/JBIMES-20-03 .

Chapter Google Scholar

Lorenzetti DL, Topfer L-A, Dennett L, Clement F. Value of databases other than medline for rapid health technology assessments. Int J Technol Assess Health Care. 2014;30(2):173–8.

Moher D, Liberati A, Tetzlaff J, Altman DG, the PRISMA Group. Preferred reporting items for (SR) and meta-analyses: the PRISMA statement. Ann Intern Med. 2009;6:264–9.

Mulrow CD. Systematic reviews: rationale for systematic reviews. BMJ. 1994;309(6954):597–9.

Munn Z, Peters MDJ, Stern C, Tufanaru C, McArthur A, Aromataris E. Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach. BMC Med Res Methodol. 2018;18(1):143.

Munthe-Kaas HM, Glenton C, Booth A, Noyes J, Lewin S. Systematic mapping of existing tools to appraise methodological strengths and limitations of qualitative research: first stage in the development of the CAMELOT tool. BMC Med Res Methodol. 2019;19(1):1–13.

Murphy CM. Writing an effective review article. J Med Toxicol. 2012;8(2):89–90.

NHMRC. Guidelines for guidelines: assessing risk of bias. Available at https://nhmrc.gov.au/guidelinesforguidelines/develop/assessing-risk-bias . Last published 29 August 2019. Accessed 29 Aug 2022.

Noyes J, Booth A, Cargo M, et al. Cochrane qualitative and implementation methods group guidance series – paper 1: introduction. J Clin Epidemiol. 2018b;97:35–8.

Noyes J, Booth A, Flemming K, et al. Cochrane qualitative and implementation methods group guidance series – paper 3: methods for assessing methodological limitations, data extraction and synthesis, and confidence in synthesized qualitative findings. J Clin Epidemiol. 2018a;97:49–58.

Noyes J, Booth A, Moore G, Flemming K, Tunçalp Ö, Shakibazadeh E. Synthesising quantitative and qualitative evidence to inform guidelines on complex interventions: clarifying the purposes, designs and outlining some methods. BMJ Glob Health. 2019;4(Suppl 1):e000893.

Peters MD, Godfrey CM, Khalil H, McInerney P, Parker D, Soares CB. Guidance for conducting systematic scoping reviews. Int J Evid Healthcare. 2015;13(3):141–6.

Polanin JR, Pigott TD, Espelage DL, Grotpeter JK. Best practice guidelines for abstract screening large-evidence systematic reviews and meta-analyses. Res Synth Methods. 2019;10(3):330–42.

Article PubMed Central Google Scholar

Shea BJ, Grimshaw JM, Wells GA, et al. Development of AMSTAR: a measurement tool to assess the methodological quality of systematic reviews. BMC Med Res Methodol. 2007;7(1):1–7.

Shea BJ, Reeves BC, Wells G, et al. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. Brit Med J. 2017;358

Sterne JA, Hernán MA, Reeves BC, et al. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. Br Med J. 2016;355

Stroup DF, Berlin JA, Morton SC, et al. Meta-analysis of observational studies in epidemiology: a proposal for reporting. JAMA. 2000;283(15):2008–12.

Tawfik GM, Dila KAS, Mohamed MYF, et al. A step by step guide for conducting a systematic review and meta-analysis with simulation data. Trop Med Health. 2019;47(1):1–9.

The Critical Appraisal Program. Critical appraisal skills program. Available at https://casp-uk.net/ . 2022. Accessed 29 Aug 2022.

The University of Melbourne. Writing a literature review in Research Techniques 2022. Available at https://students.unimelb.edu.au/academic-skills/explore-our-resources/research-techniques/reviewing-the-literature . Accessed 29 Aug 2022.

The Writing Center University of Winconsin-Madison. Learn how to write a literature review in The Writer’s Handbook – Academic Professional Writing. 2022. Available at https://writing.wisc.edu/handbook/assignments/reviewofliterature/ . Accessed 29 Aug 2022.

Thompson SG, Sharp SJ. Explaining heterogeneity in meta-analysis: a comparison of methods. Stat Med. 1999;18(20):2693–708.

Tricco AC, Lillie E, Zarin W, et al. A scoping review on the conduct and reporting of scoping reviews. BMC Med Res Methodol. 2016;16(1):15.

Tricco AC, Lillie E, Zarin W, et al. PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. 2018;169(7):467–73.

Yoneoka D, Henmi M. Clinical heterogeneity in random-effect meta-analysis: between-study boundary estimate problem. Stat Med. 2019;38(21):4131–45.

Yuan Y, Hunt RH. Systematic reviews: the good, the bad, and the ugly. Am J Gastroenterol. 2009;104(5):1086–92.

Download references

Author information

Authors and affiliations.

Centre of Excellence in Treatable Traits, College of Health, Medicine and Wellbeing, University of Newcastle, Hunter Medical Research Institute Asthma and Breathing Programme, Newcastle, NSW, Australia

Dennis Thomas

Department of Pharmacy Practice, Faculty of Pharmacy, Universitas Airlangga, Surabaya, Indonesia

Elida Zairina

Centre for Medicine Use and Safety, Monash Institute of Pharmaceutical Sciences, Faculty of Pharmacy and Pharmaceutical Sciences, Monash University, Parkville, VIC, Australia

Johnson George

You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Johnson George .

Section Editor information

College of Pharmacy, Qatar University, Doha, Qatar

Derek Charles Stewart

Department of Pharmacy, University of Huddersfield, Huddersfield, United Kingdom

Zaheer-Ud-Din Babar

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry.

Thomas, D., Zairina, E., George, J. (2023). Methodological Approaches to Literature Review. In: Encyclopedia of Evidence in Pharmaceutical Public Health and Health Services Research in Pharmacy. Springer, Cham. https://doi.org/10.1007/978-3-030-50247-8_57-1

Download citation

DOI : https://doi.org/10.1007/978-3-030-50247-8_57-1

Received : 22 February 2023

Accepted : 22 February 2023

Published : 09 May 2023

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-50247-8

Online ISBN : 978-3-030-50247-8

eBook Packages : Springer Reference Biomedicine and Life Sciences Reference Module Biomedical and Life Sciences

Publish with us

Policies and ethics

Find a journal
Track your research

Information

Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

Active Journals
Find a Journal
Proceedings Series
For Authors
For Reviewers
For Editors
For Librarians
For Publishers
For Societies
For Conference Organizers
Open Access Policy
Institutional Open Access Program
Special Issues Guidelines
Editorial Process
Research and Publication Ethics
Article Processing Charges
Testimonials
Preprints.org
SciProfiles
Encyclopedia

Article Menu

Subscribe SciFeed
Recommended Articles
Google Scholar
on Google Scholar
Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

From large language models to large multimodal models: a literature review.

1. Introduction

2. large language models, 2.1. llm architectures, 2.2. llm pretraining, 2.3. llm full fine-tuning, 2.4. llm parameter-efficient fine-tuning, 2.5. llm prompt-engineering, 2.6. representative pretrained llms.

Pretrained LMs	Architectures	Parameters	Pretraining Corpus Size	Pretraining Objectives	Model Providers	Fine-Tuned LMs
T5	encoder-decoder	11 B	1 T	MLM	Google	T0 [ ]
GPT-3	decoder-only	175 B	300 B	ALM	OpenAI	ChatGPT [ ]
						WebGPT [ ]
CPM-2	encoder-decoder	198 B	2.6 T	MLM	Tsinghua	-
PaLM	decoder-only	540 B	780 B	ALM	Google	-
PaLM-2	-	<540 B	>780 B	Mixture	Google	-
LLaMA	decoder-only	65 B	1.4 T	ALM	Meta	Vicuna [ ]
						Alpaca [ ]
						LIMA [ ]
LLaMA 2	decoder-only	70 B	2 T	ALM	Meta	LLaMA 2-CHAT [ ]
OPT	decoder-only	175 B	180 B	ALM	Meta	OPT-IML [ ]
PanGu-	decoder-only	1.085 T	329 B	ALM	Huawei	-

3. Large Multimodal Models

3.1. lmm architectures, 3.1.1. multimodal encoders, 3.1.2. input modal aligner, 3.1.3. upstream llm backbone, 3.1.4. output modal aligner, 3.1.5. multimodal decoder, 3.2. lmm training, 3.3. lmm instruction tuning, 3.4. lmm prompt engineering, 3.5. taxonomy of vision–language lmms, 3.5.1. image+text to text.

Model	Vision Encoder	Input Modal Aligner	Upstream LLM Backbone
BLIP-2 [ ]	CLIP ViT	Q-Former + Linear Mapper	Flan-T5/OPT
BLIVA [ ]	CLIP ViT	Q-Former + Linear Mapper	Vicuna-7B/Flan-T5XXL
ChatSpot [ ]	CLIP ViT	Linear Mapper	Vicuna-7B/LLaMA
CogVLM [ ]	Eva-2-CLIP ViT	MLP	Vicuna-v1.5-7B
DRESS [ ]	Eva-CLIP ViT	Linear Mapper	Vicuna-v1.5-13B
DLP [ ]	CLIP ViT	Q-Former + P-Former + Linear Mapper	OPT/Flan-T5
IDEFICS [ ]	OpenCLIP	Cross Attention Layer	LLaMA
InternLM-XComposer [ ]	Eva-CLIP ViT	Cross Attention Layer	Intern-LM
InternLM-XComposer2 [ ]	CLIP ViT	Cross Attention Layer	Intern-LM2
Lyrics [ ]	CLIP ViT	MQ-Former + Linear Mapper	Vicuna-13B
LLaVA [ ]	CLIP ViT	Linear Mapper	Vicuna-7B/13B
LLaVA-1.5 [ ]	CLIP ViT	MLP	Vicuna-v1.5-7B/13B
LLaVAR [ ]	CLIP ViT	Linear Mapper	Vicuna-13B
MiniGPT-v2 [ ]	Eva-CLIP ViT	Linear Mapper	LLaMA-2-Chat-7B
MiniGPT-4 [ ]	Eva-CLIP ViT	Q-Former + Linear Mapper	Vicuna-13B
mPLUG-Owl [ ]	CLIP ViT	Cross Attention Layer	LLaMA-7B
mPLUG-Owl2 [ ]	ViT	Modality-Adaptive Module	LLaMA-2-7B
mPLUG-DocOwl [ ]	CLIP ViT	Cross Attention Layer	LLaMA-7B
MobileVLM [ ]	CLIP ViT	Lightweight Downsample Projector (LDP)	MobileLLaMA
MobileVLM V2 [ ]	CLIP ViT	Lightweight Downsample Projector v2 (LDPv2)	MobileLLaMA
Otter [ ]	CLIP ViT	Cross Attention Layer	LLaMA-7B
Osprey [ ]	ConvNeXt-Large	MLP	Vicuna
PandaGPT [ ]	ImageBind	Linear Mapper	Vicuna-13B
PaLI-X [ ]	ViT	Linear Mapper	UL2-32B
Qwen-VL [ ]	ViT	Cross Attention Layer	Qwen-7B
RLHF-V [ ]	BEiT-3	Linear Mapper	Vicuna-v1-13B
Silkie [ ]	ViT	Cross Attention Layer	Qwen-7B
VILA [ ]	ViT	Linear Mapper	LLaMA-2-7B/13B

3.5.2. Video+Text to Text

Model	Vision Encoder	Input Modal Aligner	Upstream LLM Backbone
Dolphins [ ]	CLIP ViT	Cross Attention Layer	LLaMA/MPT
InstructBLIP	ViT	Q-Former + Linear Mapper	Vicuna/Flan-T5
InternVL [ ]	InternViT-6B	Cross-attention + QLLaMA + MLP	Vicuna-13B
Lynx [ ]	Eva-CLIP ViT	Cross Attention Layer	Vicuna-7B
mPLUG-video [ ]	TimeSformer [ ]	Cross Attention Layer + Linear Mapper	Chinese GPT-3 1.7B [ ]/2.3B [ ]
VAST [ ]	ViT	Cross Attention Layer	Vicuna-13B
VideoChat [ ]	ViT	Q-Former + Linear Mapper	Vicuna
Video-ChatGPT [ ]	CLIP ViT	Linear Mapper	Vicuna-v1.1
Video-LLaMA [ ]	Eva-CLIP ViT	Q-Former + Linear Mapper	Vicuna/LLaMA

3.5.3. Image+Text to Text+Image

Model	Vision Encoder	Input Modal Aligner	Upstream LLM Backbone	Output Modal Aligner	Vision Generator
CogCoM [ ]	Eva-2-CLIP ViT	MLP	Vicuna-v1.5-7B	-	-
DreamLLM [ ]	CLIP ViT	Linear Mapper	Vicuna-v1.5-7B	MLP	Stable Diffusion
DetGPT [ ]	CLIP ViT	Linear Mapper	Vicuna-13B	-	-
DiffusionGPT [ ]	-	-	ChatGPT	-	Stable Diffusion-1.5
Emu [ ]	Eva-CLIP ViT	Causal Transformer	LLaMA-13B	MLP	Stable Diffusion-1.5
Emu-2 [ ]	Eva-2-CLIP ViT	Linear Mapper	LLaMA-33B	MLP	Stable Diffusion XL
GILL [ ]	CLIP ViT	Linear Mapper	OPT-6.7B	GILLMapper	Stable Diffusion-1.5
GLaMM [ ]	CLIP ViT	MLP	Vicuna-7B/13B	MLP	Stable Diffusion XL
GPT4ROI [ ]	CLIP ViT	Linear Mapper	Vicuna-7B	-	-
Kosmos-G [ ]	CLIP ViT	Linear Mapper	MAGNETO	AlignerNet	Stable Diffusion-1.5
LaVIT [ ]	ViT	Cross Attention Layer	LLaMA-7B	Designed-quantizer	Stable Diffusion
LISA [ ]	ViT	-	LLaVA-7B/13B	-	-
LLaVA-Plus [ ]	-	-	LLaVA-7B/13B	-	Stable Diffusion
MiniGPT-5 [ ]	Eva-CLIP ViT	Q-Former + Linear Mapper	Vicuna-7B	Transformer + MLP	StableDiffusion-2
MM-Interleaved [ ]	CLIP ViT	Cross Attention Layer	Vicuna-13B	Transformer	Stable Diffusion-2.1
PixelLM [ ]	CLIP-ViT	MLP	LLaMA2-7B/13B	-	-
SEED [ ]	CLIP ViT	Q-Former+Linear Mapper	OPT-2.7B	Q-Former+Linear Mapper	Stable Diffusion
Shikra [ ]	CLIP ViT	Linear Mapper	Vicuna-7/13B	-	-
SPHINX-X [ ]	DINOv2/CLIP-ConvNeXt	Linear Mapper	TinyLlama-1.1B/LLaMA2-13B/Mixtral-8×7B/InternLM2-7B	-	-
Visual ChatGPT [ ]	Visual Foundation Models	-	ChatGPT	-	Visual Foundation Models
VL-GPT [ ]	CLIP ViT	Causal Transformer	LLaMA-7B	Transformer	Stable Diffusion

3.5.4. Video+Text to Text+Video

Model	Vision Encoder	Input Modal Aligner	Upstream LLM Backbone	Output Modal Aligner	Video Generator
CoDi [ ]	CLIP	Cross Attention Layer	-	Cross Attention Layer	designed Video LDM
CoDi-2 [ ]	ImageBind	MLP	LLaMA 2-7B	MLP	Zeroscope-v2
GPT4Video [ ]	CLIP ViT	Cross Attention Layer	LLaMA-7B	-	ZeroScope
HuggingGPT [ ]	-	-	ChatGPT	-	Model Zoo
ModaVerse [ ]	ImageBind	Linear Mapper	LLaMA-2	MLP	Videofusion
NExT-GPT [ ]	ImageBind	Linear Mapper	Vicuna-7B	Transformer-31M	Zeroscope

4. A Unified Perspective of Large Models

5. comparative analysis from the view of globalization, 6. on future directions, 6.1. less computation, more tokens, 6.2. less fine-tuning, more prompt engineering, 6.3. fewer parameters, more datasets, 7. conclusions, author contributions, institutional review board statement, informed consent statement, data availability statement, conflicts of interest.

Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2019 , arXiv:1810.04805. [ Google Scholar ]
Radford, A.; Narasimhan, K. Improving Language Understanding by Generative Pre-Training. 2018. Available online: https://api.semanticscholar.org/CorpusID:49313245 (accessed on 29 March 2024).
Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language Models are Unsupervised Multitask Learners. OpenAI Blog 2019 , 1 , 9. [ Google Scholar ]
Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models are Few-Shot Learners. arXiv 2020 , arXiv:2005.14165. [ Google Scholar ]
OpenAI; Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; et al. GPT-4 Technical Report. arXiv 2024 , arXiv:2303.08774. [ Google Scholar ]
Chowdhery, A.; Narang, S.; Devlin, J.; Bosma, M.; Mishra, G.; Roberts, A.; Barham, P.; Chung, H.W.; Sutton, C.; Gehrmann, S.; et al. PaLM: Scaling Language Modeling with Pathways. arXiv 2022 , arXiv:2204.02311. [ Google Scholar ]
Anil, R.; Dai, A.M.; Firat, O.; Johnson, M.; Lepikhin, D.; Passos, A.; Shakeri, S.; Taropa, E.; Bailey, P.; Chen, Z.; et al. PaLM 2 Technical Report. arXiv 2023 , arXiv:2305.10403. [ Google Scholar ]
Touvron, H.; Lavril, T.; Izacard, G.; Martinet, X.; Lachaux, M.A.; Lacroix, T.; Rozière, B.; Goyal, N.; Hambro, E.; Azhar, F.; et al. LLaMA: Open and Efficient Foundation Language Models. arXiv 2023 , arXiv:2302.13971. [ Google Scholar ]
Touvron, H.; Martin, L.; Stone, K.; Albert, P.; Almahairi, A.; Babaei, Y.; Bashlykov, N.; Batra, S.; Bhargava, P.; Bhosale, S.; et al. Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv 2023 , arXiv:2307.09288. [ Google Scholar ]
Zeng, W.; Ren, X.; Su, T.; Wang, H.; Liao, Y.; Wang, Z.; Jiang, X.; Yang, Z.; Wang, K.; Zhang, X.; et al. PanGu- α : Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation. arXiv 2021 , arXiv:2104.12369. [ Google Scholar ]
Ren, X.; Zhou, P.; Meng, X.; Huang, X.; Wang, Y.; Wang, W.; Li, P.; Zhang, X.; Podolskiy, A.; Arshinov, G.; et al. PanGu- Σ : Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing. arXiv 2023 , arXiv:2303.10845. [ Google Scholar ]
Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning Transferable Visual Models From Natural Language Supervision. arXiv 2021 , arXiv:2103.00020. [ Google Scholar ]
Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-Resolution Image Synthesis with Latent Diffusion Models. arXiv 2022 , arXiv:2112.10752. [ Google Scholar ]
Podell, D.; English, Z.; Lacey, K.; Blattmann, A.; Dockhorn, T.; Müller, J.; Penna, J.; Rombach, R. SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis. arXiv 2023 , arXiv:2307.01952. [ Google Scholar ]
Raiaan, M.A.K.; Mukta, M.S.H.; Fatema, K.; Fahad, N.M.; Sakib, S.; Mim, M.M.J.; Ahmad, J.; Ali, M.E.; Azam, S. A Review on Large Language Models: Architectures, Applications, Taxonomies, Open Issues and Challenges. IEEE Access 2024 , 12 , 26839–26874. [ Google Scholar ] [ CrossRef ]
Naveed, H.; Khan, A.U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Akhtar, N.; Barnes, N.; Mian, A. A Comprehensive Overview of Large Language Models. arXiv 2024 , arXiv:2307.06435. [ Google Scholar ]
Zhang, D.; Yu, Y.; Li, C.; Dong, J.; Su, D.; Chu, C.; Yu, D. MM-LLMs: Recent Advances in MultiModal Large Language Models. arXiv 2024 , arXiv:2401.13601. [ Google Scholar ]
Yin, S.; Fu, C.; Zhao, S.; Li, K.; Sun, X.; Xu, T.; Chen, E. A Survey on Multimodal Large Language Models. arXiv 2024 , arXiv:2306.13549. [ Google Scholar ] [ CrossRef ]
Lipton, Z.C.; Berkowitz, J.; Elkan, C. A Critical Review of Recurrent Neural Networks for Sequence Learning. arXiv 2015 , arXiv:1506.00019. [ Google Scholar ]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997 , 9 , 1735–1780. [ Google Scholar ] [ CrossRef ]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2023 , arXiv:1706.03762. [ Google Scholar ]
Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv 2019 , arXiv:1907.11692. [ Google Scholar ]
Lan, Z.; Chen, M.; Goodman, S.; Gimpel, K.; Sharma, P.; Soricut, R. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. arXiv 2020 , arXiv:1909.11942. [ Google Scholar ]
Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. arXiv 2023 , arXiv:1910.10683. [ Google Scholar ]
Dong, L.; Yang, N.; Wang, W.; Wei, F.; Liu, X.; Wang, Y.; Gao, J.; Zhou, M.; Hon, H.W. Unified Language Model Pre-training for Natural Language Understanding and Generation. arXiv 2019 , arXiv:1905.03197. [ Google Scholar ]
Houlsby, N.; Giurgiu, A.; Jastrzebski, S.; Morrone, B.; de Laroussilhe, Q.; Gesmundo, A.; Attariyan, M.; Gelly, S. Parameter-Efficient Transfer Learning for NLP. arXiv 2019 , arXiv:1902.00751. [ Google Scholar ]
Hu, E.J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. arXiv 2021 , arXiv:2106.09685. [ Google Scholar ]
Dettmers, T.; Pagnoni, A.; Holtzman, A.; Zettlemoyer, L. QLoRA: Efficient Finetuning of Quantized LLMs. arXiv 2023 , arXiv:2305.14314. [ Google Scholar ]
Li, X.L.; Liang, P. Prefix-Tuning: Optimizing Continuous Prompts for Generation. arXiv 2021 , arXiv:2101.00190. [ Google Scholar ]
Schick, T.; Schütze, H. It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners. arXiv 2021 , arXiv:2009.07118. [ Google Scholar ]
Liu, X.; Zheng, Y.; Du, Z.; Ding, M.; Qian, Y.; Yang, Z.; Tang, J. GPT Understands, Too. arXiv 2023 , arXiv:2103.10385. [ Google Scholar ] [ CrossRef ]
Lester, B.; Al-Rfou, R.; Constant, N. The Power of Scale for Parameter-Efficient Prompt Tuning. arXiv 2021 , arXiv:2104.08691. [ Google Scholar ]
Dong, Q.; Li, L.; Dai, D.; Zheng, C.; Wu, Z.; Chang, B.; Sun, X.; Xu, J.; Li, L.; Sui, Z. A Survey on In-context Learning. arXiv 2023 , arXiv:2301.00234. [ Google Scholar ]
Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Ichter, B.; Xia, F.; Chi, E.; Le, Q.; Zhou, D. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv 2023 , arXiv:2201.11903. [ Google Scholar ]
Cobbe, K.; Kosaraju, V.; Bavarian, M.; Chen, M.; Jun, H.; Kaiser, L.; Plappert, M.; Tworek, J.; Hilton, J.; Nakano, R.; et al. Training Verifiers to Solve Math Word Problems. arXiv 2021 , arXiv:2110.14168. [ Google Scholar ]
Zhang, Z.; Gu, Y.; Han, X.; Chen, S.; Xiao, C.; Sun, Z.; Yao, Y.; Qi, F.; Guan, J.; Ke, P.; et al. CPM-2: Large-scale Cost-effective Pre-trained Language Models. arXiv 2021 , arXiv:2106.10715. [ Google Scholar ] [ CrossRef ]
Zhang, Z.; Han, X.; Zhou, H.; Ke, P.; Gu, Y.; Ye, D.; Qin, Y.; Su, Y.; Ji, H.; Guan, J.; et al. CPM: A Large-scale Generative Chinese Pre-trained Language Model. arXiv 2020 , arXiv:2012.00413. [ Google Scholar ] [ CrossRef ]
Qin, Y.; Lin, Y.; Yi, J.; Zhang, J.; Han, X.; Zhang, Z.; Su, Y.; Liu, Z.; Li, P.; Sun, M.; et al. Knowledge Inheritance for Pre-trained Language Models. arXiv 2022 , arXiv:2105.13880. [ Google Scholar ]
Barham, P.; Chowdhery, A.; Dean, J.; Ghemawat, S.; Hand, S.; Hurt, D.; Isard, M.; Lim, H.; Pang, R.; Roy, S.; et al. Pathways: Asynchronous Distributed Dataflow for ML. arXiv 2022 , arXiv:2203.12533. [ Google Scholar ]
Rae, J.W.; Borgeaud, S.; Cai, T.; Millican, K.; Hoffmann, J.; Song, F.; Aslanides, J.; Henderson, S.; Ring, R.; Young, S.; et al. Scaling Language Models: Methods, Analysis & Insights from Training Gopher. arXiv 2022 , arXiv:2112.11446. [ Google Scholar ]
Hoffmann, J.; Borgeaud, S.; Mensch, A.; Buchatskaya, E.; Cai, T.; Rutherford, E.; de Las Casas, D.; Hendricks, L.A.; Welbl, J.; Clark, A.; et al. Training Compute-Optimal Large Language Models. arXiv 2022 , arXiv:2203.15556. [ Google Scholar ]
Zhang, S.; Roller, S.; Goyal, N.; Artetxe, M.; Chen, M.; Chen, S.; Dewan, C.; Diab, M.; Li, X.; Lin, X.V.; et al. OPT: Open Pre-trained Transformer Language Models. arXiv 2022 , arXiv:2205.01068. [ Google Scholar ]
Gao, L.; Biderman, S.; Black, S.; Golding, L.; Hoppe, T.; Foster, C.; Phang, J.; He, H.; Thite, A.; Nabeshima, N.; et al. The Pile: An 800GB Dataset of Diverse Text for Language Modeling. arXiv 2020 , arXiv:2101.00027. [ Google Scholar ]
Baumgartner, J.; Zannettou, S.; Keegan, B.; Squire, M.; Blackburn, J. The Pushshift Reddit Dataset. arXiv 2020 , arXiv:2001.08435. [ Google Scholar ] [ CrossRef ]
Zhang, B.; Sennrich, R. Root Mean Square Layer Normalization. arXiv 2019 , arXiv:1910.07467. [ Google Scholar ]
Su, J.; Lu, Y.; Pan, S.; Murtadha, A.; Wen, B.; Liu, Y. RoFormer: Enhanced Transformer with Rotary Position Embedding. arXiv 2023 , arXiv:2104.09864. [ Google Scholar ] [ CrossRef ]
Xu, L.; Zhang, X.; Dong, Q. CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language Model. arXiv 2020 , arXiv:2003.01355. [ Google Scholar ]
Yuan, S.; Zhao, H.; Du, Z.; Ding, M.; Liu, X.; Cen, Y.; Zou, X.; Yang, Z.; Tang, J. WuDaoCorpora: A super large-scale Chinese corpora for pre-training language models. AI Open 2021 , 2 , 65–68. [ Google Scholar ] [ CrossRef ]
Christopoulou, F.; Lampouras, G.; Gritta, M.; Zhang, G.; Guo, Y.; Li, Z.; Zhang, Q.; Xiao, M.; Shen, B.; Li, L.; et al. PanGu-Coder: Program Synthesis with Function-Level Language Modeling. arXiv 2022 , arXiv:2207.11280. [ Google Scholar ]
Gousios, G. The GHTorent dataset and tool suite. In Proceedings of the 2013 10th Working Conference on Mining Software Repositories (MSR), San Francisco, CA, USA, 18–19 May 2013; pp. 233–236. [ Google Scholar ] [ CrossRef ]
Tay, Y.; Dehghani, M.; Tran, V.Q.; Garcia, X.; Wei, J.; Wang, X.; Chung, H.W.; Bahri, D.; Schuster, T.; Zheng, S.; et al. UL2: Unifying Language Learning Paradigms. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [ Google Scholar ]
Ainslie, J.; Lee-Thorp, J.; de Jong, M.; Zemlyanskiy, Y.; Lebrón, F.; Sanghai, S. GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints. arXiv 2023 , arXiv:2305.13245. [ Google Scholar ]
Sanh, V.; Webson, A.; Raffel, C.; Bach, S.H.; Sutawika, L.; Alyafeai, Z.; Chaffin, A.; Stiegler, A.; Scao, T.L.; Raja, A.; et al. Multitask Prompted Training Enables Zero-Shot Task Generalization. arXiv 2022 , arXiv:2110.08207. [ Google Scholar ]
Introducing ChatGPT. Available online: https://openai.com/blog/chatgpt (accessed on 29 March 2024).
Nakano, R.; Hilton, J.; Balaji, S.; Wu, J.; Ouyang, L.; Kim, C.; Hesse, C.; Jain, S.; Kosaraju, V.; Saunders, W.; et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv 2022 , arXiv:2112.09332. [ Google Scholar ]
Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality|LMSYS Org. Available online: https://lmsys.org/blog/2023-03-30-vicuna/ (accessed on 29 March 2024).
Taori, R.; Gulrajani, I.; Zhang, T.; Dubois, Y.; Li, X.; Guestrin, C.; Liang, P.; Hashimoto, T.B. Stanford Alpaca: An Instruction-Following LLaMA Model. 2023. Available online: https://github.com/tatsu-lab/stanford_alpaca (accessed on 29 March 2024).
Zhou, C.; Liu, P.; Xu, P.; Iyer, S.; Sun, J.; Mao, Y.; Ma, X.; Efrat, A.; Yu, P.; Yu, L.; et al. LIMA: Less Is More for Alignment. arXiv 2023 , arXiv:2305.11206. [ Google Scholar ]
Iyer, S.; Lin, X.V.; Pasunuru, R.; Mihaylov, T.; Simig, D.; Yu, P.; Shuster, K.; Wang, T.; Liu, Q.; Koura, P.S.; et al. OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization. arXiv 2023 , arXiv:2212.12017. [ Google Scholar ]
Li, J.; Li, D.; Savarese, S.; Hoi, S. BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. arXiv 2023 , arXiv:2301.12597. [ Google Scholar ] [ CrossRef ]
Zhu, D.; Chen, J.; Shen, X.; Li, X.; Elhoseiny, M. MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models. arXiv 2023 , arXiv:2304.10592. [ Google Scholar ]
Qin, J.; Wu, J.; Chen, W.; Ren, Y.; Li, H.; Wu, H.; Xiao, X.; Wang, R.; Wen, S. DiffusionGPT: LLM-Driven Text-to-Image Generation System. arXiv 2024 , arXiv:2401.10061. [ Google Scholar ]
Pan, X.; Dong, L.; Huang, S.; Peng, Z.; Chen, W.; Wei, F. Kosmos-G: Generating Images in Context with Multimodal Large Language Models. arXiv 2023 , arXiv:2310.02992. [ Google Scholar ] [ CrossRef ]
He, K.; Chen, X.; Xie, S.; Li, Y.; Dollár, P.; Girshick, R. Masked Autoencoders Are Scalable Vision Learners. arXiv 2021 , arXiv:2111.06377. [ Google Scholar ]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv 2021 , arXiv:2010.11929. [ Google Scholar ]
Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. arXiv 2021 , arXiv:2103.14030. [ Google Scholar ]
Fang, Y.; Wang, W.; Xie, B.; Sun, Q.; Wu, L.; Wang, X.; Huang, T.; Wang, X.; Cao, Y. EVA: Exploring the Limits of Masked Visual Representation Learning at Scale. arXiv 2022 , arXiv:2211.07636. [ Google Scholar ]
Cherti, M.; Beaumont, R.; Wightman, R.; Wortsman, M.; Ilharco, G.; Gordon, C.; Schuhmann, C.; Schmidt, L.; Jitsev, J. Reproducible scaling laws for contrastive language-image learning. arXiv 2022 , arXiv:2212.07143. [ Google Scholar ]
Oquab, M.; Darcet, T.; Moutakanni, T.; Vo, H.; Szafraniec, M.; Khalidov, V.; Fernandez, P.; Haziza, D.; Massa, F.; El-Nouby, A.; et al. DINOv2: Learning Robust Visual Features without Supervision. arXiv 2024 , arXiv:2304.07193. [ Google Scholar ]
Wu, Y.; Chen, K.; Zhang, T.; Hui, Y.; Nezhurina, M.; Berg-Kirkpatrick, T.; Dubnov, S. Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation. arXiv 2024 , arXiv:2211.06687. [ Google Scholar ]
Chen, F.; Han, M.; Zhao, H.; Zhang, Q.; Shi, J.; Xu, S.; Xu, B. X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages. arXiv 2023 , arXiv:2305.04160. [ Google Scholar ]
Radford, A.; Kim, J.W.; Xu, T.; Brockman, G.; McLeavey, C.; Sutskever, I. Robust Speech Recognition via Large-Scale Weak Supervision. arXiv 2022 , arXiv:2212.04356. [ Google Scholar ]
Hsu, W.N.; Bolte, B.; Tsai, Y.H.H.; Lakhotia, K.; Salakhutdinov, R.; Mohamed, A. HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units. arXiv 2021 , arXiv:2106.07447. [ Google Scholar ] [ CrossRef ]
Zhang, Y.; Gong, K.; Zhang, K.; Li, H.; Qiao, Y.; Ouyang, W.; Yue, X. Meta-Transformer: A Unified Framework for Multimodal Learning. arXiv 2023 , arXiv:2307.10802. [ Google Scholar ]
Girdhar, R.; El-Nouby, A.; Liu, Z.; Singh, M.; Alwala, K.V.; Joulin, A.; Misra, I. ImageBind: One Embedding Space To Bind Them All. arXiv 2023 , arXiv:2305.05665. [ Google Scholar ]
Zhu, B.; Lin, B.; Ning, M.; Yan, Y.; Cui, J.; Wang, H.; Pang, Y.; Jiang, W.; Zhang, J.; Li, Z.; et al. LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment. arXiv 2024 , arXiv:2310.01852. [ Google Scholar ]
Jian, Y.; Gao, C.; Vosoughi, S. Bootstrapping Vision-Language Learning with Decoupled Language Pre-training. arXiv 2023 , arXiv:2307.07063. [ Google Scholar ] [ CrossRef ]
Lu, J.; Gan, R.; Zhang, D.; Wu, X.; Wu, Z.; Sun, R.; Zhang, J.; Zhang, P.; Song, Y. Lyrics: Boosting Fine-grained Language-Vision Alignment and Comprehension via Semantic-aware Visual Objects. arXiv 2023 , arXiv:2312.05278. [ Google Scholar ] [ CrossRef ]
Koh, J.Y.; Fried, D.; Salakhutdinov, R. Generating Images with Multimodal Language Models. arXiv 2023 , arXiv:2305.17216. [ Google Scholar ]
Tian, C.; Zhu, X.; Xiong, Y.; Wang, W.; Chen, Z.; Wang, W.; Chen, Y.; Lu, L.; Lu, T.; Zhou, J.; et al. MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer. arXiv 2024 , arXiv:2401.10208. [ Google Scholar ] [ CrossRef ]
Aghajanyan, A.; Huang, B.; Ross, C.; Karpukhin, V.; Xu, H.; Goyal, N.; Okhonko, D.; Joshi, M.; Ghosh, G.; Lewis, M.; et al. CM3: A Causal Masked Multimodal Model of the Internet. arXiv 2022 , arXiv:2201.07520. [ Google Scholar ]
Liu, H.; Chen, Z.; Yuan, Y.; Mei, X.; Liu, X.; Mandic, D.; Wang, W.; Plumbley, M.D. AudioLDM: Text-to-Audio Generation with Latent Diffusion Models. arXiv 2023 , arXiv:2301.12503. [ Google Scholar ]
Liu, H.; Tian, Q.; Yuan, Y.; Liu, X.; Mei, X.; Kong, Q.; Wang, Y.; Wang, W.; Wang, Y.; Plumbley, M.D. AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining. arXiv 2023 , arXiv:2308.05734. [ Google Scholar ] [ CrossRef ]
Wei, J.; Bosma, M.; Zhao, V.Y.; Guu, K.; Yu, A.W.; Lester, B.; Du, N.; Dai, A.M.; Le, Q.V. Finetuned Language Models Are Zero-Shot Learners. arXiv 2022 , arXiv:2109.01652. [ Google Scholar ]
Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.L.; Mishkin, P.; Zhang, C.; Agarwal, S.; Slama, K.; Ray, A.; et al. Training language models to follow instructions with human feedback. arXiv 2022 , arXiv:2203.02155. [ Google Scholar ]
Chen, K.; Zhang, Z.; Zeng, W.; Zhang, R.; Zhu, F.; Zhao, R. Shikra: Unleashing Multimodal LLM’s Referential Dialogue Magic. arXiv 2023 , arXiv:2306.15195. [ Google Scholar ] [ CrossRef ]
Li, K.; He, Y.; Wang, Y.; Li, Y.; Wang, W.; Luo, P.; Wang, Y.; Wang, L.; Qiao, Y. VideoChat: Chat-Centric Video Understanding. arXiv 2024 , arXiv:2305.06355. [ Google Scholar ] [ CrossRef ]
Zhang, Y.; Zhang, R.; Gu, J.; Zhou, Y.; Lipka, N.; Yang, D.; Sun, T. LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding. arXiv 2024 , arXiv:2306.17107. [ Google Scholar ] [ CrossRef ]
Liu, X.; Ji, K.; Fu, Y.; Tam, W.L.; Du, Z.; Yang, Z.; Tang, J. P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks. arXiv 2022 , arXiv:2110.07602. [ Google Scholar ]
Wu, C.; Yin, S.; Qi, W.; Wang, X.; Tang, Z.; Duan, N. Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models. arXiv 2023 , arXiv:2303.04671. [ Google Scholar ]
Lu, P.; Mishra, S.; Xia, T.; Qiu, L.; Chang, K.W.; Zhu, S.C.; Tafjord, O.; Clark, P.; Kalyan, A. Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering. arXiv 2022 , arXiv:2209.09513. [ Google Scholar ]
Zhang, Z.; Zhang, A.; Li, M.; Zhao, H.; Karypis, G.; Smola, A. Multimodal Chain-of-Thought Reasoning in Language Models. arXiv 2023 , arXiv:2302.00923. [ Google Scholar ]
Ge, J.; Luo, H.; Qian, S.; Gan, Y.; Fu, J.; Zhang, S. Chain of Thought Prompt Tuning in Vision Language Models. arXiv 2023 , arXiv:2304.07919. [ Google Scholar ]
Hu, W.; Xu, Y.; Li, Y.; Li, W.; Chen, Z.; Tu, Z. BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions. arXiv 2023 , arXiv:2308.09936. [ Google Scholar ] [ CrossRef ]
Zhao, L.; Yu, E.; Ge, Z.; Yang, J.; Wei, H.; Zhou, H.; Sun, J.; Peng, Y.; Dong, R.; Han, C.; et al. ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning. arXiv 2023 , arXiv:2307.09474. [ Google Scholar ] [ CrossRef ]
Wang, W.; Lv, Q.; Yu, W.; Hong, W.; Qi, J.; Wang, Y.; Ji, J.; Yang, Z.; Zhao, L.; Song, X.; et al. CogVLM: Visual Expert for Pretrained Language Models. arXiv 2023 , arXiv:2311.03079v2. [ Google Scholar ]
Chen, Y.; Sikka, K.; Cogswell, M.; Ji, H.; Divakaran, A. DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback. arXiv 2023 , arXiv:2311.10081. [ Google Scholar ] [ CrossRef ]
Introducing IDEFICS: An Open Reproduction of State-of-the-Art Visual Langage Model. Available online: https://huggingface.co/blog/idefics (accessed on 29 March 2024).
Zhang, P.; Dong, X.; Wang, B.; Cao, Y.; Xu, C.; Ouyang, L.; Zhao, Z.; Duan, H.; Zhang, S.; Ding, S.; et al. InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition. arXiv 2023 , arXiv:2309.15112. [ Google Scholar ] [ CrossRef ]
Dong, X.; Zhang, P.; Zang, Y.; Cao, Y.; Wang, B.; Ouyang, L.; Wei, X.; Zhang, S.; Duan, H.; Cao, M.; et al. InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model. arXiv 2024 , arXiv:2401.16420. [ Google Scholar ] [ CrossRef ]
Liu, H.; Li, C.; Wu, Q.; Lee, Y.J. Visual Instruction Tuning. arXiv 2023 , arXiv:2304.08485. [ Google Scholar ]
Liu, H.; Li, C.; Li, Y.; Lee, Y.J. Improved Baselines with Visual Instruction Tuning. arXiv 2023 , arXiv:2310.03744. [ Google Scholar ] [ CrossRef ]
Chen, J.; Zhu, D.; Shen, X.; Li, X.; Liu, Z.; Zhang, P.; Krishnamoorthi, R.; Chandra, V.; Xiong, Y.; Elhoseiny, M. MiniGPT-v2: Large Language Model as a Unified Interface for Vision-Language Multi-Task Learning. arXiv 2023 , arXiv:2310.09478. [ Google Scholar ] [ CrossRef ]
Ye, Q.; Xu, H.; Xu, G.; Ye, J.; Yan, M.; Zhou, Y.; Wang, J.; Hu, A.; Shi, P.; Shi, Y.; et al. mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality. arXiv 2023 , arXiv:2304.14178. [ Google Scholar ] [ CrossRef ]
Ye, Q.; Xu, H.; Ye, J.; Yan, M.; Hu, A.; Liu, H.; Qian, Q.; Zhang, J.; Huang, F.; Zhou, J. mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration. arXiv 2023 , arXiv:2311.04257. [ Google Scholar ] [ CrossRef ]
Ye, J.; Hu, A.; Xu, H.; Ye, Q.; Yan, M.; Dan, Y.; Zhao, C.; Xu, G.; Li, C.; Tian, J.; et al. mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding. arXiv 2023 , arXiv:2307.02499. [ Google Scholar ] [ CrossRef ]
Chu, X.; Qiao, L.; Lin, X.; Xu, S.; Yang, Y.; Hu, Y.; Wei, F.; Zhang, X.; Zhang, B.; Wei, X.; et al. MobileVLM: A Fast, Strong and Open Vision Language Assistant for Mobile Devices. arXiv 2023 , arXiv:2312.16886. [ Google Scholar ] [ CrossRef ]
Chu, X.; Qiao, L.; Zhang, X.; Xu, S.; Wei, F.; Yang, Y.; Sun, X.; Hu, Y.; Lin, X.; Zhang, B.; et al. MobileVLM V2: Faster and Stronger Baseline for Vision Language Model. arXiv 2024 , arXiv:2402.03766. [ Google Scholar ] [ CrossRef ]
Li, B.; Zhang, Y.; Chen, L.; Wang, J.; Yang, J.; Liu, Z. Otter: A Multi-Modal Model with In-Context Instruction Tuning. arXiv 2023 , arXiv:2305.03726. [ Google Scholar ] [ CrossRef ]
Yuan, Y.; Li, W.; Liu, J.; Tang, D.; Luo, X.; Qin, C.; Zhang, L.; Zhu, J. Osprey: Pixel Understanding with Visual Instruction Tuning. arXiv 2023 , arXiv:2312.10032. [ Google Scholar ] [ CrossRef ]
Su, Y.; Lan, T.; Li, H.; Xu, J.; Wang, Y.; Cai, D. PandaGPT: One Model To Instruction-Follow Them All. arXiv 2023 , arXiv:2305.16355. [ Google Scholar ]
Chen, X.; Djolonga, J.; Padlewski, P.; Mustafa, B.; Changpinyo, S.; Wu, J.; Ruiz, C.R.; Goodman, S.; Wang, X.; Tay, Y.; et al. PaLI-X: On Scaling up a Multilingual Vision and Language Model. arXiv 2023 , arXiv:2305.18565. [ Google Scholar ] [ CrossRef ]
Bai, J.; Bai, S.; Yang, S.; Wang, S.; Tan, S.; Wang, P.; Lin, J.; Zhou, C.; Zhou, J. Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond. arXiv 2023 , arXiv:2308.12966. [ Google Scholar ] [ CrossRef ]
Yu, T.; Yao, Y.; Zhang, H.; He, T.; Han, Y.; Cui, G.; Hu, J.; Liu, Z.; Zheng, H.T.; Sun, M.; et al. RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback. arXiv 2023 , arXiv:2312.00849. [ Google Scholar ] [ CrossRef ]
Li, L.; Xie, Z.; Li, M.; Chen, S.; Wang, P.; Chen, L.; Yang, Y.; Wang, B.; Kong, L. Silkie: Preference Distillation for Large Visual Language Models. arXiv 2023 , arXiv:2312.10665. [ Google Scholar ] [ CrossRef ]
Lin, J.; Yin, H.; Ping, W.; Lu, Y.; Molchanov, P.; Tao, A.; Mao, H.; Kautz, J.; Shoeybi, M.; Han, S. VILA: On Pre-training for Visual Language Models. arXiv 2024 , arXiv:2312.07533. [ Google Scholar ] [ CrossRef ]
Ma, Y.; Cao, Y.; Sun, J.; Pavone, M.; Xiao, C. Dolphins: Multimodal Language Model for Driving. arXiv 2023 , arXiv:2312.00438. [ Google Scholar ]
Chen, Z.; Wu, J.; Wang, W.; Su, W.; Chen, G.; Xing, S.; Zhong, M.; Zhang, Q.; Zhu, X.; Lu, L.; et al. InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks. arXiv 2024 , arXiv:2312.14238. [ Google Scholar ]
Zeng, Y.; Zhang, H.; Zheng, J.; Xia, J.; Wei, G.; Wei, Y.; Zhang, Y.; Kong, T. What Matters in Training a GPT4-Style Language Model with Multimodal Inputs? arXiv 2023 , arXiv:2307.02469. [ Google Scholar ]
Xu, H.; Ye, Q.; Wu, X.; Yan, M.; Miao, Y.; Ye, J.; Xu, G.; Hu, A.; Shi, Y.; Xu, G.; et al. Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks. arXiv 2023 , arXiv:2306.04362. [ Google Scholar ]
Bertasius, G.; Wang, H.; Torresani, L. Is Space-Time Attention All You Need for Video Understanding? arXiv 2021 , arXiv:2102.05095. [ Google Scholar ]
For Intelligent Computing, I. Chinese GPT-3-1.3B. Available online: https://www.modelscope.cn/models/damo/nlp_gpt3_text-generation_1.3B (accessed on 19 April 2024).
For Intelligent Computing, I. Chinese GPT-3-2.7B. Available online: https://www.modelscope.cn/models/damo/nlp_gpt3_text-generation_2.7B (accessed on 19 April 2024).
Chen, S.; Li, H.; Wang, Q.; Zhao, Z.; Sun, M.; Zhu, X.; Liu, J. VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset. arXiv 2023 , arXiv:2305.18500. [ Google Scholar ]
Maaz, M.; Rasheed, H.; Khan, S.; Khan, F.S. Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models. arXiv 2023 , arXiv:2306.05424. [ Google Scholar ] [ CrossRef ]
Zhang, H.; Li, X.; Bing, L. Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding. arXiv 2023 , arXiv:2306.02858. [ Google Scholar ]
Qi, J.; Ding, M.; Wang, W.; Bai, Y.; Lv, Q.; Hong, W.; Xu, B.; Hou, L.; Li, J.; Dong, Y.; et al. CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations. arXiv 2024 , arXiv:2402.04236. [ Google Scholar ] [ CrossRef ]
Pi, R.; Gao, J.; Diao, S.; Pan, R.; Dong, H.; Zhang, J.; Yao, L.; Han, J.; Xu, H.; Kong, L.; et al. DetGPT: Detect What You Need via Reasoning. arXiv 2023 , arXiv:2305.14167. [ Google Scholar ] [ CrossRef ]
Gao, P.; Zhang, R.; Liu, C.; Qiu, L.; Huang, S.; Lin, W.; Zhao, S.; Geng, S.; Lin, Z.; Jin, P.; et al. SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models. arXiv 2024 , arXiv:2402.05935. [ Google Scholar ] [ CrossRef ]
Huang, S.; Dong, L.; Wang, W.; Hao, Y.; Singhal, S.; Ma, S.; Lv, T.; Cui, L.; Mohammed, O.K.; Patra, B.; et al. Language Is Not All You Need: Aligning Perception with Language Models. arXiv 2023 , arXiv:2302.14045. [ Google Scholar ]
Lai, X.; Tian, Z.; Chen, Y.; Li, Y.; Yuan, Y.; Liu, S.; Jia, J. LISA: Reasoning Segmentation via Large Language Model. arXiv 2023 , arXiv:2308.00692. [ Google Scholar ] [ CrossRef ]
Dong, R.; Han, C.; Peng, Y.; Qi, Z.; Ge, Z.; Yang, J.; Zhao, L.; Sun, J.; Zhou, H.; Wei, H.; et al. DreamLLM: Synergistic Multimodal Comprehension and Creation. arXiv 2023 , arXiv:2309.11499. [ Google Scholar ] [ CrossRef ]
Sun, Q.; Yu, Q.; Cui, Y.; Zhang, F.; Zhang, X.; Wang, Y.; Gao, H.; Liu, J.; Huang, T.; Wang, X. Generative Pretraining in Multimodality. arXiv 2023 , arXiv:2307.05222. [ Google Scholar ]
Sun, Q.; Cui, Y.; Zhang, X.; Zhang, F.; Yu, Q.; Luo, Z.; Wang, Y.; Rao, Y.; Liu, J.; Huang, T.; et al. Generative Multimodal Models are In-Context Learners. arXiv 2023 , arXiv:2312.13286. [ Google Scholar ]
Rasheed, H.; Maaz, M.; Mullappilly, S.S.; Shaker, A.; Khan, S.; Cholakkal, H.; Anwer, R.M.; Xing, E.; Yang, M.H.; Khan, F.S. GLaMM: Pixel Grounding Large Multimodal Model. arXiv 2023 , arXiv:2311.03356. [ Google Scholar ] [ CrossRef ]
Zhang, S.; Sun, P.; Chen, S.; Xiao, M.; Shao, W.; Zhang, W.; Liu, Y.; Chen, K.; Luo, P. GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest. arXiv 2023 , arXiv:2307.03601. [ Google Scholar ] [ CrossRef ]
Jin, Y.; Xu, K.; Xu, K.; Chen, L.; Liao, C.; Tan, J.; Huang, Q.; Chen, B.; Lei, C.; Liu, A.; et al. Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization. arXiv 2024 , arXiv:2309.04669. [ Google Scholar ]
Liu, S.; Cheng, H.; Liu, H.; Zhang, H.; Li, F.; Ren, T.; Zou, X.; Yang, J.; Su, H.; Zhu, J.; et al. LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents. arXiv 2023 , arXiv:2311.05437. [ Google Scholar ] [ CrossRef ]
Zheng, K.; He, X.; Wang, X.E. MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens. arXiv 2023 , arXiv:2310.02239. [ Google Scholar ] [ CrossRef ]
Ren, Z.; Huang, Z.; Wei, Y.; Zhao, Y.; Fu, D.; Feng, J.; Jin, X. PixelLM: Pixel Reasoning with Large Multimodal Model. arXiv 2023 , arXiv:2312.02228. [ Google Scholar ]
Ge, Y.; Ge, Y.; Zeng, Z.; Wang, X.; Shan, Y. Planting a SEED of Vision in Large Language Model. arXiv 2023 , arXiv:2307.08041. [ Google Scholar ] [ CrossRef ]
Zhu, J.; Ding, X.; Ge, Y.; Ge, Y.; Zhao, S.; Zhao, H.; Wang, X.; Shan, Y. VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation. arXiv 2023 , arXiv:2312.09251. [ Google Scholar ] [ CrossRef ]
Tang, Z.; Yang, Z.; Khademi, M.; Liu, Y.; Zhu, C.; Bansal, M. CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation. arXiv 2023 , arXiv:2311.18775. [ Google Scholar ]
Wang, X.; Zhuang, B.; Wu, Q. ModaVerse: Efficiently Transforming Modalities with LLMs. arXiv 2024 , arXiv:2401.06395. [ Google Scholar ]
Wu, S.; Fei, H.; Qu, L.; Ji, W.; Chua, T.S. NExT-GPT: Any-to-Any Multimodal LLM. arXiv 2023 , arXiv:2309.05519. [ Google Scholar ]
Wang, J.; Yuan, H.; Chen, D.; Zhang, Y.; Wang, X.; Zhang, S. Modelscope text-to-video technical report. arXiv 2023 , arXiv:2308.06571. [ Google Scholar ]
Tang, Z.; Yang, Z.; Zhu, C.; Zeng, M.; Bansal, M. Any-to-Any Generation via Composable Diffusion. arXiv 2023 , arXiv:2305.11846. [ Google Scholar ]
Wang, Z.; Wang, L.; Zhao, Z.; Wu, M.; Lyu, C.; Li, H.; Cai, D.; Zhou, L.; Shi, S.; Tu, Z. GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation. arXiv 2023 , arXiv:2311.16511. [ Google Scholar ]
Shen, Y.; Song, K.; Tan, X.; Li, D.; Lu, W.; Zhuang, Y. HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face. arXiv 2023 , arXiv:2303.17580. [ Google Scholar ]
Blattmann, A.; Rombach, R.; Ling, H.; Dockhorn, T.; Kim, S.W.; Fidler, S.; Kreis, K. Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models. arXiv 2023 , arXiv:2304.08818. [ Google Scholar ]
Blattmann, A.; Dockhorn, T.; Kulal, S.; Mendelevitch, D.; Kilian, M.; Lorenz, D.; Levi, Y.; English, Z.; Voleti, V.; Letts, A.; et al. Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets. arXiv 2023 , arXiv:2311.15127. [ Google Scholar ]
Girdhar, R.; Singh, M.; Brown, A.; Duval, Q.; Azadi, S.; Rambhatla, S.S.; Shah, A.; Yin, X.; Parikh, D.; Misra, I. Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning. arXiv 2023 , arXiv:2311.10709. [ Google Scholar ]
Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. arXiv 2014 , arXiv:1406.2661. [ Google Scholar ] [ CrossRef ]
Brooks, T.; Hellsten, J.; Aittala, M.; Wang, T.C.; Aila, T.; Lehtinen, J.; Liu, M.Y.; Efros, A.A.; Karras, T. Generating Long Videos of Dynamic Scenes. arXiv 2022 , arXiv:2206.03429. [ Google Scholar ]
Liu, Y.; Zhang, K.; Li, Y.; Yan, Z.; Gao, C.; Chen, R.; Yuan, Z.; Huang, Y.; Sun, H.; Gao, J.; et al. Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models. arXiv 2024 , arXiv:2402.17177. [ Google Scholar ]
Peebles, W.; Xie, S. Scalable Diffusion Models with Transformers. arXiv 2023 , arXiv:2212.09748. [ Google Scholar ]
Ramesh, A.; Pavlov, M.; Goh, G.; Gray, S.; Voss, C.; Radford, A.; Chen, M.; Sutskever, I. Zero-Shot Text-to-Image Generation. arXiv 2021 , arXiv:2102.12092. [ Google Scholar ]
Ramesh, A.; Dhariwal, P.; Nichol, A.; Chu, C.; Chen, M. Hierarchical Text-Conditional Image Generation with CLIP Latents. arXiv 2022 , arXiv:2204.06125. [ Google Scholar ]
Betker, J.; Goh, G.; Jing, L.; Brooks, T.; Wang, J.; Li, L.; Ouyang, L.; Zhuang, J.; Lee, J.; Guo, Y.; et al. Improving Image Generation with Better Captions. Available online: https://api.semanticscholar.org/CorpusID:264403242 (accessed on 19 April 2024).
Team, G.; Anil, R.; Borgeaud, S.; Alayrac, J.B.; Yu, J.; Soricut, R.; Schalkwyk, J.; Dai, A.M.; Hauth, A.; Millican, K.; et al. Gemini: A Family of Highly Capable Multimodal Models. arXiv 2024 , arXiv:2312.11805. [ Google Scholar ]
Reid, M.; Savinov, N.; Teplyashin, D.; Lepikhin, D.; Lillicrap, T.; Alayrac, J.-B.; Soricut, R.; Lazaridou, A.; Firat, O.; Schrittwieser, J.; et al. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv 2024 , arXiv:2403.05530. [ Google Scholar ]
Gupta, U.; Kim, Y.G.; Lee, S.; Tse, J.; Lee, H.H.S.; Wei, G.Y.; Brooks, D.; Wu, C.J. Chasing Carbon: The Elusive Environmental Footprint of Computing. In Proceedings of the 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), Seoul, Republic of Korea, 27 February–3 March 2021; pp. 854–867. [ Google Scholar ] [ CrossRef ]
Patterson, D.; Gonzalez, J.; Le, Q.; Liang, C.; Munguia, L.M.; Rothchild, D.; So, D.; Texier, M.; Dean, J. Carbon Emissions and Large Neural Network Training. arXiv 2021 , arXiv:2104.10350. [ Google Scholar ]
Sun, Y.; Wang, S.; Li, Y.; Feng, S.; Chen, X.; Zhang, H.; Tian, X.; Zhu, D.; Tian, H.; Wu, H. ERNIE: Enhanced Representation through Knowledge Integration. arXiv 2019 , arXiv:1904.09223. [ Google Scholar ]
Sun, Y.; Wang, S.; Li, Y.; Feng, S.; Tian, H.; Wu, H.; Wang, H. ERNIE 2.0: A Continual Pre-training Framework for Language Understanding. arXiv 2019 , arXiv:1907.12412. [ Google Scholar ] [ CrossRef ]
Sun, Y.; Wang, S.; Feng, S.; Ding, S.; Pang, C.; Shang, J.; Liu, J.; Chen, X.; Zhao, Y.; Lu, Y.; et al. ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation. arXiv 2021 , arXiv:2107.02137. [ Google Scholar ]
Hu, J.; Yao, Y.; Wang, C.; Wang, S.; Pan, Y.; Chen, Q.; Yu, T.; Wu, H.; Zhao, Y.; Zhang, H.; et al. Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages. arXiv 2024 , arXiv:2308.12038. [ Google Scholar ]
Du, Z.; Qian, Y.; Liu, X.; Ding, M.; Qiu, J.; Yang, Z.; Tang, J. GLM: General Language Model Pretraining with Autoregressive Blank Infilling. arXiv 2022 , arXiv:2103.10360. [ Google Scholar ]
Kim, B.; Kim, H.; Lee, S.W.; Lee, G.; Kwak, D.; Jeon, D.H.; Park, S.; Kim, S.; Kim, S.; Seo, D.; et al. What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers. arXiv 2021 , arXiv:2109.04650. [ Google Scholar ]
Kim, I.; Han, G.; Ham, J.; Baek, W. KoGPT: KakaoBrain Korean(hangul) Generative Pre-Trained Transformer. 2021. Available online: https://github.com/kakaobrain/kogpt (accessed on 19 April 2024).
Workshop, B.; Scao, T.L.; Fan, A.; Akiki, C.; Pavlick, E.; Ilić, S.; Hesslow, D.; Castagné, R.; Luccioni, A.S.; Yvon, F.; et al. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. arXiv 2023 , arXiv:2211.05100. [ Google Scholar ]
Gu, A.; Johnson, I.; Goel, K.; Saab, K.; Dao, T.; Rudra, A.; Ré, C. Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers. arXiv 2021 , arXiv:2110.13985. [ Google Scholar ]
Gu, A.; Goel, K.; Ré, C. Efficiently Modeling Long Sequences with Structured State Spaces. arXiv 2022 , arXiv:2111.00396. [ Google Scholar ]
He, W.; Han, K.; Tang, Y.; Wang, C.; Yang, Y.; Guo, T.; Wang, Y. DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models. arXiv 2024 , arXiv:2403.00818. [ Google Scholar ]
Jiang, X.; Han, C.; Mesgarani, N. Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech Separation. arXiv 2024 , arXiv:2403.18257. [ Google Scholar ]
Li, K.; Chen, G. SPMamba: State-space model is all you need in speech separation. arXiv 2024 , arXiv:2404.02063. [ Google Scholar ]
Zhu, L.; Liao, B.; Zhang, Q.; Wang, X.; Liu, W.; Wang, X. Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model. arXiv 2024 , arXiv:2401.09417. [ Google Scholar ]
Liu, Y.; Tian, Y.; Zhao, Y.; Yu, H.; Xie, L.; Wang, Y.; Ye, Q.; Liu, Y. VMamba: Visual State Space Model. arXiv 2024 , arXiv:2401.10166. [ Google Scholar ]
Peng, S.; Zhu, X.; Deng, H.; Lei, Z.; Deng, L.J. FusionMamba: Efficient Image Fusion with State Space Model. arXiv 2024 , arXiv:2404.07932. [ Google Scholar ]
Qiao, Y.; Yu, Z.; Guo, L.; Chen, S.; Zhao, Z.; Sun, M.; Wu, Q.; Liu, J. VL-Mamba: Exploring State Space Models for Multimodal Learning. arXiv 2024 , arXiv:2403.13600. [ Google Scholar ]
Zhang, Z.; Liu, A.; Reid, I.; Hartley, R.; Zhuang, B.; Tang, H. Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM. arXiv 2024 , arXiv:2403.07487. [ Google Scholar ]
De, S.; Smith, S.L.; Fernando, A.; Botev, A.; Cristian-Muraru, G.; Gu, A.; Haroun, R.; Berrada, L.; Chen, Y.; Srinivasan, S.; et al. Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models. arXiv 2024 , arXiv:2402.19427. [ Google Scholar ]

Click here to enlarge figure

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Huang, D.; Yan, C.; Li, Q.; Peng, X. From Large Language Models to Large Multimodal Models: A Literature Review. Appl. Sci. 2024 , 14 , 5068. https://doi.org/10.3390/app14125068

Huang D, Yan C, Li Q, Peng X. From Large Language Models to Large Multimodal Models: A Literature Review. Applied Sciences . 2024; 14(12):5068. https://doi.org/10.3390/app14125068

Huang, Dawei, Chuan Yan, Qing Li, and Xiaojiang Peng. 2024. "From Large Language Models to Large Multimodal Models: A Literature Review" Applied Sciences 14, no. 12: 5068. https://doi.org/10.3390/app14125068

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

Subscribe to receive issue release notifications and newsletters from MDPI journals

Purdue Online Writing Lab Purdue OWL® College of Liberal Arts

Writing a Literature Review

Welcome to the Purdue OWL

This page is brought to you by the OWL at Purdue University. When printing this page, you must include the entire legal notice.

Copyright ©1995-2018 by The Writing Lab & The OWL at Purdue and Purdue University. All rights reserved. This material may not be published, reproduced, broadcast, rewritten, or redistributed without permission. Use of this site constitutes acceptance of our terms and conditions of fair use.

A literature review is a document or section of a document that collects key sources on a topic and discusses those sources in conversation with each other (also called synthesis ). The lit review is an important genre in many disciplines, not just literature (i.e., the study of works of literature such as novels and plays). When we say “literature review” or refer to “the literature,” we are talking about the research ( scholarship ) in a given field. You will often see the terms “the research,” “the scholarship,” and “the literature” used mostly interchangeably.

Where, when, and why would I write a lit review?

There are a number of different situations where you might write a literature review, each with slightly different expectations; different disciplines, too, have field-specific expectations for what a literature review is and does. For instance, in the humanities, authors might include more overt argumentation and interpretation of source material in their literature reviews, whereas in the sciences, authors are more likely to report study designs and results in their literature reviews; these differences reflect these disciplines’ purposes and conventions in scholarship. You should always look at examples from your own discipline and talk to professors or mentors in your field to be sure you understand your discipline’s conventions, for literature reviews as well as for any other genre.

A literature review can be a part of a research paper or scholarly article, usually falling after the introduction and before the research methods sections. In these cases, the lit review just needs to cover scholarship that is important to the issue you are writing about; sometimes it will also cover key sources that informed your research methodology.

Lit reviews can also be standalone pieces, either as assignments in a class or as publications. In a class, a lit review may be assigned to help students familiarize themselves with a topic and with scholarship in their field, get an idea of the other researchers working on the topic they’re interested in, find gaps in existing research in order to propose new projects, and/or develop a theoretical framework and methodology for later research. As a publication, a lit review usually is meant to help make other scholars’ lives easier by collecting and summarizing, synthesizing, and analyzing existing research on a topic. This can be especially helpful for students or scholars getting into a new research area, or for directing an entire community of scholars toward questions that have not yet been answered.

What are the parts of a lit review?

Most lit reviews use a basic introduction-body-conclusion structure; if your lit review is part of a larger paper, the introduction and conclusion pieces may be just a few sentences while you focus most of your attention on the body. If your lit review is a standalone piece, the introduction and conclusion take up more space and give you a place to discuss your goals, research methods, and conclusions separately from where you discuss the literature itself.

Introduction:

An introductory paragraph that explains what your working topic and thesis is
A forecast of key topics or texts that will appear in the review
Potentially, a description of how you found sources and how you analyzed them for inclusion and discussion in the review (more often found in published, standalone literature reviews than in lit review sections in an article or research paper)
Summarize and synthesize: Give an overview of the main points of each source and combine them into a coherent whole
Analyze and interpret: Don’t just paraphrase other researchers – add your own interpretations where possible, discussing the significance of findings in relation to the literature as a whole
Critically Evaluate: Mention the strengths and weaknesses of your sources
Write in well-structured paragraphs: Use transition words and topic sentence to draw connections, comparisons, and contrasts.

Conclusion:

Summarize the key findings you have taken from the literature and emphasize their significance
Connect it back to your primary research question

How should I organize my lit review?

Lit reviews can take many different organizational patterns depending on what you are trying to accomplish with the review. Here are some examples:

Chronological : The simplest approach is to trace the development of the topic over time, which helps familiarize the audience with the topic (for instance if you are introducing something that is not commonly known in your field). If you choose this strategy, be careful to avoid simply listing and summarizing sources in order. Try to analyze the patterns, turning points, and key debates that have shaped the direction of the field. Give your interpretation of how and why certain developments occurred (as mentioned previously, this may not be appropriate in your discipline — check with a teacher or mentor if you’re unsure).
Thematic : If you have found some recurring central themes that you will continue working with throughout your piece, you can organize your literature review into subsections that address different aspects of the topic. For example, if you are reviewing literature about women and religion, key themes can include the role of women in churches and the religious attitude towards women.
Qualitative versus quantitative research
Empirical versus theoretical scholarship
Divide the research by sociological, historical, or cultural sources
Theoretical : In many humanities articles, the literature review is the foundation for the theoretical framework. You can use it to discuss various theories, models, and definitions of key concepts. You can argue for the relevance of a specific theoretical approach or combine various theorical concepts to create a framework for your research.

What are some strategies or tips I can use while writing my lit review?

Any lit review is only as good as the research it discusses; make sure your sources are well-chosen and your research is thorough. Don’t be afraid to do more research if you discover a new thread as you’re writing. More info on the research process is available in our "Conducting Research" resources .

As you’re doing your research, create an annotated bibliography ( see our page on the this type of document ). Much of the information used in an annotated bibliography can be used also in a literature review, so you’ll be not only partially drafting your lit review as you research, but also developing your sense of the larger conversation going on among scholars, professionals, and any other stakeholders in your topic.

Usually you will need to synthesize research rather than just summarizing it. This means drawing connections between sources to create a picture of the scholarly conversation on a topic over time. Many student writers struggle to synthesize because they feel they don’t have anything to add to the scholars they are citing; here are some strategies to help you:

It often helps to remember that the point of these kinds of syntheses is to show your readers how you understand your research, to help them read the rest of your paper.
Writing teachers often say synthesis is like hosting a dinner party: imagine all your sources are together in a room, discussing your topic. What are they saying to each other?
Look at the in-text citations in each paragraph. Are you citing just one source for each paragraph? This usually indicates summary only. When you have multiple sources cited in a paragraph, you are more likely to be synthesizing them (not always, but often
Read more about synthesis here.

The most interesting literature reviews are often written as arguments (again, as mentioned at the beginning of the page, this is discipline-specific and doesn’t work for all situations). Often, the literature review is where you can establish your research as filling a particular gap or as relevant in a particular way. You have some chance to do this in your introduction in an article, but the literature review section gives a more extended opportunity to establish the conversation in the way you would like your readers to see it. You can choose the intellectual lineage you would like to be part of and whose definitions matter most to your thinking (mostly humanities-specific, but this goes for sciences as well). In addressing these points, you argue for your place in the conversation, which tends to make the lit review more compelling than a simple reporting of other sources.

USC Libraries
Research Guides

Organizing Your Social Sciences Research Paper

5. The Literature Review
Purpose of Guide
Design Flaws to Avoid
Independent and Dependent Variables
Glossary of Research Terms
Reading Research Effectively
Narrowing a Topic Idea
Broadening a Topic Idea
Extending the Timeliness of a Topic Idea
Academic Writing Style
Applying Critical Thinking
Choosing a Title
Making an Outline
Paragraph Development
Research Process Video Series
Executive Summary
The C.A.R.S. Model
Background Information
The Research Problem/Question
Theoretical Framework
Citation Tracking
Content Alert Services
Evaluating Sources
Primary Sources
Secondary Sources
Tiertiary Sources
Scholarly vs. Popular Publications
Qualitative Methods
Quantitative Methods
Insiderness
Using Non-Textual Elements
Limitations of the Study
Common Grammar Mistakes
Writing Concisely
Avoiding Plagiarism
Footnotes or Endnotes?
Further Readings
Generative AI and Writing
USC Libraries Tutorials and Other Guides
Bibliography

A literature review surveys prior research published in books, scholarly articles, and any other sources relevant to a particular issue, area of research, or theory, and by so doing, provides a description, summary, and critical evaluation of these works in relation to the research problem being investigated. Literature reviews are designed to provide an overview of sources you have used in researching a particular topic and to demonstrate to your readers how your research fits within existing scholarship about the topic.

Fink, Arlene. Conducting Research Literature Reviews: From the Internet to Paper . Fourth edition. Thousand Oaks, CA: SAGE, 2014.

Importance of a Good Literature Review

A literature review may consist of simply a summary of key sources, but in the social sciences, a literature review usually has an organizational pattern and combines both summary and synthesis, often within specific conceptual categories . A summary is a recap of the important information of the source, but a synthesis is a re-organization, or a reshuffling, of that information in a way that informs how you are planning to investigate a research problem. The analytical features of a literature review might:

Give a new interpretation of old material or combine new with old interpretations,
Trace the intellectual progression of the field, including major debates,
Depending on the situation, evaluate the sources and advise the reader on the most pertinent or relevant research, or
Usually in the conclusion of a literature review, identify where gaps exist in how a problem has been researched to date.

Given this, the purpose of a literature review is to:

Place each work in the context of its contribution to understanding the research problem being studied.
Describe the relationship of each work to the others under consideration.
Identify new ways to interpret prior research.
Reveal any gaps that exist in the literature.
Resolve conflicts amongst seemingly contradictory previous studies.
Identify areas of prior scholarship to prevent duplication of effort.
Point the way in fulfilling a need for additional research.
Locate your own research within the context of existing literature [very important].

Fink, Arlene. Conducting Research Literature Reviews: From the Internet to Paper. 2nd ed. Thousand Oaks, CA: Sage, 2005; Hart, Chris. Doing a Literature Review: Releasing the Social Science Research Imagination . Thousand Oaks, CA: Sage Publications, 1998; Jesson, Jill. Doing Your Literature Review: Traditional and Systematic Techniques . Los Angeles, CA: SAGE, 2011; Knopf, Jeffrey W. "Doing a Literature Review." PS: Political Science and Politics 39 (January 2006): 127-132; Ridley, Diana. The Literature Review: A Step-by-Step Guide for Students . 2nd ed. Los Angeles, CA: SAGE, 2012.

Types of Literature Reviews

It is important to think of knowledge in a given field as consisting of three layers. First, there are the primary studies that researchers conduct and publish. Second are the reviews of those studies that summarize and offer new interpretations built from and often extending beyond the primary studies. Third, there are the perceptions, conclusions, opinion, and interpretations that are shared informally among scholars that become part of the body of epistemological traditions within the field.

In composing a literature review, it is important to note that it is often this third layer of knowledge that is cited as "true" even though it often has only a loose relationship to the primary studies and secondary literature reviews. Given this, while literature reviews are designed to provide an overview and synthesis of pertinent sources you have explored, there are a number of approaches you could adopt depending upon the type of analysis underpinning your study.

Argumentative Review This form examines literature selectively in order to support or refute an argument, deeply embedded assumption, or philosophical problem already established in the literature. The purpose is to develop a body of literature that establishes a contrarian viewpoint. Given the value-laden nature of some social science research [e.g., educational reform; immigration control], argumentative approaches to analyzing the literature can be a legitimate and important form of discourse. However, note that they can also introduce problems of bias when they are used to make summary claims of the sort found in systematic reviews [see below].

Integrative Review Considered a form of research that reviews, critiques, and synthesizes representative literature on a topic in an integrated way such that new frameworks and perspectives on the topic are generated. The body of literature includes all studies that address related or identical hypotheses or research problems. A well-done integrative review meets the same standards as primary research in regard to clarity, rigor, and replication. This is the most common form of review in the social sciences.

Historical Review Few things rest in isolation from historical precedent. Historical literature reviews focus on examining research throughout a period of time, often starting with the first time an issue, concept, theory, phenomena emerged in the literature, then tracing its evolution within the scholarship of a discipline. The purpose is to place research in a historical context to show familiarity with state-of-the-art developments and to identify the likely directions for future research.

Methodological Review A review does not always focus on what someone said [findings], but how they came about saying what they say [method of analysis]. Reviewing methods of analysis provides a framework of understanding at different levels [i.e. those of theory, substantive fields, research approaches, and data collection and analysis techniques], how researchers draw upon a wide variety of knowledge ranging from the conceptual level to practical documents for use in fieldwork in the areas of ontological and epistemological consideration, quantitative and qualitative integration, sampling, interviewing, data collection, and data analysis. This approach helps highlight ethical issues which you should be aware of and consider as you go through your own study.

Systematic Review This form consists of an overview of existing evidence pertinent to a clearly formulated research question, which uses pre-specified and standardized methods to identify and critically appraise relevant research, and to collect, report, and analyze data from the studies that are included in the review. The goal is to deliberately document, critically evaluate, and summarize scientifically all of the research about a clearly defined research problem . Typically it focuses on a very specific empirical question, often posed in a cause-and-effect form, such as "To what extent does A contribute to B?" This type of literature review is primarily applied to examining prior research studies in clinical medicine and allied health fields, but it is increasingly being used in the social sciences.

Theoretical Review The purpose of this form is to examine the corpus of theory that has accumulated in regard to an issue, concept, theory, phenomena. The theoretical literature review helps to establish what theories already exist, the relationships between them, to what degree the existing theories have been investigated, and to develop new hypotheses to be tested. Often this form is used to help establish a lack of appropriate theories or reveal that current theories are inadequate for explaining new or emerging research problems. The unit of analysis can focus on a theoretical concept or a whole theory or framework.

NOTE: Most often the literature review will incorporate some combination of types. For example, a review that examines literature supporting or refuting an argument, assumption, or philosophical problem related to the research problem will also need to include writing supported by sources that establish the history of these arguments in the literature.

Baumeister, Roy F. and Mark R. Leary. "Writing Narrative Literature Reviews." Review of General Psychology 1 (September 1997): 311-320; Mark R. Fink, Arlene. Conducting Research Literature Reviews: From the Internet to Paper . 2nd ed. Thousand Oaks, CA: Sage, 2005; Hart, Chris. Doing a Literature Review: Releasing the Social Science Research Imagination . Thousand Oaks, CA: Sage Publications, 1998; Kennedy, Mary M. "Defining a Literature." Educational Researcher 36 (April 2007): 139-147; Petticrew, Mark and Helen Roberts. Systematic Reviews in the Social Sciences: A Practical Guide . Malden, MA: Blackwell Publishers, 2006; Torracro, Richard. "Writing Integrative Literature Reviews: Guidelines and Examples." Human Resource Development Review 4 (September 2005): 356-367; Rocco, Tonette S. and Maria S. Plakhotnik. "Literature Reviews, Conceptual Frameworks, and Theoretical Frameworks: Terms, Functions, and Distinctions." Human Ressource Development Review 8 (March 2008): 120-130; Sutton, Anthea. Systematic Approaches to a Successful Literature Review . Los Angeles, CA: Sage Publications, 2016.

Structure and Writing Style

I. Thinking About Your Literature Review

The structure of a literature review should include the following in support of understanding the research problem :

An overview of the subject, issue, or theory under consideration, along with the objectives of the literature review,
Division of works under review into themes or categories [e.g. works that support a particular position, those against, and those offering alternative approaches entirely],
An explanation of how each work is similar to and how it varies from the others,
Conclusions as to which pieces are best considered in their argument, are most convincing of their opinions, and make the greatest contribution to the understanding and development of their area of research.

The critical evaluation of each work should consider :

Provenance -- what are the author's credentials? Are the author's arguments supported by evidence [e.g. primary historical material, case studies, narratives, statistics, recent scientific findings]?
Methodology -- were the techniques used to identify, gather, and analyze the data appropriate to addressing the research problem? Was the sample size appropriate? Were the results effectively interpreted and reported?
Objectivity -- is the author's perspective even-handed or prejudicial? Is contrary data considered or is certain pertinent information ignored to prove the author's point?
Persuasiveness -- which of the author's theses are most convincing or least convincing?
Validity -- are the author's arguments and conclusions convincing? Does the work ultimately contribute in any significant way to an understanding of the subject?

II. Development of the Literature Review

Four Basic Stages of Writing 1. Problem formulation -- which topic or field is being examined and what are its component issues? 2. Literature search -- finding materials relevant to the subject being explored. 3. Data evaluation -- determining which literature makes a significant contribution to the understanding of the topic. 4. Analysis and interpretation -- discussing the findings and conclusions of pertinent literature.

Consider the following issues before writing the literature review: Clarify If your assignment is not specific about what form your literature review should take, seek clarification from your professor by asking these questions: 1. Roughly how many sources would be appropriate to include? 2. What types of sources should I review (books, journal articles, websites; scholarly versus popular sources)? 3. Should I summarize, synthesize, or critique sources by discussing a common theme or issue? 4. Should I evaluate the sources in any way beyond evaluating how they relate to understanding the research problem? 5. Should I provide subheadings and other background information, such as definitions and/or a history? Find Models Use the exercise of reviewing the literature to examine how authors in your discipline or area of interest have composed their literature review sections. Read them to get a sense of the types of themes you might want to look for in your own research or to identify ways to organize your final review. The bibliography or reference section of sources you've already read, such as required readings in the course syllabus, are also excellent entry points into your own research. Narrow the Topic The narrower your topic, the easier it will be to limit the number of sources you need to read in order to obtain a good survey of relevant resources. Your professor will probably not expect you to read everything that's available about the topic, but you'll make the act of reviewing easier if you first limit scope of the research problem. A good strategy is to begin by searching the USC Libraries Catalog for recent books about the topic and review the table of contents for chapters that focuses on specific issues. You can also review the indexes of books to find references to specific issues that can serve as the focus of your research. For example, a book surveying the history of the Israeli-Palestinian conflict may include a chapter on the role Egypt has played in mediating the conflict, or look in the index for the pages where Egypt is mentioned in the text. Consider Whether Your Sources are Current Some disciplines require that you use information that is as current as possible. This is particularly true in disciplines in medicine and the sciences where research conducted becomes obsolete very quickly as new discoveries are made. However, when writing a review in the social sciences, a survey of the history of the literature may be required. In other words, a complete understanding the research problem requires you to deliberately examine how knowledge and perspectives have changed over time. Sort through other current bibliographies or literature reviews in the field to get a sense of what your discipline expects. You can also use this method to explore what is considered by scholars to be a "hot topic" and what is not.

III. Ways to Organize Your Literature Review

Chronology of Events If your review follows the chronological method, you could write about the materials according to when they were published. This approach should only be followed if a clear path of research building on previous research can be identified and that these trends follow a clear chronological order of development. For example, a literature review that focuses on continuing research about the emergence of German economic power after the fall of the Soviet Union. By Publication Order your sources by publication chronology, then, only if the order demonstrates a more important trend. For instance, you could order a review of literature on environmental studies of brown fields if the progression revealed, for example, a change in the soil collection practices of the researchers who wrote and/or conducted the studies. Thematic [“conceptual categories”] A thematic literature review is the most common approach to summarizing prior research in the social and behavioral sciences. Thematic reviews are organized around a topic or issue, rather than the progression of time, although the progression of time may still be incorporated into a thematic review. For example, a review of the Internet’s impact on American presidential politics could focus on the development of online political satire. While the study focuses on one topic, the Internet’s impact on American presidential politics, it would still be organized chronologically reflecting technological developments in media. The difference in this example between a "chronological" and a "thematic" approach is what is emphasized the most: themes related to the role of the Internet in presidential politics. Note that more authentic thematic reviews tend to break away from chronological order. A review organized in this manner would shift between time periods within each section according to the point being made. Methodological A methodological approach focuses on the methods utilized by the researcher. For the Internet in American presidential politics project, one methodological approach would be to look at cultural differences between the portrayal of American presidents on American, British, and French websites. Or the review might focus on the fundraising impact of the Internet on a particular political party. A methodological scope will influence either the types of documents in the review or the way in which these documents are discussed.

Other Sections of Your Literature Review Once you've decided on the organizational method for your literature review, the sections you need to include in the paper should be easy to figure out because they arise from your organizational strategy. In other words, a chronological review would have subsections for each vital time period; a thematic review would have subtopics based upon factors that relate to the theme or issue. However, sometimes you may need to add additional sections that are necessary for your study, but do not fit in the organizational strategy of the body. What other sections you include in the body is up to you. However, only include what is necessary for the reader to locate your study within the larger scholarship about the research problem.

Here are examples of other sections, usually in the form of a single paragraph, you may need to include depending on the type of review you write:

Current Situation : Information necessary to understand the current topic or focus of the literature review.
Sources Used : Describes the methods and resources [e.g., databases] you used to identify the literature you reviewed.
History : The chronological progression of the field, the research literature, or an idea that is necessary to understand the literature review, if the body of the literature review is not already a chronology.
Selection Methods : Criteria you used to select (and perhaps exclude) sources in your literature review. For instance, you might explain that your review includes only peer-reviewed [i.e., scholarly] sources.
Standards : Description of the way in which you present your information.
Questions for Further Research : What questions about the field has the review sparked? How will you further your research as a result of the review?

IV. Writing Your Literature Review

Once you've settled on how to organize your literature review, you're ready to write each section. When writing your review, keep in mind these issues.

Use Evidence A literature review section is, in this sense, just like any other academic research paper. Your interpretation of the available sources must be backed up with evidence [citations] that demonstrates that what you are saying is valid. Be Selective Select only the most important points in each source to highlight in the review. The type of information you choose to mention should relate directly to the research problem, whether it is thematic, methodological, or chronological. Related items that provide additional information, but that are not key to understanding the research problem, can be included in a list of further readings . Use Quotes Sparingly Some short quotes are appropriate if you want to emphasize a point, or if what an author stated cannot be easily paraphrased. Sometimes you may need to quote certain terminology that was coined by the author, is not common knowledge, or taken directly from the study. Do not use extensive quotes as a substitute for using your own words in reviewing the literature. Summarize and Synthesize Remember to summarize and synthesize your sources within each thematic paragraph as well as throughout the review. Recapitulate important features of a research study, but then synthesize it by rephrasing the study's significance and relating it to your own work and the work of others. Keep Your Own Voice While the literature review presents others' ideas, your voice [the writer's] should remain front and center. For example, weave references to other sources into what you are writing but maintain your own voice by starting and ending the paragraph with your own ideas and wording. Use Caution When Paraphrasing When paraphrasing a source that is not your own, be sure to represent the author's information or opinions accurately and in your own words. Even when paraphrasing an author’s work, you still must provide a citation to that work.

V. Common Mistakes to Avoid

These are the most common mistakes made in reviewing social science research literature.

Sources in your literature review do not clearly relate to the research problem;
You do not take sufficient time to define and identify the most relevant sources to use in the literature review related to the research problem;
Relies exclusively on secondary analytical sources rather than including relevant primary research studies or data;
Uncritically accepts another researcher's findings and interpretations as valid, rather than examining critically all aspects of the research design and analysis;
Does not describe the search procedures that were used in identifying the literature to review;
Reports isolated statistical results rather than synthesizing them in chi-squared or meta-analytic methods; and,
Only includes research that validates assumptions and does not consider contrary findings and alternative interpretations found in the literature.

Cook, Kathleen E. and Elise Murowchick. “Do Literature Review Skills Transfer from One Course to Another?” Psychology Learning and Teaching 13 (March 2014): 3-11; Fink, Arlene. Conducting Research Literature Reviews: From the Internet to Paper . 2nd ed. Thousand Oaks, CA: Sage, 2005; Hart, Chris. Doing a Literature Review: Releasing the Social Science Research Imagination . Thousand Oaks, CA: Sage Publications, 1998; Jesson, Jill. Doing Your Literature Review: Traditional and Systematic Techniques . London: SAGE, 2011; Literature Review Handout. Online Writing Center. Liberty University; Literature Reviews. The Writing Center. University of North Carolina; Onwuegbuzie, Anthony J. and Rebecca Frels. Seven Steps to a Comprehensive Literature Review: A Multimodal and Cultural Approach . Los Angeles, CA: SAGE, 2016; Ridley, Diana. The Literature Review: A Step-by-Step Guide for Students . 2nd ed. Los Angeles, CA: SAGE, 2012; Randolph, Justus J. “A Guide to Writing the Dissertation Literature Review." Practical Assessment, Research, and Evaluation. vol. 14, June 2009; Sutton, Anthea. Systematic Approaches to a Successful Literature Review . Los Angeles, CA: Sage Publications, 2016; Taylor, Dena. The Literature Review: A Few Tips On Conducting It. University College Writing Centre. University of Toronto; Writing a Literature Review. Academic Skills Centre. University of Canberra.

Writing Tip

Break Out of Your Disciplinary Box!

Thinking interdisciplinarily about a research problem can be a rewarding exercise in applying new ideas, theories, or concepts to an old problem. For example, what might cultural anthropologists say about the continuing conflict in the Middle East? In what ways might geographers view the need for better distribution of social service agencies in large cities than how social workers might study the issue? You don’t want to substitute a thorough review of core research literature in your discipline for studies conducted in other fields of study. However, particularly in the social sciences, thinking about research problems from multiple vectors is a key strategy for finding new solutions to a problem or gaining a new perspective. Consult with a librarian about identifying research databases in other disciplines; almost every field of study has at least one comprehensive database devoted to indexing its research literature.

Frodeman, Robert. The Oxford Handbook of Interdisciplinarity . New York: Oxford University Press, 2010.

Another Writing Tip

Don't Just Review for Content!

While conducting a review of the literature, maximize the time you devote to writing this part of your paper by thinking broadly about what you should be looking for and evaluating. Review not just what scholars are saying, but how are they saying it. Some questions to ask:

How are they organizing their ideas?
What methods have they used to study the problem?
What theories have been used to explain, predict, or understand their research problem?
What sources have they cited to support their conclusions?
How have they used non-textual elements [e.g., charts, graphs, figures, etc.] to illustrate key points?

When you begin to write your literature review section, you'll be glad you dug deeper into how the research was designed and constructed because it establishes a means for developing more substantial analysis and interpretation of the research problem.

Hart, Chris. Doing a Literature Review: Releasing the Social Science Research Imagination . Thousand Oaks, CA: Sage Publications, 1 998.

Yet Another Writing Tip

When Do I Know I Can Stop Looking and Move On?

Here are several strategies you can utilize to assess whether you've thoroughly reviewed the literature:

Look for repeating patterns in the research findings . If the same thing is being said, just by different people, then this likely demonstrates that the research problem has hit a conceptual dead end. At this point consider: Does your study extend current research? Does it forge a new path? Or, does is merely add more of the same thing being said?
Look at sources the authors cite to in their work . If you begin to see the same researchers cited again and again, then this is often an indication that no new ideas have been generated to address the research problem.
Search Google Scholar to identify who has subsequently cited leading scholars already identified in your literature review [see next sub-tab]. This is called citation tracking and there are a number of sources that can help you identify who has cited whom, particularly scholars from outside of your discipline. Here again, if the same authors are being cited again and again, this may indicate no new literature has been written on the topic.

Onwuegbuzie, Anthony J. and Rebecca Frels. Seven Steps to a Comprehensive Literature Review: A Multimodal and Cultural Approach . Los Angeles, CA: Sage, 2016; Sutton, Anthea. Systematic Approaches to a Successful Literature Review . Los Angeles, CA: Sage Publications, 2016.

<< Previous: Theoretical Framework
Next: Citation Tracking >>
Last Updated: Jun 18, 2024 10:45 AM
URL: https://libguides.usc.edu/writingguide

Home » Literature Review – Types Writing Guide and Examples

Literature Review – Types Writing Guide and Examples

Table of Contents

Literature Review

Definition:

A literature review is a comprehensive and critical analysis of the existing literature on a particular topic or research question. It involves identifying, evaluating, and synthesizing relevant literature, including scholarly articles, books, and other sources, to provide a summary and critical assessment of what is known about the topic.

Types of Literature Review

Types of Literature Review are as follows:

Narrative literature review : This type of review involves a comprehensive summary and critical analysis of the available literature on a particular topic or research question. It is often used as an introductory section of a research paper.
Systematic literature review: This is a rigorous and structured review that follows a pre-defined protocol to identify, evaluate, and synthesize all relevant studies on a specific research question. It is often used in evidence-based practice and systematic reviews.
Meta-analysis: This is a quantitative review that uses statistical methods to combine data from multiple studies to derive a summary effect size. It provides a more precise estimate of the overall effect than any individual study.
Scoping review: This is a preliminary review that aims to map the existing literature on a broad topic area to identify research gaps and areas for further investigation.
Critical literature review : This type of review evaluates the strengths and weaknesses of the existing literature on a particular topic or research question. It aims to provide a critical analysis of the literature and identify areas where further research is needed.
Conceptual literature review: This review synthesizes and integrates theories and concepts from multiple sources to provide a new perspective on a particular topic. It aims to provide a theoretical framework for understanding a particular research question.
Rapid literature review: This is a quick review that provides a snapshot of the current state of knowledge on a specific research question or topic. It is often used when time and resources are limited.
Thematic literature review : This review identifies and analyzes common themes and patterns across a body of literature on a particular topic. It aims to provide a comprehensive overview of the literature and identify key themes and concepts.
Realist literature review: This review is often used in social science research and aims to identify how and why certain interventions work in certain contexts. It takes into account the context and complexities of real-world situations.
State-of-the-art literature review : This type of review provides an overview of the current state of knowledge in a particular field, highlighting the most recent and relevant research. It is often used in fields where knowledge is rapidly evolving, such as technology or medicine.
Integrative literature review: This type of review synthesizes and integrates findings from multiple studies on a particular topic to identify patterns, themes, and gaps in the literature. It aims to provide a comprehensive understanding of the current state of knowledge on a particular topic.
Umbrella literature review : This review is used to provide a broad overview of a large and diverse body of literature on a particular topic. It aims to identify common themes and patterns across different areas of research.
Historical literature review: This type of review examines the historical development of research on a particular topic or research question. It aims to provide a historical context for understanding the current state of knowledge on a particular topic.
Problem-oriented literature review : This review focuses on a specific problem or issue and examines the literature to identify potential solutions or interventions. It aims to provide practical recommendations for addressing a particular problem or issue.
Mixed-methods literature review : This type of review combines quantitative and qualitative methods to synthesize and analyze the available literature on a particular topic. It aims to provide a more comprehensive understanding of the research question by combining different types of evidence.

Parts of Literature Review

Parts of a literature review are as follows:

Introduction

The introduction of a literature review typically provides background information on the research topic and why it is important. It outlines the objectives of the review, the research question or hypothesis, and the scope of the review.

Literature Search

This section outlines the search strategy and databases used to identify relevant literature. The search terms used, inclusion and exclusion criteria, and any limitations of the search are described.

Literature Analysis

The literature analysis is the main body of the literature review. This section summarizes and synthesizes the literature that is relevant to the research question or hypothesis. The review should be organized thematically, chronologically, or by methodology, depending on the research objectives.

Critical Evaluation

Critical evaluation involves assessing the quality and validity of the literature. This includes evaluating the reliability and validity of the studies reviewed, the methodology used, and the strength of the evidence.

The conclusion of the literature review should summarize the main findings, identify any gaps in the literature, and suggest areas for future research. It should also reiterate the importance of the research question or hypothesis and the contribution of the literature review to the overall research project.

The references list includes all the sources cited in the literature review, and follows a specific referencing style (e.g., APA, MLA, Harvard).

How to write Literature Review

Here are some steps to follow when writing a literature review:

Define your research question or topic : Before starting your literature review, it is essential to define your research question or topic. This will help you identify relevant literature and determine the scope of your review.
Conduct a comprehensive search: Use databases and search engines to find relevant literature. Look for peer-reviewed articles, books, and other academic sources that are relevant to your research question or topic.
Evaluate the sources: Once you have found potential sources, evaluate them critically to determine their relevance, credibility, and quality. Look for recent publications, reputable authors, and reliable sources of data and evidence.
Organize your sources: Group the sources by theme, method, or research question. This will help you identify similarities and differences among the literature, and provide a structure for your literature review.
Analyze and synthesize the literature : Analyze each source in depth, identifying the key findings, methodologies, and conclusions. Then, synthesize the information from the sources, identifying patterns and themes in the literature.
Write the literature review : Start with an introduction that provides an overview of the topic and the purpose of the literature review. Then, organize the literature according to your chosen structure, and analyze and synthesize the sources. Finally, provide a conclusion that summarizes the key findings of the literature review, identifies gaps in knowledge, and suggests areas for future research.
Edit and proofread: Once you have written your literature review, edit and proofread it carefully to ensure that it is well-organized, clear, and concise.

Examples of Literature Review

Here’s an example of how a literature review can be conducted for a thesis on the topic of “ The Impact of Social Media on Teenagers’ Mental Health”:

Start by identifying the key terms related to your research topic. In this case, the key terms are “social media,” “teenagers,” and “mental health.”
Use academic databases like Google Scholar, JSTOR, or PubMed to search for relevant articles, books, and other publications. Use these keywords in your search to narrow down your results.
Evaluate the sources you find to determine if they are relevant to your research question. You may want to consider the publication date, author’s credentials, and the journal or book publisher.
Begin reading and taking notes on each source, paying attention to key findings, methodologies used, and any gaps in the research.
Organize your findings into themes or categories. For example, you might categorize your sources into those that examine the impact of social media on self-esteem, those that explore the effects of cyberbullying, and those that investigate the relationship between social media use and depression.
Synthesize your findings by summarizing the key themes and highlighting any gaps or inconsistencies in the research. Identify areas where further research is needed.
Use your literature review to inform your research questions and hypotheses for your thesis.

For example, after conducting a literature review on the impact of social media on teenagers’ mental health, a thesis might look like this:

“Using a mixed-methods approach, this study aims to investigate the relationship between social media use and mental health outcomes in teenagers. Specifically, the study will examine the effects of cyberbullying, social comparison, and excessive social media use on self-esteem, anxiety, and depression. Through an analysis of survey data and qualitative interviews with teenagers, the study will provide insight into the complex relationship between social media use and mental health outcomes, and identify strategies for promoting positive mental health outcomes in young people.”

Reference: Smith, J., Jones, M., & Lee, S. (2019). The effects of social media use on adolescent mental health: A systematic review. Journal of Adolescent Health, 65(2), 154-165. doi:10.1016/j.jadohealth.2019.03.024

Reference Example: Author, A. A., Author, B. B., & Author, C. C. (Year). Title of article. Title of Journal, volume number(issue number), page range. doi:0000000/000000000000 or URL

Applications of Literature Review

some applications of literature review in different fields:

Social Sciences: In social sciences, literature reviews are used to identify gaps in existing research, to develop research questions, and to provide a theoretical framework for research. Literature reviews are commonly used in fields such as sociology, psychology, anthropology, and political science.
Natural Sciences: In natural sciences, literature reviews are used to summarize and evaluate the current state of knowledge in a particular field or subfield. Literature reviews can help researchers identify areas where more research is needed and provide insights into the latest developments in a particular field. Fields such as biology, chemistry, and physics commonly use literature reviews.
Health Sciences: In health sciences, literature reviews are used to evaluate the effectiveness of treatments, identify best practices, and determine areas where more research is needed. Literature reviews are commonly used in fields such as medicine, nursing, and public health.
Humanities: In humanities, literature reviews are used to identify gaps in existing knowledge, develop new interpretations of texts or cultural artifacts, and provide a theoretical framework for research. Literature reviews are commonly used in fields such as history, literary studies, and philosophy.

Role of Literature Review in Research

Here are some applications of literature review in research:

Identifying Research Gaps : Literature review helps researchers identify gaps in existing research and literature related to their research question. This allows them to develop new research questions and hypotheses to fill those gaps.
Developing Theoretical Framework: Literature review helps researchers develop a theoretical framework for their research. By analyzing and synthesizing existing literature, researchers can identify the key concepts, theories, and models that are relevant to their research.
Selecting Research Methods : Literature review helps researchers select appropriate research methods and techniques based on previous research. It also helps researchers to identify potential biases or limitations of certain methods and techniques.
Data Collection and Analysis: Literature review helps researchers in data collection and analysis by providing a foundation for the development of data collection instruments and methods. It also helps researchers to identify relevant data sources and identify potential data analysis techniques.
Communicating Results: Literature review helps researchers to communicate their results effectively by providing a context for their research. It also helps to justify the significance of their findings in relation to existing research and literature.

Purpose of Literature Review

Some of the specific purposes of a literature review are as follows:

To provide context: A literature review helps to provide context for your research by situating it within the broader body of literature on the topic.
To identify gaps and inconsistencies: A literature review helps to identify areas where further research is needed or where there are inconsistencies in the existing literature.
To synthesize information: A literature review helps to synthesize the information from multiple sources and present a coherent and comprehensive picture of the current state of knowledge on the topic.
To identify key concepts and theories : A literature review helps to identify key concepts and theories that are relevant to your research question and provide a theoretical framework for your study.
To inform research design: A literature review can inform the design of your research study by identifying appropriate research methods, data sources, and research questions.

Characteristics of Literature Review

Some Characteristics of Literature Review are as follows:

Identifying gaps in knowledge: A literature review helps to identify gaps in the existing knowledge and research on a specific topic or research question. By analyzing and synthesizing the literature, you can identify areas where further research is needed and where new insights can be gained.
Establishing the significance of your research: A literature review helps to establish the significance of your own research by placing it in the context of existing research. By demonstrating the relevance of your research to the existing literature, you can establish its importance and value.
Informing research design and methodology : A literature review helps to inform research design and methodology by identifying the most appropriate research methods, techniques, and instruments. By reviewing the literature, you can identify the strengths and limitations of different research methods and techniques, and select the most appropriate ones for your own research.
Supporting arguments and claims: A literature review provides evidence to support arguments and claims made in academic writing. By citing and analyzing the literature, you can provide a solid foundation for your own arguments and claims.
I dentifying potential collaborators and mentors: A literature review can help identify potential collaborators and mentors by identifying researchers and practitioners who are working on related topics or using similar methods. By building relationships with these individuals, you can gain valuable insights and support for your own research and practice.
Keeping up-to-date with the latest research : A literature review helps to keep you up-to-date with the latest research on a specific topic or research question. By regularly reviewing the literature, you can stay informed about the latest findings and developments in your field.

Advantages of Literature Review

There are several advantages to conducting a literature review as part of a research project, including:

Establishing the significance of the research : A literature review helps to establish the significance of the research by demonstrating the gap or problem in the existing literature that the study aims to address.
Identifying key concepts and theories: A literature review can help to identify key concepts and theories that are relevant to the research question, and provide a theoretical framework for the study.
Supporting the research methodology : A literature review can inform the research methodology by identifying appropriate research methods, data sources, and research questions.
Providing a comprehensive overview of the literature : A literature review provides a comprehensive overview of the current state of knowledge on a topic, allowing the researcher to identify key themes, debates, and areas of agreement or disagreement.
Identifying potential research questions: A literature review can help to identify potential research questions and areas for further investigation.
Avoiding duplication of research: A literature review can help to avoid duplication of research by identifying what has already been done on a topic, and what remains to be done.
Enhancing the credibility of the research : A literature review helps to enhance the credibility of the research by demonstrating the researcher’s knowledge of the existing literature and their ability to situate their research within a broader context.

Limitations of Literature Review

Limitations of Literature Review are as follows:

Limited scope : Literature reviews can only cover the existing literature on a particular topic, which may be limited in scope or depth.
Publication bias : Literature reviews may be influenced by publication bias, which occurs when researchers are more likely to publish positive results than negative ones. This can lead to an incomplete or biased picture of the literature.
Quality of sources : The quality of the literature reviewed can vary widely, and not all sources may be reliable or valid.
Time-limited: Literature reviews can become quickly outdated as new research is published, making it difficult to keep up with the latest developments in a field.
Subjective interpretation : Literature reviews can be subjective, and the interpretation of the findings can vary depending on the researcher’s perspective or bias.
Lack of original data : Literature reviews do not generate new data, but rather rely on the analysis of existing studies.
Risk of plagiarism: It is important to ensure that literature reviews do not inadvertently contain plagiarism, which can occur when researchers use the work of others without proper attribution.

About the author

Muhammad Hassan

Researcher, Academic Writer, Web developer

Dissertation vs Thesis – Key Differences

Research Objectives – Types, Examples and...

Thesis – Structure, Example and Writing Guide

Problem Statement – Writing Guide, Examples and...

Critical Analysis – Types, Examples and Writing...

Future Research – Thesis Guide

Methodology
Open access
Published: 19 October 2019

Smart literature review: a practical topic modelling approach to exploratory literature review

Claus Boye Asmussen ORCID: orcid.org/0000-0002-2998-2293 1 &
Charles Møller 1

Journal of Big Data volume 6 , Article number: 93 ( 2019 ) Cite this article

48k Accesses

204 Citations

3 Altmetric

Metrics details

Manual exploratory literature reviews should be a thing of the past, as technology and development of machine learning methods have matured. The learning curve for using machine learning methods is rapidly declining, enabling new possibilities for all researchers. A framework is presented on how to use topic modelling on a large collection of papers for an exploratory literature review and how that can be used for a full literature review. The aim of the paper is to enable the use of topic modelling for researchers by presenting a step-by-step framework on a case and sharing a code template. The framework consists of three steps; pre-processing, topic modelling, and post-processing, where the topic model Latent Dirichlet Allocation is used. The framework enables huge amounts of papers to be reviewed in a transparent, reliable, faster, and reproducible way.

Introduction

Manual exploratory literature reviews are soon to be outdated. It is a time-consuming process, with limited processing power, resulting in a low number of papers analysed. Researchers, especially junior researchers, often need to find, organise, and understand new and unchartered research areas. As a literature review in the early stages often involves a large number of papers, the options for a researcher is either to limit the amount of papers to review a priori or review the papers by other methods. So far, the handling of large collections of papers has been structured into topics or categories by the use of coding sheets [ 2 , 12 , 22 ], dictionary or supervised learning methods [ 30 ]. The use of coding sheets has especially been used in social science, where trained humans have created impressive data collections, such as the Policy Agendas Project and the Congressional Bills Project in American politics [ 30 ]. These methods, however, have a high upfront cost of time, requiring a prior understanding where papers are grouped by categories based on pre-existing knowledge. In an exploratory phase where a general overview of research directions is needed, many researchers may be dismayed by having to spend a lot of time before seeing any results, potentially wasting efforts that could have been better spent elsewhere. With the advancement of machine learning methods, many of the issues can be dealt with at a low cost of time for the researcher. Some authors argue that when human processing such as coding practice is substituted by computer processing, reliability is increased and cost of time is reduced [ 12 , 23 , 30 ]. Supervised learning and unsupervised learning, are two methods for automatically processing papers [ 30 ]. Supervised learning relies on manually coding a training set of papers before performing an analysis, which entails a high cost of time before a result is achieved. Unsupervised learning methods, such as topic modelling, do not require the researcher to create coding sheets before an analysis, which presents a low cost of time approach for an exploratory review with a large collection of papers. Even though, topic modelling has been used to group large amounts of documents, few applications of topic modelling have been used on research papers, and a researcher is required to have programming skills and statistical knowledge to successfully conduct an exploratory literature review using topic modelling.

This paper presents a framework where topic modelling, a branch of the unsupervised methods, is used to conduct an exploratory literature review and how that can be used for a full literature review. The intention of the paper is to enable the use of topic modelling for researchers by providing a practical approach to topic modelling, where a framework is presented and used on a case step-by-step. The paper is organised as follows. The following section will review the literature in topic modelling and its use in exploratory literature reviews. The framework is presented in “ Method ” section, and the case is presented in “ Framework ” section. “ Discussion ” and “ Conclusion ” sections conclude the paper with a discussion and conclusion.

Topic modelling for exploratory literature review

While there are many ways of conducting an exploratory review, most methods require a high upfront cost of time and having pre-existent knowledge of the domain. Quinn et al. [ 30 ] investigated the costs of different text categorisation methods, a summary of which is presented in Table 1 , where the assumptions and cost of the methods are compared.

What is striking is that all of the methods, except manually reading papers and topic modelling, require pre-existing knowledge of the categories of the papers and have a high pre-analysis cost. Manually reading a large amount of papers will have a high cost of time for the researcher, whereas topic modelling can be automated, substituting the use of the researcher’s time with the use of computer time. This indicates a potentially good fit for the use of topic modelling for exploratory literature reviews.

The use of topic modelling is not new. However, there are remarkably few papers utilising the method for categorising research papers. It has been predominantly been used in the social sciences to identify concepts and subjects within a corpus of documents. An overview of applications of topic modelling is presented in Table 2 , where the type of data, topic modelling method, the use case and size of data are presented.

The papers in Table 2 analyse web content, newspaper articles, books, speeches, and, in one instance, videos, but none of the papers have applied a topic modelling method on a corpus of research papers. However, [ 27 ] address the use of LDA for researchers and argue that there are four parameters a researcher needs to deal with, namely pre-processing of text, selection of model parameters and number of topics to be generated, evaluation of reliability, and evaluation of validity. The uses of topic modelling are to identify themes or topics within a corpus of many documents, or to develop or test topic modelling methods. The motivation for most of the papers is that the use of topic modelling enables the possibility to do an analysis on a large amount of documents, as they would otherwise have not been able to due to the cost of time [ 30 ]. Most of the papers argue that LDA is a state-of-the-art and preferred method for topic modelling, which is why almost all of the papers have chosen the LDA method. The use of topic modelling does not provide a full meaning of the text but provides a good overview of the themes, which could not have been obtained otherwise [ 21 ]. DiMaggio et al. [ 12 ] find a key distinction in the use of topic modelling is that its use is more of utility than accuracy, where the model should simplify the data in an interpretable and valid way to be used for further analysis They note that a subject-matter expert is required to interpret the outcome and that the analysis is formed by the data.

The use of topic modelling presents an opportunity for researchers to add a tool to their tool box for an exploratory and literature review process. Topic modelling has mostly been used on online content and requires a high degree of statistical and technical skill, skills not all researchers possess. To enable more researchers to apply topic modelling for their exploratory literature reviews, a framework will be proposed to lower the requirements for technical and statistical skills of the researcher.

Topic modelling has proven itself as a tool for exploratory analysis of a large number of papers [ 14 , 24 ]. However, it has rarely been applied in the context of an exploratory literature review. The selected topic modelling method, for the framework, is Latent Dirichlet Allocation (LDA), as it is the most used [ 6 , 12 , 17 , 20 , 32 ], state-of-the-art method [ 25 ] and simplest method [ 8 ]. While other topic modelling methods could be considered, the aim of this paper is to enable the use of topic modelling for researchers. For enabling topic modelling for researchers, ease of use and applicability are highly rated, where LDA is easily implemented and understood. Other topic modelling methods could potentially be used in the framework, where reviews of other topic models is presented in [ 1 , 26 ].

The topic modelling method LDA is an unsupervised, probabilistic modelling method which extracts topics from a collection of papers. A topic is defined as a distribution over a fixed vocabulary. LDA analyses the words in each paper and calculates the joint probability distribution between the observed (words in the paper) and the unobserved (the hidden structure of topics). The method uses a ‘Bag of Words’ approach where the semantics and meaning of sentences are not evaluated. Rather, the method evaluates the frequency of words. It is therefore assumed that the most frequent words within a topic will present an aboutness of the topic. As an example, if one of the topics in a paper is LEAN, then it can be assumed that the words LEAN, JIT and Kanban are more frequent, compared to other non-LEAN papers. The result is a number of topics with the most prevalent topics grouped together. A probability for each paper is calculated for each topic, creating a matrix with the size of number of topics multiplied with the number of papers. A detailed description of LDA is found in [ 6 ].

The framework is designed as a step-by-step procedure, where its use is presented in a form of a case where the code used for the analysis is shared, enabling other researchers to easily replicate the framework for their own literature review. The code is based on the open source statistical language R, but any language with the LDA method is suitable for use. The framework can be made fully automated, presenting a low cost of time approach for exploratory literature reviews. An inspiration for the automation of the framework can be found in [ 10 ], who created an online-service, towards processing Business Process Management documents where text-mining approaches such as topic modelling are automated. They find that topic modelling can be automated and argue that the use of a good tool for topic modelling can easily present good results, but the method relies on the ability of people to find the right data, guide the analytical journey and interpret the results.

The aim of the paper is to create a generic framework which can be applied in any context of an exploratory literature review and potentially be used for a full literature review. The method provided in this paper is a framework which is based upon well-known procedures for how to clean and process data, in such a way that the contribution from the framework is not in presenting new ways to process data but in how known methods are combined and used. The framework will be validated by the use of a case in the form of a literature review. The outcome of the method is a list of topics where papers are grouped. If the grouping of papers makes sense and is logical, which can be evaluated by an expert within the research field, then the framework is deemed valid. Compared to other methods, such as supervised learning, the method of measuring validity does not produce an exact degree of validity. However, invalid results will likely be easily identifiable by an expert within the field. As stated by [ 12 ], the use of topic modelling is more for utility than for accuracy.

The developed framework is illustrated in Fig. 1 , and the R-code and case output files are located at https://github.com/clausba/Smart-Literature-Review . The smart literature review process consists of the three steps: pre-processing, topic modelling, and post-processing.

Process overview of the smart literature review framework

The pre-processing steps are getting the data and model ready to run, where the topic-modelling step is executing the LDA method. The post-processing steps are translating the outcome of the LDA model to an exploratory review and using that to identify papers to be used for a literature review. It is assumed that the papers for review are downloaded and available, as a library with the pdf files.

Pre-processing

The pre-processing steps consist of loading and preparing the papers for processing, an essential step for a good analytical result. The first step is to load the papers into the R environment. The next step is to clean the papers by removing or altering non-value-adding words. All words are converted to lower case, and punctuation and whitespaces are removed. Special characters, URLs, and emails are removed, as they often do not contribute to identification of topics. Stop words, misread words and other non-semantic contributing words are removed. Examples of stop words are “can”, “use”, and “make”. These words add no value to the aboutness of a topic. The loading of papers into R can in some instances cause words to be misread, which must either be rectified or removed. Further, some websites add a first page with general information, and these contain words that must be removed. This prevents unwanted correlation between papers downloaded from the same source. Words are stemmed to their root form for easier comparison. Lastly, many words only occur in a single paper, and these should be removed to make computations easier, as less frequent words will likely provide little benefit in grouping papers into topics.

The cleansing process is often an iterative process, as it can be difficult to identify all misread and non-value adding-words a priori. Different papers’ corpora contain different words, which means that an identical cleaning process cannot be guaranteed if a new exploratory review is conducted. As an example, different non-value-adding words exist for the medical field compared to sociology or supply chain management (SCM). The cleaning process is finished once the loaded papers mainly contain value-adding words. There is no known way to scientifically evaluate when the cleaning process is finished, which in some instances makes the cleaning process more of an art than science. However, if a researcher is technically inclined methods, provided in the preText R-package can aid in making a better cleaning process [ 11 ].

LDA is an unsupervised method, which means we do not, prior to the model being executed, know the relationship between the papers. A key aspect of LDA is to group papers into a fixed number of topics, which must be given as a parameter when executing LDA. A key process is therefore to estimate the optimal number of topics. To estimate the number of topics, a cross-validation method is used to calculate the perplexity, as used in information theory, and it is a metric used to evaluate language models, where a low score indicates a better generalisation model, as done by [ 7 , 31 , 32 ]. Lowering the perplexity score is identical to maximising the overall probability of papers being in a topic. Next, test and training datasets are created: the LDA algorithm is run on the training set, and the test set is used to validate the results. The criteria for selecting the right number of topics is to find the balance between a useable number of topics and, at the same time, to keep the perplexity as low as possible. The right number of topics can differ greatly, depending on the aim of the analysis. As a rule of thumb, a low number of topics is used for a general overview and a higher number of topics is used for a more detailed view.

The cross-validation step is used to make sure that a result from an analysis is reliable, by running the LDA method several times under different conditions. Most of the parameters set for the cross-validation should have the same value, as in the final topic modelling run. However, due to computational reasons, some parameters can be altered to lower the amount of computation to save time. As with the number of topics, there is no right way to set the parameters, indicating a trial-and-error process. Most of the LDA implementations have default values set, but in this paper’s case the following parameters were changed: burn-in time, number of iterations, seed values, number of folds, and distribution between training and test sets.

Topic modelling

Once the papers have been cleaned and a decision has been made on the number of topics, the LDA method can be run. The same parameters as used in the cross-validation should be used as a guidance but for more precise results, parameters can be changed such as a higher number of iterations. The number of folds should be removed, as we do not need a test set, as all papers will be used to run the model. The outcome of the model is a list of papers, a list of probabilities for each paper for each topic, and a list of the most frequent words for each topic.

If an update to the analysis is needed, new papers simply have to be loaded and the post-processing and topic modelling steps can be re-run without any alterations to the parameters. Thus, the framework enables an easy path for updating an exploratory review.

Post-processing

The aim of the post-processing steps is to identify and label research topics and topics relevant for use in a literature review. An outcome of the LDA model is a list of topic probabilities for each paper. The list is used to assign a paper to a topic by sorting the list by highest probability for each paper for each topic. By assigning the papers to the topics with the highest probability, all of the topics contain papers that are similar to each other. When all of the papers have been distributed into their selected topics, the topics need to be labelled. The labelling of the topics is found by identifying the main topic of each topic group, as done in [ 17 ]. Naturally, this is a subjective matter, which can provide different labelling of topics depending on the researcher. To lower the risk of wrongly identified topics, a combination of reviewing the most frequent words for each topic and a title review is used. After the topics have been labelled, the exploratory search is finished.

When the exploratory search has finished, the results must be validated. There are three ways to validate the results of an LDA model, namely statistical, semantic, or predictive [ 12 ]. Statistical validation uses statistical methods to test the assumptions of the model. An example is [ 28 ], where a Bayesian approach is used to estimate the fit of papers to topics. Semantic validation is used to compare the results of the LDA method with expert reasoning, where the results must make semantic sense. In other words, does the grouping of papers into a topic make sense, which ideally should be evaluated by an expert. An example is [ 18 ], who utilises hand coding of papers and compare the coding of papers to the outcome of an LDA model. Predictive validation is used if an external incident can be correlated with an event not found in the papers. An example is in politics where external events, such as presidential elections which should have an impact on e.g. press releases or newspaper coverage, can be used to create a predictive model [ 12 , 17 ].

The chosen method for validation in this framework is semantic validation. The reason is that a researcher will often be or have access to an expert who can quickly validate if the grouping of papers into topics makes sense or not. Statistical validation is a good way to validate the results. However, it would require high statistical skills from the researchers, which cannot be assumed. Predictive validation is used in cases where external events can be used to predict the outcome of the model, which is seldom the case in an exploratory literature review.

It should be noted that, in contrast to many other machine learning methods, it is not possible to calculate a specific measure such as the F-measure or RMSE. To be able to calculate such measures, there must exist a correct grouping of papers, which in this instance would often mean comparing the results to manually created coding sheets [ 11 , 19 , 20 , 30 ]. However, it is very rare that coding sheets are available, leaving the semantic validation approach as the preferred validation method. The validation process in the proposed framework is two-fold. Firstly, the title of the individual paper must be reviewed to validate that each paper does indeed belong in its respective topic. As LDA is an unsupervised method, it can be assumed that not all papers will have a perfect fit within each topic, but if the majority of papers are within the theme of the topic, it is evaluated to be a valid result. If the objective of the research is only an exploratory literature review, the validation ends here. However, if a full literature review is conducted, the literature review can be viewed as an extended semantic validation method. By reviewing the papers in detail within the selected topics of research, it can be validated if the vast majority of papers belong together.

Using the results from the exploratory literature review for a full literature review is simple, as all topics from the exploratory literature review will be labelled. To conduct the full literature review, select the relevant topics and conduct the literature review on the selected papers.

To validate the framework, a case will be presented, where the framework is used to conduct a literature review. The literature review is conducted in the intersection of the research fields analytics, SCM, and enterprise information systems [ 3 ]. As the research areas have a rapidly growing interest, it was assumed that the number of papers would be large, and that an exploratory review was needed to identify the research directions within the research fields. The case used broadly defined keywords for searching for papers, ensuring to include as many potentially relevant papers as possible. Six hundred and fifty papers were found, which were heavily reduced by the use of the smart literature review framework to 76 papers, resulting in a successful literature review. The amount of papers is evaluated to be too time-consuming for a manual exploratory review, which provides a good case to test the smart literature review framework. The steps and thoughts behind the use of the framework are presented in this case section.

The first step was to load the 650 papers into the R environment. Next, all words were converted to lowercase and punctuation, whitespaces, email addresses, and URLs were removed. Problematic words were identified, such as words incorrectly read from the papers. Words included in a publisher’s information page were removed, as they add no semantic value to the topic of a paper. English stop words were removed, and all words were stemmed. As a part of an iterative process, several papers were investigated to evaluate the progress of cleaning the papers. The investigations were done by displaying words in a console window and manually evaluating if more cleaning had to be done.

After the cleaning steps, 256,747 unique words remained in the paper corpus. This is a large number of unique words, which for computational reasons is beneficial to reduce. Therefore, all words that did not have a sparsity or likelihood of 99% to be in any paper were removed. The operation lowered the amount of unique words to 14,145, greatly reducing the computational needs. The LDA method will be applied on the basis of the 14,145 unique words for the 650 papers. Several papers were manually reviewed, and it was evaluated that removal of the unique words did not significantly worsen the ability to identify main topics of the paper corpus.

The last step of pre-processing is to identify the optimal number of topics. To approximate the optimal number of topics, two things were considered. The perplexity was calculated for different amounts of topics, and secondly the need for specificity was considered.

At the extremes, choosing one topic would indicate one topic covering all papers, which will provide a very coarse view of the papers. On the other hand, if the number of topics is equal to the number of papers, then a very precise topic description will be achieved, although the topics will lose practical use as the overview of topics will be too complex. Therefore, a low number of topics was preferred as a general overview was required. Identifying what is a low number of topics will differ depending on the corpus of papers, but visualising the perplexity can often provide the necessary aid for the decision.

The perplexity was calculated over five folds, where each fold would identify 75% of the papers for training the model and leave out the remaining 25% for testing purposes. Using multiple folds reduces the variability of the model, ensuring higher reliability and reducing the risk of overfitting. For replicability purposes, specific seed values were set. Lastly, the number of topics to evaluate is selected. In this case, the following amounts of topics were selected: 2, 3, 4, 5, 10, 20, 30, 40, 50, 75, 100, and 200. The perplexity method in the ‘topicmodels’ R library is used, where the specific parameters can be found in the provided code.

The calculations were done over two runs. However, there is no practical reason for not running the calculations in one run. The first run included all values of number of topics below 100, and the second run calculated the perplexity for 100 and 200 number of topics. The runtimes for the calculations were respectively 9 and 10 h on a standard issue laptop. The combined results are presented in Fig. 2 , and the converged results can be found in the shared repository.

5-Fold cross-validation of topic modelling. Results of cross-validation

The goal in this case is to find the lowest number of topics, which at the same time have a low perplexity. In this case, the slope of the fitted line starts to gradually decline at twenty topics, which is why the selected number of topics is twenty.

Case: topic modelling

As the number of topics is chosen, the next step is to run the LDA method on the entire set of papers. The full run of 650 papers for 20 topics took 3.5 h to compute on a standard issue laptop. An outcome of the method is a 650 by 20 matrix of topic probabilities. In this case, the papers with the highest probability for each topic were used to allocate the papers. The allocation of papers to topics was done in Microsoft Excel. An example of how a distribution of probabilities is distributed across topics for a specific paper is depicted in Fig. 3 . Some papers have topic probability values close to each other, which could indicate a paper belonging to an intersection between two or more topics. These cases were not considered, and the topic with the highest probability was selected.

Example of probability distribution for one document (Topic 16 selected)

The allocation of papers to topics resulted in the distribution depicted in Fig. 4 . As can be seen, the number of papers varies for each topic, indicating that some research areas have more publications than others do.

Distribution of papers per topic

Next step is to process the findings and find an adequate description of the topics. A combination of reviewing the most frequent words and a title review was used to identify the topic names. Practically, all of the paper titles and the most frequent words for each topic, were transferred to a separate Excel spreadsheet, providing an easy overview of paper titles. An example for topic 17 can be seen in Table 3 . The most frequent words for the papers in topic 17 are “data”, “big” and “analyt”. Many of the paper titles also indicate usage of big data and analytics for application in a business setting. The topic is named “Big Data Analytics”.

The process was repeated for all other topics. The names of the topics are presented in Tables 4 and 5 .

Based on the names of the topics, three topics were selected based on relevancy for the literature review. Topics 5, 13, and 17 were selected, with a total of 99 papers. In this specific case, it was deemed that there might be papers with a sub-topic that is not relevant for the literature review. Therefore, an abstract review was conducted for the 99 papers, creating 10 sub-topics, which are presented in Table 6 .

The sub-topics RFID, Analytical Methods, Performance Management, and Evaluation and Selection of IT Systems were evaluated to not be relevant for the literature review. Seventy-six papers remained, grouped by sub-topics.

The outcome of the case was an overview of the research areas within the paper corpus, represented by the twenty topics and the ten sub-topics. The selected sub-topics were used to conduct a literature review. The validation of the framework consisted of two parts. The first part addressed the question of whether the grouping of papers, evaluated by the title and keywords, makes sense and the second part addressed whether the literature review revealed any misplaced papers. The framework did successfully place the selected papers into groups of papers that resemble each other. There was only one case where a paper was misplaced, namely that a paper about material informatics was placed among the papers in the sub-topic EIS and Analytics. The grouping and selection of papers in the literature review, based on the framework, did make semantic sense and was successfully used for a literature review. The framework has proven its utility in enabling a faster and more comprehensive exploratory literature review, as compared to competing methods. The framework has increased the speed for analysing a large amount of papers, as well as having increased the reliability in comparison with manual reviews as the same result can be obtained by running the analysis once again. The transparency in the framework is higher than in competing methods, as all steps of the framework are recorded in the code and output files.

This paper presents an approach not often found in academia, by using machine learning to explore papers to identify research directions. Even though the framework has its limitations, the results and ease of use leave a promising future for topic-modelling-based exploratory literature reviews.

The main benefit of the framework is that it provides information about a large number of papers, with little effort on the researcher’s part, before time-costly manual work is to be done. It is possible, by the use of the framework, to quickly navigate many different paper corpora and evaluate where the researchers’ time and focus should be spent. This is especially valuable for a junior researcher or a researcher with little prior knowledge of a research field. If default parameters and cleaning settings can be found for the steps in the framework, a fully automatic grouping of papers could be enabled, where very little work has to be done to achieve an overview of research directions. From a literature review perspective, the benefit of using the framework is that the decision to include or exclude papers for a literature review will be postponed to a later stage where more information is provided, resulting in a more informed decision-making process. The framework enables reproducibility, as all of the steps in the exploratory review process can be reproduced, and enables a higher degree of transparency than competing methods do, as the entire review process can, in detail, be evaluated by other researchers.

There is practically no limit of the number of papers the framework is able to process, which could enable new practices for exploratory literature reviews. An example is to use the framework to track the development of a research field, by running the topic modelling script frequently or when new papers are published. This is especially potent if new papers are automatically downloaded, enabling a fully automatic exploratory literature review. For example, if an exploratory review was conducted once, the review could be updated constantly whenever new publications are made, grouping the publications into the related topics. For this, the topic model has to be trained properly for the selected collection of papers, where it can be assumed that minor additions of papers would likely not warrant any changes to the selected parameters of the model. However, as time passes and more papers are processed, the model will learn more about the collection of papers and provide a more accurate and updated result. Having an automated process could also enable a faster and more reliable method to do post-processing of the results, reducing the post-analysis cost identified for topic modelling by [ 30 ], from moderate to low.

The framework is designed to be easily used by other researchers by designing the framework to require less technical knowledge than a normal topic model usage would entail and by sharing the code used in the case work. The framework is designed as a step-by-step approach, which makes the framework more approachable. However, the framework has yet not been used by other researchers, which would provide valuable lessons for evaluating if the learning curve needs to be lowered even further for researchers to successfully use the framework.

There are, however, considerations that must be addressed when using the smart literature review framework. Finding the optimal number of topics can be quite difficult, and the proposed method of cross-validation based on the perplexity presented a good, but not optimal, solution. An indication of why the number of selected topics is not optimal is the fact that it was not possible to identify a unifying topic label for two of the topics. Namely topics 12 and 20, which were both labelled miscellaneous. The current solution to this issue is to evaluate the relevancy of every single paper of the topics that cannot be labelled. However, in future iterations of the framework, a better identification of the number of topics must be developed. This is a notion also recognised by [ 6 ], who requested that researchers should find a way to label and assign papers to a topic other than identifying the most frequent words. An attempt was made by [ 17 ] to generate automatic labelling on press releases, but it is uncertain if the method will work in other instances. Overall, the grouping of papers in the presented case into topics generally made semantic sense, where a topic label could be found for the majority of topics.

A consideration when using the framework is that not all steps have been clearly defined, and, e.g., the cleaning step is more of an art than science. If a researcher has no or little experience in coding or executing analytical models, suboptimal results could occur. [ 11 , 25 , 27 ] find that especially the pre-processing steps can have a great impact on the validity of results, which further emphasises the importance of selecting model parameters. However, it is found that the default parameters and cleaning steps set in the code provided a sufficiently valid and useable result for an exploratory literature analysis. Running the code will not take much of the researcher’s time, as the execution of code is mainly machine time, and verifying the results takes a limited amount of a researcher time.

Due to the semantic validation method used in the framework, it relies on the availability of a domain expert. The domain expert will not only validate if the grouping of papers into topics makes sense, but it is also their responsibility to label the topics [ 12 ]. If a domain expert is not available, it could lead to wrongly labelled topics and a non-valid result.

A key issue with topic modelling is that a paper can be placed in several related topics, depending on the selected seed value. The seed value will change the starting point of the topic modelling, which could result in another grouping of papers. A paper consists of several sub-topics and depending on how the different sub-topics are evaluated, papers can be allocated to different topics. A way to deal with this issue is to investigate papers with topic probabilities close to each other. Potential wrongly assigned papers can be identified and manually moved if deemed necessary. However, this presents a less automatic way of processing the papers, where future research should aim to improve the assignments of papers to topics or create a method to provide an overview of potentially misplaced papers. It should be noted that even though some papers can be misplaced, the framework provides outcome files than can easily be viewed to identify misplaced papers, by a manual review.

As the smart literature review framework heavily relies on topic modelling, improvements to the selected topic model will likely present better results. The results of the LDA method have provided good results, but more accurate results could be achieved if the semantic meaning of the words would be considered. The framework has only been tested on academic papers, but there is no technical reason to not include other types of documents. An example is to use the framework in a business context to analyse meeting minutes notes to analyse the discussion within the different departments in a company. For this to work, the cleaning parameters would likely have to change, and another evaluation method other than a literature review would be applicable. Further, the applicability of the framework has to be assessed on other streams of literature to be certain of its use for exploratory literature reviews at large.

This paper aimed to create a framework to enable researchers to use topic modelling to, do an exploratory literature review, decreasing the need for manually reading papers and, enabling the possibility to analyse a greater, almost unlimited, amount of papers, faster, more transparently and with greater reliability. The framework is based upon the use of the topic model Latent Dirichlet Allocation, which groups related papers into topic groups. The framework provides greater reliability than competing exploratory review methods provide, as the code can be rerun on the same papers, which will provide identical results. The process is highly transparent, as most decisions made by the researcher can be reviewed by other researchers, unlike, e.g., in the creation of coding sheets. The framework consists of three main phases: Pre-processing, Topic Modelling, and Post-Processing. In the pre-processing stage, papers are loaded, cleaned, and cross-validated, where recommendations to parameter settings are provided in the case work, as well as in the accompanied code. The topic modelling step is where the LDA method is executed, using the parameters identified in the pre-processing step. The post-processing step creates outputs from the topic model and addresses how validity can be ensured and how the exploratory literature review can be used for a full literature review. The framework was successfully used in a case with 650 papers, which was processed quickly, with little time investment from the researcher. Less than 2 days was used to process the 650 papers and group them into twenty research areas, with the use of a standard laptop. The results of the case are used in the literature review by [ 3 ].

The framework is seen to be especially relevant for junior researchers, as they often need an overview of different research fields, with little pre-existing knowledge, where the framework can enable researchers to review more papers, more frequently.

For an improved framework, two main areas need to be addressed. Firstly, the proposed framework needs to be applied by other researchers on other research fields to gain knowledge about the practicality and gain ideas for further development of the framework. Secondly, research in how to automatically identity model parameters could greatly improve the usability for the use of topic modelling for non-technical researchers, as the selection of model parameters has a great impact on the result of the framework.

Availability of data and materials

https://github.com/clausba/Smart-Literature-Review (No data).

Abbreviations

Latent Dirichlet Allocation

supply chain management

Alghamdi R, Alfalqi K. A survey of topic modeling in text mining. Int J Adv Comput Sci Appl. 2015;6(1):7. https://doi.org/10.14569/IJACSA.2015.060121 .

Article Google Scholar

Ansolabehere S, Snowberg EC, Snyder JM. Statistical bias in newspaper reporting on campaign finance. Public Opin Quart. 2003. https://doi.org/10.2139/ssrn.463780 .

Asmussen CB, Møller C. Enabling supply chain analytics for enterprise information systems: a topic modelling literature review. Enterprise Information Syst. 2019. (Submitted To) .

Atteveldt W, Welbers K, Jacobi C, Vliegenthart R. LDA models topics… But what are “topics”? In: Big data in the social sciences workshop. 2015. http://vanatteveldt.com/wp-content/uploads/2014_vanatteveldt_glasgowbigdata_topics.pdf .

Baum D. Recognising speakers from the topics they talk about. Speech Commun. 2012;54(10):1132–42. https://doi.org/10.1016/j.specom.2012.06.003 .

Blei DM. Probabilistic topic models. Commun ACM. 2012;55(4):77–84. https://doi.org/10.1145/2133806.2133826 .

Blei DM, Lafferty JD. A correlated topic model of science. Ann Appl Stat. 2007;1(1):17–35. https://doi.org/10.1214/07-AOAS114 .

Article MathSciNet MATH Google Scholar

Blei DM, Ng AY, Jordan MI. Latent Dirichlet Allocation. J Mach Learn Res. 2003;3:993–1022. https://doi.org/10.5555/944919.944937 .

Article MATH Google Scholar

Bonilla T, Grimmer J. Elevated threat levels and decreased expectations: how democracy handles terrorist threats. Poetics. 2013;41(6):650–69. https://doi.org/10.1016/j.poetic.2013.06.003 .

Brocke JV, Mueller O, Debortoli S. The power of text-mining in business process management. BPTrends.

Denny MJ, Spirling A. Text preprocessing for unsupervised learning: why it matters, when it misleads, and what to do about it. Polit Anal. 2018;26(2):168–89. https://doi.org/10.1017/pan.2017.44 .

DiMaggio P, Nag M, Blei D. Exploiting affinities between topic modeling and the sociological perspective on culture: application to newspaper coverage of U.S. government arts funding. Poetics. 2013;41(6):570–606. https://doi.org/10.1016/j.poetic.2013.08.004 .

Elgesem D, Feinerer I, Steskal L. Bloggers’ responses to the Snowden affair: combining automated and manual methods in the analysis of news blogging. Computer Supported Cooperative Work: CSCW. Int J. 2016;25(2–3):167–91. https://doi.org/10.1007/s10606-016-9251-z .

Elgesem D, Steskal L, Diakopoulos N. Structure and content of the discourse on climate change in the blogosphere: the big picture. Environ Commun. 2015;9(2):169–88. https://doi.org/10.1080/17524032.2014.983536 .

Evans MS. A computational approach to qualitative analysis in large textual datasets. PLoS ONE. 2014;9(2):1–11. https://doi.org/10.1371/journal.pone.0087908 .

Ghosh D, Guha R. What are we “tweeting” about obesity? Mapping tweets with topic modeling and geographic information system. Cartogr Geogr Inform Sci. 2013;40(2):90–102. https://doi.org/10.1080/15230406.2013.776210 .

Grimmer J. A Bayesian hierarchical topic model for political texts: measuring expressed agendas in senate press releases. Polit Anal. 2010;18(1):1–35. https://doi.org/10.1093/pan/mpp034 .

Grimmer J, Stewart BM. Text as data: the promise and pitfalls of automatic content analysis methods for political texts. Polit Anal. 2013;21(03):267–97. https://doi.org/10.1093/pan/mps028 .

Guo L, Vargo CJ, Pan Z, Ding W, Ishwar P. Big social data analytics in journalism and mass communication. J Mass Commun Quart. 2016;93(2):332–59. https://doi.org/10.1177/1077699016639231 .

Jacobi C, Van Atteveldt W, Welbers K. Quantitative analysis of large amounts of journalistic texts using topic modelling. Digit J. 2016;4(1):89–106. https://doi.org/10.1080/21670811.2015.1093271 .

Jockers ML, Mimno D. Significant themes in 19th-century literature. Poetics. 2013;41(6):750–69. https://doi.org/10.1016/j.poetic.2013.08.005 .

Jones BD, Baumgartner FR. The politics of attention: how government prioritizes problems. Chicago: University of Chicago Press; 2005.

Google Scholar

King G, Lowe W. An automated information extraction tool for international conflict data with performance as good as human coders: a rare events evaluation design. Int Org. 2008;57:617–43. https://doi.org/10.1017/s0020818303573064 .

Koltsova O, Koltcov S. Mapping the public agenda with topic modeling: the case of the Russian LiveJournal. Policy Internet. 2013;5(2):207–27. https://doi.org/10.1002/1944-2866.POI331 .

Lancichinetti A, Irmak Sirer M, Wang JX, Acuna D, Körding K, Amaral LA. High-reproducibility and high-accuracy method for automated topic classification. Phys Rev X. 2015;5(1):1–11. https://doi.org/10.1103/PhysRevX.5.011007 .

Mahmood A. Literature survey on topic modeling. Technical Report, Dept. of CIS, University of Delaware Newark, Delaware. http://www.eecis.udel.edu/~vijay/fall13/snlp/lit-survey/TopicModeling-ASM.pdf . 2009.

Maier D, Waldherr A, Miltner P, Wiedemann G, Niekler A, Keinert A, Adam S. Applying LDA topic modeling in communication research: toward a valid and reliable methodology. Commun Methods Meas. 2018;12(2–3):93–118. https://doi.org/10.1080/19312458.2018.1430754 .

Mimno D, Blei DM. Bayesian checking for topic models. In: EMLP 11 proceedings of the conference on empirical methods in natural language processing. 2011. p 227–37. https://doi.org/10.5555/2145432.2145459

Parra D, Trattner C, Gómez D, Hurtado M, Wen X, Lin YR. Twitter in academic events: a study of temporal usage, communication, sentimental and topical patterns in 16 Computer Science conferences. Comput Commun. 2016;73:301–14. https://doi.org/10.1016/j.comcom.2015.07.001 .

Quinn KM, Monroe BL, Colaresi M, Crespin MH, Radev DR. How to analyze political attention. Am J Polit Sci. 2010;54(1):209–28. https://doi.org/10.1111/j.1540-5907.2009.00427.x .

Xu Z, Raschid L. Probabilistic financial community models with Latent Dirichlet Allocation for financial supply chains. In: DSMM’16 proceedings of the second international workshop on data science for macro-modeling. 2016. https://doi.org/10.1145/2951894.2951900 .

Zhao W, Chen JJ, Perkins R, Liu Z, Ge W, Ding Y, Zou W. A heuristic approach to determine an appropriate number of topics in topic modeling. BMC Bioinform. 2015;16(13):S8. https://doi.org/10.1186/1471-2105-16-S13-S8 .

Download references

Acknowledgements

Not applicable.

Author information

Authors and affiliations.

Department of Materials and Production, Center for Industrial Production, Aalborg University, Fibigerstræde 16, 9220, Aalborg Øst, Denmark

Claus Boye Asmussen & Charles Møller

You can also search for this author in PubMed Google Scholar

Contributions

CBA wrote the paper, developed the framework and executed the case. CM Supervised the research and developed the framework. Both authors read and approved the final manuscript.

Corresponding author

Correspondence to Claus Boye Asmussen .

Ethics declarations

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article.

Asmussen, C.B., Møller, C. Smart literature review: a practical topic modelling approach to exploratory literature review. J Big Data 6 , 93 (2019). https://doi.org/10.1186/s40537-019-0255-7

Download citation

Received : 26 July 2019

Accepted : 02 October 2019

Published : 19 October 2019

DOI : https://doi.org/10.1186/s40537-019-0255-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Supply chain management
Automatic literature review

Get science-backed answers as you write with Paperpal's Research feature

What is a Literature Review? How to Write It (with Examples)

A literature review is a critical analysis and synthesis of existing research on a particular topic. It provides an overview of the current state of knowledge, identifies gaps, and highlights key findings in the literature. 1 The purpose of a literature review is to situate your own research within the context of existing scholarship, demonstrating your understanding of the topic and showing how your work contributes to the ongoing conversation in the field. Learning how to write a literature review is a critical tool for successful research. Your ability to summarize and synthesize prior research pertaining to a certain topic demonstrates your grasp on the topic of study, and assists in the learning process.

What is the purpose of literature review?
a. Habitat Loss and Species Extinction:
b. Range Shifts and Phenological Changes:
c. Ocean Acidification and Coral Reefs:
d. Adaptive Strategies and Conservation Efforts:

How to write a good literature review

Choose a Topic and Define the Research Question:
Decide on the Scope of Your Review:
Select Databases for Searches:
Conduct Searches and Keep Track:
Review the Literature:
Organize and Write Your Literature Review:
How to write a literature review faster with Paperpal?
Frequently asked questions

What is a literature review?

A well-conducted literature review demonstrates the researcher’s familiarity with the existing literature, establishes the context for their own research, and contributes to scholarly conversations on the topic. One of the purposes of a literature review is also to help researchers avoid duplicating previous work and ensure that their research is informed by and builds upon the existing body of knowledge.

What is the purpose of literature review?

A literature review serves several important purposes within academic and research contexts. Here are some key objectives and functions of a literature review: 2

1. Contextualizing the Research Problem: The literature review provides a background and context for the research problem under investigation. It helps to situate the study within the existing body of knowledge.

2. Identifying Gaps in Knowledge: By identifying gaps, contradictions, or areas requiring further research, the researcher can shape the research question and justify the significance of the study. This is crucial for ensuring that the new research contributes something novel to the field.

Find academic papers related to your research topic faster. Try Research on Paperpal

3. Understanding Theoretical and Conceptual Frameworks: Literature reviews help researchers gain an understanding of the theoretical and conceptual frameworks used in previous studies. This aids in the development of a theoretical framework for the current research.

4. Providing Methodological Insights: Another purpose of literature reviews is that it allows researchers to learn about the methodologies employed in previous studies. This can help in choosing appropriate research methods for the current study and avoiding pitfalls that others may have encountered.

5. Establishing Credibility: A well-conducted literature review demonstrates the researcher’s familiarity with existing scholarship, establishing their credibility and expertise in the field. It also helps in building a solid foundation for the new research.

6. Informing Hypotheses or Research Questions: The literature review guides the formulation of hypotheses or research questions by highlighting relevant findings and areas of uncertainty in existing literature.

Literature review example

Let’s delve deeper with a literature review example: Let’s say your literature review is about the impact of climate change on biodiversity. You might format your literature review into sections such as the effects of climate change on habitat loss and species extinction, phenological changes, and marine biodiversity. Each section would then summarize and analyze relevant studies in those areas, highlighting key findings and identifying gaps in the research. The review would conclude by emphasizing the need for further research on specific aspects of the relationship between climate change and biodiversity. The following literature review template provides a glimpse into the recommended literature review structure and content, demonstrating how research findings are organized around specific themes within a broader topic.

Literature Review on Climate Change Impacts on Biodiversity:

Climate change is a global phenomenon with far-reaching consequences, including significant impacts on biodiversity. This literature review synthesizes key findings from various studies:

a. Habitat Loss and Species Extinction:

Climate change-induced alterations in temperature and precipitation patterns contribute to habitat loss, affecting numerous species (Thomas et al., 2004). The review discusses how these changes increase the risk of extinction, particularly for species with specific habitat requirements.

b. Range Shifts and Phenological Changes:

Observations of range shifts and changes in the timing of biological events (phenology) are documented in response to changing climatic conditions (Parmesan & Yohe, 2003). These shifts affect ecosystems and may lead to mismatches between species and their resources.

c. Ocean Acidification and Coral Reefs:

The review explores the impact of climate change on marine biodiversity, emphasizing ocean acidification’s threat to coral reefs (Hoegh-Guldberg et al., 2007). Changes in pH levels negatively affect coral calcification, disrupting the delicate balance of marine ecosystems.

d. Adaptive Strategies and Conservation Efforts:

Recognizing the urgency of the situation, the literature review discusses various adaptive strategies adopted by species and conservation efforts aimed at mitigating the impacts of climate change on biodiversity (Hannah et al., 2007). It emphasizes the importance of interdisciplinary approaches for effective conservation planning.

Strengthen your literature review with factual insights. Try Research on Paperpal for free!

Writing a literature review involves summarizing and synthesizing existing research on a particular topic. A good literature review format should include the following elements.

Introduction: The introduction sets the stage for your literature review, providing context and introducing the main focus of your review.

Opening Statement: Begin with a general statement about the broader topic and its significance in the field.
Scope and Purpose: Clearly define the scope of your literature review. Explain the specific research question or objective you aim to address.
Organizational Framework: Briefly outline the structure of your literature review, indicating how you will categorize and discuss the existing research.
Significance of the Study: Highlight why your literature review is important and how it contributes to the understanding of the chosen topic.
Thesis Statement: Conclude the introduction with a concise thesis statement that outlines the main argument or perspective you will develop in the body of the literature review.

Body: The body of the literature review is where you provide a comprehensive analysis of existing literature, grouping studies based on themes, methodologies, or other relevant criteria.

Organize by Theme or Concept: Group studies that share common themes, concepts, or methodologies. Discuss each theme or concept in detail, summarizing key findings and identifying gaps or areas of disagreement.
Critical Analysis: Evaluate the strengths and weaknesses of each study. Discuss the methodologies used, the quality of evidence, and the overall contribution of each work to the understanding of the topic.
Synthesis of Findings: Synthesize the information from different studies to highlight trends, patterns, or areas of consensus in the literature.
Identification of Gaps: Discuss any gaps or limitations in the existing research and explain how your review contributes to filling these gaps.
Transition between Sections: Provide smooth transitions between different themes or concepts to maintain the flow of your literature review.

Write and Cite as you go with Paperpal Research. Start now for free.

Conclusion: The conclusion of your literature review should summarize the main findings, highlight the contributions of the review, and suggest avenues for future research.

Summary of Key Findings: Recap the main findings from the literature and restate how they contribute to your research question or objective.
Contributions to the Field: Discuss the overall contribution of your literature review to the existing knowledge in the field.
Implications and Applications: Explore the practical implications of the findings and suggest how they might impact future research or practice.
Recommendations for Future Research: Identify areas that require further investigation and propose potential directions for future research in the field.
Final Thoughts: Conclude with a final reflection on the importance of your literature review and its relevance to the broader academic community.

Conducting a literature review

Conducting a literature review is an essential step in research that involves reviewing and analyzing existing literature on a specific topic. It’s important to know how to do a literature review effectively, so here are the steps to follow: 1

Choose a Topic and Define the Research Question:

Select a topic that is relevant to your field of study.
Clearly define your research question or objective. Determine what specific aspect of the topic do you want to explore?

Decide on the Scope of Your Review:

Determine the timeframe for your literature review. Are you focusing on recent developments, or do you want a historical overview?
Consider the geographical scope. Is your review global, or are you focusing on a specific region?
Define the inclusion and exclusion criteria. What types of sources will you include? Are there specific types of studies or publications you will exclude?

Select Databases for Searches:

Identify relevant databases for your field. Examples include PubMed, IEEE Xplore, Scopus, Web of Science, and Google Scholar.
Consider searching in library catalogs, institutional repositories, and specialized databases related to your topic.

Conduct Searches and Keep Track:

Develop a systematic search strategy using keywords, Boolean operators (AND, OR, NOT), and other search techniques.
Record and document your search strategy for transparency and replicability.
Keep track of the articles, including publication details, abstracts, and links. Use citation management tools like EndNote, Zotero, or Mendeley to organize your references.

Review the Literature:

Evaluate the relevance and quality of each source. Consider the methodology, sample size, and results of studies.
Organize the literature by themes or key concepts. Identify patterns, trends, and gaps in the existing research.
Summarize key findings and arguments from each source. Compare and contrast different perspectives.
Identify areas where there is a consensus in the literature and where there are conflicting opinions.
Provide critical analysis and synthesis of the literature. What are the strengths and weaknesses of existing research?

Organize and Write Your Literature Review:

Literature review outline should be based on themes, chronological order, or methodological approaches.
Write a clear and coherent narrative that synthesizes the information gathered.
Use proper citations for each source and ensure consistency in your citation style (APA, MLA, Chicago, etc.).
Conclude your literature review by summarizing key findings, identifying gaps, and suggesting areas for future research.

Whether you’re exploring a new research field or finding new angles to develop an existing topic, sifting through hundreds of papers can take more time than you have to spare. But what if you could find science-backed insights with verified citations in seconds? That’s the power of Paperpal’s new Research feature!

How to write a literature review faster with Paperpal?

Paperpal, an AI writing assistant, integrates powerful academic search capabilities within its writing platform. With the Research feature, you get 100% factual insights, with citations backed by 250M+ verified research articles, directly within your writing interface with the option to save relevant references in your Citation Library. By eliminating the need to switch tabs to find answers to all your research questions, Paperpal saves time and helps you stay focused on your writing.

Here’s how to use the Research feature:

Ask a question: Get started with a new document on paperpal.com. Click on the “Research” feature and type your question in plain English. Paperpal will scour over 250 million research articles, including conference papers and preprints, to provide you with accurate insights and citations.
Review and Save: Paperpal summarizes the information, while citing sources and listing relevant reads. You can quickly scan the results to identify relevant references and save these directly to your built-in citations library for later access.
Cite with Confidence: Paperpal makes it easy to incorporate relevant citations and references into your writing, ensuring your arguments are well-supported by credible sources. This translates to a polished, well-researched literature review.

The literature review sample and detailed advice on writing and conducting a review will help you produce a well-structured report. But remember that a good literature review is an ongoing process, and it may be necessary to revisit and update it as your research progresses. By combining effortless research with an easy citation process, Paperpal Research streamlines the literature review process and empowers you to write faster and with more confidence. Try Paperpal Research now and see for yourself.

Frequently asked questions

A literature review is a critical and comprehensive analysis of existing literature (published and unpublished works) on a specific topic or research question and provides a synthesis of the current state of knowledge in a particular field. A well-conducted literature review is crucial for researchers to build upon existing knowledge, avoid duplication of efforts, and contribute to the advancement of their field. It also helps researchers situate their work within a broader context and facilitates the development of a sound theoretical and conceptual framework for their studies.

Literature review is a crucial component of research writing, providing a solid background for a research paper’s investigation. The aim is to keep professionals up to date by providing an understanding of ongoing developments within a specific field, including research methods, and experimental techniques used in that field, and present that knowledge in the form of a written report. Also, the depth and breadth of the literature review emphasizes the credibility of the scholar in his or her field.

Before writing a literature review, it’s essential to undertake several preparatory steps to ensure that your review is well-researched, organized, and focused. This includes choosing a topic of general interest to you and doing exploratory research on that topic, writing an annotated bibliography, and noting major points, especially those that relate to the position you have taken on the topic.

Literature reviews and academic research papers are essential components of scholarly work but serve different purposes within the academic realm. 3 A literature review aims to provide a foundation for understanding the current state of research on a particular topic, identify gaps or controversies, and lay the groundwork for future research. Therefore, it draws heavily from existing academic sources, including books, journal articles, and other scholarly publications. In contrast, an academic research paper aims to present new knowledge, contribute to the academic discourse, and advance the understanding of a specific research question. Therefore, it involves a mix of existing literature (in the introduction and literature review sections) and original data or findings obtained through research methods.

Literature reviews are essential components of academic and research papers, and various strategies can be employed to conduct them effectively. If you want to know how to write a literature review for a research paper, here are four common approaches that are often used by researchers. Chronological Review: This strategy involves organizing the literature based on the chronological order of publication. It helps to trace the development of a topic over time, showing how ideas, theories, and research have evolved. Thematic Review: Thematic reviews focus on identifying and analyzing themes or topics that cut across different studies. Instead of organizing the literature chronologically, it is grouped by key themes or concepts, allowing for a comprehensive exploration of various aspects of the topic. Methodological Review: This strategy involves organizing the literature based on the research methods employed in different studies. It helps to highlight the strengths and weaknesses of various methodologies and allows the reader to evaluate the reliability and validity of the research findings. Theoretical Review: A theoretical review examines the literature based on the theoretical frameworks used in different studies. This approach helps to identify the key theories that have been applied to the topic and assess their contributions to the understanding of the subject. It’s important to note that these strategies are not mutually exclusive, and a literature review may combine elements of more than one approach. The choice of strategy depends on the research question, the nature of the literature available, and the goals of the review. Additionally, other strategies, such as integrative reviews or systematic reviews, may be employed depending on the specific requirements of the research.

The literature review format can vary depending on the specific publication guidelines. However, there are some common elements and structures that are often followed. Here is a general guideline for the format of a literature review: Introduction: Provide an overview of the topic. Define the scope and purpose of the literature review. State the research question or objective. Body: Organize the literature by themes, concepts, or chronology. Critically analyze and evaluate each source. Discuss the strengths and weaknesses of the studies. Highlight any methodological limitations or biases. Identify patterns, connections, or contradictions in the existing research. Conclusion: Summarize the key points discussed in the literature review. Highlight the research gap. Address the research question or objective stated in the introduction. Highlight the contributions of the review and suggest directions for future research.

Both annotated bibliographies and literature reviews involve the examination of scholarly sources. While annotated bibliographies focus on individual sources with brief annotations, literature reviews provide a more in-depth, integrated, and comprehensive analysis of existing literature on a specific topic. The key differences are as follows:

	Annotated Bibliography	Literature Review
Purpose	List of citations of books, articles, and other sources with a brief description (annotation) of each source.	Comprehensive and critical analysis of existing literature on a specific topic.
Focus	Summary and evaluation of each source, including its relevance, methodology, and key findings.	Provides an overview of the current state of knowledge on a particular subject and identifies gaps, trends, and patterns in existing literature.
Structure	Each citation is followed by a concise paragraph (annotation) that describes the source’s content, methodology, and its contribution to the topic.	The literature review is organized thematically or chronologically and involves a synthesis of the findings from different sources to build a narrative or argument.
Length	Typically 100-200 words	Length of literature review ranges from a few pages to several chapters
Independence	Each source is treated separately, with less emphasis on synthesizing the information across sources.	The writer synthesizes information from multiple sources to present a cohesive overview of the topic.

References

Denney, A. S., & Tewksbury, R. (2013). How to write a literature review.  Journal of criminal justice education ,  24 (2), 218-234.
Pan, M. L. (2016).  Preparing literature reviews: Qualitative and quantitative approaches . Taylor & Francis.
Cantero, C. (2019). How to write a literature review.  San José State University Writing Center .

Paperpal is an AI writing assistant that help academics write better, faster with real-time suggestions for in-depth language and grammar correction. Trained on millions of research manuscripts enhanced by professional academic editors, Paperpal delivers human precision at machine speed.

Try it for free or upgrade to  Paperpal Prime , which unlocks unlimited access to premium features like academic translation, paraphrasing, contextual synonyms, consistency checks and more. It’s like always having a professional academic editor by your side! Go beyond limitations and experience the future of academic writing.  Get Paperpal Prime now at just US$19 a month!

6 Tips for Post-Doc Researchers to Take Their Career to the Next Level

Self-plagiarism in research: what it is and how to avoid it, you may also like, how to structure an essay, leveraging generative ai to enhance student understanding of..., what’s the best chatgpt alternative for academic writing, how to write a good hook for essays,..., addressing peer review feedback and mastering manuscript revisions..., how paperpal can boost comprehension and foster interdisciplinary..., what is the importance of a concept paper..., how to write the first draft of a..., mla works cited page: format, template & examples, how to ace grant writing for research funding....

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

Advanced Search
Journal List
v.13(5); 2023
PMC10230988

Original research

Evidence-based practice models and frameworks in the healthcare setting: a scoping review, jarrod dusin.

1 Department of Evidence Based Practice, Children’s Mercy Hospitals and Clinics, Kansas City, Missouri, USA

2 Therapeutic Science, The University of Kansas Medical Center, Kansas City, Kansas, USA

Andrea Melanson

Lisa mische-lawson, associated data.

bmjopen-2022-071188supp001.pdf

bmjopen-2022-071188supp002.pdf

No data are available.

The aim of this scoping review was to identify and review current evidence-based practice (EBP) models and frameworks. Specifically, how EBP models and frameworks used in healthcare settings align with the original model of (1) asking the question, (2) acquiring the best evidence, (3) appraising the evidence, (4) applying the findings to clinical practice and (5) evaluating the outcomes of change, along with patient values and preferences and clinical skills.

A Scoping review.

Included sources and articles

Published articles were identified through searches within electronic databases (MEDLINE, EMBASE, Scopus) from January 1990 to April 2022. The English language EBP models and frameworks included in the review all included the five main steps of EBP. Excluded were models and frameworks focused on one domain or strategy (eg, frameworks focused on applying findings).

Of the 20 097 articles found by our search, 19 models and frameworks met our inclusion criteria. The results showed a diverse collection of models and frameworks. Many models and frameworks were well developed and widely used, with supporting validation and updates. Some models and frameworks provided many tools and contextual instruction, while others provided only general process instruction. The models and frameworks reviewed demonstrated that the user must possess EBP expertise and knowledge for the step of assessing evidence. The models and frameworks varied greatly in the level of instruction to assess the evidence. Only seven models and frameworks integrated patient values and preferences into their processes.

Many EBP models and frameworks currently exist that provide diverse instructions on the best way to use EBP. However, the inclusion of patient values and preferences needs to be better integrated into EBP models and frameworks. Also, the issues of EBP expertise and knowledge to assess evidence must be considered when choosing a model or framework.

STRENGTHS AND LIMITATIONS OF THIS STUDY

Currently, no comprehensive review exists of evidence-based practice (EBP) models and frameworks.
Well-developed models and frameworks may have been excluded for not including all five steps of original model for EBP.
This review did not measure the quality of the models and frameworks based on validated studies.

Introduction

Evidence-based practice (EBP) grew from evidence-based medicine (EBM) to provide a process to review, translate and implement research with practice to improve patient care, treatment and outcomes. Guyatt 1 coined the term EBM in the early 1990s. Over the last 25 years, the field of EBM has continued to evolve and is now a cornerstone of healthcare and a core competency for all medical professionals. 2 3 At first, the term EBM was used only in medicine. However, the term EBP now applies to the principles of other health professions. This expansion of the concept of EBM increases its complexity. 4 The term EBP is used for this paper because it is universal across professions.

Early in the development of EBP, Sackett 5 created an innovative five-step model. This foundational medical model provided a concise overview of the process of EBP. The five steps are (1) asking the question, (2) acquiring the best evidence, (3) appraising the evidence, (4) applying the findings to clinical practice and (5) evaluating the outcomes of change. Other critical components of Sackett’s model are considering patient value and preferences and clinical skills with the best available evidence. 5 The influence of this model has led to its integration and adaption into every field of healthcare. Historically, the foundation of EBP has focused on asking the question, acquiring the literature and appraising the evidence but has had difficulty integrating evidence into practice. 6 Although the five steps appear simple, each area includes a vast number of ways to review the literature (eg, Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA), Newcastle-Ottawa Scale) and entire fields of study, such as implementation science, a field dedicated to implementing EBP. 7 8 Implementation science can be traced to the 1960s with Everett Rogers’ Diffusion of Innovation Theory and has grown alongside EBP over the last 25 years. 7 9

One way to manage the complexity of EBP in healthcare is by developing EBP models and frameworks that establish strategies to determine resource needs, identify barriers and facilitators, and guide processes. 10 EBP models and frameworks provide insight into the complexity of transforming evidence into clinical practice. 11 They also allow organisations to determine readiness, willingness and potential outcomes for a hospital system. 12 EBP can differ from implementation science, as EBP models include all five of Sackett’s steps of EBP, while the non-process models of implementation science typically focus on the final two steps. 5 10 There are published scoping reviews of implementation science, 13 however, no comprehensive review of EBP models and frameworks currently exists. Although there is overlap of EBP, implementation science and knowledge translation models and frameworks 10 14 the purpose of the scoping review was to explore how EBP models and frameworks used in healthcare settings align with the original EBP five-step model.

A scoping review synthesises findings across various study types and provides a broad overview of the selected topic. 15 The Arksey and O’Malley method and Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA-ScR) procedures guided this review (see online supplemental PRISMA-ScR checklist ). 15 16 The primary author established the research question and inclusion and exclusion criteria before conducting the review. An a priori protocol was not pre-registered. One research question guided the review: Which EBP models and frameworks align with Sackett’s original model?

Supplementary data

Eligibility criteria.

To be included in the review, English language published EBP models and frameworks needed to include the five main steps of EBP (asking the question, acquiring the best evidence, appraising the evidence, applying the findings to clinical practice and assessing the outcomes of change) based on Sackett’s model. 5 If the models or frameworks involved identifying problems or measured readiness for change, the criteria of ‘asking the question’ was met. Exclusions included models or frameworks focused on one domain or strategy (eg, frameworks focused on applying findings). Also, non-peer-reviewed abstracts, letters, editorials, opinion articles, and dissertations were excluded.

Search and selection

To identify potential studies, a medical librarian searched the databases from January 1990 to April 2022 in MEDLINE, EMBASE and Scopus in collaboration with the primary author. The search was limited to 1990 because the term EBP was coined in the early 90s. The search strategy employed the following keywords: ‘Evidence-Based Practice’ OR ‘evidence based medicine’ OR ‘evidence-based medicine’ OR ‘evidence based nursing’ OR ‘evidence-based nursing’ OR ‘evidence based practice’ OR ‘evidence-based practice’ OR ‘evidence based medicine’ OR ‘evidence-based medicine’ OR ‘evidence based nursing’ OR ‘evidence-based nursing’ OR ‘evidence based practice’ OR ‘evidence-based practice’ AND ‘Hospitals’ OR ‘Hospital Medicine’ OR ‘Nursing’ OR ‘Advanced Practice Nursing’ OR ‘Academic Medical Centers’ OR ‘healthcare’ OR ‘hospital’ OR ‘healthcare’ OR ‘hospital’ AND ‘Models, Organizational’ OR ‘Models, Nursing’ OR ‘framework’ OR ‘theory’ OR ‘theories’ OR ‘model’ OR ‘framework’ OR ‘theory’ OR ‘theories’ OR ‘model’. Additionally, reference lists in publications included for full-text review were screened to identify eligible models and frameworks (see online supplemental appendix A for searches).

Selection of sources of evidence

Two authors (JD and AM) independently screened titles and abstracts and selected studies for potential inclusion in the study, applying the predefined inclusion and exclusion criteria. Both authors then read the full texts of these articles to assess eligibility for final inclusion. Disagreement between the authors regarding eligibility was resolved by consensus between the three authors (JD, AM and LM-L). During the selection process, many models and frameworks were found more than once. Once a model or framework article was identified, the seminal article was reviewed for inclusion. If models or frameworks had been changed or updated since the publication of their seminal article, the most current iteration published was reviewed for inclusion. Once a model or framework was identified and verified for inclusion, all other articles listing the model or framework were excluded. This scoping review intended to identify model or framework aligned with Sackett’s model; therefore, analysing every article that used the included model or framework was unnecessary (see online supplemental appendix B for tracking form).

Data extraction and analysis

Data were collected on the following study characteristics: (1) authors, (2) publication year, (3) model or framework and (4) area(s) of focus in reference to Sackett’s five-step model. After initial selection, models and frameworks were analysed for key features and alignment to the five-step EBP process. A data analysis form was developed to map detailed information (see online supplemental appendix C for full data capture form). Data analysis focused on identifying (1) the general themes of the model or frameworks, and (2) any knowledge gaps. Data extraction and analysis were done by the primary author (JD) and verified by one other author (AM). 15

Patient and public involvement

Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

The search identified 6523 potentially relevant references (see figure 1 ). Following a review of the titles and abstracts, the primary author completed a more detailed screening of 37 full papers. From these, 19 models and frameworks were included. Table 1 summarises the 19 models and frameworks. Of the 19 models and frameworks assessed and mapped, 15 had broad target audiences, including healthcare or public health organisations or health systems. Only five models and frameworks included a target audience of individual clinicians (eg, physicians and nurses). 17–22

An external file that holds a picture, illustration, etc.
Object name is bmjopen-2022-071188f01.jpg

Retrieval and selection process.

Models and frameworks organised by integration of patient preferences and values

Name	Steps of model or framework		General themes	Knowledge gaps
Patient values incorporated into model
Iowa Model	1. Question development 2. Searches, appraises and synthesises the literature 3. If literature is lacking, conduct research	4.Develop, enact and appraise a pilot solution 5. If successful, implement across organisation 6. If unsuccessful, restart process
Monash Partners Learning Health Systems Framework	1. Stakeholder-driven 2. Engage the people 3. Identify priorities 4. Research evidence 5. Evidence-based information 6. Evidence synthesis	7. Data-derived evidence 8. Data/information systems 9. Benchmarking 10. Implementation evidence 11. Implementation 12. Healthcare improvement
ARCC	1. Assess the healthcare organisation for readiness for change 2. Identify potential and actual barriers and facilitators 3. Identify EBP champions	4. Implement evidence into practice 5. Evaluate EBP outcomes
The Clinical Scholar Model	1. Observation 2. Analysis 3. Synthesis	4. Application/ evaluation 5. Dissemination
JBI	1. Global Health 2. Evidence generation 3. Evidence synthesis	4. Evidence (knowledge) transfer 5. Evidence implementation
CETEP	1. Define the clinical practice question 2. Assess the critical appraisal components 3. Plan the implementation	4. Implement the practice change 5. Evaluate the practice change
Johns Hopkins	1. Practice question: EBP question is identified 2. Evidence: the team searches, appraises, rates the strength of evidence 3. Translation: feasibility, action plan and change implemented and evaluated
Patient values discussed, not incorporated into models/frameworks
Stetler Model	1. Question development includes project context 2. Identify the relevance of evidence sources and quality 3. Summarise evidence 4. Develop a plan 5. Identify/collect data outcomes to evaluate effectiveness of plan
KTA	1. Identify problems and begin searching for evidence 2. Adapt knowledge to local context 3. Identify barriers 4. Select, adapt, and implement	5. Monitor implanted knowledge 6. Evaluate outcomes related to knowledge use 7. Sustain appropriate knowledge use
EBMgt	1. Asking; acquiring; appraising; aggregating; applying; and assessing 2. Predictors; barriers; training organisations; and research institutes
St Luke’s	1. Area of interest 2. Collect the best evidence 3. Critically appraise the evidence	4. Integrate the evidence, clinical skill and patient preferences/values 5. Evaluate the practice change
The I3 Model for Advancing Quality Patient Centred Care	1. Inquiry 2. Improvement 3. Innovation	4. Inquiry encompasses research 5. Improvement includes quality improvement projects 6. Innovation is discovery studies and best evidence projects
Model for Change to Evidence Based Practice	1. Identify need to change practice 2. Approximate problem with outcomes 3. Summarise best scientific evidence 4. Develop plan for changing practice	5. Implement and evaluate change (pilot study) 6. Integrate and maintain change in practice 7. Monitor implementation
Patient values not discussed
Evidence-Based Public Health	1. Community assessment 2. Quantify the issue 3. Develop statement of the issue 4. Determine what is known evidence	5. Develop and prioritise programme and policy options 6. Develop an action plan 7. Evaluate the programme or policy
ACE Star Model	1. Discovery: Searching for new knowledge 2. Evidence Summary: Synthesise the body of research knowledge 3. Translation: Provide clinicians with a practice document 4. Integration: Changed through formal and informal channels 5. Evaluation: EBP outcomes are evaluated
An Evidence Implementation Model for Public Health Systems	Not a linear model 1. Circle 1 Evidence implementation target 2. Circle 2 Actors involved in implementation	3. Circle 3 Knowledge transfer 4. Circle 4 Barriers and facilitators
San Diego 8A’s EBP Model	1. Assessing a clinical or practice problem 2. Asking a clinical question in a PICO format 3. Acquiring existing sources of evidence 4. Appraising the levels of evidence	5.Applying the evidence to a practice change 6. Analysing the results of the change 7. Advancing the practice change through dissemination 8. Adopting the practice of sustainability over time
Tyler Collaborative Model for EBP	Phase one: unfreezing 1. Building relationships 2. Diagnosing the problem 3. Acquiring resources Phase two: moving 1. Choosing the solution 2. Gaining acceptance	Phase three: refreezing 1. Stabilisation
The Practice Guidelines Development Cycle	1. Select/frame clinical problem 2. Generate recommendations 3. Ratify recommendations 4. Formulate practice guideline	5. Independent review 6. Negotiate practice policies 7. Adopt guideline policies 8. Scheduled review

EBP, evidence-based practice.

Asking the question

All 19 models and frameworks included a process for asking questions. Most focused on identifying problems that needed to be addressed on an organisational or hospital level. Five used the PICO (population, intervention, comparator, outcome) format to ask specific questions related to patient care. 19–25

Acquiring the evidence

The models and frameworks gave basic instructions on acquiring literature, such as ‘conduct systematic search’ or ‘acquire resource’. 20 Four recommended sources from previously generated evidence, such as guidelines and systematic reviews. 6 21 22 26 Although most models and frameworks did not provide specifics, others suggested this work be done through EBP mentors/experts. 20 21 25 27 Seven models included qualitative evidence in the use of evidence, 6 19 21 24 27–29 while only four models considered the use of patient preference and values as evidence. 21 22 24 27 Six models recommended internal data be used in acquiring information. 17 20–22 24 27

Assessing the evidence

The models and frameworks varied greatly in the level of instruction provided in assessing the best evidence. All provided a general overview in assessing and grading the evidence. Four recommended this work be done by EBP mentors and experts. 20 25 27 30 Seven models developed specific tools to be used to assess the levels of evidence. 6 17 21 22 24 25 27

Applying the evidence

The application of evidence also varied greatly for the different models and frameworks. Seven models recommended pilot programmes to implement change. 6 21–25 31 Five recommended the use of EBP mentors and experts to assist in the implementation of evidence and quality improvement as a strategy of the models and frameworks. 20 24 25 27 Thirteen models and frameworks discussed patient values and preferences, 6 17–19 21–27 31 32 but only seven incorporated this topic into the model or framework, 21–27 and only five included tools and instructions. 21–25 Twelve of the 20 models discussed using clinical skill, but specifics of how this was incorporated was lacking in models and frameworks. 6 17–19 21–27 31

Evaluating the outcomes of change

Evaluation varied among the models and frameworks, but most involved using implementation outcome measures to determine the project’s success. Five models and frameworks provide tools and in-depth instruction for evaluation. 21 22 24–26 Monash Partners Learning Health Systems provided detailed instruction on using internal institutional data to determine success of application. 26 This framework uses internal and external data along with evidence in decision making as a benchmark for successful implementation.

EBP models and frameworks provide a process for transforming evidence into clinical practice and allow organisations to determine readiness and willingness for change in a complex hospital system. 12 The large number of models and frameworks complicates the process by confusing what the best tool is for healthcare organisations. This review examined many models and frameworks and assessed the characteristics and gaps that can better assist healthcare organisations to determine the right tool for themselves. This review identified 19 EBP models and frameworks that included the five main steps of EBP as described by Sackett. 5 The results showed that the themes of the models and frameworks are as diverse as the models and frameworks themselves. Some are well developed and widely used, with supporting validation and updates. 21 22 24 27 One such model, the Iowa EBP model, has received over 3900 requests for permission to use it and has been updated from its initial development and publication. 24 Other models provided tools and contextual instruction such as the Johns Hopkin’s model which includes a large number of supporting tools for developing PICOs, instructions for grading literature and project implementation. 17 21 22 24 27 By contrast, the ACE Star model and the An Evidence Implementation Model for Public Health Systems only provide high level overview and general instructions compared with other models and frameworks. 19 29 33

Gaps in the evidence

A consistent finding in research of clinician experience with EBP is the lack of expertise that is needed to assess the literature. 24 34 35 The models and frameworks reviewed demonstrated that the user must possess the knowledge and related skills for this step in the process. The models and frameworks varied greatly in the level of instruction to assess the evidence. Most provided a general overview in assessing and grading the evidence, though a few recommended that this work be done by EBP mentors and experts. 20 25 27 ARCC, JBI and Johns Hopkins provided robust tools and resources that would require administrative time and financial support. 21 22 27 Some models and frameworks offered vital resources or pointed to other resources for assessing evidence, 24 but most did not. While a few used mentors and experts to assist with assessing the literature, a majority did not address this persistent issue.

Sackett’s five-step model included another important consideration when implementing EBP: patient values and preferences. One criticism of EBP is that it ignores patient values and preferences. 36 Over half of the models and frameworks reported the need to include patient values and preferences, but the tools, instruction or resources for including them were limited. The ARCC model integrates patient preferences and values into the model, but it is up to the EBP mentor to accomplish this task. 37 There are many tools for assessing evidence, but few models and frameworks provide this level of guidance for incorporating patient preference and values. The inclusion of patient and family values and preferences can be misunderstood, insincere, and even tokenistic but without it there is reduced chance of success of implementation of EBP. 38 39

Strengths and limitations

Similar to other well-designed scoping reviews, the strengths of this review include a rigorous search conducted by a skilled librarian, literature evaluation by more than one person, and the utilisation of an established methodological framework (PRISMA-ScR). 14 15 Additionally, utilising the EBP five-step models as a point of alignment allows for a more comprehensive breakdown and established reference points for the reviewed models and frameworks. While scoping reviews have been completed on implementation science and knowledge translation models and framework, to our knowledge, this is the first scoping review of EBP models and frameworks. 13 14 Limitations of the study include that well-developed models and frameworks may have been excluded for not including all five steps. 40 For example, the Promoting Action on Research Implementation in Health Services (PARIHS) framework is a well-developed and validated implementation framework but did not include all five steps of an EBP model. 40 Also, some models and frameworks have been studied and validated over many years. It was beyond the scope of the review to measure the quality of the models and frameworks based on these other validated studies.

Implications and future research

Healthcare organisations can support EBP by choosing a model or framework that best suits their environment and providing clear guidance for implementing the best evidence. Some organisations may find the best fit with the ARCC and the Clinical Scholars Model because of the emphasis on mentors or the Johns Hopkins model for its tools for grading the level of evidence. 21 25 27 In contrast, other organisations may find the Iowa model useful with its feedback loops throughout its process. 24

Another implication of this study is the opportunity to better define and develop robust tools for patient and family values and preferences within EBP models and frameworks. Patient experiences are complex and require thorough exploration, so it is not overlooked, which is often the case. 39 41 The utilisation of EBP models and frameworks provide an opportunity to explore this area and provide the resources and understanding that are often lacking. 38 Though varying, models such as the Iowa Model, JBI and Johns Hopkins developed tools to incorporate patient and family values and preferences, but a majority of the models and frameworks did not. 21 22 24 An opportunity exists to create broad tools that can incorporate patient and family values and preferences into EBP to a similar extent as many of the models and frameworks used for developing tools for literature assessment and implementation. 21–25

Future research should consider appraising the quality and use of the different EBP models and frameworks to determine success. Additionally, greater clarification on what is considered patient and family values and preferences and how they can be integrated into the different models and frameworks is needed.

This scoping review of 19 models and frameworks shows considerable variation regarding how the EBP models and frameworks integrate the five steps of EBP. Most of the included models and frameworks provided a narrow description of the steps needed to assess and implement EBP, while a few provided robust instruction and tools. The reviewed models and frameworks provided diverse instructions on the best way to use EBP. However, the inclusion of patient values and preferences needs to be better integrated into EBP models. Also, the issues of EBP expertise to assess evidence must be considered when selecting a model or framework.

Supplementary Material

Acknowledgments.

We thank Keri Swaggart for completing the database searches and the Medical Writing Center at Children's Mercy Kansas City for editing this manuscript.

Contributors: All authors have read and approved the final manuscript. JD conceptualised the study design, screened the articles for eligibility, extracted data from included studies and contributed to the writing and revision of the manuscript. LM-L conceptualised the study design, provided critical feedback on the manuscript and revised the manuscript. AM screened the articles for eligibility, extracted data from the studies, provided critical feedback on the manuscript and revised the manuscript. JD is the guarantor of this work.

Funding: The article processing charges related to the publication of this article were supported by The University of Kansas (KU) One University Open Access Author Fund sponsored jointly by the KU Provost, KU Vice Chancellor for Research, and KUMC Vice Chancellor for Research and managed jointly by the Libraries at the Medical Center and KU - Lawrence

Disclaimer: No funding agencies had input into the content of this manuscript.

Competing interests: None declared.

Patient and public involvement: Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

Provenance and peer review: Not commissioned; externally peer reviewed.

Supplemental material: This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Data availability statement

Ethics statements, patient consent for publication.

Not applicable.

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings
My Bibliography
Collections
Citation manager

Save citation to file

Email citation, add to collections.

Create a new collection
Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

Search in PubMed
Search in NLM Catalog
Add to Search

Ortho-geriatric service--a literature review comparing different models

Affiliation.

1 Department of Trauma Surgery and Sports Medicine, Medical University of Innsbruck, Anichstrasse 35, Innsbruck, Austria. [email protected]
PMID: 21058004
DOI: 10.1007/s00198-010-1396-x

In the fast-growing geriatric population, we are confronted with both osteoporosis, which makes fixation of fractures more and more challenging, and several comorbidities, which are most likely to cause postoperative complications. Several models of shared care for these patients are described, and the goal of our systematic literature research was to point out the differences of the individual models. A systematic electronic database search was performed, identifying articles that evaluate in a multidisciplinary approach the elderly hip fracture patients, including at least a geriatrician and an orthopedic surgeon focused on in-hospital treatment. The different investigations were categorized into four groups defined by the type of intervention. The main outcome parameters were pooled across the studies and weighted by sample size. Out of 656 potentially relevant citations, 21 could be extracted and categorized into four groups. Regarding the main outcome parameters, the group with integrated care could show the lowest in-hospital mortality rate (1.14%), the lowest length of stay (7.39 days), and the lowest mean time to surgery (1.43 days). No clear statement could be found for the medical complication rates and the activities of daily living due to their inhomogeneity when comparing the models. The review of these investigations cannot tell us the best model, but there is a trend toward more recent models using an integrated approach. Integrated care summarizes all the positive features reported in the various investigations like integration of a Geriatrician in the trauma unit, having a multidisciplinary team, prioritizing the geriatric fracture patients, and developing guidelines for the patients' treatment. Each hospital implementing a special model for geriatric hip fracture patients should collect detailed data about the patients, process of care, and outcomes to be able to participate in audit processes and avoid peerlessness.

PubMed Disclaimer

Publication types

Search in MeSH

Related information

Cited in Books

LinkOut - more resources

Full text sources.

MedlinePlus Health Information
Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Corpus ID: 70837584

Models of Adult Learning: A Literature Review

Karin Tusting , D. Barton
Published 7 July 2006
Education, Psychology

118 Citations

An investigation into adult learners and learning : powerful learners and learning in three sites of adult education, adult learners and mathematics learning support, being, having and doing: theories of learning and adults with learning difficulties..

Highly Influenced

Constructing the ‘ideal learner’: a critical discourse analysis of the adult numeracy core curriculum

What do we know about mathematics teaching and learning of multilingual adults and why does it matter, teaching english as a foreign language: perceptions of an in-service diploma course, how to use socratic questioning in order to promote adults’ self-directed learning, the effects of the literacy policy environment on local sites of learning, adult education and lifelong learning in arts and cultural institutions: a content analysis, theories of learning and the teacher educator, 55 references, the adult learner : a neglected species, boundaries of adult learning, conceptualising education for all in latin america, researching expanded notions of learning and work and underemployment: findings of the first canadian survey of informal learning practices, the concept of experiential learning and john dewey's theory of reflective thought and action, understanding practice: perspectives on activity and context, adult learning in the social context, self-direction for lifelong learning: a comprehensive guide to theory and practice, learning in likely places : varieties of apprenticeship in japan, social learning theory, related papers.

Showing 1 through 3 of 0 Related Papers

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

We're Hiring!
Help Center

A literature review of different pressure ulcer models from 1942–2005 and the development of an ideal animal model

2008, Australasian Physics & Engineering Sciences in Medicine

Related Papers

Christopher Khoo

This multi-author volume brings together 30 contributors under an international editorship. The four editors include three biomedical engineers (one from Queen Mary's College, University of London and two from the Eindhoven University of Technology) and the medical director of l'Arche Rehabilitation Centre in France, who is the only clinical author. The final chapter on tissue repair strategies, which is of most surgical interest, focuses on ‘biochemical stimulation of the wound bed to improve wound healing. Important new therapeutics in this category are reviewed such as: (i) exogenous application of growth factors; (ii) tissue-engineered skin grafts; and (iii) gene therapy.’ The two authors from the Eindhoven University of Technology are (according to an internet search) not medically qualified, so understandably their viewpoints do not set out the full range of current reconstructive surgical options or ‘future perspectives’ in clinical surgery. The publishers have adopte...

Wound Repair and Regeneration

Martijn van Griensven

Pressure Ulcer Research

Carlijn V . C . Bouten

Indian Journal of Plastic Surgery

Karoon Agrawal

ABSTRACTPressure ulcer in an otherwise sick patient is a matter of concern for the care givers as well as the medical personnel. A lot has been done to understand the disease process. So much so that USA and European countries have established advisory panels in their respective continents. Since the establishment of these organizations, the understanding of the pressure ulcer has improved significantly. The authors feel that the well documented and well publicized definition of pressure ulcer is somewhat lacking in the correct description of the disease process. Hence, a modified definition has been presented. This disease is here to stay. In the process of managing these ulcers the basic pathology needs to be understood well. Pressure ischemia is the main reason behind the occurrence of ulceration. Different extrinsic and intrinsic factors have been described in detail with review of literature. There are a large number of risk factors causing ulceration. The risk assessment scale...

Aleksandra Kotlińska-Lemieszek

This paper presents a modern concept of conservative treatment of pressure ulcers in the moist environment. Dressing types, their characteristics and a system of “colour” wound classification are presented.

Ernane Reis

Pressure Ulcers: Etiology, Treatment and Prevention Anu Singhal, MD, Resident, Metrohealth Medical Centre, Cleveland, OH, USA. Ernane D. Reis, MD, Assistant Professor, Department of Surgery,The Mount Sinai Medical Center, New York, NY, USA. Morris D. Kerstein, MD, Chief of Staff,V.A. Medical & Regional Office Center, Wilmington, Delaware; Professor of Surgery, Jefferson Medical College, Philadelphia, PA, USA. SKIN DISEASE Frequently found on the sacrum,pressure ulcers develop due to prolonged periods of unrelieved pressure on soft tissues,but can occur anywhere there is pressure, including trochanters and especially heels. In the bedridden patient, constant pressure causes ischemia and necrosis of subcutaneous tissues and skin. Most patients are elderly, immobile and have neurologic impairments,often associated with inability to sense pain and discomfort and/or incontinence. Sacral ulcers can be treated with debridement, dressings and skin grafts.However,preventive efforts—including...

Science and Practice of Pressure Ulcer Management

Physical Medicine and Rehabilitation Clinics of North America

Advances in Dermatology and Allergology/Postępy Dermatologii i Alergologii

Revista Brasileira de Cirurgia Plástica

Ricardo Figueiras

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

RELATED PAPERS

Dermatologic Therapy

Marco Romanelli

Journal of Pharmaceutical Research International

Bandar Alsharari

BMJ clinical evidence

Emily Petherick

Advances in Wound Care

Tatiana Boyko

Archives of Physical Medicine and Rehabilitation

Linda Phillips

Health & Research Journal

Orchan Impis

Ostomy/wound management

Laura Edsberg

International Wound Journal

andres maldonado

Plastic and Reconstructive Surgery

Bruce Klitzman

Andrea Cavicchioli

Journal of Wound, Ostomy and Continence Nursing

Christina Lindholm

Journal of advanced nursing

Dawn Dowding

Nils Lahmann

Joanne Whitney , Linda Phillips

International Journal of Online and Biomedical Engineering (iJOE)

Olivera Stojadinovic

JAMA: The Journal of the American Medical Association

Ijesrt Journal

Title and abstract screening for literature reviews using large language models: an exploratory study in the biomedical domain

Fabio Dennstädt ORCID: orcid.org/0000-0002-5374-8720 1 , 3 ,
Johannes Zink 2 ,
Paul Martin Putora 1 , 3 ,
Janna Hastings 4 , 5 , 6 &
Nikola Cihoric 3

Systematic Reviews volume 13 , Article number: 158 ( 2024 ) Cite this article

475 Accesses

1 Altmetric

Metrics details

Systematically screening published literature to determine the relevant publications to synthesize in a review is a time-consuming and difficult task. Large language models (LLMs) are an emerging technology with promising capabilities for the automation of language-related tasks that may be useful for such a purpose.

LLMs were used as part of an automated system to evaluate the relevance of publications to a certain topic based on defined criteria and based on the title and abstract of each publication. A Python script was created to generate structured prompts consisting of text strings for instruction, title, abstract, and relevant criteria to be provided to an LLM. The relevance of a publication was evaluated by the LLM on a Likert scale (low relevance to high relevance). By specifying a threshold, different classifiers for inclusion/exclusion of publications could then be defined. The approach was used with four different openly available LLMs on ten published data sets of biomedical literature reviews and on a newly human-created data set for a hypothetical new systematic literature review.

The performance of the classifiers varied depending on the LLM being used and on the data set analyzed. Regarding sensitivity/specificity, the classifiers yielded 94.48%/31.78% for the FlanT5 model, 97.58%/19.12% for the OpenHermes-NeuralChat model, 81.93%/75.19% for the Mixtral model and 97.58%/38.34% for the Platypus 2 model on the ten published data sets. The same classifiers yielded 100% sensitivity at a specificity of 12.58%, 4.54%, 62.47%, and 24.74% on the newly created data set. Changing the standard settings of the approach (minor adaption of instruction prompt and/or changing the range of the Likert scale from 1–5 to 1–10) had a considerable impact on the performance.

Conclusions

LLMs can be used to evaluate the relevance of scientific publications to a certain review topic and classifiers based on such an approach show some promising results. To date, little is known about how well such systems would perform if used prospectively when conducting systematic literature reviews and what further implications this might have. However, it is likely that in the future researchers will increasingly use LLMs for evaluating and classifying scientific publications.

Peer Review reports

Systematic literature reviews (SLRs) summarize knowledge about a specific topic and are an essential ingredient for evidence-based medicine. Performing an SLR involves a lot of effort, as it requires researchers to identify, filter, and analyze substantial quantities of literature. Typically, the most relevant out of thousands of publications need to be identified for the topic and key information needs to be extracted for the synthesis. Some estimates indicate that systematic reviews typically take several months to complete [ 1 , 2 ], which is why the latest evidence may not always be taken into consideration.

Title and abstract screening forms a considerable part of the systematic reviewing workload. In this step, which typically follows defining a search strategy and precedes the full-text screening of a smaller number of search results, researchers determine whether a certain publication is relevant for inclusion in the systematic review based on title and abstract. Automating title and abstract screening has the potential to save time and thereby accelerate the translation of evidence into practice. It may also make the reviewing methodology more consistent and reproducible. Thus, the automation or semi-automation of this part of the reviewing workflow has been of longstanding interest [ 3 , 4 , 5 ].

Several approaches have been developed that use machine learning (ML) to automate or semi-automate screening [ 1 , 6 ]. For example, systematic review software applications such as Covidence [ 7 ] and EPPI-Reviewer [ 8 ] (which use the same algorithm) offer ML-assisted ranking algorithms that aim to show the most relevant publications for the search criteria higher in the reviewing to speed up the manual review process. Elicit [ 9 ] is a standalone literature discovery tool that also offers an ML-assisted literature search facility. Furthermore, several dedicated tools have been developed to specifically automate title and abstract screening [ 1 , 10 ]. Examples include Rayyan [ 11 ], DistillerSR [ 12 ], Abstrackr [ 13 ], RobotAnalyst [ 14 ], and ASReview [ 5 ]. These tools typically work via different technical strategies drawn from ML and topic modeling to enable the system to learn how similar new articles are to a core set of identified ‘good’ results for the topic. These approaches have been found to lead to a considerable reduction in the time taken to complete systematic reviews [ 15 ].

Most of these systems require some sort of pre-selection or specific training for the larger corpus of publications to be analyzed (e.g., identification of some “relevant” publications by a human so that the algorithm can select similar papers) and are thus not fully automated.

Furthermore, dedicated models are required that are built for the specific purpose together with appropriate training data. Fully automated systems that achieve high levels of performance and can be flexibly applied to various topics have not yet been realized.

Large language models (LLMs) are an approach to natural language processing in which very large-scale neural networks are trained on vast amounts of textual data to generate sequences of words in response to input text. These capable models are then subject to different strategies for additional training to improve their performance on a wide range of tasks. Recent technological advancements in model size, architecture, and training strategies have led to general-purpose dialog LLMs achieving and exceeding state-of-the-art performance on many benchmark tasks including medical question answering [ 16 ] and text summarization [ 17 ].

Recent progress in the development of LLMs led to very capable models. While models developed by private companies such as GPT-3/GPT-3.5/GPT-4 from OpenAI [ 18 ] or PaLM and Gemini from Google [ 19 , 20 ] are among the most powerful LLMs currently available, openly available models are actively being developed by different stakeholders and in some cases achieve performances not far from the state of the art [ 21 ].

LLMs have shown remarkable capabilities in a variety of subjects and tasks that would require a profound understanding of text and knowledge for a human to perform. Among others, LLMs can be used for classification [ 22 ], information extraction [ 23 ], and knowledge access [ 24 ]. Furthermore, they can be flexibly adapted via prompt engineering techniques [ 25 ] and parameter settings, to behave in a desired way. At the same time, considerable problems with the usage of LLMs such as “hallucinations” of models [ 26 ], inherent biases [ 27 , 28 ], and weak alignment with human evaluation [ 29 ] have been described. Therefore, even though the text output generated by LLMs is based on objective statistical calculations, the text output itself is not necessarily factual and correct and furthermore incorporates subjectivity based on the training data. This implies, that an LLM-based evaluation system has a priori some fundamental limitations. However, using LLMs for evaluating scientific publications is a novel and interesting approach that may be helpful in creating fully automated and still flexible systems for screening and evaluating scientific literature.

To investigate whether and how well openly available LLMs can be used for evaluating the relevance of publications as part of an automated title and abstract screening system, we conducted a study to evaluate the performance of such an approach in the biomedical domain with modern openly available LLMs.

Using LLMs for title and abstract screening

We designed an approach for evaluating the relevance of publications based on title and abstract using an LLM. This approach is based on the following strategy:

An instruction prompt to evaluate the relevance of a scientific publication for inclusion into an SLR is given to an LLM.

The prompt includes the title and abstract of the publication and the criteria that are considered relevant.

The prompt furthermore includes the request to return just a number as an answer, which corresponds to the relevance of the publication on a Likert scale (“not relevant” to “highly relevant”).

The prompt for each publication is created in a structured and automated way.

A numeric threshold may be defined which separates relevant publications from irrelevant publications (corresponding to the definition of a classifier).

The prompts are created in the following way:

Prompt = [Instruction] + [Title of publication] + [Abstract of publication] + [Relevant Criteria ] .

(“ + ” is not part of the final prompt but indicates the merge of the text strings).

[Instruction] is the text string describing the general instruction for the LLM to evaluate the publication. The LLM is asked to evaluate the relevance of a publication for an SLR on a numeric scale (low relevance to high relevance) based on the title and abstract of the publication and based on defined relevant criteria.

[Title of publication] is the text string “Title:” together with the title of the publication.

[Abstract of publication] is the text string “, Abstract:” together with the abstract of the publication.

[Relevant Criteria] is the text that describes the criteria to evaluate the relevance of a publication. The relevant criteria are defined beforehand by the researchers depending on the topic to determine which publications are relevant. The [Relevant Criteria] text string remains unchanged for all the publications that should be checked for relevance.

The answer to the LLM usually consists just of a digit on a numeric scale (e.g., 1–5). However, variations are acceptable if the answer can unambiguously be assigned to one of the possible scores on the Likert scale (e.g., the answer “The relevance of the publication is 3.” can unambiguously be assigned to the score 3). This assignment of answers to a score can be automated with a string-search command, meaning a simple regular expression command searching for a positive integer number, which will be extracted from the text string.

A request is sent to the LLM for each publication in the corpus. In cases for which an LLM provided an invalid (unprocessable) response for a publication, that response was excluded from the direct downstream analysis. It was determined for how many publications invalid responses were given and how many of these publications would have been relevant.

A schematic illustration of the approach is shown in Fig. 1 . An example of a prompt is provided in Supplementary material 1: Appendix 1.

Schematic illustration of the LLM-based approach for evaluating the relevance of a scientific publication. In this example, a 1–5 scale and a 3 + classifier are used

A Python script was created to automate the process and to apply it to a data set with a collection of different publications.

With the publications being sorted into different relevance groups, a threshold can be defined, which is used by a classifier to separate relevant from irrelevant publications. For example, a 3 + classifier would classify publications with a score of ≥ 3 as relevant, and publications with a score < 3 as irrelevant.

The performance of the approach was tested with different LLMs, data sets and settings as described in the following:

Language models

A variety of different models were tested. To investigate the approach with different LLMs (that are also diverse regarding design and training data), the following four models were used in the experiments:

FlanT5-XXL (FlanT5) is an LLM developed by Google Research. It’s a variant of the T5 (text-to-text) model, that utilizes a unified text-to-text framework allowing it to perform a wide range of NLP tasks with the same model architecture, loss function, and hyperparameters. FlanT5 is a variant that was enhanced through fine-tuning over a thousand additional tasks and supporting more languages. It is primarily used for research in various areas of natural language processing, such as reasoning and question-answering [ 30 , 31 ].

OpenHermes-2.5-neural-chat-7b-v3-1-7B (OHNC) [ 32 ] is a powerful open-source LLM, which was merged from the two models OpenHermes 2.5 Mistral 7B [ 33 ] and Neural-Chat (neural-chat-7b-v3-1) [ 34 ]. Despite having only 7 billion parameters it performs better than some larger models on various benchmarks.

Mixtral-8 × 7B-Instruct v0.1 (Mixtral) is a pretrained generative Sparse Mixture of Experts LLM developed by Mistral AI [ 35 , 36 ]. It was reported to outperform powerful models like gpt-3.5-turbo, Claude-2.1, Gemini Pro, and Llama 2 70B-chat on human benchmarks.

Platypus2-70B-Instruct (Platypus 2) is a powerful language model with 70 Billion parameters [ 37 ]. The model itself is a merge of the models Platypus2-70B and SOLAR-0-70b-16bit (previously published as LLaMa-2-70b-instruct-v2) [ 38 ].

Published data sets

A list of several data sets for SLRs is provided to the public by the research group of the ASReview tool [ 39 ]. The list contains data sets on a variety of different biomedical subjects of previously published SLRs. For testing the LLM approach on an individual data set, the [Relevant Criteria] string for each data set was created based on the description in the publication of the corresponding SLR. We tested the approach on a total of ten published data sets covering different biomedical topics (Table 1 , Supplementary material 2: Appendix 2).

Newly created data set on CDSS in radiation oncology

To test the approach also in a prospective setting on a not previously published review, we created a data set for a new, hypothetical SLR, for which title and abstract screening should be performed.

The use case was an SLR on “Clinical Decision Support System (CDSS) tools for physicians in radiation oncology”. A CDSS is an information technology system developed to support clinical decision-making. This general definition may include diagnostic tools, knowledge bases, prognostic models, or patient decision aids [ 50 ]. We decided that the hypothetical SLR should be only about software-based systems to be used by clinicians for decision-making purposes in radiation oncology. We defined the following criteria for the [Relevant Criteria] text of the provided prompt:

Only inclusion of original articles, exclusion of review articles.

Publications examining one or several clinical decision-support systems relevant to radiation therapy.

Decision-support systems are software-based.

Exclusion of systems intended for support of non-clinicians (e.g., patient decision aids).

Publications about models (e.g., prognostic models) should only be included if the model is intended to support clinical decision-making as part of a software application, which may resemble a clinical decision support system.

The following query was used for searching relevant publications on PubMed: “(clinical decision support system) AND (radiotherapy OR radiation therapy)”.

Titles and abstracts of all publications found with the query were collected. A human-based title and abstract screening was performed to obtain the ground truth data set. Two researchers (FD and NC) independently labeled the publications as relevant/not relevant based on the title and abstract and based on the [Relevant criteria] string. The task was to label those publications relevant that may be of interest and should be analyzed as full text, while all other publications should be labeled irrelevant. After labeling all publications, some of the publications were deemed relevant only by one of the two researchers. To obtain a final decision, a third researcher (PMP) independently did the labeling for the undecided cases.

The aim was to create a human-based data set purely representing the process of title and abstract screening without further information or analysis.

A manual title and abstract screening was conducted on 521 publications identified in the search with 36 publications being identified as relevant and labeled accordingly in the data set. This data set was named “CDSS_RO”. It should be noted that this data set is qualitatively different from the 10 published data sets, as not only the publications that may be finally included in an SLR are labeled as relevant, but all publications that should be analyzed in full text based on title and abstract. The file is provided at https://github.com/med-data-tools/title-abstract-screening-ai ).

Parameters and settings of LLM-based title and abstract screening

Standard parameters.

The LLM-based title and abstract screening as described above requires the definition of some parameters. The standard settings for the approach were the following:

[Instruction] string: We used the following standard [Instruction] string:

“On a scale from 1 (very low probability) to X (very high probability), how would you rate the relevance of the following scientific publication to be included in a systematic literature review based on the relevant criteria and based on title and abstract?”

Range of scale: defines the range of the Likert scale mentioned in the [Instruction] string (marked as X in the standard string above). For the standard settings, a value of 5 was used.

Model parameters of the LLMs were defined in the source code. To obtain reproducible results, the model parameters were set accordingly for the model to become deterministic (e.g., the temperature value is a parameter that defines how much variation a response of a model should have. Values greater than 0 add a random element to the output, which should be avoided for the reproducibility of the LLM-based title and abstract screening).

Adaptation of instruction prompt and range

The behavior of an LLM is highly dependent on the provided prompt. Adequate adaptation of the prompt may be used to improve the performance of an LLM for certain tasks [ 25 ]. To investigate what impact a slightly adapted version of the Instruction prompt would have on the results, we added the string “(Note: Give a low score if not all criteria are fulfilled. Give only a high score if all or almost all criteria are fulfilled.)” in the instruction prompt as additional instruction and examined the impact on the performance. Furthermore, the range of the scale was changed from 1–5 to 1–10 in some experiments to investigate what impact this would have on the performance.

Statistical analyses

The performance of the approach, depending on models and threshold, was determined by calculating the sensitivity (= recall), specificity, accuracy, precision, and F1-score of the system, based on the amount of correctly and incorrectly included/excluded publications for each data set.

Comparison with the automated classifier of Natukunda et al.

The LLM-based title and abstract screening was compared to another, recently published approach for fully automated title and abstract screening. This approach, developed by Natukunda et al., uses an unsupervised Latent Dirichlet Allocation-based topic model for screening [ 51 ]. Unlike the LLM-based approach, it does not require an additional [Relevant Criteria] string, but defined search keywords to determine which publications are relevant. The approach was used to do a screening on the ten published data sets as well as on the CDSS_RO data set. To obtain the required keywords we processed the text of the used search terms by splitting combined text into individual words and removing stop words, duplicates, and punctuation (as described in the original publication of Natukunda et al.).

Performance of LLM-based title and abstract screening of different models on published data sets

The LLM-based screening with a Likert scale of 1–5 provided clear results for evaluating the relevance of a publication in the majority of cases. Out of the total of 44,055 publications among the 10 published data sets, valid and unambiguously assignable answers were given for 44,055 publications (100%) by the FlanT5 model, for 44,052 publications (99.993%) by the OHNC model, for 44,026 publications (99.93%) by the Mixtral model and for 44,054 publications (99.998%) by the Platypus 2 model. The few publications for which an invalid answer was given were excluded from further analysis. None of the excluded publications was relevant. The distribution of scores given was different between the different models. For example, the OHNC model ranked the majority of publications with a score of 3 (47.2%) or 4 (34.2%), while the FlanT5 model ranked almost all publications with a score of either 4 (68.1%) or 2 (31.7%). For all models, the group of publications labeled as relevant in the data sets was ranked with higher scores compared to the overall group of publications (mean score of 3.89 compared to 3.38 for FlanT5, 3.86 compared to 3.14 for OHNC, 4.16 compared to 2.12 for Mixtral and 3.80 compared to 2.92 for Platypus 2). An overview is provided in Fig. 2 .

Distribution of scores given by the different models

Based on the scores given, according classifiers that label publications with a score of greater than or equal to “X” as relevant, have higher rates of sensitivity and lower rates of specificity with decreasing threshold (decreasing “X”).

Classifiers with a threshold of ≥ 3 (3 + classifiers) were further analyzed, as these classifiers were considered to correctly identify the vast majority of relevant publications (high sensitivity) without including too many irrelevant publications (sufficient specificity). The 3 + classifiers had a sensitivity/specificity of 94.8%/31.8% for the FlanT5 model, of 97.6%/19.1% for the OHNC model, of 81.9%/75.2% for the Mixtral model, and of 97.2%/38.3% for the Platypus 2 model on all ten published data sets. The performance of the classifiers was quite different depending on the data set used (Fig. 3 ). Detailed results on the individual data sets are presented in Supplementary material 3: Appendix 3.

Sensitivity and specificity of the 3 + classifiers on different data sets using different models. Each data point represents the results of one of the data sets

The highest specificity at 100% sensitivity was seen for the Mixtral model on the data set Wolters_2018 with all 19 relevant publications being scored with 3–5, while 4410 of 5019 irrelevant publications were scored with 1 or 2 (specificity of 87.87%). The lowest sensitivity was observed with the Mixtral model on the dataset Jeyaraman_2021 with 23.96% sensitivity at 94.63% specificity.

Using LLM-based title and abstract screening for a new systematic literature review

On the newly created manually labeled data set, the 3 + classifiers had 100% sensitivity for all four models with specificity ranging from 4.54 to 62.47%. The results of the LLM-based title and abstract screening, dependent on the threshold for the classifiers are presented as receiver operating characteristics (ROC) curves in Fig. 4 as well as in Supplementary material 3: Appendix 3.

Receiver operating characteristics (ROC) curves of the LLM-based title and abstract screening for the different models on the CDSS_RO data set

Dependence of LLM-based title and abstract screening on Instruction prompt and on a range of scale

Several runs of the Python script with different settings (adapted [Instruction] string and/or range of scale 1–10 instead of 1–5) were performed, which led to different results. Minor adaptation of the Instruction string with an additional demand to focus on the mentioned criteria had a different impact on the performance of the classifiers depending on the LLM used. While the sensitivity of the 3 + classifiers remained at 100% for all four models, the specificity was lower for the OHNC model (2.89% vs. 4.54%), the Mixtral model (56.29% vs. 62.47%) and the Platypus 2 model (15.88% vs. 24.74%), while it was higher for the FlanT5 model (25.15% vs. 12.58%).

Changing the range of scale from 1–5 to 1–10 and using a 6 + classifier instead of a 3 + classifier led to a lower sensitivity for the OHNC model (97.22% vs. 100%), while increasing the specificity (13.49% vs. 4.54%). For the other models, the sensitivity remained at 100% with higher specificity for the Platypus 2 model (51.34% vs. 24.74%) and the FlanT5 model (50.52% vs. 12.58%). The specificity was unchanged for the Mixtral model at 62.47%, which was the highest value among all combinations at 100% sensitivity. No combination of the settings for a range of scales and with/without prompt adaptation was superior among all models. An overview of the results is provided in Fig. 5 .

Performance of the classifiers depending on adaptation of the prompt and on the range of scale

Comparison with unsupervised title and abstract screening of Natukunda et al.

The screening approach developed by Natukunda et al. achieved an overall sensitivity of 52.75% at 56.39% specificity on the ten published data sets. As for the LLM-based screening, the performance of this approach was dependent on the data set analyzed. The lowest sensitivity was observed for the Jeyaraman_2021 data set (1.04%), while the highest sensitivity was observed for the Wolters_2018 dataset (100%). Compared to the 3 + classifier with the Mixtral model, the LLM-based approach had higher sensitivity on 9 data sets and equal sensitivity on 1 data set, while it had higher specificity on 6 data sets and lower specificity on 4 data sets.

On the CDSS_RO data set, the approach of Natukunda et al. achieved 94.44% sensitivity (lower than all four LLMs) at 39.59% specificity (lower than the Mixtral model and higher than the FlanT5, OHNC, and Platypus 2 models). Further data on the comparison is provided in Supplementary material 4: Appendix 4.

We developed and elaborated a flexible approach to use LLMs for automated title and abstract screening that has shown some promising results on a variety of biomedical topics. Such an approach could potentially be used to automatically pre-screen the relevance of publications based on title and abstract. While the results are far from perfect, using LLMs for evaluating the relevance of publications could potentially be helpful (e.g., as a pre-processing step) when performing an SLR. Furthermore, the approach is widely applicable without the development of custom tools or training custom models.

Automated and semi-automated screening

A variety of different ML and AI tools have been developed to assist researchers in performing SLRs [ 5 , 10 , 52 , 53 ]. Fully automated systems (like the LLM-based approach presented in our study) still fail to differentiate relevant from irrelevant publications near the level of human evaluation [ 51 , 54 ].

A well-functioning fully automated title and abstract screening system that could be used on different subjects in the biomedical domain and possibly also in other scientific areas would be very valuable. While human-based screening is the current gold standard, it has considerable drawbacks. From a methodological point of view, one major problem of human-based literature evaluation, including title and abstract screening, is the subjectivity of the process [ 55 ]. Evaluating the publications (based on title and abstract) is dependent on the experience and individual judgments of the person doing the screening. To overcome this issue, SLRs of high quality require multiple independent researchers to do the evaluation with specific criteria upon inclusion/exclusion defined beforehand [ 56 ]. Nevertheless, subjectivity remains an unresolved issue, which also limits the reproducibility of results. From a practical point of view, another major problem is the considerable workload needed to be performed by humans, especially if thousands of publications need to be assessed, which is multiplied by the need to have multiple reviewers and to discuss disagreements. The challenge of workload is not just a matter of inconvenience, as SLRs on subjects that require tens of thousands of publications to be searched, may just not be feasible for small research teams to do, or may already be outdated after the time it would take to do the screening and analyze the results.

While fully automated screening approaches may also be affected by subjectivity (since the training data of models is itself generated by processes which are affected by subjectivity), the results would at least be more reproducible, and automation can be applied at scale in order to overcome the problem of practicability.

While current fully automated systems cannot replace humans in title and abstract screening, they may nevertheless be helpful. Such systems are already being used in systematic reviews and most likely their usage will continue to grow [ 57 ].

Ideally, a fully automated system should not miss a single relevant publication (100% sensitivity) while minimizing as far as possible the number of irrelevant publications included. This would allow confident exclusion of some of the retrieved search results which is a big asset to reducing time taken in manual screening.

LLMs for title and abstract screening

By creating structured prompts with clear instructions, an LLM can feasibly be used for evaluating the relevance of a scientific publication. In comparison to some other solutions, the LLM-based screening may have some advantages. On the one hand, the flexible nature of the approach allows adaptation to a specific subject. Depending on the question, different prompts for relevant criteria and instructions can be used to address the individual research question. On the other hand, the approach can create reproducible results, given a fixed model, parameters, prompting strategy, and defined threshold. At the same time, it is scalable to process large numbers of publications. As we have seen, such an approach is feasible with a performance similar to or even better in comparison to other current solutions like the approach of Natukunda et al. However, it should be noted that the performance varied considerably depending on which of the 10 + 1 data sets were used.

Further applications of LLMs in literature analysis

While we investigated LLMs for evaluating the relevance of publications and in particular for title and abstract screening, it is being discussed how these models may be used for a variety of tasks in literature analysis [ 58 , 59 ]. For example, Wang et al. obtained promising results when investigating if ChatGPT may be used for writing Boolean Queries for SLRs [ 60 ]. Aydin et al., also using ChatGPT, employed the LLM to write an entire Literature Review about Digital Twins in Healthcare [ 61 ].

Guo et al. recently performed a study using the OpenAI API with gpt-3.5 and gpt-4 to create a classifier for clinical reviews [ 62 ]. They observed promising results when comparing the performance of the classifier against human-based screening with a sensitivity of 76% at 91% specificity on six different review papers. In contrast to our approach, they used a Boolean classifier instead of a Likert scale. Another approach was developed by Akinseloyin et al., who used ChatGPT to create a method for citation screening by ranking the relevance of publications using a question-answering framework [ 63 ].

The question may arise what the purpose of using a Likert scale instead of a direct binary classifier is (also since some models only rarely use some of the score values; see e.g., FlanT5 in Fig. 2 ). The rationale for using the Likert scale arose out of some preliminary, unsystematic explorations we conducted using different models and ranges of scale (including binary). We realized that using a Likert scale has some advantages as it sorts the publications into several groups depending on the estimated relevance. This also allows flexible adjustment of the threshold (which may potentially also be useful if the user wants to rather focus on sensitivity or rather on specificity).

However, there seem to be several feasible approaches and frameworks to use LLMs for the screening of publications.

It should be noted that an LLM-based approach for evaluating the relevance of publications might just as well be used for a variety of different classification tasks in literature analysis. For example, one may adopt the [Instruction prompt] asking the LLM not to evaluate the relevance of a publication on a Likert scale, but for classification into several groups like “original article”, “trial”, “letter to the editor”, etc. From this point of view, the title and abstract screening is just a special use case of LLM-based classification.

Future developments

The capabilities of LLMs and other AI models will continue to evolve, which will increase the performance of fully automated systems. As we have seen, the results are highly dependent on the LLM used for the approach. In any case, there may still be substantial room for improvement and optimization and it currently is unclear what LLM-based approach with which prompts, models, and settings yields the best results over a large variety of data sets.

Furthermore, LLMs may not only be used for the screening of titles and abstracts but for the analysis of full-text documents. The newest generation of language and multimodal models may process whole articles or potentially also image data from publications [ 64 , 65 ]. Beyond that, LLM-based evaluation of scientific data and publications may only be one of several options for AI assistance in literature analysis. Future systems may combine different ML and AI approaches for optimal automated processing of literature and scientific data.

Limitations of LLM-based title and abstract screening

Even though the LLM-based screening presented in our work shows some promising results, it also has some drawbacks and limitations. While the open framework with adaptable prompts makes the approach flexible, the performance of the approach is highly dependent on the used model, the input parameters/settings, and the data set analyzed. If a slightly different instruction or another scale (1–10 instead of 1–5) is used, this can have a considerable impact on the performance. The classifiers analyzed in our study failed to consistently identify relevant publications at 100% sensitivity without considerably impairing the specificity. In academic research, the bar for automated screening tools needs to be very high, as ideally not a single relevant publication should be missed. The LLM-based title and abstract screening requires the definition of clear criteria for inclusion/exclusion. For research questions with less clear relevance criteria, LLMs may not be that useful for the evaluation. This may potentially be one reason, why the performance of the approach was quite different in our study depending on the data set analyzed. Overall, there are still many open questions, and it is unclear if and how high levels of performance can be consistently guaranteed so that such a system can be relied on. It is interesting that the Mixtral model, even though it seemed to have the highest level of performance on average, performed poorly with low sensitivity on one data set (Fig. 3 ). Further research is needed to investigate the requirements for good performance of the LLMs in evaluating scientific literature.

Another limitation of the approach in its current form is a considerable demand for resources regarding calculation power and hardware equipment. Answering thousands of long text prompts with modern, multi-billion-parameter LLMs requires sufficient IT infrastructure and calculation power to perform. The issue of resource demand is especially relevant if many thousand publications are evaluated and if very complex models are used.

Fundamental issues of using LLMs in literature analysis

On a more fundamental level, there are some general issues regarding the use of LLMs for literature studies. LLMs calculate the probability for a sequence of words based on their training data which derives from past observations and knowledge. They can thereby inherit unwanted features and biases (such as for example ethnic or gender biases) [ 29 , 66 ]. In a recent study by Koo et al., it was shown that the cognitive biases and preferences of LLMs are not the same as the ones of humans as a low correlation between ratings given by LLMs and humans was observed [ 67 ]. The authors therefore stated that LLMs are currently not suitable as fair and reliable automatic evaluators. Considering that using LLMs for evaluating and processing scientific publications may be seen as a problematic and questionable undertaking. However, the biases present in language models affect different tasks differently, and it remains to be seen how they might differentially affect different screening tasks in the literature review [ 28 ].

Nevertheless, it is most likely that LLMs and other AI solutions will be increasingly used in conducting and evaluating scientific research [ 68 ]. While this certainly will provide a lot of chances and opportunities, it is also potentially concerning. The amount and proportion of text being written by AI models is increasing. This includes not only public text on the Internet but also scientific literature and publications [ 69 , 70 ]. The fact that ChatGPT has been chosen as one of the top researchers of the year 2023 by Nature and has frequently been listed as co-author, shows how immediate the impact of the development has already been [ 71 ]. At the same time, most LLMs are trained on large amounts of text provided on the Internet. The idea that in the future LLMs might be used to evaluate publications written with the help of LLMs that may themselves be trained on data created by LLMs may lead to disturbing negative feedback loops which decrease the quality of the results over time [ 72 ]. Such a development could actually undermine academia and evidence-based science [ 73 ], also due to the known fact that LLMs tend to “hallucinate”, meaning that a model may generate text with illusory statements not based on correct data [ 26 ]. It is important to be aware that LLMs are not directly coupled to evidence and that there is no restriction preventing a model from generating incorrect statements. As part of a screening tool assigning just a score value to the relevance of a publication, this may be a mere factor impairing the performance of the system – yet for LLM-based analysis in general this is a major problem.

The majority of studies that so far have been published on using LLMs for publication screening used the currently most powerful models that are operated by private companies—most notably the ChatGPT models GPT-3.5 and GPT-4 developed by OpenAI [ 18 , 74 ]. Using models that are owned and controlled by private companies and that may change over time is associated with additional major problems when using them for publication screening, such as a lack of reproducibility. Therefore, after initial experiments with such models, we decided to use openly available models for our study.

Limitations of the study

Our study has some limitations. While we present a strategy for using LLMs to evaluate the relevance of publications for an SLR, our work does not provide a comprehensive analysis of all possible capabilities and limitations. Even though we achieved promising results on ten published data sets and a newly created one in our study, generalization of the results may be limited as it is not clear how the approach would perform on many other subjects within the biomedical domain more broadly and within other domains. To get a more comprehensive understanding, thorough testing with many more data sets about different topics would be needed, which is beyond the scope of this work. Testing the screening approach on retrospective data sets is also per se problematic. While a good performance on retrospective data should hopefully indicate a good performance if used prospectively on a new topic, this does not have to be the case [ 75 ]. Indeed, naively assuming a classifier that was tested on retrospective data will perform equally on a new research question is clearly problematic, since a new research question in science is by definition new and unfamiliar and therefore will not be represented in previously tested data sets.

Furthermore, models that are trained on vast amounts of scientific literature may even have been trained on some publications or the reviews that are used in the retrospective benchmarking of an LLM-based classifier, which obviously creates a considerable bias. To objectively assess how well an LLM-based solution can evaluate scientific publications for new research questions, large cultivated and independent prospective data sets on many different topics would be needed, which will be very challenging to create. It is interesting that the LLM-based title and abstract screening in our study would have also performed well on our new hypothetical SLR on CDSS in radiation therapy, but of course, this alone is a too limited data basis from which to draw general conclusions. Therefore, it currently cannot be reliably known in which situations such an LLM-based evaluation may succeed or may fail.

Regarding the ten published data sets, the results also need to be interpreted with caution. These data sets may not truly represent the singular task of title and abstract screening. For example, in the Appenzeller-Herzog_2020 data set, only the 26 publications that were finally included (not only after title and abstract screening but also after further analysis) were labeled as relevant [ 40 ]. While these publications ideally should be correctly identified by an AI-classifier, there may be other publications in the data set, that per se cannot be excluded solely based on title and abstract. Furthermore, we had to retrospectively define the [Relevant Criteria] string based on the text in the publication of the SLR. This obviously is a suboptimal way to define inclusion and exclusion criteria, as the defined string may not completely align with the criteria intended by the researchers of the SLR.

We also want to emphasize that the comparison with the approach of Natukunda et al. needs to be interpreted with caution since the two approaches are not based on exactly the same prerequisites: the LLM-based approach requires a [Relevant Criteria] string, while the approach of Natukunda et al. requires defined keywords.

While overall our work shows that LLM-based title and abstract screening is possible and shows some promising results on the analyzed data sets, our study cannot fully answer the question of how well LLMs would perform if they were used for new research. Even more importantly, we cannot answer the question of to what extent LLMs should be used for conducting literature reviews and for doing research.

Large language models can be used for evaluating the relevance of publications for SLRs. We were able to implement a flexible and cross-domain system with promising results on different biomedical subjects. With the continuing progress in the fields of LLMs and AI, fully automated computer systems may assist researchers in performing SLRs and other forms of scientific knowledge synthesis. However, it remains unclear how well such systems will perform when being used in a prospective manner and what implications this will have on the conduction of SLRs.

Availability of data and materials

All data generated and analyzed during this study are either included in this published article (and its supplementary information files) or publicly available on the Internet. The Python script as well as the CDSS_RO data set are available under https://github.com/med-data-tools/title-abstract-screening-ai . The ten published data sets analyzed in our study are available on the GitHub Repository of the research group of the ASReview Tool [ 39 ].

Abbreviations

Artificial intelligence

Application programming interface

Clinical Decision Support System

FlanT5-XXL model

Generative pre-trained transformer

Mixtral-8 × 7B-Instruct v0.1 model

Machine learning

Large language model

OpenHermes-2.5-neural-chat-7b-v3-1-7B model

Platypus2-70B-Instruct model

Receiver operating characteristic

Systematic literature review

Khalil H, Ameen D, Zarnegar A. Tools to support the automation of systematic reviews: a scoping review. J Clin Epidemiol. 2022;144:22–42.

Article PubMed Google Scholar

Clark J, Scott AM, Glasziou P. Not all systematic reviews can be completed in 2 weeks—But many can be (and should be). J Clin Epidemiol. 2020;126:163.

Clark J, Glasziou P, Del Mar C, Bannach-Brown A, Stehlik P, Scott AM. A full systematic review was completed in 2 weeks using automation tools: a case study. J Clin Epidemiol. 2020;121:81–90.

Pham B, Jovanovic J, Bagheri E, Antony J, Ashoor H, Nguyen TT, et al. Text mining to support abstract screening for knowledge syntheses: a semi-automated workflow. Syst Rev. 2021;10(1):156.

Article PubMed PubMed Central Google Scholar

van de Schoot R, de Bruin J, Schram R, Zahedi P, de Boer J, Weijdema F, et al. An open source machine learning framework for efficient and transparent systematic reviews. Nat Mach Intell. 2021;3(2):125–33.

Article Google Scholar

Hamel C, Hersi M, Kelly SE, Tricco AC, Straus S, Wells G, et al. Guidance for using artificial intelligence for title and abstract screening while conducting knowledge syntheses. BMC Med Res Methodol. 2021;21(1):285.

Covidence [Internet]. [cited 2024 Jan 14]. Available from: www.covidence.org .

Machine learning functionality in EPPI-Reviewer [Internet]. [cited 2024 Jan 14]. Available from: https://eppi.ioe.ac.uk/CMS/Portals/35/machine_learning_in_eppi-reviewer_v_7_web_version.pdf .

Elicit [Internet]. [cited 2024 Jan 14]. Available from: https://elicit.org/ .

Harrison H, Griffin SJ, Kuhn I, Usher-Smith JA. Software tools to support title and abstract screening for systematic reviews in healthcare: an evaluation. BMC Med Res Methodol. 2020;20(1):7.

Rayyan [Internet]. [cited 2024 Jan 14]. Available from: https://www.rayyan.ai/ .

DistillerSR [Internet]. [cited 2024 Jan 14]. Available from: https://www.distillersr.com/products/distillersr-systematic-review-software .

Abstrackr [Internet]. [cited 2024 Jan 14]. Available from: http://abstrackr.cebm.brown.edu/account/login .

RobotAnalyst [Internet]. [cited 2024 Jan 14]. Available from: http://www.nactem.ac.uk/robotanalyst/ .

Clark J, McFarlane C, Cleo G, Ishikawa Ramos C, Marshall S. The impact of systematic review automation tools on methodological quality and time taken to complete systematic review Tasks: Case Study. JMIR Med Educ. 2021;7(2): e24418.

Ayers JW, Poliak A, Dredze M, Leas EC, Zhu Z, Kelley JB, et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med. 2023 [cited 2024 Jan 14]; Available from: https://jamanetwork.com/journals/jamainternalmedicine/fullarticle/2804309 .

Tang L, Sun Z, Idnay B, Nestor JG, Soroush A, Elias PA, et al. Evaluating Large Language Models on Medical Evidence Summarization [Internet]. Health Informatics; 2023 Apr [cited 2024 Jan 14]. Available from: http://medrxiv.org/lookup/doi/ https://doi.org/10.1101/2023.04.22.23288967 .

OpenAI: GPT3-apps [Internet]. [cited 2024 Jan 14]. Available from: https://openai.com/blog/gpt-3-apps .

Google: PaLM [Internet]. [cited 2024 Jan 14]. Available from: https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html .

Google: Gemini [Internet]. [cited 2024 Jan 14]. Available from: https://deepmind.google/technologies/gemini/#hands-on .

Zhao WX, Zhou K, Li J, Tang T, Wang X, Hou Y, et al. A Survey of Large Language Models. 2023 [cited 2024 Jan 14]; Available from: https://arxiv.org/abs/2303.18223 .

McNichols H, Zhang M, Lan A. Algebra error classification with large language models [Internet]. arXiv; 2023 [cited 2023 May 25]. Available from: http://arxiv.org/abs/2305.06163 .

Wadhwa S, Amir S, Wallace BC. Revisiting relation extraction in the era of large language models [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2305.05003 .

Trajanoska M, Stojanov R, Trajanov D. Enhancing knowledge graph construction using large language models [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2305.04676 .

Reynolds L, McDonell K. Prompt programming for large language models: beyond the few-shot paradigm [Internet]. arXiv; 2021 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2102.07350 .

Guerreiro NM, Alves D, Waldendorf J, Haddow B, Birch A, Colombo P, et al. Hallucinations in Large Multilingual Translation Models [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2303.16104 .

Zack T, Lehman E, Suzgun M, Rodriguez JA, Celi LA, Gichoya J, et al. Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study. Lancet Digital Health. 2024;6(1):e12-22.

Article CAS PubMed Google Scholar

Hastings J. Preventing harm from non-conscious bias in medical generative AI. Lancet Digital Health. 2024;6(1):e2-3.

Digutsch J, Kosinski M. Overlap in meaning is a stronger predictor of semantic activation in GPT-3 than in humans. Sci Rep. 2023;13(1):5035.

Article CAS PubMed PubMed Central Google Scholar

Huggingface: FlanT5-XXL [Internet]. [cited 2024 Jan 14]. Available from: https://huggingface.co/google/flan-t5-xxl .

Chung HW, Hou L, Longpre S, Zoph B, Tay Y, Fedus W, et al. Scaling Instruction-Finetuned Language Models [Internet]. arXiv; 2022 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2210.11416 .

Huggingface: OpenHermes-2.5-neural-chat-7b-v3–1–7B [Internet]. [cited 2024 Jan 14]. Available from: https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-7b-v3-1-7B .

Huggingface: OpenHermes-2.5-Mistral-7B [Internet]. [cited 2024 Jan 14]. Available from: https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B .

Huggingface: neural-chat-7b-v3–1 [Internet]. [cited 2024 Jan 14]. Available from: https://huggingface.co/Intel/neural-chat-7b-v3-1 .

Huggingface: Mixtral-8x7B-Instruct-v0.1 [Internet]. [cited 2024 Jan 14]. Available from: https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1 .

Jiang AQ, Sablayrolles A, Roux A, Mensch A, Savary B, Bamford C, et al. Mixtral of Experts [Internet]. [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2401.04088 .

Huggingface: Platypus2–70B-Instruct [Internet]. [cited 2024 Jan 14]. Available from: https://huggingface.co/garage-bAInd/Platypus2-70B-instruct .

Huggingface: SOLAR-0–70b-16bit [Internet]. [cited 2024 Jan 14]. Available from: https://huggingface.co/upstage/SOLAR-0-70b-16bit#updates .

Systematic Review Datasets: ASReview [Internet]. [cited 2024 Jan 14]. Available from: https://github.com/asreview/systematic-review-datasets .

Appenzeller-Herzog C, Mathes T, Heeres MLS, Weiss KH, Houwen RHJ, Ewald H. Comparative effectiveness of common therapies for Wilson disease: a systematic review and meta-analysis of controlled studies. Liver Int. 2019;39(11):2136–52.

Bos D, Wolters FJ, Darweesh SKL, Vernooij MW, De Wolf F, Ikram MA, et al. Cerebral small vessel disease and the risk of dementia: a systematic review and meta-analysis of population-based evidence. Alzheimer’s & Dementia. 2018;14(11):1482–92.

Donners AAMT, Rademaker CMA, Bevers LAH, Huitema ADR, Schutgens REG, Egberts TCG, et al. Pharmacokinetics and associated efficacy of emicizumab in humans: a systematic review. Clin Pharmacokinet. 2021;60(11):1395–406.

Jeyaraman M, Muthu S, Ganie PA. Does the source of mesenchymal stem cell have an effect in the management of osteoarthritis of the knee? Meta-analysis of randomized controlled trials. CARTILAGE. 2021 Dec;13(1_suppl):1532S-1547S.

Leenaars C, Stafleu F, De Jong D, Van Berlo M, Geurts T, Coenen-de Roo T, et al. A systematic review comparing experimental design of animal and human methotrexate efficacy studies for rheumatoid arthritis: lessons for the translational value of animal studies. Animals. 2020;10(6):1047.

Meijboom RW, Gardarsdottir H, Egberts TCG, Giezen TJ. Patients retransitioning from biosimilar TNFα inhibitor to the corresponding originator after initial transitioning to the biosimilar: a systematic review. BioDrugs. 2022;36(1):27–39.

Muthu S, Ramakrishnan E. Fragility analysis of statistically significant outcomes of randomized control trials in spine surgery: a systematic review. Spine. 2021;46(3):198–208.

Oud M, Arntz A, Hermens ML, Verhoef R, Kendall T. Specialized psychotherapies for adults with borderline personality disorder: a systematic review and meta-analysis. Aust N Z J Psychiatry. 2018;52(10):949–61.

Van De Schoot R, Sijbrandij M, Depaoli S, Winter SD, Olff M, Van Loey NE. Bayesian PTSD-trajectory analysis with informed priors based on a systematic literature search and expert elicitation. Multivar Behav Res. 2018;53(2):267–91.

Wolters FJ, Segufa RA, Darweesh SKL, Bos D, Ikram MA, Sabayan B, et al. Coronary heart disease, heart failure, and the risk of dementia: A systematic review and meta-analysis. Alzheimer’s Dementia. 2018;14(11):1493–504.

Sutton RT, Pincock D, Baumgart DC, Sadowski DC, Fedorak RN, Kroeker KI. An overview of clinical decision support systems: benefits, risks, and strategies for success. npj Digit Med. 2020 Feb 6;3(1):17.

Natukunda A, Muchene LK. Unsupervised title and abstract screening for systematic review: a retrospective case-study using topic modelling methodology. Syst Rev. 2023;12(1):1.

Marshall IJ, Wallace BC. Toward systematic review automation: a practical guide to using machine learning tools in research synthesis. Syst Rev. 2019 Dec;8(1):163, s13643–019–1074–9.

Wallace BC, Trikalinos TA, Lau J, Brodley C, Schmid CH. Semi-automated screening of biomedical citations for systematic reviews. BMC Bioinformatics. 2010;11(1):55.

Li D, Wang Z, Wang L, Sohn S, Shen F, Murad MH, et al. A text-mining framework for supporting systematic reviews. Am J Inf Manag. 2016;1(1):1–9.

PubMed PubMed Central Google Scholar

de Almeida CPB, de Goulart BNG. How to avoid bias in systematic reviews of observational studies. Rev CEFAC. 2017;19(4):551–5.

Siddaway AP, Wood AM, Hedges LV. How to do a systematic review: a best practice guide for conducting and reporting narrative reviews, meta-analyses, and meta-syntheses. Annu Rev Psychol. 2019;70(1):747–70.

Santos ÁOD, Da Silva ES, Couto LM, Reis GVL, Belo VS. The use of artificial intelligence for automating or semi-automating biomedical literature analyses: a scoping review. J Biomed Inform. 2023;142: 104389.

Haman M, Školník M. Using ChatGPT to conduct a literature review. Account Res. 2023;6:1–3.

Liu R, Shah NB. ReviewerGPT? An exploratory study on using large language models for paper reviewing [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2306.00622

Wang S, Scells H, Koopman B, Zuccon G. Can ChatGPT write a good boolean query for systematic review literature search? [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2302.03495 .

Aydın Ö, Karaarslan E. OpenAI ChatGPT generated literature review: digital twin in healthcare. SSRN Journal [Internet]. 2022 [cited 2024 Jan 14]; Available from: https://www.ssrn.com/abstract=4308687 .

Guo E, Gupta M, Deng J, Park YJ, Paget M, Naugler C. Automated paper screening for clinical reviews using large language models [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2305.00844 .

Akinseloyin O, Jiang X, Palade V. A novel question-answering framework for automated citation screening using large language models [Internet]. Health Informatics; 2023 Dec [cited 2024 Jan 14]. Available from: http://medrxiv.org/lookup/doi/ https://doi.org/10.1101/2023.12.17.23300102 .

Koh JY, Salakhutdinov R, Fried D. Grounding language models to images for multimodal inputs and outputs. 2023 [cited 2024 Jan 14]; Available from: https://arxiv.org/abs/2301.13823 .

Wang L, Lyu C, Ji T, Zhang Z, Yu D, Shi S, et al. Document-level machine translation with large language models [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2304.02210 .

Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, et al. Language Models are Few-Shot Learners [Internet]. arXiv; 2020 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2005.14165 .

Koo R, Lee M, Raheja V, Park JI, Kim ZM, Kang D. Benchmarking cognitive biases in large language models as evaluators [Internet]. arXiv; 2023 [cited 2024 Jan 14]. Available from: http://arxiv.org/abs/2309.17012 .

Editorial —Artificial Intelligence language models in scientific writing. EPL. 2023 Jul 1;143(2):20000.

Grimaldi G, Ehrler BAI, et al. Machines Are About to Change Scientific Publishing Forever. ACS Energy Lett. 2023;8(1):878–80.

Article CAS Google Scholar

Grillo R. The rising tide of artificial intelligence in scientific journals: a profound shift in research landscape. Eur J Ther. 2023;29(3):686–8.

nature: ChatGPT and science: the AI system was a force in 2023 — for good and bad [Internet]. [cited 2024 Jan 14]. Available from: https://www.nature.com/articles/d41586-023-03930-6 .

Chiang CH, Lee H yi. Can large language models be an alternative to human evaluations? 2023 [cited 2024 Jan 6]; Available from: https://arxiv.org/abs/2305.01937 .

Erler A. Publish with AUTOGEN or perish? Some pitfalls to avoid in the pursuit of academic enhancement via personalized large language models. Am J Bioeth. 2023;23(10):94–6.

OpenAI: ChatGPT [Internet]. [cited 2024 Jan 14]. Available from: https://openai.com/blog/chatgpt .

Gates A, Gates M, Sebastianski M, Guitard S, Elliott SA, Hartling L. The semi-automation of title and abstract screening: a retrospective exploration of ways to leverage Abstrackr’s relevance predictions in systematic and rapid reviews. BMC Med Res Methodol. 2020;20(1):139.

Download references

Acknowledgements

Not applicable.

Author information

Authors and affiliations.

Department of Radiation Oncology, Cantonal Hospital of St. Gallen, St. Gallen, Switzerland

Fabio Dennstädt & Paul Martin Putora

Institute for Computer Science, University of Würzburg, Würzburg, Germany

Johannes Zink

Department of Radiation Oncology, Inselspital, Bern University Hospital and University of Bern, Bern, Switzerland

Fabio Dennstädt, Paul Martin Putora & Nikola Cihoric

Institute for Implementation Science in Health Care, University of Zurich, Zurich, Switzerland

Janna Hastings

School of Medicine, University of St. Gallen, St. Gallen, Switzerland

Swiss Institute of Bioinformatics, Lausanne, Switzerland

You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to designing the concept and methodology of the presented approach of LLM-based evaluation of the relevance of a publication to an SLR. The Python script was created by FD and JZ. The experiments were conducted by FD and JH. All authors contributed in writing and revising the manuscript. All authors have read and approved the final version of the manuscript.

Corresponding author

Correspondence to Fabio Dennstädt .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

NC is a technical lead for the SmartOncology© project and medical advisor for Wemedoo AG, Steinhausen AG, Switzerland. The authors declare that they have no other competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1: appendix 1: sample prompt., supplementary material 2: appendix 2: relevant criteria of published datasets., supplementary material 3: appendix 3: performance of models on data sets., supplementary material 4: appendix 4: comparison with other approach., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Dennstädt, F., Zink, J., Putora, P.M. et al. Title and abstract screening for literature reviews using large language models: an exploratory study in the biomedical domain. Syst Rev 13 , 158 (2024). https://doi.org/10.1186/s13643-024-02575-4

Download citation

Received : 17 June 2023

Accepted : 30 May 2024

Published : 15 June 2024

DOI : https://doi.org/10.1186/s13643-024-02575-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Natural language processing
Biomedicine
Title and abstract screening
Large language models

Systematic Reviews

ISSN: 2046-4053

Submission enquiries: Access here and click Contact Us
General enquiries: [email protected]

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Carbon peaking prediction scenarios based on different neural network models: A case study of Guizhou Province

Roles Writing – original draft

Affiliation China Railway Fifth Bureau Group Co., Ltd., Guiyang, China

Roles Data curation, Formal analysis

Affiliation Geological Brigade of Guizhou Provincial Bureau of Geology and Mineral Resources, Zunyi, China

Roles Writing – review & editing

* E-mail: [email protected]

Affiliation Faculty of Resources and Environmental Engineering, Guizhou Institute of Technology, Guiyang, China

Roles Conceptualization, Supervision, Validation

Roles Methodology, Project administration

Affiliation Guizhou Natural Resources Survey and Planning Research Institute, Guiyang, China

Da Lian,
Shi Qiang Yang,
Wu Yang,
Min Zhang,
Wen Rui Ran

Published: June 25, 2024
https://doi.org/10.1371/journal.pone.0296596
Peer Review
Reader Comments

Global warming, caused by greenhouse gas emissions, is a major challenge for all human societies. To ensure that ambitious carbon neutrality and sustainable economic development goals are met, regional human activities and their impacts on carbon emissions must be studied. Guizhou Province is a typical karst area in China that predominantly uses fossil fuels. In this study, a backpropagation (BP) neural network and extreme learning machine (ELM) model, which is advantageous due to its nonlinear processing, were used to predict carbon emissions from 2020 to 2040 in Guizhou Province. The carbon emissions were calculated using conversion and inventory compilation methods with energy consumption data and the results showed an "S" growth trend. Twelve influencing factors were selected, however, five with larger correlations were screened out using a grey correlation analysis method. A prediction model for carbon emissions from Guizhou Province was established. The prediction performance of a whale optimization algorithm (WOA)-ELM model was found to be higher than the BP neural network and ELM models. Baseline, high-speed, and low-carbon scenarios were analyzed and the size and time of peak carbon emissions in Liaoning Province from 2020 to 2040 were predicted using the WOA-ELM model.

Citation: Lian D, Yang SQ, Yang W, Zhang M, Ran WR (2024) Carbon peaking prediction scenarios based on different neural network models: A case study of Guizhou Province. PLoS ONE 19(6): e0296596. https://doi.org/10.1371/journal.pone.0296596

Editor: Salim Heddam, University 20 Aout 1955 skikda, Algeria, ALGERIA

Received: December 19, 2023; Accepted: May 13, 2024; Published: June 25, 2024

Copyright: © 2024 Lian et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting Information files.

Funding: This research was supported by Concealed Ore Deposit Exploration and Innovation Team of Guizhou Colleges and Universities (Guizhou Education and Cooperation Talent Team [2015]56), Provincial Key Discipline of Geological Resources and Geological Engineering of Guizhou Province (ZDXK[2018]001), Huang Danian Resources of National colleges and universities Teachers' Team of Exploration Engineering (Teacher Letter [2018] No. 1), Geological Resources and Geological Engineering Talent Base of Guizhou Province (RCJD2018-3), Key Laboratory of Karst Engineering Geology and Hidden Mineral Resources of Guizhou Province (Qianjiaohe KY [2018] No. 486Guizhou Institute of Technology Rural Revitalization Soft Science Project(2022xczx10), Education and Teaching Reform Research Project of Guizhou Institute of Technology (JGZD202107,2022TDFJG01).The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Global warming is a major issue for all countries, and one of the major strategies by which to relieve it is the reduction of greenhouse gases. In particular, the Fifth Assessment of Climate Change Bulletin of the United Nations Intergovernmental Panel on Climate Change (IPCC) objectively stated that climate warming is mainly caused by the burning of large amounts of fossil fuels during human activities, and that the alleviation of global warming has become an unavoidable responsibility of all countries. The increase in global average temperature caused by excessive CO 2 emissions seriously threatens the living space of human beings and sustainable development.

Global climate change is closely related to sustainable development goals worldwide, and consequently, many governments have taken relevant measures to actively address this issue. China has put forward the ambitious goal of "achieving peak carbon by 2030 and carbon neutrality by 2060.” To help achieve this they have increased research and development into activities relating to new energy-technologies that can help reduce the proportion of fossil energy use, thereby achieving a profound change in the energy consumption structure. To protect the ecological environment, water, wind, and tidal energy resources should be utilized alongside "pollution and carbon reduction" activities. The guidance for typical demonstrations should be improved to fully mobilize the enthusiasm of local governments, departments, industries, and enterprises and thus facilitate good working patterns.

In China, Guizhou Province is a major karst area whose energy consumption is predominantly from coal, oil, and other primary energy sources. In recent years, accelerated urbanization and rapid economic growth have increased its dependence on traditional energy sources, and consequently, total energy consumption for this area is considered high. Furthermore, in Guizhou Province, the dual impetus of urbanization and economic development has lead to increased energy consumption and consequently, annually increasing carbon emissions. Energy saving and emission reduction strategies for Guizhou province will ultimately affect low-carbon economic development for China as a whole. To achieve peak carbon emissions in China by 2030, efficient emission reduction policies must be implemented. The impacts of carbon emissions must be studied and monitored using effective scientific methods to help enable accurate predictions. Different neural network models have been used to predict peak carbon for Guizhou Province as this will help to reduce carbon emissions to meet Chinas "3060" goal ( Fig 1 ).

PPT PowerPoint slide
PNG larger image
TIFF original image

https://doi.org/10.1371/journal.pone.0296596.g001

Existing research on carbon emissions has predominantly focused on national and industrial levels, while regional level research has been limited. This study focuses on Guizhou Province and the total carbon emissions were calculated using relevant data on energy consumption and carbon emissions. A literature review was conducted to identify the factors influencing carbon emissions, machine learning technology was used to identify high correlations among factors, and the future trends for carbon emissions were predicted. The future carbon emissions were then analyzed using the predominant factors and carbon peak data. The results of this study will help guide future theoretical research on carbon emissions at the regional level ( Fig 1 ).

Literature review

National carbon emission reduction work is required in all regions of a country, and regions with different levels of industrial development should implement differentiated policies. Guizhou Province is an old industrial base in China that predominantly used fossil fuel energy, and its greenhouse gas emissions, both historically and at a present, are generally high. Consequently, achieving Chinas carbon peak schedule will be difficult for Guizhou Province. A scientifically accurate calculation method for carbon emissions is thus required and will provide significant aid in the implementation of energy conservation and emission reduction strategies. Based on the optimized neural network model, this study predicts and analyzes the peak carbon emissions in Guizhou Province over the next 20 years, which helps to identify the existing problems and gaps and to guide the direction of energy development.

Carbon emission influencing factors

Research on factors influencing carbon emissions is focused on the contributions of different factors and predominantly selects economic indicators such as population, economy, and energy intensity to construct a relevant index system. Ang et al. [ 1 ] analyzed the changes in carbon dioxide produced per unit of electricity globally and considered the import and export of each country, the fuel structure of power generation, and emission factors as the main influencing factors. This study has revealed that an improvement in power generation efficiency is the main reason for reduced CO 2 emissions. Rustemoglu et al. [ 2 ] studied carbon dioxide levels in Brazil and Russia from 1992 to 2012 and identified the factors affecting carbon emissions as falling into employment, economy, and carbon emission intensity categories. The results showed that Brazil’s carbon dioxide emissions were not decoupled from its economic development. Russia’s carbon dioxide emissions have greatly reduced with an increase in energy intensity. Lin et al. [ 3 ] used the input-output method to analyze the carbon emissions of the national food industry and divided industrial carbon emissions into four main factors: pollution factor, total output, energy intensity, and energy structure. Roinioti et al. [ 4 ] considered the development level of the national economy, energy consumption intensity, fuel consumption capacity, and carbon emission intensity as the main factors affecting carbon dioxide. Kim et al. [ 5 ] decomposed the factors affecting carbon emissions into production scale and production intensity and conducted corresponding research and analysis on the contributions of growth in energy consumption in various sub-industries. Roman et al. [ 6 ] focused on Colombia and used the IDA-LMDI model to decompose the influencing factors affecting CO 2 emissions into five aspects, energy intensity, wealth value, fossil fuel substitution, and renewable energy development, to explore the contribution levels of different CO 2 increments.

A carbon emissions impact factor index system was derived in China, predominantly from the perspectives of energy structure, population growth, economics, and other factors. Ying et al. [ 7 ] considered China’s steel industry as the main research object, and their results showed that energy intensity has a greater impact on carbon emissions, but the role of the consumption structure was not as expected. Dewey et al. [ 8 ] studied the influencing factors of carbon dioxide generated by indirect consumption in the daily lives of Chinese residents and found that socioeconomic level is the main driving factor affecting CO 2 emissions for urban and rural residents, and the contributions of their different structures and their consumption proportions are inversely related to CO 2 emissions. Jingyan et al. [ 9 ] used regression analysis to study the carbon emissions of the thermal power industry in Guangdong Province, China. Whereas Ting et al. [ 10 ] used the LMDI method to conclude that economic growth and carbon dioxide levels change in the same direction; that is, the faster the economic growth, the more obvious the effect of promoting carbon dioxide emissions, whereas the effective utilization and conversion of energy can reduce carbon dioxide emissions. Xiaoming et al. [ 11 ] used an LMDI decomposition model to study the carbon emissions from 30 provinces and cities in China from 2004 to 2014 and explore the contributions of the important factors by dividing the time period with 2009 as the demarcation point. The results showed that the growth of the gross national product had the greatest impact on national carbon emissions, while the contributions of other factors were weak. Ying et al. [ 12 ] used energy intensity, economic development, and population size as influencing factors to study carbon emissions. Xiaoyong et al. [ 13 ] used high-resolution spatial data to place carbonate chemical weathering carbon sinks, silicate chemical weathering carbon sinks, vegetation-soil ecosystem carbon sinks, and energy carbon emissions on a spatial grid. Subsequently, a carbon neutral index model was established to reveal the contributions of terrestrial ecosystem carbon sinks to carbon neutrality. The results were compared with those of other countries from horizontal and vertical perspectives. The results provide a new ideas for the measurement of carbon neutralization capacity, and provide important reference values and data for the systematic determination of global carbon neutralization capacity. the model developed by Xiaoyong et al. is highly recognized in academia.

Domestic and foreign carbon emission predictions

Previous studies have predicted the carbon emissions for different countries or industries using logistic regression models, the STIRPAT model, and scenario analysis. Ouedraogo et al. [ 14 ] used the LEAP framework to model the analysis and projection of energy demand and associated emissions under alternative strategies in Africa from 2010 to 2040. Lin et al. [ 15 ] conducted a survey of China’s manufacturing industry using the STIRPAT model and found that macroeconomic growth factors could determine the carbon dioxide emissions of the industry, while the effects of the fuel utilization and urbanization rates have significant regional heterogeneity. Wang et al.[ 16 ] constructed the STIRPAT model to investigate the factors influencing carbon emissions in Xinjiang from 1952 to 2012, and the results identified differences in the impacts of various factors in different historical periods. Prior to 1978, population size expansion was the main factor causing an increase in carbon emissions. From 1978 to 2000, economic growth and population size were the main factors driving increased carbon emissions, and after 2000, the main factors were increased economic development and fixed asset investment. Kachoee et al. [ 17 ] used the LEAP model to predict carbon dioxide emissions relating to Iran’s power sector over the next 30 years and concluded that economic growth is the main influencing factor. Emodi et al. [ 18 ] used the LEAP model to study climate change in relation to Australia’s power sector and found that reducing expenditure on environmental protection and resource conservation would produce economic benefits ( Fig 2 ).Liu [ 19 – 22 ] designed four plane-scale models of steel oblique beam structures and conducted quasi-static tests under cyclic loading, which clarified the yield mechanism, failure mode, hysteresis energy consumption, stiffness degradation, equivalent viscous resistance coefficient and lateral deformation performance of oblique beam structures, and provided technical basis for performance-based seismic design of oblique beam structures [ 23 ]. Jun-song Jia [ 24 ] takes Henan Province of China as a study area,computed the EF and the ecological carrying capacity (EC) in 1949–2006. Based on the computed results, the simulating process of the ARIMA model and the fitting and forecasting results were explained in detail. The final results demonstrated that ARIMA model could be used effectively in the simulation and prediction of EF and the predicted EF could help the decision-makers make a package of better planning for regional ecological balance or sustainable future.

https://doi.org/10.1371/journal.pone.0296596.g002

Neural network models

Neural network models are widely used in various fields [ 25 – 27 ]. Representative neural network models include BP neural networks, radial basis function networks, Hopfield models, GMDH networks, adaptive resonance theory, Boltzmann machines, and CPN models. Lapedes et al. used neural networks for economic forecasting in 1987, whereas Chunjuan et al. [ 27 ] applied neural networks to predict typhoons, debris flows, and geological subsidence. Fan et al. [ 28 ] established a POS-BP neural network model to predict the total carbon emissions and intensities of 30 provinces, municipalities, and autonomous regions in China. Ying et al. [ 29 ] compared the advantages of a neural network and other traditional prediction methods and used a BP neural network model combined with a terminal information collection system and Web Service technology to design an intelligent system for urban road-occupying parking and proved the feasibility of the management system using actual data. Xiaowei et al. [ 30 ] predicted the prices of stock investments by combining a neural network model with principal component analysis and multiple linear regression. Xiaolong et al. [ 31 ] and others studied the problem of gas outbursts in tunnels using a BP neural network. The results were good and verified that the predicted and real values are consistent. Xiaocheng et al. [ 32 ] also used a BP neural network to predict air pollutant concentrations. The original BP neural network was used to calculate the system error of all samples using successive iterations and batch processing, which improved the execution efficiency.

In summary, the influencing factors affecting carbon emissions are predominantly considered to be population, economy, and energy structure, and these are used to establish a carbon emission index system and applied in the follow-up prediction research. Carbon emission forecasting research has predominantly used traditional econometric methods such as the logistic regression model, STIRPAT model, and scenario analysis. Whereas neural network models have achieved satisfactory results when used for economic forecasting. A review of carbon emission-related literature shows that most recent studies have focused on the national or industrial levels, and few studies have focused on the use of machine learning algorithms for peak carbon emission predictions, while the reported neural network model used is relatively single. Training a single neural network prediction model is time-consuming and can easily fall into a local optimum. This study has thus combined a neural network model with carbon emission research and has optimized different algorithms to improve carbon emission predictions for Guizhou Province.

Carbon emissions from Guizhou Province are calculated using an inventory method based with energy consumption data for the province. Referring to relevant literature and combining it with the actual development of Guizhou Province, 12 factors affecting carbon emissions were selected to establish a characteristic subset. By introducing the grey correlation analysis method, indicators with a higher degree of influence were selected and applied to follow-up prediction research. Finally, the carbon emissions from 2020 to 2040 in Guizhou Province were predicted under three different development scenarios by establishing a prediction model based on the WOA-ELM.

Carbon emission calculations

Where B i represents the consumption of the i th energy source and n represents the type of energy source. In this study, n = 9 represents the energy consumptions of coal, coke, crude oil, gasoline, kerosene, diesel, natural gas, and electricity. The conversion and carbon emission coefficients for each energy standard coal in Guizhou Province are listed in Table 1 .

https://doi.org/10.1371/journal.pone.0296596.t001

From 2000 to 2012, the total carbon emissions from Guizhou Province increased ( Table 2 ), while from 2012 to 2016 they showed a decreasing trend. The decrease is related to the construction of the ecological civilization in Guizhou Province, and the practice of green mountains is Jinshan and Yinshan. In 2020, the total amount of carbon emissions in Guizhou Province was 1.1 million tonnes 22237, approximately twice that in 2002. With the development of a social economy, carbon emissions in Guizhou Province are expected to show a steady upward trend in the future. However, due to the inhibition of carbon emissions through the implementation of policies, such as those for carbon emission reduction, the establishment of a carbon trading market, and an increase in the proportion of new energy applications in Guizhou Province, the growth rate of carbon emissions will gradually decrease, and the development trend will be reduced year on year.

https://doi.org/10.1371/journal.pone.0296596.t002

Literature on carbon emission influencing factors shows that most scholars select macroeconomy, industrial structure, energy consumption, and scientific and technological development. Based on the actual social and economic development of Guizhou Province, this study has comprehensively considered the scientific, systematic, and authentic principles of index selection. Total population, urbanization rate, household consumption level, per capita GDP, energy intensity, carbon emission intensity, foreign direct investment, energy structure, proportion of primary secondary, and tertiary industries, and total energy consumption were selected and qualitatively analyzed. A description of each factor is as follows:

Total population. With the expansion of population size and the continuous improvement of living standards, the demand for food, clothing, shelter, and living environments increases. This drives the development of related industries, and an increase in energy demand, which leads to increased CO 2 emissions.
Urbanization rate: The urbanization rate reflects the progress of urbanization in a country or region. Improvement in the urbanization rate can drive an increase in economic development, thus promoting the process of local industrialization. Further promotion of industrialization in Guizhou Province will inevitably generate a large energy demand. If we only pay attention to the development of urbanization and ignore resources and the environment, it will inevitably lead to large levels of energy consumption and the vicious growth of carbon dioxide.
Resident consumption levels: Increasing resident consumption levels accelerates the pace of urban industrialization and rapid economic growth.
GDP per capita: Per capita GDP generally reflects a country’s economic development and people’s living standards, but at the same time, it can also express the growth level of carbon emissions caused by the continuous expansion of economic development and scale to a certain extent. Economic and social development cannot be separated from industrial development, and industrial development cannot be separated from energy. The rapid development of the global economy will inevitably lead to increased energy consumption, resulting in high levels of carbon dioxide emissions.
Energy intensity: The amount of energy consumed per unit of output reflects the dependence of social and economic development on energy consumption. A lower energy intensity implies lower energy use, higher energy efficiency, and increased output [ 33 ].
Carbon emission intensity: Carbon emission intensity refers to the proportional relationship between total carbon emissions and gross domestic product (GDP), which shows the relationship between the social and economic development of a country or region and carbon emissions. Generally, the better the economic and social development, the higher the level of industrial development and the lower the carbon emission intensity [ 34 ]. With economic development and progress in science and technology, the intensity of carbon emissions will gradually decrease.
Foreign direct investment: Since the reform and opening-up, the speed of introducing and utilizing foreign direct investment in Guizhou Province has increased. However, environmental problems have become increasingly severe with economic development. Considering the negative impact of foreign direct investment on carbon emissions and other environmental problems in the region, it is necessary to study the impact of foreign direct investment on carbon emissions in Guizhou Province.
Energy structure: The proportion of coal consumption has a significant impact on the total carbon emissions [ 35 ]. The ratio of coal energy consumption to fossil fuel energy consumption was set to represent the energy structure.
Primary industry: The primary industry is the production sector, with agriculture as its main object and nature as its direct object. The higher the proportion of the primary industry, the lower the degree of industrial modernization in the region. During continuous economic development, the proportion of primary industry gradually decreases in relation to the larger industrial structure.
Secondary industry: Secondary industry includes industrial enterprises and includes numerous sub-industries have high energy consumption and pollution levels. Many links promote the increase of energy consumption, resulting in a large amount of carbon dioxide emissions [ 36 ]. Selecting the proportion of secondary industry can better show the level of the industrial structure in Guizhou Province.
Tertiary industry: As an industrialized city, the proportion of tertiary industry in Guizhou Province has gradually increased in recent years. The rational transformation of the industrial structure will promote low-carbon technological innovation and have a significant positive impact on reducing carbon emissions.
Total energy consumption: Energy is important for economic development and energy consumption plays a decisive role in economic growth. Different energy structures cause different degrees of environmental pollution [ 37 ].

The selected indicators are shown in Table 3 .

https://doi.org/10.1371/journal.pone.0296596.t003

Total carbon emissions from Guizhou Province between 2000 and 2020 were calculated as basic data, and correlations with the population, economy, and energy data were determined. The data were obtained from the Statistical Yearbook of Guizhou Province 2000–2020.

To facilitate subsequent modeling and data representation, the variable names for the 12 factors affecting carbon emissions were redefined as X1, X2, X3, X4, X5, X6, X7, X8, X9, X10, X11, and X12. The total carbon emissions (Y) from Guizhou Province were set as the reference sequence, and the 12 influencing factors are set as the comparison sequence. The original data were normalized to eliminate dimensional influences, and the results are shown in Table 4 .

https://doi.org/10.1371/journal.pone.0296596.t004

To calculate the difference between the maximum and minimum absolute values in the matrix, the resolution coefficient was set to 0.5 to obtain the correlation coefficient table. The average values of the correlation coefficients for different sequences at each time point were used to obtain the correlation degree and for sorting. The results are summarized in Table 5 .

https://doi.org/10.1371/journal.pone.0296596.t005

The closer the correlation degree is to 1, the stronger the correlation with carbon emissions in Guizhou Province. The top five correlation degrees identified were for X12, X1, X2, X4, and X3. The influencing factors were total energy consumption, total population, urbanization rate, per capita gross domestic product (GDP), and residents’ consumption level. The correlation degrees, which were the main correlation factors, were all greater than 0.75. The correlation degrees for X7, X11, X10, X9, X8, and X5 were all between 0.5 and 0.7, indicating medium correlation. The correlation coefficient between energy intensity and carbon emissions in Guizhou Province was low at 0.487.

The relevant data regarding energy consumption in Guizhou Province were collected and the carbon emission data between 2000 and 2020 were determined. The results show that the total carbon emissions in Guizhou Province are closely related to economic development and relevant policies. Data from the existing literature was used to help analyze the data from Guizhou Province, this study initially sets 12 indicators, including the total population, urbanization rate, consumption level of residents, and per capita GDP, as the factors influencing carbon emissions and expounds the reasons for the selection in detail. The total energy consumption, total population, urbanization rate, per capita GDP, and residents’ consumption level all showed a high level of correlation with carbon emissions in Guizhou Province, and the five factors with strong correlation can be used as input variables in the prediction model to improve carbon emission prediction accuracy in Guizhou Province.

Carbon emission and carbon peak prediction model

Prediction model design, establishment of a bp neural network model..

A BP neural network is composed of input, hidden, and output layers. When establishing a BP neural network, setting too many or too few hidden layer nodes will affects the results of the data; too many hidden layers are prone to overfitting, resulting in increased training time, and too few hidden layer nodes affects the accuracy of the data fitting. The number of hidden layers must be determined based on data characteristics. In this study, the number of hidden layer nodes [ 38 ] was determined by trial and error. The specific settings for each level and node are as follows:

Setting the model input and output layers . The five factors selected in the third chapter are the main factors affecting carbon emissions in Guizhou Province, and were used as the model input variables; that is, the nodes of the input layer, including the total population, urbanization rate, household consumption level, per capita GDP, and total energy consumption. The number of neurons in the output layer of the model was one, that is, the carbon emissions of Guizhou Province.
Analysis of network structural parameters . a) Selection of network layer number . A BP neural network is a multilayered neural network, and the selection of the number of layers is crucial for establishing a reasonable network model. Different numbers of hidden layers affected the final model-fitting. When the number of hidden layers is increased, the network structure becomes complex, the prediction ability of the model is improved, and the error is reduced. However, the overfitting phenomenon is prone to occur, and the training time is not ideal. The number of hidden layers is reduced, the network structure is simplified, the prediction ability of the model is reduced, the accuracy is reduced, the error is increased, and the training speed is fast. Hornik proved that a three-layer BP network structure can meet the fitting requirements of most nonlinear systems; if the number of hidden layer nodes is set appropriately, it can also help the network fit most functions with a high level of accuracy. To meet the requirements for training accuracy, a three-layer BP neural network with only one hidden layer was selected for this study to predict carbon emission data.

b) Determination of the activation function and other parameters

The activation function introduced a nonlinear relationship into the neurons through mapping. To better represent the nonlinear relationship of a function, an appropriate type of activation function must be selected. The hyperbolic tangent function is a common activation function, which maps the number taking the value of (−∞, + ∞) into (−1, 1), so that the variable is in the largest possible threshold range, which can better preserve the nonlinear variation level of the function. The transfer function of the hidden layer node was thus chosen as the tangent S-type transfer function, tansig, for this study. The input and output values of the linear transfer function purelin can assume any value. To facilitate comparisons with the sample value, purelin was selected as the output value returned by the output layer, which refers to the change in the information accumulation speed of the BP neural network with time. Different learning rate settings affect the training time and the training effect of the model. The training speed of the model with a larger learning rate value is relatively fast; however, there are large fluctuations in the later period that the resulting model cannot convert. When its value is small, although it can make the simulation results of the model more accurate, it significantly increases the training time. In general studies, the learning rate γ is usually set between 0 and 1. In this paper, through continuous debugging and comparison of the training effect in the training process, the learning rate γ = 0.1 was selected. The accuracy of the network training was required to be 0.001, and the maximum number of training sessions was 500.

c) Selection of the number of neurons in the hidden layer

https://doi.org/10.1371/journal.pone.0296596.t006

Establishing an extreme learning machine model

The algorithm can be divided into three steps. The first step determines the number of neurons in a hidden layer and randomly generates a connection value between an input layer and the hidden layer, and a neuron bias for the hidden layer in a network model. The second step determines the activation function of the neurons in the hidden layer, and calculates the output matrix H for the hidden layer by selecting an infinitely differentiable function. The third step calculates the output layer weight as follows: β * = H + T.

Establishment of a WOA-BP model

The BP algorithm is a common learning algorithm used in various fields. However, existing problems restrict its development. During the training process, the initial weights and thresholds are randomly generated, and consequently, the generalization ability cannot be guaranteed. The WOA is then used to optimize the initial parameters of the BP neural network to obtain a more stable WOA-BP neural network.

The steps involved in the WOA optimization of the BP neural network are as follows:

Determine a BP neural network structure and initial weight and threshold values;
Calculate that individual fitness of the whale, and take a fitness function as an optimized target function;
Set an algorithm-stopping criterion, select different mechanisms, update the individual positions of the whale, and optimize parameters;
Identify an optimal whale position, and assign the optimal weight and threshold values to the BP neural network;
The optimized BP neural network is trained, a simulation test is performed, and the prediction performance of the BP neural network is compared with the data prior to optimization.

Establishment of a WOA-ELM model

Based on the whale optimization algorithm and structure of the extreme learning machine mentioned above, a WOA-ELM combination forecasting model was established. In the WOA, the optimal position of the humpback whale is the optimized ELM parameter value, and the WOA iteration is used to determine the optimal wi and bi of the ELM, which can improve the prediction accuracy of the model.

Initialize the ELM: Number of input neurons (set as 5), number of hidden layer neurons (set as 10), activation function type G (set as a sigmoid function), input weight (wi), and hidden layer threshold (bi);
Initialize the WOA parameters: Population size N (set to 50) and maximum generation number Tmax (set to 500);
Initialize the position vector of the individual whale, connecting the randomly generated connection weight (wi) and nerve during ELM training, the meta-bias (bi) was considered the initial position vector of the individual whale;
Set a fitness function (error rate in the ELM test) and calculate the current individual fitness in the initialized population to obtain the optimal (best fitness) whale individual.
After the p-value is randomly generated (0, 1), the updated formula (Graph) (Graph) at different positions is determined by the values of A and p. When the A (Graph) (Graph) values are different, there are three corresponding update position formula selections as follows: When A (Graph) (Graph) > 1, select to perform a global random search; when A (Graph) (Graph) < 1, the random variable p-value is combined to choose between the shrinking enclosure and spiraling strategies.
Determine whether the algorithm can satisfy the preset termination conditions. When the end condition of the algorithm is reached, the algorithm is terminated, and the whale individual position vector with the best fitness value is output; that is, the optimal weight and threshold for the ELM network are obtained. Otherwise, the number of iterations is increased by one, and Step (5) is repeated.
The obtained optimal parameters of the ELM network are input into the WOA-ELM model to predict the carbon emissions for Guizhou Province.

It is worth noting that in ELM, the input data is transformed by the hidden layer, and then the output layer produces the result. This process is "forward propagation", that is, information flows from the input layer to the output layer. However, the most important thing in this process is backpropagation, that is, how to adjust network parameters to improve performance when the output does not meet expectations. Back-propagation algorithm is an important optimization technique, which calculates how much the weight of each layer needs to be adjusted according to the difference between the actual output and the expected output of the network, and then optimizes the network. In ELM, due to its single-layer feedforward feature, backpropagation is mainly used to adjust the weights and biases, so that the network can better adapt to the training data.

Analysis and comparison of the experimental results from the prediction model

There is no authoritative agency in China that directly provides carbon dioxide emission data, and consequently, this study has used a compromise method to discount the emission data for each year from different database collections and obtain an average value. The calculation of carbon emissions and the collection of relevant data varied depending on the subjects of the study. Considering the characteristics of carbon emissions in Guizhou Province and the difficulty in acquiring data, this study has used an estimation method.

Prediction based on BP neural network model

Simulation settings . The data used in this study were obtained from the Statistical Yearbook of Guizhou Province from 2000 to 2020, the China Energy Statistics Yearbook, and the website of the National Bureau of Statistics. In the third chapter, the calculation of carbon emissions data in Guizhou Province, for example, to verify the BP neural network prediction model data for carbon emissions in Guizhou Province, the fitting degree of the evaluation model and its advantages and disadvantages are discussed.
Prediction results and analysis . When dividing the training and test sets, relevant data from 2015 to 2020 were used as the training set data, and the remaining six groups were used as the test set data. Carbon emissions change each year, and thus, to improve the prediction performance of the model, this study has predicted carbon emissions annually and added the influencing factors and carbon emission data from each new prediction year to the training sample. The relative and absolute errors were used to analyze the prediction effect of the prediction model. The prediction errors for the test samples are shown in Table 7 .

https://doi.org/10.1371/journal.pone.0296596.t007

The carbon emission prediction curve in Table 7 shows that the change rule for the predicted carbon emission values in Guizhou Province was generally consistent with the change rule for the real values; however, the difference between the values is large, and the prediction effect is not sufficiently stable. The difference between the predicted and real values in 2018, 2019, and 2020 was small, and the relative error was below 2.5%, which met the expectations for prediction accuracy. However, there were large differences between the predicted and actual values in 2015, 2016, and 2017, and the expected prediction effects were not observed. This was predominantly because the initial weights and thresholds in the BP neural network are determined randomly, and it is difficult to achieve a good fit during model training; this results in large fluctuations in the prediction results and an inability to achieve a good prediction effect [ 39 – 42 ].

Extreme learning machine model predictions

An extreme learning machine model was used to predict carbon emissions for Guizhou Province, using the data from 2000 to 2014 as the training set. The data from 2015 to 2020 were divided into the test set, and the training sample method was the same as for the BP neural network structure, the relative error was reserved for two digits, the absolute error was reserved for one digit, and the error comparison between the actual value and the predicted value was obtained.

The prediction results in Table 8 show that the model fits the carbon emissions of Guizhou Province well and also approximates their relationship with the influencing factors. However, the predicted results are unstable. Although the predicted values for most years were close to the actual values, the carbon emissions obtained in 2017 differed significantly from the actual values. In the forecast results, the absolute error between the forecast and real values in 2019 was 8.8, and the forecast value in 2019 was the closest to the actual carbon emissions of Guizhou Province. The average relative error of the test set was 0.43%. Compared with the BP neural network, the prediction model for carbon emissions in Guizhou Province based on extreme machine learning has a higher accuracy and stronger ability to approximate the nonlinear relationship of samples, but the setting of random initialization value and β also affects the accuracy of the model to a certain extent, and will require further optimization.

https://doi.org/10.1371/journal.pone.0296596.t008

WOA-BP model predictions

The whale algorithm, which has a global search ability, was used to optimize the initial weight threshold of the BP neural network to improve its prediction accuracy. The divided training and test sets were the same as those used in the BP neural network model.

When setting the initial weights and thresholds of the neural network, a set of randomly generated initial values was selected, because there was no relevant setting principle. The BP neural network can learn the mapping relationship between the input and output automatically, generate initial parameters randomly, and modify the weights and thresholds of the network continuously through error back propagation; however, randomly selected initial weights and thresholds are usually inversely proportional to the convergence speed of neural network training; that is, the larger the value, the slower the convergence speed. In this case, the final training results easily fall into the local optimum, and it is difficult to obtain ideal calculation and prediction results. As shown in Table 9 , the relative error of the BP neural network after optimization is significantly reduced by no more than 1.5%, and the fitting degree for carbon emissions in Guizhou Province is significantly higher when compared with the results prior to optimization. The carbon emission prediction value in 2017 was the closest to the actual value, with a relative error of 0.16%, and the prediction results in other years were relatively stable. The accuracy and stability of the prediction using the WOA-BP neural network were significantly improved [ 43 , 44 ].

https://doi.org/10.1371/journal.pone.0296596.t009

WOA-ELM model predictions

The training samples selected in this section were the same as those used for the extreme learning machine model setting. The input and output variable data from 2000 to 2014 were used as the training set ( Table 10 ), and the prediction years were from 2015 to 2020. The WOA-ELM model was established [ 45 – 50 ].

https://doi.org/10.1371/journal.pone.0296596.t010

The results show that after multiple training sessions, the ELM model has a better fitting effect on the carbon emissions of Guizhou Province from 2015 to 2020 after WOA optimization. The error between the predicted value and the real value is between 0% and 0.05%. The relative error and absolute error between the two are relatively small, except for a few, and the prediction accuracy is significantly higher than that of the ELM model. The effectiveness of the WOA-optimized ELM scheme was verified.

Comparison of the prediction results

In this study, four prediction models were established to test the carbon emission data for Guizhou Province. To evaluate the prediction performance of the four models, three indicators,–mean absolute error, mean absolute percentage error, and root mean square error were used for comparative analysis. The mean absolute error indicates the actual prediction error, while the root mean square error (RMSE) indicates the deviation between the observed and true values.

The results showed that the BP neural network is highly accurate at predicting carbon emission data for Guizhou Province using an extreme learning machine model ( Table 11 ). Compared with the ELM model, the BP neural network model is less robust and random, and its convergence speed is slightly lower than that of the ELM model. Among the four prediction models, the prediction model based on the WOA-ELM had the highest prediction accuracy, and it was followed by the model based on the extreme learning machine. The prediction performance of the BP neural network model was the worst. The comparison test results concluded that the prediction effect of the WOA-ELM model was relatively better, and that the prediction accuracy can reach the expected level, which can be used to predict the peak carbon emissions of Guizhou Province in the following text.

https://doi.org/10.1371/journal.pone.0296596.t011

Carbon peak predictions for Guizhou Province

Construction of carbon emission scenarios, scenario settings..

Scenario analysis refers to the quantitative analysis of both past and present situations, and integrates the factors affecting the future and makes qualitative assumptions to infer possible future situations. It is not the purpose of scenario construction to accurately predict the possibilities of the future, as its greatest practical value is comprehensive analysis. When using scenario analysis, there are two premises: one is to ensure that impact factors can be quantified and the other is to predict future indicators [ 33 – 35 , 51 , 52 ].

This section uses a scenario analysis method to set the impact factors of carbon emissions under different development scenarios as this will help to facilitate theprediction of carbon emission levels in Guizhou Province from 2020 to 2040. First, the baseline, high-speed, and low-carbon scenarios are set, corresponding to the indicators of medium growth and high growth with positive regression coefficients. Then, according to strategic policy interpretation for economic and energy development in Guizhou Province, the current situation for economic and social development and the development trend for the energy structure in Guizhou Province were clarified, and the parameters for total population, urbanization rate, residents’ consumption level, total energy consumption, and per capita GDP in Guizhou Province in the future under different development scenarios were set in combination with relevant policies and energy target requirements. Finally, the future evolutionary trend for carbon emissions in Guizhou Province was predicted ( Table 12 ).

https://doi.org/10.1371/journal.pone.0296596.t012

Benchmark scenario . The benchmark scenario is the continuation of existing economic and energy development in Guizhou Province. In the current economic development model, each factor is set according to the most likely situation. As a large industrial province, Guizhou Province has a complete industrial infrastructure that is expected to continue to thrive. Furthermore, its economic and industrial structures will continue to follow the state calls for transformation and upgrading. The energy consumption structure of Guizhou Province is dominated by industrial development, and will continue to be dominated by fossil energy consumption; however, the proportion of fossil energy consumption will continue to decline with the development of new energy technologies.

High-speed scenario mode . In the high-speed scenario mode, the total population, urbanization rate, per capita GDP, household consumption level, and total energy consumption are maintained, which facilitates rapid development and change. With the rapid growth of the population, acceleration of urbanization, rapid and vigorous development of the economy and society, rapid development of new industries, and dominant position of the information industry, the use of new energy will be more widely applied in various industries, and the efficiency of energy utilization will be significantly improved.

Low-carbon scenario . Total population, urbanization rate, per capita GDP, and total energy consumption will develop at a lower rate than in the baseline scenario.

Scenario parameter settings

1) Population setting . With economic and social development, the total population will continue to expand in the short term, however, long term, the population growth rate will decline. Analysis of the changes in the total population trend in Guizhou Province showed that it gradually decreased from 2010 to 2020, and the natural growth rate of the population was negative. By 2020, the population of Guizhou Province will reach 38.57 million, while the Population Development Plan of Guizhou Province proposes that the permanent population will reach 50 million by 2030; which means that the average annual growth rate will be 0.44%. According to the population development plan of Guizhou Province and the population growth in recent years, this study sets the annual rate of change at 0. 7% in baseline mode, 1% in high-speed mode, and 0. 5% in the low-carbon mode.

2) Setting the urbanization rate . Urbanization is continuously advancing in Guizhou Province, reaching a rate of 50.26% in 2018, which was 2.58 times higher than that in 1995. The average growth rate in the past five years was 1.12%, and the average growth rate in the past ten years was 1.33%. The effects of changing urbanization trends on the economies of various countries show that Britain and the United States are in the leading position in the process of global urbanization construction, reaching approximately 80%, while other developed countries are approximately 70%. Compared with the general level of urbanization in China, the urbanization process in Guizhou Province is relatively fast. Combined with the experience of developed countries, this study sets the annual rate of change to 1% in the benchmark mode, 1.25% in the high-speed mode, and 1% in the high-speed mode. In the low-carbon mode, the annual rate of change was 0.7%.

3) Setting of per capita GDP . The per capita GDP of Guizhou Province will continue to grow from 2000 to 2020, with the per capita GDP reaching $330/person in 2000 and $7000/person in 2020. In recent years, infrastructure development has led to an increase in economic development in Guizhou Province, and the growth rate of the per capita GDP has rapidly increased. In 2016, the per capita GDP of Guizhou Province increased significantly. According to the 13th Five-Year Plan for National Economic and Social Development of Guizhou Province, the average annual growth rate of the regional GDP has reached 6.6%, and the space for the decline in the per capita GDP growth rate will gradually shrink after the 13th Five-Year Plan. This study set the annual change rate to 6.5% in the benchmark mode, 7% in the high-speed mode, and 6% in the low-carbon mode [ 36 – 38 , 53 , 54 ].

4) Residents’ consumption levels . From 2000 to 2020, the average annual growth rate of residents’ consumption levels in Guizhou Province was 8.0%. The "13th Five-Year Plan" of Guizhou Province proposes releasing residents’ consumption potential, creating consumption demand, and further enhancing their consumption capacity. In baseline mode, the annual change rate was 8%, whereas in high-speed mode, the annual change rate was 9%.

5) Total energy consumption . From 2000 to 2020, the total energy consumption in Guizhou Province increased slightly. From 2002 to 2012, total energy consumption showed a rapid upward trend. After 2012, the total energy consumption declined, from 23526 tonnes of standard coal/10,000 CHY in 2012 to 22321 tonnes of standard coal/10,000 CHY in 2018. According to the requirements of energy saving and emission reduction planning in the 13th Five-Year Plan of Guizhou Province, this paper sets the annual change rate of −1.5% in the baseline mode, −2% in the high-speed mode, and −2.5% in the low-carbon mode ( Table 13 ).

https://doi.org/10.1371/journal.pone.0296596.t013

Prediction and analysis of carbon peaks

The fitted WOA-ELM was used to predict the carbon emissions of Guizhou Province from 2020 to 2040 under the three scenarios. The predicted results are listed in Table 14 .

https://doi.org/10.1371/journal.pone.0296596.t014

Owing to the differences in the carbon emissions affected by population, urbanization rate, resident consumption level, per capita GDP, and total energy consumption, the occurrence time and peak value of the carbon peak in Guizhou Province will change because of different parameter settings, and the total carbon emissions will also change accordingly. In the baseline scenario, it is estimated that the peak carbon emissions of Guizhou Province will reach 260 million tonnes in 2038, while in the high-speed scenario, they reach 300 million tonnes in 2036 and in the low-carbon scenario they reach 210 million tonnes in 2033. The baseline scenario data shows that the peak carbon emissions in Guizhou Province will not be achieved by 2030 as scheduled and will most likely be delayed to 2038. In the high-speed scenario, the peak carbon emissions in Guizhou Province occurred two years earlier than those in the baseline scenario. By comparing the time and size of the peak carbon emissions under the baseline and low-carbon scenarios, the peak year under the low-carbon scenario was found to be earlier than under the baseline scenario. Although it is three years after the peak target of China in 2030, the development status of Guizhou Province is relatively backward compared to that of developed cities such as Beijing and Shanghai; therefore, it is acceptable. Its peak volume is 40 million tonnes lower than that of the baseline scenario. Comparing the peak time and size of carbon emissions predicted by the low-carbon and high-speed scenarios, shows that the peak time for carbon emissions in the low-carbon scenario is three years earlier than in the high-speed scenario, and the peak volume is reduced by 50 million tonnes. The previous prediction results showed that Guizhou Province cannot achieve the ambitious goal of carbon peak in 2030 in the baseline and high-speed development scenarios. In contrast, the carbon peak time in the low-carbon scenario was earlier and the peak value was lower.

Exploring the main factors affecting carbon emissions in Guizhou Province will be crucial for China to achieve its desired carbon peaks and neutralization. Accurate carbon emission predictions are also of great significance for governments and will help to formulate relevant policies and innovate energy-saving and emission-reduction science and technologies. In this study, the characteristic subset affecting carbon emissions was constructed by referring to the existing literature and combining it with real world data from Guizhou Province. The appropriate input variables were then selected based on the grey correlation analysis method and then the BP neural network and ELM model were established, the WOA algorithm was used to optimize the BP neural network and ELM model, and the performance of the prediction models was compared and analyzed using simulations. Finally, three scenarios were established to predict the carbon emissions of Guizhou Province from 2020 to 2040. The following conclusions were drawn from the analysis.

The results show that the total carbon emissions in Guizhou Province in 2020 will be 22237 million tonnes which was approximately twice the total carbon emissions of Guizhou Province in 2000. With the development of social economy, the growth rate of total carbon emissions in Guizhou Province will gradually decrease, and the overall trend was shown to have an "S" curve. The data for Guizhou Province and previous studies were combined and 12 influencing factors were selected. According to their degrees of correlation, population and total energy consumption have a greater impact on carbon emissions in Guizhou Province, while the total population, urbanization rate, residents’ consumption level, per capita GDP, and total energy consumption were selected as the input variables of the prediction model.

The BP neural network, ELM, WOA-BP, and WOA-ELM models were established to predict carbon emissions in Guizhou Province. Comparing the mean absolute error, mean absolute percentage error, and root mean square error of the BP neural network, ELM, WOA-BP, and WOA-ELM prediction models, the accuracy of the WOA-ELM model was found to be higher with an MAE of 101. The prediction accuracy of the model based on an extreme learning machine was the second highest, with an MAE of 224. 46, MAPE is 0.43%, RMSE is 328.62, and the prediction effect of the BP neural network model was the worst. Three scenarios were constructed: baseline, high-speed, and low-carbon scenarios. The carbon emissions in Guizhou Province over the next 20 years were determined using the fitted model input with the set scenario parameters. The results show that, under the baseline model, the carbon peak for Guizhou Province will appear in 2038, and the peak value will be 0.61 million tonnes 26243. Under the high-speed scenario, the peak time for carbon emissions in Guizhou Province appeared in 2036, with a peak value is 0.27 million tonnes 30251. Under the low-carbon scenario, the peak time of carbon emissions in Guizhou Province is 2033, and the peak value is 9800 tonnes 21294. In the baseline model, Guizhou Province cannot achieve China’s peak target by 2030, and the low-carbon scenario is the closest to the carbon peak target of the three scenarios, indicating that it is necessary to intervene in the external policies of Guizhou Province [ 55 , 56 ].

According to the results of the above gray correlation analysis, the population has an important impact on carbon emissions in Guizhou Province, which must be considered in the work of energy conservation and emission reduction in Guizhou Province. The increasing demand for energy in daily life and production activities promotes and significantly impacts the increase in carbon emissions. Controlling the population of Guizhou Province and encouraging citizens to choose green travel, energy-saving, and environmentally friendly lifestyles will have far-reaching impacts on the current carbon emissions in Guizhou Province. The government should encourage people to save electricity, appropriately dispose of household appliance waste, and increase investment in the research and development of energy-saving alternatives. Enriching urban public transport, promoting the construction of public transport facilities, and opening more convenient energy vehicle development.

Analysis of the differences in carbon emissions caused by the three development modes in Guizhou Province showed that the low-carbon mode reached the carbon peak earliest, followed by the high-speed, and benchmark modes. In the low-carbon development mode, when carbon dioxide emissions reach their peak, the value is the smallest among the three modes. Overall, population, economic development, and energy consumption factors influence each other, and to reach a timely carbon peak in Guizhou Province, we should not only ensure normal economic growth but also take measures to control the increase in urbanization rate, reduce energy consumption, and optimize energy structures. For example, coal consumption accounts for a high proportion of the energy consumption in Guizhou Province, and coal combustion increases carbon dioxide emissions. Efforts should thus be made to reduce the consumption of coal energy, increase the utilization and conversion rate of coal, increase investment in the research and development of new energy, broaden the scope of the popularization of new energy, improve the construction of related supporting facilities, and put the full use of new energy on the agenda. There should be a focus on the development of water conservancy, hydropower projects, and photovoltaic projects, and on increasing the proportion of clean energy, such as hydropower.

Supporting information

https://doi.org/10.1371/journal.pone.0296596.s001

View Article
Google Scholar
PubMed/NCBI

Advanced Search

Evaluating Large Language Models on Academic Literature Understanding and Review: An Empirical Study among Early-stage Scholars

New citation alert added.

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations, supplemental material, index terms.

Computing methodologies

Artificial intelligence

Natural language processing

Human-centered computing

Human computer interaction (HCI)

Empirical studies in HCI

Recommendations

Journal self-citation study for semiconductor literature: synchronous and diachronous approach.

The present study investigates the self-citations of the most productive semiconductor journals by synchronous (self-citing rate) and diachronous (self-cited rate) approaches. Journal's productivity of 100 most productive semiconductor journals was ...

Scientific collaboration patterns vary with scholars' academic ages

Scientists may encounter many collaborators of different academic ages throughout their careers. Thus, they are required to make essential decisions to commence or end a creative partnership. This process can be influenced by strategic motivations ...

Does Microsoft Academic find early citations?

This article investigates whether Microsoft Academic can use its web search component to identify early citations to recently published articles to help solve the problem of delays in research evaluations caused by the need to wait for citation counts ...

Information

Published in.

Monash University

The Australian National University

University of Glasgow

Lancaster University

University of Nottingham

Monash University/New Mexico State University

University of Copenhagen

SIGCHI: ACM Special Interest Group on Computer-Human Interaction
SIGACCESS: ACM Special Interest Group on Accessible Computing

Association for Computing Machinery

New York, NY, United States

Publication History

Permissions, check for updates, author tags.

academic tasks
human-AI collaboration
large language model
user perception
Research-article
Refereed limited

Data Availability

Funding sources.

Guangzhou Municipal Science and Technology Project
Guangzhou Science and Technology Program City-University Joint Funding Project

Acceptance Rates

Upcoming conference, contributors, other metrics, bibliometrics, article metrics.

0 Total Citations
577 Total Downloads
Downloads (Last 12 months) 577
Downloads (Last 6 weeks) 577

View Options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

View options.

View or Download as a PDF file.

View online with eReader .

View this article in Full Text.

HTML Format

View this article in HTML Format.

Share this Publication link

Copying failed.

Share on social media

Affiliations, export citations.

Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
Download citation
Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

COMMENTS

Types of Literature Review
1. Narrative Literature Review. A narrative literature review, also known as a traditional literature review, involves analyzing and summarizing existing literature without adhering to a structured methodology. It typically provides a descriptive overview of key concepts, theories, and relevant findings of the research topic.
Types of Literature Reviews
Qualitative, narrative synthesis. Thematic analysis, may include conceptual models. Rapid review. Assessment of what is already known about a policy or practice issue, by using systematic review methods to search and critically appraise existing research. Completeness of searching determined by time constraints.
Theories and Models: What They Are, What They Are for, and What They
What Are Theories. The terms theory and model have been defined in numerous ways, and there are at least as many ideas on how theories and models relate to each other (Bailer-Jones, Citation 2009).I understand theories as bodies of knowledge that are broad in scope and aim to explain robust phenomena.Models, on the other hand, are instantiations of theories, narrower in scope and often more ...
Literature review as a research methodology: An ...
As mentioned previously, there are a number of existing guidelines for literature reviews. Depending on the methodology needed to achieve the purpose of the review, all types can be helpful and appropriate to reach a specific goal (for examples, please see Table 1).These approaches can be qualitative, quantitative, or have a mixed design depending on the phase of the review.
Literature Reviews, Theoretical Frameworks, and Conceptual Frameworks
There are different ways to approach and construct a literature review. Booth et al. (2016a) provide an overview that includes, for example, scoping reviews, which are focused only on notable studies and use a basic method of analysis, and integrative reviews, which are the result of exhaustive literature searches across different genres.
Guidance on Conducting a Systematic Literature Review
The different literature review typologies discussed earlier and the type of literature being synthesized will guide the reviewer to appropriate synthesis methods. The synthesis methods, in turn, will guide the data extraction process—for example, if one is doing a meta-analysis, data extraction will be centered on what's needed for a meta ...
Researching and Developing Models, Theories and Approaches ...
Literature Study An important method in the development of conceptual models and theories is the study and synthesis of relevant insights from literature, that is, literature review. There are different types of literature review (see Cooper 1988 , for a taxonomy) and they can be approached in different ways.
Methodological Approaches to Literature Review
A literature review is defined as "a critical analysis of a segment of a published body of knowledge through summary, classification, and comparison of prior research studies, reviews of literature, and theoretical articles." (The Writing Center University of Winconsin-Madison 2022) A literature review is an integrated analysis, not just a summary of scholarly work on a specific topic.
From Large Language Models to Large Multimodal Models: A Literature Review
Furthermore, the paper experimentally examines the performance of various model variants based on different self-attention mechanisms, ultimately concluding that the encoder-decoder architecture outperforms standalone language model and prefix LMs in text-to-text tasks. ... "From Large Language Models to Large Multimodal Models: A Literature ...
How to Analyze Literature Reviews with Four Models
1. Thematic model. Be the first to add your personal experience. 2. Conceptual model. Be the first to add your personal experience. 3. Theoretical model. Be the first to add your personal experience.
Chapter 9 Methods for Literature Reviews
9.3. Types of Review Articles and Brief Illustrations. EHealth researchers have at their disposal a number of approaches and methods for making sense out of existing literature, all with the purpose of casting current research findings into historical contexts or explaining contradictions that might exist among a set of primary research studies conducted on a particular topic.
Writing a Literature Review
Writing a Literature Review. A literature review is a document or section of a document that collects key sources on a topic and discusses those sources in conversation with each other (also called synthesis ). The lit review is an important genre in many disciplines, not just literature (i.e., the study of works of literature such as novels ...
5. The Literature Review
A literature review may consist of simply a summary of key sources, but in the social sciences, a literature review usually has an organizational pattern and combines both summary and synthesis, often within specific conceptual categories.A summary is a recap of the important information of the source, but a synthesis is a re-organization, or a reshuffling, of that information in a way that ...
Literature Review
A Literature Review is a comprehensive overview of all the knowledge available on a specific topic up to the present day. ... some applications of literature review in different fields: Social Sciences: ... researchers can identify the key concepts, theories, and models that are relevant to their research. Selecting Research Methods: Literature ...
An Overview of Chronic Disease Models: A Systematic Literature Review
A total of 23 studies were included in the final analysis. Majority of the studies were US-based. Five chronic disease models included Chronic Care Model (CCM), Improving Chronic Illness Care (ICIC), and Innovative Care for Chronic Conditions (ICCC), Stanford Model (SM) and Community based Transition Model (CBTM). CCM was the most studied model.
Smart literature review: a practical topic modelling approach to
Manual exploratory literature reviews should be a thing of the past, as technology and development of machine learning methods have matured. The learning curve for using machine learning methods is rapidly declining, enabling new possibilities for all researchers. A framework is presented on how to use topic modelling on a large collection of papers for an exploratory literature review and how ...
What is a Literature Review? How to Write It (with Examples)
A literature review is a critical analysis and synthesis of existing research on a particular topic. It provides an overview of the current state of knowledge, identifies gaps, and highlights key findings in the literature. 1 The purpose of a literature review is to situate your own research within the context of existing scholarship ...
Understanding Change: A Critical Review of Literature
of Literature. Ahmed Shaikh. University of Manitoba, Canada. [email protected]. *Correspondence: [email protected]. Received: 29 th November 2019; Accepted: 15 th March 2020 ...
Understanding the Differences That Differentiate: A Model for Deciding
This model emphasizes aligning the selection of a literature review type with the needs and expectations of the synthesis question driving the study. We focus on the 8 review types discussed in the JGME literature review series: systematic, realist, narrative, scoping, state-of-the-art, critical, meta-ethnographic, and theoretical integrative ...
Original research: Evidence-based practice models and frameworks in the
Objectives. The aim of this scoping review was to identify and review current evidence-based practice (EBP) models and frameworks. Specifically, how EBP models and frameworks used in healthcare settings align with the original model of (1) asking the question, (2) acquiring the best evidence, (3) appraising the evidence, (4) applying the findings to clinical practice and (5) evaluating the ...
PDF Computational lumbar spine models: A literature review
Interpretation: Development of integrated models combining elements from different model types in a framework that enables the evaluation of larger populations of subjects could address existing voids and enable more realistic representation of the biomechanics of the lumbar spine. 1. Introduction. Computational modeling has become a common and ...
Ortho-geriatric service--a literature review comparing different models
Ortho-geriatric service--a literature review comparing different models Osteoporos Int. 2010 Dec;21(Suppl 4):S637-46. doi: 10.1007/s00198-010-1396-x . ... for the medical complication rates and the activities of daily living due to their inhomogeneity when comparing the models. The review of these investigations cannot tell us the best model ...
(PDF) Topic Modeling: Perspectives From a Literature Review
The objective of this paper is to analyze the evolution of the topic modeling technique, the main areas in which it has been applied, and the models that are recommended for specific types of data ...
(PDF) Service quality models: A review
Design/methodology/approach The paper critically examines 19 different service quality models reported in the literature. The critical review of the different service quality models is intended to ...
Models of Adult Learning: A Literature Review
A broad review of literature on adult learning is described, describing the different models of adult learning and their significance for research and development in adult literacy, numeracy and English for speakers of other languages (ESOL). This paper summarises a broad review of literature on adult learning, describing the different models of adult learning and their significance for ...
(PDF) A literature review of different pressure ulcer models from 1942
Australasian Physical & Engineering Sciences in Medicine Volume 31 Number 3, 2008 TECHNICAL REPORT A literature review of different pressure ulcer models from 1942-2005 and the development of an ideal animal model P. K. T. Nguyen1, A-L. Smith2 and K. J. Reynolds1 1 School of Informatics and Engineering, Flinders University, Adelaide, Australia ...
Developing an Integrated Activity-Based Travel Demand Model for
The literature review of this study is divided into two parts: (1) the development of an EV adoption model and (2) the integration of an activity-based travel demand model with traffic assignment and emission simulators to evaluate traffic operations and vehicular emissions. ... The model uses several inventories from multiple data sources and ...
Title and abstract screening for literature reviews using large
Systematically screening published literature to determine the relevant publications to synthesize in a review is a time-consuming and difficult task. Large language models (LLMs) are an emerging technology with promising capabilities for the automation of language-related tasks that may be useful for such a purpose. LLMs were used as part of an automated system to evaluate the relevance of ...
Carbon peaking prediction scenarios based on different neural network
Different neural network models have been used to predict peak carbon for Guizhou Province as this will help to reduce carbon emissions to meet Chinas "3060" goal . ... Literature review. National carbon emission reduction work is required in all regions of a country, and regions with different levels of industrial development should implement ...
Evaluating Large Language Models on Academic Literature Understanding
Evaluating Large Language Models on Academic Literature Understanding and Review: An Empirical Study among Early-stage Scholars ... paper reading and literature reviews) under different levels of time pressure. Before conducting the tasks, participants received different training programs regarding the limitations and capabilities of the LLMs ...

Types of Literature Review — A Guide for Researchers

Table of Contents

What is a Literature Review?

What is the importance of a Literature review in research?

Types of Literature Review

1. Narrative Literature Review

Steps to Conduct a Narrative Literature Review

Define the research question or topic:

Conduct a thorough literature search

Select relevant studies

Critically analyze the literature

Synthesize and integrate the findings

Discussion and conclusion

Write a cohesive narrative review

Structure of Narrative Literature Review

Pros and Cons of Narrative Literature Review

Example of Well-Executed Narrative Literature Reviews

2. Systematic Review

Steps to Conduct Systematic Reviews

Formulate a clear and focused research question

Develop a thorough literature search strategy

Screening and selecting studies

Data extraction

Critical appraisal

Data synthesis

Interpretation and conclusion

The final step — Report writing

Structure of a systematic literature review

Pros and Cons of Systematic Literature Review

Example of Well-Executed Systematic Literature Review

3. Scoping Literature Review

Steps to Conduct a Scoping Literature Review

Structure of a Scoping Literature Review

Pros and Cons of Scoping Literature Review

Example of a Well-Executed Scoping Literature Review

4. Integrative Literature Review

Steps to Conduct an Integrative Literature Review

Structure of an Integrative Literature Review

Pros and Cons of Integrative Literature Review

Example of Integrative Literature Reviews

5. Rapid Literature Review

When to Consider a Rapid Literature Review?

Steps to Conduct a Rapid Literature Review

Structure of a Rapid Literature Review

Pros and Cons of Rapid Literature Review

Example of a Well-Executed Rapid Literature Review

A Summary of Literature Review Types

Tools and Resources for Conducting Different Types of Literature Reviews

Reference management software

Automate Literature Review with AI tools

Frequently Asked Questions

You might also like

Boosting Citations: A Comparative Analysis of Graphical Abstract vs. Video Abstract

The Impact of Visual Abstracts on Boosting Citations

Introducing SciSpace’s Citation Booster To Increase Research Visibility

Systematic Reviews

What Makes a Systematic Review Different from Other Types of Reviews?

Methodological Approaches to Literature Review

Access this chapter

Similar content being viewed by others

Reviewing Literature for and as Research

Discussion and Conclusion

Systematic Reviews in Educational Research: Methodology, Perspectives and Application

Author information

Corresponding author

Section Editor information

Rights and permissions

Copyright information

About this entry

Download citation

Information

Initiatives

Article Menu

JSmol Viewer

1. Introduction

3. Large Multimodal Models

3.5.2. Video+Text to Text

3.5.3. Image+Text to Text+Image

3.5.4. Video+Text to Text+Video

4. A Unified Perspective of Large Models