• How it works

researchprospect post subheader

Meta-Analysis – Guide with Definition, Steps & Examples

Published by Owen Ingram at April 26th, 2023 , Revised On April 26, 2023

“A meta-analysis is a formal, epidemiological, quantitative study design that uses statistical methods to generalise the findings of the selected independent studies. “

Meta-analysis and systematic review are the two most authentic strategies in research. When researchers start looking for the best available evidence concerning their research work, they are advised to begin from the top of the evidence pyramid. The evidence available in the form of meta-analysis or systematic reviews addressing important questions is significant in academics because it informs decision-making.

What is Meta-Analysis  

Meta-analysis estimates the absolute effect of individual independent research studies by systematically synthesising or merging the results. Meta-analysis isn’t only about achieving a wider population by combining several smaller studies. It involves systematic methods to evaluate the inconsistencies in participants, variability (also known as heterogeneity), and findings to check how sensitive their findings are to the selected systematic review protocol.   

When Should you Conduct a Meta-Analysis?

Meta-analysis has become a widely-used research method in medical sciences and other fields of work for several reasons. The technique involves summarising the results of independent systematic review studies. 

The Cochrane Handbook explains that “an important step in a systematic review is the thoughtful consideration of whether it is appropriate to combine the numerical results of all, or perhaps some, of the studies. Such a meta-analysis yields an overall statistic (together with its confidence interval) that summarizes the effectiveness of an experimental intervention compared with a comparator intervention” (section 10.2).

A researcher or a practitioner should choose meta-analysis when the following outcomes are desirable. 

For generating new hypotheses or ending controversies resulting from different research studies. Quantifying and evaluating the variable results and identifying the extent of conflict in literature through meta-analysis is possible. 

To find research gaps left unfilled and address questions not posed by individual studies. Primary research studies involve specific types of participants and interventions. A review of these studies with variable characteristics and methodologies can allow the researcher to gauge the consistency of findings across a wider range of participants and interventions. With the help of meta-analysis, the reasons for differences in the effect can also be explored. 

To provide convincing evidence. Estimating the effects with a larger sample size and interventions can provide convincing evidence. Many academic studies are based on a very small dataset, so the estimated intervention effects in isolation are not fully reliable.

Elements of a Meta-Analysis

Deeks et al. (2019), Haidilch (2010), and Grant & Booth (2009) explored the characteristics, strengths, and weaknesses of conducting the meta-analysis. They are briefly explained below. 

Characteristics: 

  • A systematic review must be completed before conducting the meta-analysis because it provides a summary of the findings of the individual studies synthesised. 
  • You can only conduct a meta-analysis by synthesising studies in a systematic review. 
  • The studies selected for statistical analysis for the purpose of meta-analysis should be similar in terms of comparison, intervention, and population. 

Strengths: 

  • A meta-analysis takes place after the systematic review. The end product is a comprehensive quantitative analysis that is complicated but reliable. 
  • It gives more value and weightage to existing studies that do not hold practical value on their own. 
  • Policy-makers and academicians cannot base their decisions on individual research studies. Meta-analysis provides them with a complex and solid analysis of evidence to make informed decisions. 

Criticisms: 

  • The meta-analysis uses studies exploring similar topics. Finding similar studies for the meta-analysis can be challenging.
  • When and if biases in the individual studies or those related to reporting and specific research methodologies are involved, the meta-analysis results could be misleading.

Steps of Conducting the Meta-Analysis 

The process of conducting the meta-analysis has remained a topic of debate among researchers and scientists. However, the following 5-step process is widely accepted. 

Step 1: Research Question

The first step in conducting clinical research involves identifying a research question and proposing a hypothesis . The potential clinical significance of the research question is then explained, and the study design and analytical plan are justified.

Step 2: Systematic Review 

The purpose of a systematic review (SR) is to address a research question by identifying all relevant studies that meet the required quality standards for inclusion. While established journals typically serve as the primary source for identified studies, it is important to also consider unpublished data to avoid publication bias or the exclusion of studies with negative results.

While some meta-analyses may limit their focus to randomized controlled trials (RCTs) for the sake of obtaining the highest quality evidence, other experimental and quasi-experimental studies may be included if they meet the specific inclusion/exclusion criteria established for the review.

Step 3: Data Extraction

After selecting studies for the meta-analysis, researchers extract summary data or outcomes, as well as sample sizes and measures of data variability for both intervention and control groups. The choice of outcome measures depends on the research question and the type of study, and may include numerical or categorical measures.

For instance, numerical means may be used to report differences in scores on a questionnaire or changes in a measurement, such as blood pressure. In contrast, risk measures like odds ratios (OR) or relative risks (RR) are typically used to report differences in the probability of belonging to one category or another, such as vaginal birth versus cesarean birth.

Step 4: Standardisation and Weighting Studies

After gathering all the required data, the fourth step involves computing suitable summary measures from each study for further examination. These measures are typically referred to as Effect Sizes and indicate the difference in average scores between the control and intervention groups. For instance, it could be the variation in blood pressure changes between study participants who used drug X and those who used a placebo.

Since the units of measurement often differ across the included studies, standardization is necessary to create comparable effect size estimates. Standardization is accomplished by determining, for each study, the average score for the intervention group, subtracting the average score for the control group, and dividing the result by the relevant measure of variability in that dataset.

In some cases, the results of certain studies must carry more significance than others. Larger studies, as measured by their sample sizes, are deemed to produce more precise estimates of effect size than smaller studies. Additionally, studies with less variability in data, such as smaller standard deviation or narrower confidence intervals, are typically regarded as higher quality in study design. A weighting statistic that aims to incorporate both of these factors, known as inverse variance, is commonly employed.

Step 5: Absolute Effect Estimation

The ultimate step in conducting a meta-analysis is to choose and utilize an appropriate model for comparing Effect Sizes among diverse studies. Two popular models for this purpose are the Fixed Effects and Random Effects models. The Fixed Effects model relies on the premise that each study is evaluating a common treatment effect, implying that all studies would have estimated the same Effect Size if sample variability were equal across all studies.

Conversely, the Random Effects model posits that the true treatment effects in individual studies may vary from each other, and endeavors to consider this additional source of interstudy variation in Effect Sizes. The existence and magnitude of this latter variability is usually evaluated within the meta-analysis through a test for ‘heterogeneity.’

Forest Plot

The results of a meta-analysis are often visually presented using a “Forest Plot”. This type of plot displays, for each study, included in the analysis, a horizontal line that indicates the standardized Effect Size estimate and 95% confidence interval for the risk ratio used. Figure A provides an example of a hypothetical Forest Plot in which drug X reduces the risk of death in all three studies.

However, the first study was larger than the other two, and as a result, the estimates for the smaller studies were not statistically significant. This is indicated by the lines emanating from their boxes, including the value of 1. The size of the boxes represents the relative weights assigned to each study by the meta-analysis. The combined estimate of the drug’s effect, represented by the diamond, provides a more precise estimate of the drug’s effect, with the diamond indicating both the combined risk ratio estimate and the 95% confidence interval limits.

odds ratio

Figure-A: Hypothetical Forest Plot

Relevance to Practice and Research 

  Evidence Based Nursing commentaries often include recently published systematic reviews and meta-analyses, as they can provide new insights and strengthen recommendations for effective healthcare practices. Additionally, they can identify gaps or limitations in current evidence and guide future research directions.

The quality of the data available for synthesis is a critical factor in the strength of conclusions drawn from meta-analyses, and this is influenced by the quality of individual studies and the systematic review itself. However, meta-analysis cannot overcome issues related to underpowered or poorly designed studies.

Therefore, clinicians may still encounter situations where the evidence is weak or uncertain, and where higher-quality research is required to improve clinical decision-making. While such findings can be frustrating, they remain important for informing practice and highlighting the need for further research to fill gaps in the evidence base.

Methods and Assumptions in Meta-Analysis 

Ensuring the credibility of findings is imperative in all types of research, including meta-analyses. To validate the outcomes of a meta-analysis, the researcher must confirm that the research techniques used were accurate in measuring the intended variables. Typically, researchers establish the validity of a meta-analysis by testing the outcomes for homogeneity or the degree of similarity between the results of the combined studies.

Homogeneity is preferred in meta-analyses as it allows the data to be combined without needing adjustments to suit the study’s requirements. To determine homogeneity, researchers assess heterogeneity, the opposite of homogeneity. Two widely used statistical methods for evaluating heterogeneity in research results are Cochran’s-Q and I-Square, also known as I-2 Index.

Difference Between Meta-Analysis and Systematic Reviews

Meta-analysis and systematic reviews are both research methods used to synthesise evidence from multiple studies on a particular topic. However, there are some key differences between the two.

Systematic reviews involve a comprehensive and structured approach to identifying, selecting, and critically appraising all available evidence relevant to a specific research question. This process involves searching multiple databases, screening the identified studies for relevance and quality, and summarizing the findings in a narrative report.

Meta-analysis, on the other hand, involves using statistical methods to combine and analyze the data from multiple studies, with the aim of producing a quantitative summary of the overall effect size. Meta-analysis requires the studies to be similar enough in terms of their design, methodology, and outcome measures to allow for meaningful comparison and analysis.

Therefore, systematic reviews are broader in scope and summarize the findings of all studies on a topic, while meta-analyses are more focused on producing a quantitative estimate of the effect size of an intervention across multiple studies that meet certain criteria. In some cases, a systematic review may be conducted without a meta-analysis if the studies are too diverse or the quality of the data is not sufficient to allow for statistical pooling.

Software Packages For Meta-Analysis

Meta-analysis can be done through software packages, including free and paid options. One of the most commonly used software packages for meta-analysis is RevMan by the Cochrane Collaboration.

Assessing the Quality of Meta-Analysis 

Assessing the quality of a meta-analysis involves evaluating the methods used to conduct the analysis and the quality of the studies included. Here are some key factors to consider:

  • Study selection: The studies included in the meta-analysis should be relevant to the research question and meet predetermined criteria for quality.
  • Search strategy: The search strategy should be comprehensive and transparent, including databases and search terms used to identify relevant studies.
  • Study quality assessment: The quality of included studies should be assessed using appropriate tools, and this assessment should be reported in the meta-analysis.
  • Data extraction: The data extraction process should be systematic and clearly reported, including any discrepancies that arose.
  • Analysis methods: The meta-analysis should use appropriate statistical methods to combine the results of the included studies, and these methods should be transparently reported.
  • Publication bias: The potential for publication bias should be assessed and reported in the meta-analysis, including any efforts to identify and include unpublished studies.
  • Interpretation of results: The results should be interpreted in the context of the study limitations and the overall quality of the evidence.
  • Sensitivity analysis: Sensitivity analysis should be conducted to evaluate the impact of study quality, inclusion criteria, and other factors on the overall results.

Overall, a high-quality meta-analysis should be transparent in its methods and clearly report the included studies’ limitations and the evidence’s overall quality.

Hire an Expert Writer

Orders completed by our expert writers are

  • Formally drafted in an academic style
  • Free Amendments and 100% Plagiarism Free – or your money back!
  • 100% Confidential and Timely Delivery!
  • Free anti-plagiarism report
  • Appreciated by thousands of clients. Check client reviews

Hire an Expert Writer

Examples of Meta-Analysis

  • STANLEY T.D. et JARRELL S.B. (1989), « Meta-regression analysis : a quantitative method of literature surveys », Journal of Economics Surveys, vol. 3, n°2, pp. 161-170.
  • DATTA D.K., PINCHES G.E. et NARAYANAN V.K. (1992), « Factors influencing wealth creation from mergers and acquisitions : a meta-analysis », Strategic Management Journal, Vol. 13, pp. 67-84.
  • GLASS G. (1983), « Synthesising empirical research : Meta-analysis » in S.A. Ward and L.J. Reed (Eds), Knowledge structure and use : Implications for synthesis and interpretation, Philadelphia : Temple University Press.
  • WOLF F.M. (1986), Meta-analysis : Quantitative methods for research synthesis, Sage University Paper n°59.
  • HUNTER J.E., SCHMIDT F.L. et JACKSON G.B. (1982), « Meta-analysis : cumulating research findings across studies », Beverly Hills, CA : Sage.

Frequently Asked Questions

What is a meta-analysis in research.

Meta-analysis is a statistical method used to combine results from multiple studies on a specific topic. By pooling data from various sources, meta-analysis can provide a more precise estimate of the effect size of a treatment or intervention and identify areas for future research.

Why is meta-analysis important?

Meta-analysis is important because it combines and summarizes results from multiple studies to provide a more precise and reliable estimate of the effect of a treatment or intervention. This helps clinicians and policymakers make evidence-based decisions and identify areas for further research.

What is an example of a meta-analysis?

A meta-analysis of studies evaluating physical exercise’s effect on depression in adults is an example. Researchers gathered data from 49 studies involving a total of 2669 participants. The studies used different types of exercise and measures of depression, which made it difficult to compare the results.

Through meta-analysis, the researchers calculated an overall effect size and determined that exercise was associated with a statistically significant reduction in depression symptoms. The study also identified that moderate-intensity aerobic exercise, performed three to five times per week, was the most effective. The meta-analysis provided a more comprehensive understanding of the impact of exercise on depression than any single study could provide.

What is the definition of meta-analysis in clinical research?

Meta-analysis in clinical research is a statistical technique that combines data from multiple independent studies on a particular topic to generate a summary or “meta” estimate of the effect of a particular intervention or exposure.

This type of analysis allows researchers to synthesise the results of multiple studies, potentially increasing the statistical power and providing more precise estimates of treatment effects. Meta-analyses are commonly used in clinical research to evaluate the effectiveness and safety of medical interventions and to inform clinical practice guidelines.

Is meta-analysis qualitative or quantitative?

Meta-analysis is a quantitative method used to combine and analyze data from multiple studies. It involves the statistical synthesis of results from individual studies to obtain a pooled estimate of the effect size of a particular intervention or treatment. Therefore, meta-analysis is considered a quantitative approach to research synthesis.

You May Also Like

Descriptive research is carried out to describe current issues, programs, and provides information about the issue through surveys and various fact-finding methods.

A hypothesis is a research question that has to be proved correct or incorrect through hypothesis testing – a scientific approach to test a hypothesis.

Textual analysis is the method of analysing and understanding the text. We need to look carefully at the text to identify the writer’s context and message.

USEFUL LINKS

LEARNING RESOURCES

researchprospect-reviews-trust-site

COMPANY DETAILS

Research-Prospect-Writing-Service

  • How It Works
  • Open access
  • Published: 01 August 2019

A step by step guide for conducting a systematic review and meta-analysis with simulation data

  • Gehad Mohamed Tawfik 1 , 2 ,
  • Kadek Agus Surya Dila 2 , 3 ,
  • Muawia Yousif Fadlelmola Mohamed 2 , 4 ,
  • Dao Ngoc Hien Tam 2 , 5 ,
  • Nguyen Dang Kien 2 , 6 ,
  • Ali Mahmoud Ahmed 2 , 7 &
  • Nguyen Tien Huy 8 , 9 , 10  

Tropical Medicine and Health volume  47 , Article number:  46 ( 2019 ) Cite this article

820k Accesses

316 Citations

94 Altmetric

Metrics details

The massive abundance of studies relating to tropical medicine and health has increased strikingly over the last few decades. In the field of tropical medicine and health, a well-conducted systematic review and meta-analysis (SR/MA) is considered a feasible solution for keeping clinicians abreast of current evidence-based medicine. Understanding of SR/MA steps is of paramount importance for its conduction. It is not easy to be done as there are obstacles that could face the researcher. To solve those hindrances, this methodology study aimed to provide a step-by-step approach mainly for beginners and junior researchers, in the field of tropical medicine and other health care fields, on how to properly conduct a SR/MA, in which all the steps here depicts our experience and expertise combined with the already well-known and accepted international guidance.

We suggest that all steps of SR/MA should be done independently by 2–3 reviewers’ discussion, to ensure data quality and accuracy.

SR/MA steps include the development of research question, forming criteria, search strategy, searching databases, protocol registration, title, abstract, full-text screening, manual searching, extracting data, quality assessment, data checking, statistical analysis, double data checking, and manuscript writing.

Introduction

The amount of studies published in the biomedical literature, especially tropical medicine and health, has increased strikingly over the last few decades. This massive abundance of literature makes clinical medicine increasingly complex, and knowledge from various researches is often needed to inform a particular clinical decision. However, available studies are often heterogeneous with regard to their design, operational quality, and subjects under study and may handle the research question in a different way, which adds to the complexity of evidence and conclusion synthesis [ 1 ].

Systematic review and meta-analyses (SR/MAs) have a high level of evidence as represented by the evidence-based pyramid. Therefore, a well-conducted SR/MA is considered a feasible solution in keeping health clinicians ahead regarding contemporary evidence-based medicine.

Differing from a systematic review, unsystematic narrative review tends to be descriptive, in which the authors select frequently articles based on their point of view which leads to its poor quality. A systematic review, on the other hand, is defined as a review using a systematic method to summarize evidence on questions with a detailed and comprehensive plan of study. Furthermore, despite the increasing guidelines for effectively conducting a systematic review, we found that basic steps often start from framing question, then identifying relevant work which consists of criteria development and search for articles, appraise the quality of included studies, summarize the evidence, and interpret the results [ 2 , 3 ]. However, those simple steps are not easy to be reached in reality. There are many troubles that a researcher could be struggled with which has no detailed indication.

Conducting a SR/MA in tropical medicine and health may be difficult especially for young researchers; therefore, understanding of its essential steps is crucial. It is not easy to be done as there are obstacles that could face the researcher. To solve those hindrances, we recommend a flow diagram (Fig. 1 ) which illustrates a detailed and step-by-step the stages for SR/MA studies. This methodology study aimed to provide a step-by-step approach mainly for beginners and junior researchers, in the field of tropical medicine and other health care fields, on how to properly and succinctly conduct a SR/MA; all the steps here depicts our experience and expertise combined with the already well known and accepted international guidance.

figure 1

Detailed flow diagram guideline for systematic review and meta-analysis steps. Note : Star icon refers to “2–3 reviewers screen independently”

Methods and results

Detailed steps for conducting any systematic review and meta-analysis.

We searched the methods reported in published SR/MA in tropical medicine and other healthcare fields besides the published guidelines like Cochrane guidelines {Higgins, 2011 #7} [ 4 ] to collect the best low-bias method for each step of SR/MA conduction steps. Furthermore, we used guidelines that we apply in studies for all SR/MA steps. We combined these methods in order to conclude and conduct a detailed flow diagram that shows the SR/MA steps how being conducted.

Any SR/MA must follow the widely accepted Preferred Reporting Items for Systematic Review and Meta-analysis statement (PRISMA checklist 2009) (Additional file 5 : Table S1) [ 5 ].

We proposed our methods according to a valid explanatory simulation example choosing the topic of “evaluating safety of Ebola vaccine,” as it is known that Ebola is a very rare tropical disease but fatal. All the explained methods feature the standards followed internationally, with our compiled experience in the conduct of SR beside it, which we think proved some validity. This is a SR under conduct by a couple of researchers teaming in a research group, moreover, as the outbreak of Ebola which took place (2013–2016) in Africa resulted in a significant mortality and morbidity. Furthermore, since there are many published and ongoing trials assessing the safety of Ebola vaccines, we thought this would provide a great opportunity to tackle this hotly debated issue. Moreover, Ebola started to fire again and new fatal outbreak appeared in the Democratic Republic of Congo since August 2018, which caused infection to more than 1000 people according to the World Health Organization, and 629 people have been killed till now. Hence, it is considered the second worst Ebola outbreak, after the first one in West Africa in 2014 , which infected more than 26,000 and killed about 11,300 people along outbreak course.

Research question and objectives

Like other study designs, the research question of SR/MA should be feasible, interesting, novel, ethical, and relevant. Therefore, a clear, logical, and well-defined research question should be formulated. Usually, two common tools are used: PICO or SPIDER. PICO (Population, Intervention, Comparison, Outcome) is used mostly in quantitative evidence synthesis. Authors demonstrated that PICO holds more sensitivity than the more specific SPIDER approach [ 6 ]. SPIDER (Sample, Phenomenon of Interest, Design, Evaluation, Research type) was proposed as a method for qualitative and mixed methods search.

We here recommend a combined approach of using either one or both the SPIDER and PICO tools to retrieve a comprehensive search depending on time and resources limitations. When we apply this to our assumed research topic, being of qualitative nature, the use of SPIDER approach is more valid.

PICO is usually used for systematic review and meta-analysis of clinical trial study. For the observational study (without intervention or comparator), in many tropical and epidemiological questions, it is usually enough to use P (Patient) and O (outcome) only to formulate a research question. We must indicate clearly the population (P), then intervention (I) or exposure. Next, it is necessary to compare (C) the indicated intervention with other interventions, i.e., placebo. Finally, we need to clarify which are our relevant outcomes.

To facilitate comprehension, we choose the Ebola virus disease (EVD) as an example. Currently, the vaccine for EVD is being developed and under phase I, II, and III clinical trials; we want to know whether this vaccine is safe and can induce sufficient immunogenicity to the subjects.

An example of a research question for SR/MA based on PICO for this issue is as follows: How is the safety and immunogenicity of Ebola vaccine in human? (P: healthy subjects (human), I: vaccination, C: placebo, O: safety or adverse effects)

Preliminary research and idea validation

We recommend a preliminary search to identify relevant articles, ensure the validity of the proposed idea, avoid duplication of previously addressed questions, and assure that we have enough articles for conducting its analysis. Moreover, themes should focus on relevant and important health-care issues, consider global needs and values, reflect the current science, and be consistent with the adopted review methods. Gaining familiarity with a deep understanding of the study field through relevant videos and discussions is of paramount importance for better retrieval of results. If we ignore this step, our study could be canceled whenever we find out a similar study published before. This means we are wasting our time to deal with a problem that has been tackled for a long time.

To do this, we can start by doing a simple search in PubMed or Google Scholar with search terms Ebola AND vaccine. While doing this step, we identify a systematic review and meta-analysis of determinant factors influencing antibody response from vaccination of Ebola vaccine in non-human primate and human [ 7 ], which is a relevant paper to read to get a deeper insight and identify gaps for better formulation of our research question or purpose. We can still conduct systematic review and meta-analysis of Ebola vaccine because we evaluate safety as a different outcome and different population (only human).

Inclusion and exclusion criteria

Eligibility criteria are based on the PICO approach, study design, and date. Exclusion criteria mostly are unrelated, duplicated, unavailable full texts, or abstract-only papers. These exclusions should be stated in advance to refrain the researcher from bias. The inclusion criteria would be articles with the target patients, investigated interventions, or the comparison between two studied interventions. Briefly, it would be articles which contain information answering our research question. But the most important is that it should be clear and sufficient information, including positive or negative, to answer the question.

For the topic we have chosen, we can make inclusion criteria: (1) any clinical trial evaluating the safety of Ebola vaccine and (2) no restriction regarding country, patient age, race, gender, publication language, and date. Exclusion criteria are as follows: (1) study of Ebola vaccine in non-human subjects or in vitro studies; (2) study with data not reliably extracted, duplicate, or overlapping data; (3) abstract-only papers as preceding papers, conference, editorial, and author response theses and books; (4) articles without available full text available; and (5) case reports, case series, and systematic review studies. The PRISMA flow diagram template that is used in SR/MA studies can be found in Fig. 2 .

figure 2

PRISMA flow diagram of studies’ screening and selection

Search strategy

A standard search strategy is used in PubMed, then later it is modified according to each specific database to get the best relevant results. The basic search strategy is built based on the research question formulation (i.e., PICO or PICOS). Search strategies are constructed to include free-text terms (e.g., in the title and abstract) and any appropriate subject indexing (e.g., MeSH) expected to retrieve eligible studies, with the help of an expert in the review topic field or an information specialist. Additionally, we advise not to use terms for the Outcomes as their inclusion might hinder the database being searched to retrieve eligible studies because the used outcome is not mentioned obviously in the articles.

The improvement of the search term is made while doing a trial search and looking for another relevant term within each concept from retrieved papers. To search for a clinical trial, we can use these descriptors in PubMed: “clinical trial”[Publication Type] OR “clinical trials as topic”[MeSH terms] OR “clinical trial”[All Fields]. After some rounds of trial and refinement of search term, we formulate the final search term for PubMed as follows: (ebola OR ebola virus OR ebola virus disease OR EVD) AND (vaccine OR vaccination OR vaccinated OR immunization) AND (“clinical trial”[Publication Type] OR “clinical trials as topic”[MeSH Terms] OR “clinical trial”[All Fields]). Because the study for this topic is limited, we do not include outcome term (safety and immunogenicity) in the search term to capture more studies.

Search databases, import all results to a library, and exporting to an excel sheet

According to the AMSTAR guidelines, at least two databases have to be searched in the SR/MA [ 8 ], but as you increase the number of searched databases, you get much yield and more accurate and comprehensive results. The ordering of the databases depends mostly on the review questions; being in a study of clinical trials, you will rely mostly on Cochrane, mRCTs, or International Clinical Trials Registry Platform (ICTRP). Here, we propose 12 databases (PubMed, Scopus, Web of Science, EMBASE, GHL, VHL, Cochrane, Google Scholar, Clinical trials.gov , mRCTs, POPLINE, and SIGLE), which help to cover almost all published articles in tropical medicine and other health-related fields. Among those databases, POPLINE focuses on reproductive health. Researchers should consider to choose relevant database according to the research topic. Some databases do not support the use of Boolean or quotation; otherwise, there are some databases that have special searching way. Therefore, we need to modify the initial search terms for each database to get appreciated results; therefore, manipulation guides for each online database searches are presented in Additional file 5 : Table S2. The detailed search strategy for each database is found in Additional file 5 : Table S3. The search term that we created in PubMed needs customization based on a specific characteristic of the database. An example for Google Scholar advanced search for our topic is as follows:

With all of the words: ebola virus

With at least one of the words: vaccine vaccination vaccinated immunization

Where my words occur: in the title of the article

With all of the words: EVD

Finally, all records are collected into one Endnote library in order to delete duplicates and then to it export into an excel sheet. Using remove duplicating function with two options is mandatory. All references which have (1) the same title and author, and published in the same year, and (2) the same title and author, and published in the same journal, would be deleted. References remaining after this step should be exported to an excel file with essential information for screening. These could be the authors’ names, publication year, journal, DOI, URL link, and abstract.

Protocol writing and registration

Protocol registration at an early stage guarantees transparency in the research process and protects from duplication problems. Besides, it is considered a documented proof of team plan of action, research question, eligibility criteria, intervention/exposure, quality assessment, and pre-analysis plan. It is recommended that researchers send it to the principal investigator (PI) to revise it, then upload it to registry sites. There are many registry sites available for SR/MA like those proposed by Cochrane and Campbell collaborations; however, we recommend registering the protocol into PROSPERO as it is easier. The layout of a protocol template, according to PROSPERO, can be found in Additional file 5 : File S1.

Title and abstract screening

Decisions to select retrieved articles for further assessment are based on eligibility criteria, to minimize the chance of including non-relevant articles. According to the Cochrane guidance, two reviewers are a must to do this step, but as for beginners and junior researchers, this might be tiresome; thus, we propose based on our experience that at least three reviewers should work independently to reduce the chance of error, particularly in teams with a large number of authors to add more scrutiny and ensure proper conduct. Mostly, the quality with three reviewers would be better than two, as two only would have different opinions from each other, so they cannot decide, while the third opinion is crucial. And here are some examples of systematic reviews which we conducted following the same strategy (by a different group of researchers in our research group) and published successfully, and they feature relevant ideas to tropical medicine and disease [ 9 , 10 , 11 ].

In this step, duplications will be removed manually whenever the reviewers find them out. When there is a doubt about an article decision, the team should be inclusive rather than exclusive, until the main leader or PI makes a decision after discussion and consensus. All excluded records should be given exclusion reasons.

Full text downloading and screening

Many search engines provide links for free to access full-text articles. In case not found, we can search in some research websites as ResearchGate, which offer an option of direct full-text request from authors. Additionally, exploring archives of wanted journals, or contacting PI to purchase it if available. Similarly, 2–3 reviewers work independently to decide about included full texts according to eligibility criteria, with reporting exclusion reasons of articles. In case any disagreement has occurred, the final decision has to be made by discussion.

Manual search

One has to exhaust all possibilities to reduce bias by performing an explicit hand-searching for retrieval of reports that may have been dropped from first search [ 12 ]. We apply five methods to make manual searching: searching references from included studies/reviews, contacting authors and experts, and looking at related articles/cited articles in PubMed and Google Scholar.

We describe here three consecutive methods to increase and refine the yield of manual searching: firstly, searching reference lists of included articles; secondly, performing what is known as citation tracking in which the reviewers track all the articles that cite each one of the included articles, and this might involve electronic searching of databases; and thirdly, similar to the citation tracking, we follow all “related to” or “similar” articles. Each of the abovementioned methods can be performed by 2–3 independent reviewers, and all the possible relevant article must undergo further scrutiny against the inclusion criteria, after following the same records yielded from electronic databases, i.e., title/abstract and full-text screening.

We propose an independent reviewing by assigning each member of the teams a “tag” and a distinct method, to compile all the results at the end for comparison of differences and discussion and to maximize the retrieval and minimize the bias. Similarly, the number of included articles has to be stated before addition to the overall included records.

Data extraction and quality assessment

This step entitles data collection from included full-texts in a structured extraction excel sheet, which is previously pilot-tested for extraction using some random studies. We recommend extracting both adjusted and non-adjusted data because it gives the most allowed confounding factor to be used in the analysis by pooling them later [ 13 ]. The process of extraction should be executed by 2–3 independent reviewers. Mostly, the sheet is classified into the study and patient characteristics, outcomes, and quality assessment (QA) tool.

Data presented in graphs should be extracted by software tools such as Web plot digitizer [ 14 ]. Most of the equations that can be used in extraction prior to analysis and estimation of standard deviation (SD) from other variables is found inside Additional file 5 : File S2 with their references as Hozo et al. [ 15 ], Xiang et al. [ 16 ], and Rijkom et al. [ 17 ]. A variety of tools are available for the QA, depending on the design: ROB-2 Cochrane tool for randomized controlled trials [ 18 ] which is presented as Additional file 1 : Figure S1 and Additional file 2 : Figure S2—from a previous published article data—[ 19 ], NIH tool for observational and cross-sectional studies [ 20 ], ROBINS-I tool for non-randomize trials [ 21 ], QUADAS-2 tool for diagnostic studies, QUIPS tool for prognostic studies, CARE tool for case reports, and ToxRtool for in vivo and in vitro studies. We recommend that 2–3 reviewers independently assess the quality of the studies and add to the data extraction form before the inclusion into the analysis to reduce the risk of bias. In the NIH tool for observational studies—cohort and cross-sectional—as in this EBOLA case, to evaluate the risk of bias, reviewers should rate each of the 14 items into dichotomous variables: yes, no, or not applicable. An overall score is calculated by adding all the items scores as yes equals one, while no and NA equals zero. A score will be given for every paper to classify them as poor, fair, or good conducted studies, where a score from 0–5 was considered poor, 6–9 as fair, and 10–14 as good.

In the EBOLA case example above, authors can extract the following information: name of authors, country of patients, year of publication, study design (case report, cohort study, or clinical trial or RCT), sample size, the infected point of time after EBOLA infection, follow-up interval after vaccination time, efficacy, safety, adverse effects after vaccinations, and QA sheet (Additional file 6 : Data S1).

Data checking

Due to the expected human error and bias, we recommend a data checking step, in which every included article is compared with its counterpart in an extraction sheet by evidence photos, to detect mistakes in data. We advise assigning articles to 2–3 independent reviewers, ideally not the ones who performed the extraction of those articles. When resources are limited, each reviewer is assigned a different article than the one he extracted in the previous stage.

Statistical analysis

Investigators use different methods for combining and summarizing findings of included studies. Before analysis, there is an important step called cleaning of data in the extraction sheet, where the analyst organizes extraction sheet data in a form that can be read by analytical software. The analysis consists of 2 types namely qualitative and quantitative analysis. Qualitative analysis mostly describes data in SR studies, while quantitative analysis consists of two main types: MA and network meta-analysis (NMA). Subgroup, sensitivity, cumulative analyses, and meta-regression are appropriate for testing whether the results are consistent or not and investigating the effect of certain confounders on the outcome and finding the best predictors. Publication bias should be assessed to investigate the presence of missing studies which can affect the summary.

To illustrate basic meta-analysis, we provide an imaginary data for the research question about Ebola vaccine safety (in terms of adverse events, 14 days after injection) and immunogenicity (Ebola virus antibodies rise in geometric mean titer, 6 months after injection). Assuming that from searching and data extraction, we decided to do an analysis to evaluate Ebola vaccine “A” safety and immunogenicity. Other Ebola vaccines were not meta-analyzed because of the limited number of studies (instead, it will be included for narrative review). The imaginary data for vaccine safety meta-analysis can be accessed in Additional file 7 : Data S2. To do the meta-analysis, we can use free software, such as RevMan [ 22 ] or R package meta [ 23 ]. In this example, we will use the R package meta. The tutorial of meta package can be accessed through “General Package for Meta-Analysis” tutorial pdf [ 23 ]. The R codes and its guidance for meta-analysis done can be found in Additional file 5 : File S3.

For the analysis, we assume that the study is heterogenous in nature; therefore, we choose a random effect model. We did an analysis on the safety of Ebola vaccine A. From the data table, we can see some adverse events occurring after intramuscular injection of vaccine A to the subject of the study. Suppose that we include six studies that fulfill our inclusion criteria. We can do a meta-analysis for each of the adverse events extracted from the studies, for example, arthralgia, from the results of random effect meta-analysis using the R meta package.

From the results shown in Additional file 3 : Figure S3, we can see that the odds ratio (OR) of arthralgia is 1.06 (0.79; 1.42), p value = 0.71, which means that there is no association between the intramuscular injection of Ebola vaccine A and arthralgia, as the OR is almost one, and besides, the P value is insignificant as it is > 0.05.

In the meta-analysis, we can also visualize the results in a forest plot. It is shown in Fig. 3 an example of a forest plot from the simulated analysis.

figure 3

Random effect model forest plot for comparison of vaccine A versus placebo

From the forest plot, we can see six studies (A to F) and their respective OR (95% CI). The green box represents the effect size (in this case, OR) of each study. The bigger the box means the study weighted more (i.e., bigger sample size). The blue diamond shape represents the pooled OR of the six studies. We can see the blue diamond cross the vertical line OR = 1, which indicates no significance for the association as the diamond almost equalized in both sides. We can confirm this also from the 95% confidence interval that includes one and the p value > 0.05.

For heterogeneity, we see that I 2 = 0%, which means no heterogeneity is detected; the study is relatively homogenous (it is rare in the real study). To evaluate publication bias related to the meta-analysis of adverse events of arthralgia, we can use the metabias function from the R meta package (Additional file 4 : Figure S4) and visualization using a funnel plot. The results of publication bias are demonstrated in Fig. 4 . We see that the p value associated with this test is 0.74, indicating symmetry of the funnel plot. We can confirm it by looking at the funnel plot.

figure 4

Publication bias funnel plot for comparison of vaccine A versus placebo

Looking at the funnel plot, the number of studies at the left and right side of the funnel plot is the same; therefore, the plot is symmetry, indicating no publication bias detected.

Sensitivity analysis is a procedure used to discover how different values of an independent variable will influence the significance of a particular dependent variable by removing one study from MA. If all included study p values are < 0.05, hence, removing any study will not change the significant association. It is only performed when there is a significant association, so if the p value of MA done is 0.7—more than one—the sensitivity analysis is not needed for this case study example. If there are 2 studies with p value > 0.05, removing any of the two studies will result in a loss of the significance.

Double data checking

For more assurance on the quality of results, the analyzed data should be rechecked from full-text data by evidence photos, to allow an obvious check for the PI of the study.

Manuscript writing, revision, and submission to a journal

Writing based on four scientific sections: introduction, methods, results, and discussion, mostly with a conclusion. Performing a characteristic table for study and patient characteristics is a mandatory step which can be found as a template in Additional file 5 : Table S3.

After finishing the manuscript writing, characteristics table, and PRISMA flow diagram, the team should send it to the PI to revise it well and reply to his comments and, finally, choose a suitable journal for the manuscript which fits with considerable impact factor and fitting field. We need to pay attention by reading the author guidelines of journals before submitting the manuscript.

The role of evidence-based medicine in biomedical research is rapidly growing. SR/MAs are also increasing in the medical literature. This paper has sought to provide a comprehensive approach to enable reviewers to produce high-quality SR/MAs. We hope that readers could gain general knowledge about how to conduct a SR/MA and have the confidence to perform one, although this kind of study requires complex steps compared to narrative reviews.

Having the basic steps for conduction of MA, there are many advanced steps that are applied for certain specific purposes. One of these steps is meta-regression which is performed to investigate the association of any confounder and the results of the MA. Furthermore, there are other types rather than the standard MA like NMA and MA. In NMA, we investigate the difference between several comparisons when there were not enough data to enable standard meta-analysis. It uses both direct and indirect comparisons to conclude what is the best between the competitors. On the other hand, mega MA or MA of patients tend to summarize the results of independent studies by using its individual subject data. As a more detailed analysis can be done, it is useful in conducting repeated measure analysis and time-to-event analysis. Moreover, it can perform analysis of variance and multiple regression analysis; however, it requires homogenous dataset and it is time-consuming in conduct [ 24 ].

Conclusions

Systematic review/meta-analysis steps include development of research question and its validation, forming criteria, search strategy, searching databases, importing all results to a library and exporting to an excel sheet, protocol writing and registration, title and abstract screening, full-text screening, manual searching, extracting data and assessing its quality, data checking, conducting statistical analysis, double data checking, manuscript writing, revising, and submitting to a journal.

Availability of data and materials

Not applicable.

Abbreviations

Network meta-analysis

Principal investigator

Population, Intervention, Comparison, Outcome

Preferred Reporting Items for Systematic Review and Meta-analysis statement

Quality assessment

Sample, Phenomenon of Interest, Design, Evaluation, Research type

Systematic review and meta-analyses

Bello A, Wiebe N, Garg A, Tonelli M. Evidence-based decision-making 2: systematic reviews and meta-analysis. Methods Mol Biol (Clifton, NJ). 2015;1281:397–416.

Article   Google Scholar  

Khan KS, Kunz R, Kleijnen J, Antes G. Five steps to conducting a systematic review. J R Soc Med. 2003;96(3):118–21.

Rys P, Wladysiuk M, Skrzekowska-Baran I, Malecki MT. Review articles, systematic reviews and meta-analyses: which can be trusted? Polskie Archiwum Medycyny Wewnetrznej. 2009;119(3):148–56.

PubMed   Google Scholar  

Higgins JPT, Green S. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated March 2011]. 2011.

Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ. 2009;339:b2535.

Methley AM, Campbell S, Chew-Graham C, McNally R, Cheraghi-Sohi S. PICO, PICOS and SPIDER: a comparison study of specificity and sensitivity in three search tools for qualitative systematic reviews. BMC Health Serv Res. 2014;14:579.

Gross L, Lhomme E, Pasin C, Richert L, Thiebaut R. Ebola vaccine development: systematic review of pre-clinical and clinical studies, and meta-analysis of determinants of antibody response variability after vaccination. Int J Infect Dis. 2018;74:83–96.

Article   CAS   Google Scholar  

Shea BJ, Reeves BC, Wells G, Thuku M, Hamel C, Moran J, ... Henry DA. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ. 2017;358:j4008.

Giang HTN, Banno K, Minh LHN, Trinh LT, Loc LT, Eltobgy A, et al. Dengue hemophagocytic syndrome: a systematic review and meta-analysis on epidemiology, clinical signs, outcomes, and risk factors. Rev Med Virol. 2018;28(6):e2005.

Morra ME, Altibi AMA, Iqtadar S, Minh LHN, Elawady SS, Hallab A, et al. Definitions for warning signs and signs of severe dengue according to the WHO 2009 classification: systematic review of literature. Rev Med Virol. 2018;28(4):e1979.

Morra ME, Van Thanh L, Kamel MG, Ghazy AA, Altibi AMA, Dat LM, et al. Clinical outcomes of current medical approaches for Middle East respiratory syndrome: a systematic review and meta-analysis. Rev Med Virol. 2018;28(3):e1977.

Vassar M, Atakpo P, Kash MJ. Manual search approaches used by systematic reviewers in dermatology. Journal of the Medical Library Association: JMLA. 2016;104(4):302.

Naunheim MR, Remenschneider AK, Scangas GA, Bunting GW, Deschler DG. The effect of initial tracheoesophageal voice prosthesis size on postoperative complications and voice outcomes. Ann Otol Rhinol Laryngol. 2016;125(6):478–84.

Rohatgi AJaiWa. Web Plot Digitizer. ht tp. 2014;2.

Hozo SP, Djulbegovic B, Hozo I. Estimating the mean and variance from the median, range, and the size of a sample. BMC Med Res Methodol. 2005;5(1):13.

Wan X, Wang W, Liu J, Tong T. Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC Med Res Methodol. 2014;14(1):135.

Van Rijkom HM, Truin GJ, Van’t Hof MA. A meta-analysis of clinical studies on the caries-inhibiting effect of fluoride gel treatment. Carries Res. 1998;32(2):83–92.

Higgins JP, Altman DG, Gotzsche PC, Juni P, Moher D, Oxman AD, et al. The Cochrane Collaboration's tool for assessing risk of bias in randomised trials. BMJ. 2011;343:d5928.

Tawfik GM, Tieu TM, Ghozy S, Makram OM, Samuel P, Abdelaal A, et al. Speech efficacy, safety and factors affecting lifetime of voice prostheses in patients with laryngeal cancer: a systematic review and network meta-analysis of randomized controlled trials. J Clin Oncol. 2018;36(15_suppl):e18031-e.

Wannemuehler TJ, Lobo BC, Johnson JD, Deig CR, Ting JY, Gregory RL. Vibratory stimulus reduces in vitro biofilm formation on tracheoesophageal voice prostheses. Laryngoscope. 2016;126(12):2752–7.

Sterne JAC, Hernán MA, Reeves BC, Savović J, Berkman ND, Viswanathan M, et al. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ. 2016;355.

RevMan The Cochrane Collaboration %J Copenhagen TNCCTCC. Review Manager (RevMan). 5.0. 2008.

Schwarzer GJRn. meta: An R package for meta-analysis. 2007;7(3):40-45.

Google Scholar  

Simms LLH. Meta-analysis versus mega-analysis: is there a difference? Oral budesonide for the maintenance of remission in Crohn’s disease: Faculty of Graduate Studies, University of Western Ontario; 1998.

Download references

Acknowledgements

This study was conducted (in part) at the Joint Usage/Research Center on Tropical Disease, Institute of Tropical Medicine, Nagasaki University, Japan.

Author information

Authors and affiliations.

Faculty of Medicine, Ain Shams University, Cairo, Egypt

Gehad Mohamed Tawfik

Online research Club http://www.onlineresearchclub.org/

Gehad Mohamed Tawfik, Kadek Agus Surya Dila, Muawia Yousif Fadlelmola Mohamed, Dao Ngoc Hien Tam, Nguyen Dang Kien & Ali Mahmoud Ahmed

Pratama Giri Emas Hospital, Singaraja-Amlapura street, Giri Emas village, Sawan subdistrict, Singaraja City, Buleleng, Bali, 81171, Indonesia

Kadek Agus Surya Dila

Faculty of Medicine, University of Khartoum, Khartoum, Sudan

Muawia Yousif Fadlelmola Mohamed

Nanogen Pharmaceutical Biotechnology Joint Stock Company, Ho Chi Minh City, Vietnam

Dao Ngoc Hien Tam

Department of Obstetrics and Gynecology, Thai Binh University of Medicine and Pharmacy, Thai Binh, Vietnam

Nguyen Dang Kien

Faculty of Medicine, Al-Azhar University, Cairo, Egypt

Ali Mahmoud Ahmed

Evidence Based Medicine Research Group & Faculty of Applied Sciences, Ton Duc Thang University, Ho Chi Minh City, 70000, Vietnam

Nguyen Tien Huy

Faculty of Applied Sciences, Ton Duc Thang University, Ho Chi Minh City, 70000, Vietnam

Department of Clinical Product Development, Institute of Tropical Medicine (NEKKEN), Leading Graduate School Program, and Graduate School of Biomedical Sciences, Nagasaki University, 1-12-4 Sakamoto, Nagasaki, 852-8523, Japan

You can also search for this author in PubMed   Google Scholar

Contributions

NTH and GMT were responsible for the idea and its design. The figure was done by GMT. All authors contributed to the manuscript writing and approval of the final version.

Corresponding author

Correspondence to Nguyen Tien Huy .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1:.

Figure S1. Risk of bias assessment graph of included randomized controlled trials. (TIF 20 kb)

Additional file 2:

Figure S2. Risk of bias assessment summary. (TIF 69 kb)

Additional file 3:

Figure S3. Arthralgia results of random effect meta-analysis using R meta package. (TIF 20 kb)

Additional file 4:

Figure S4. Arthralgia linear regression test of funnel plot asymmetry using R meta package. (TIF 13 kb)

Additional file 5:

Table S1. PRISMA 2009 Checklist. Table S2. Manipulation guides for online database searches. Table S3. Detailed search strategy for twelve database searches. Table S4. Baseline characteristics of the patients in the included studies. File S1. PROSPERO protocol template file. File S2. Extraction equations that can be used prior to analysis to get missed variables. File S3. R codes and its guidance for meta-analysis done for comparison between EBOLA vaccine A and placebo. (DOCX 49 kb)

Additional file 6:

Data S1. Extraction and quality assessment data sheets for EBOLA case example. (XLSX 1368 kb)

Additional file 7:

Data S2. Imaginary data for EBOLA case example. (XLSX 10 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Tawfik, G.M., Dila, K.A.S., Mohamed, M.Y.F. et al. A step by step guide for conducting a systematic review and meta-analysis with simulation data. Trop Med Health 47 , 46 (2019). https://doi.org/10.1186/s41182-019-0165-6

Download citation

Received : 30 January 2019

Accepted : 24 May 2019

Published : 01 August 2019

DOI : https://doi.org/10.1186/s41182-019-0165-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Tropical Medicine and Health

ISSN: 1349-4147

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

meta analysis research questions

Systematic Reviews and Meta Analysis

  • Getting Started
  • Guides and Standards
  • Review Protocols
  • Databases and Sources
  • Randomized Controlled Trials
  • Controlled Clinical Trials
  • Observational Designs
  • Tests of Diagnostic Accuracy
  • Software and Tools
  • Where do I get all those articles?
  • Collaborations
  • EPI 233/528
  • Countway Mediated Search
  • Risk of Bias (RoB)

Systematic review Q & A

What is a systematic review.

A systematic review is guided filtering and synthesis of all available evidence addressing a specific, focused research question, generally about a specific intervention or exposure. The use of standardized, systematic methods and pre-selected eligibility criteria reduce the risk of bias in identifying, selecting and analyzing relevant studies. A well-designed systematic review includes clear objectives, pre-selected criteria for identifying eligible studies, an explicit methodology, a thorough and reproducible search of the literature, an assessment of the validity or risk of bias of each included study, and a systematic synthesis, analysis and presentation of the findings of the included studies. A systematic review may include a meta-analysis.

For details about carrying out systematic reviews, see the Guides and Standards section of this guide.

Is my research topic appropriate for systematic review methods?

A systematic review is best deployed to test a specific hypothesis about a healthcare or public health intervention or exposure. By focusing on a single intervention or a few specific interventions for a particular condition, the investigator can ensure a manageable results set. Moreover, examining a single or small set of related interventions, exposures, or outcomes, will simplify the assessment of studies and the synthesis of the findings.

Systematic reviews are poor tools for hypothesis generation: for instance, to determine what interventions have been used to increase the awareness and acceptability of a vaccine or to investigate the ways that predictive analytics have been used in health care management. In the first case, we don't know what interventions to search for and so have to screen all the articles about awareness and acceptability. In the second, there is no agreed on set of methods that make up predictive analytics, and health care management is far too broad. The search will necessarily be incomplete, vague and very large all at the same time. In most cases, reviews without clearly and exactly specified populations, interventions, exposures, and outcomes will produce results sets that quickly outstrip the resources of a small team and offer no consistent way to assess and synthesize findings from the studies that are identified.

If not a systematic review, then what?

You might consider performing a scoping review . This framework allows iterative searching over a reduced number of data sources and no requirement to assess individual studies for risk of bias. The framework includes built-in mechanisms to adjust the analysis as the work progresses and more is learned about the topic. A scoping review won't help you limit the number of records you'll need to screen (broad questions lead to large results sets) but may give you means of dealing with a large set of results.

This tool can help you decide what kind of review is right for your question.

Can my student complete a systematic review during her summer project?

Probably not. Systematic reviews are a lot of work. Including creating the protocol, building and running a quality search, collecting all the papers, evaluating the studies that meet the inclusion criteria and extracting and analyzing the summary data, a well done review can require dozens to hundreds of hours of work that can span several months. Moreover, a systematic review requires subject expertise, statistical support and a librarian to help design and run the search. Be aware that librarians sometimes have queues for their search time. It may take several weeks to complete and run a search. Moreover, all guidelines for carrying out systematic reviews recommend that at least two subject experts screen the studies identified in the search. The first round of screening can consume 1 hour per screener for every 100-200 records. A systematic review is a labor-intensive team effort.

How can I know if my topic has been been reviewed already?

Before starting out on a systematic review, check to see if someone has done it already. In PubMed you can use the systematic review subset to limit to a broad group of papers that is enriched for systematic reviews. You can invoke the subset by selecting if from the Article Types filters to the left of your PubMed results, or you can append AND systematic[sb] to your search. For example:

"neoadjuvant chemotherapy" AND systematic[sb]

The systematic review subset is very noisy, however. To quickly focus on systematic reviews (knowing that you may be missing some), simply search for the word systematic in the title:

"neoadjuvant chemotherapy" AND systematic[ti]

Any PRISMA-compliant systematic review will be captured by this method since including the words "systematic review" in the title is a requirement of the PRISMA checklist. Cochrane systematic reviews do not include 'systematic' in the title, however. It's worth checking the Cochrane Database of Systematic Reviews independently.

You can also search for protocols that will indicate that another group has set out on a similar project. Many investigators will register their protocols in PROSPERO , a registry of review protocols. Other published protocols as well as Cochrane Review protocols appear in the Cochrane Methodology Register, a part of the Cochrane Library .

  • Next: Guides and Standards >>
  • Last Updated: Feb 26, 2024 3:17 PM
  • URL: https://guides.library.harvard.edu/meta-analysis
         


10 Shattuck St, Boston MA 02115 | (617) 432-2136

| |
Copyright © 2020 President and Fellows of Harvard College. All rights reserved.

  • Open access
  • Published: 03 March 2017

Meta-evaluation of meta-analysis: ten appraisal questions for biologists

  • Shinichi Nakagawa 1 , 2 ,
  • Daniel W. A. Noble 1 ,
  • Alistair M. Senior 3 , 4 &
  • Malgorzata Lagisz 1  

BMC Biology volume  15 , Article number:  18 ( 2017 ) Cite this article

43k Accesses

329 Citations

97 Altmetric

Metrics details

Meta-analysis is a statistical procedure for analyzing the combined data from different studies, and can be a major source of concise up-to-date information. The overall conclusions of a meta-analysis, however, depend heavily on the quality of the meta-analytic process, and an appropriate evaluation of the quality of meta-analysis (meta-evaluation) can be challenging. We outline ten questions biologists can ask to critically appraise a meta-analysis. These questions could also act as simple and accessible guidelines for the authors of meta-analyses. We focus on meta-analyses using non-human species, which we term ‘biological’ meta-analysis. Our ten questions are aimed at enabling a biologist to evaluate whether a biological meta-analysis embodies ‘mega-enlightenment’, a ‘mega-mistake’, or something in between.

Meta-analyses can be important and informative, but are they all?

Last year saw 40 years since the coining of the term ‘meta-analysis’ by Gene Glass in 1976 [ 1 , 2 ]. Meta-analyses, in which data from multiple studies are combined to evaluate an overall effect, or effect size, were first introduced to the medical and social sciences, where humans are the main species of interest [ 3 , 4 , 5 ]. Decades later, meta-analysis has infiltrated different areas of biological sciences [ 6 ], including ecology, evolutionary biology, conservation biology, and physiology. Here non-human species, or even ecosystems, are the main focus [ 7 , 8 , 9 , 10 , 11 , 12 ]. Despite this somewhat later arrival, interest in meta-analysis has been rapidly increasing in biological sciences. We have argued that the remarkable surge in interest over the last several years may indicate that meta-analysis is superseding traditional (narrative) reviews as a more objective and informative way of summarizing biological topics [ 8 ].

It is likely that the majority of us (biologists) have never conducted a meta-analysis. Chances are, however, that almost all of us have read at least one. Meta-analysis can not only provide quantitative information (such as overall effects and consistency among studies), but also qualitative information (such as dominant research trends and current knowledge gaps). In contrast to that of many medical and social scientists [ 3 , 5 ], the training of a biologist does not typically include meta-analysis [ 13 ] and, consequently, it may be difficult for a biologist to evaluate and interpret a meta-analysis. As with original research studies, the quality of meta-analyses vary immensely. For example, recent reviews have revealed that many meta-analyses in ecology and evolution miss, or perform poorly, several critical steps that are routinely implemented in the medical and social sciences [ 14 , 15 ] (but also see [ 16 , 17 ]).

The aim of this review is to provide ten appraisal questions that one should ask when reading a meta-analysis (cf., [ 18 , 19 ]), although these questions could also be used as simple and accessible guidelines for researchers conducting meta-analyses. In this review, we only deal with ‘narrow sense’ or ‘formal’ meta-analyses, where a statistical model is used to combine common effect sizes across studies, and the model takes into account sampling error, which is a function of sample size upon which each effect size is based (more details below; for discussions on the definitions of meta-analysis, see [ 15 , 20 , 21 ]). Further, our emphasis is on ‘biological’ meta-analyses, which deal with non-human species, including model organisms (nematodes, fruit flies, mice, and rats [ 22 ]) and non-model organisms, multiple species, or even entire ecosystems. For medical and social science meta-analyses concerning human subjects, large bodies of literature and excellent guidelines already exist, especially from overseeing organizations such as the Cochrane (Collaboration) and the Campbell Collaboration. We refer to the literature and the practices from these ‘experienced’ disciplines where appropriate. An overview and roadmap of this review is presented in Fig.  1 . Clearly, we cannot cover all details, but we cite key references in each section so that interested readers can follow up.

Mapping the process (on the left ) and main evaluation questions (on the right ) for meta-analysis. References to the relevant figures (Figs.  2 , 3 , 4 , 5 and 6 ) are included in the blue ovals

Q1: Is the search systematic and transparently documented?

When we read a biological meta-analysis, it used to be (and probably still is) common to see a statement like “a comprehensive search of the literature was conducted” without mention of the date and type of databases the authors searched. Documentation on keyword strings and inclusion criteria is often also very poor, making replication of search outcomes difficult or impossible. Superficial documentation also makes it hard to tell whether the search really was comprehensive, and, more importantly, systematic.

A comprehensive search attempts to identify (almost) all relevant studies/data for a given meta-analysis, and would thus not only include multiple major databases for finding published studies, but also make use of various lesser-known databases to locate reports and unpublished studies. Despite the common belief that search results should be similar among major databases, overlaps can sometimes be only moderate. For example, overlap in search results between Web of Science and Scopus (two of the most popular academic databases) is only 40–50% in many major fields [ 23 ]. As well as reading that a search is comprehensive, it is not uncommon to read that a search was systematic. A systematic search needs to follow a set of pre-determined protocols aimed at minimizing bias in the resulting data set. For example, a search of a single database, with pre-defined focal questions, search strings, and inclusion/exclusion criteria, can be considered systematic, negating some bias, though not necessarily being comprehensive. It is notable that a comprehensive search is preferable but not necessary (and often very difficult to do) whereas a systematic search is a must [ 24 ].

For most meta-analyses in medicine and social sciences, the search steps are systematic and well documented for reproducibility. This is because these studies follow a protocol named the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement [ 25 , 26 ]; note that a meta-analysis should usually be a part of a systematic review, although a systematic review may or may not include meta-analysis. The PRISMA statement facilitates transparency in reporting meta-analytic studies. Although it was developed for health sciences, we believe that the details of the four key elements of the PRISMA flow diagram (‘identification’, ‘screening’, ‘eligibility’, and ‘included’) should also be reported in a biological meta-analysis [ 8 ]. Figure  2 shows: A) the key ideas of the PRISMA statement, which the reader should compare with the content of a biological meta-analysis; and B) an example of a PRISMA diagram, which should be included as part of meta-analysis documentation. The bottom line is that one should assess whether search and screening procedures are reproducible and systematic (if not comprehensive; to minimize potential bias), given what is described in the meta-analytic paper [ 27 , 28 ].

Preferred Reporting Items for Systematic Reviews and Meta-Analyses. (PRISMA). a The main components of a systematic review or meta-analysis. The data search (identification) stage should, ideally, be preceded by the development of a detailed study protocol and its preregistration. Searching at least two literature databases, along with other sources of published and unpublished studies (using backward and forward citations, reviews, field experts, own data, grey and non-English literature) is recommended. It is also necessary to report search dates and exact keyword strings. The screening and eligibility stage should be based on a set of predefined study inclusion and exclusion criteria. Criteria might differ for the initial screening (title, abstract) compared with the full-text screening, but both need to be reported in detail. It is good practice to have at least two people involved in screening, with a plan in place for disagreement resolution and calculating disagreement rates. It is recommended that the list of studies excluded at the full-text screening stage, with reasons for their exclusion, is reported. It is also necessary to include a full list of studies included in the final dataset, with their basic characteristics. The extraction and coding (included) stage may also be performed by at least two people (as is recommended in medical meta-analysis). The authors should record the figures, tables, or text fragments within each paper from which the data were extracted, as well as report intermediate calculations, transformations, simplifications, and assumptions made during data extraction. These details make tracing mistakes easier and improve reproducibility. Documentation should include: a summary of the dataset, information on data and study details requested from authors, details of software used, and code for analyses (if applicable). b It is now becoming compulsory to present a PRISMA diagram, which records the flow of information starting from the data search and leading to the final data set. WoS Web of Science

Q2: What question and what effect size?

A meta-analysis should not just be descriptive. The best meta-analyses ask questions or test hypotheses, as is the case with original research. The meta-analytic questions and hypotheses addressed will generally determine the types of effect size statistics the authors use [ 29 , 30 , 31 , 32 ], as we explain below. Three broad groups of effect size statistics are based on are: 1) the difference between the means of two groups (for example, control versus treatment); 2) the relationship, or correlation, between two variables; and 3) the incidence of two outcomes (for example, dead or alive) in two groups (often represented in a 2 by 2 contingency table); see [ 3 , 7 ] for comprehensive lists of effect size statistics. Corresponding common effect size statistics are: 1) standardized mean difference (SMD; often referred to as d , Cohen’s d , Hedges’ d or Hedges’ g ) and the natural logarithm (log) of the response ratio (denoted as either ln R or ln RR [ 33 ]); 2) Fisher’s z -transformed correlation coefficient (often denoted as Zr ); and 3) the natural logarithm of the odds ratio (ln OR ) and relative risk (ln RR ; not to be confused with the response ratio).

We have also used and developed methods associated with less common effect size statistics such as log hazard ratio (ln HR ) for comparing survival curves [ 34 , 35 , 36 , 37 ], and also the log coefficient of variation ratio (ln CVR ) for comparing differences between the variances, rather than means, of two groups [ 38 , 39 , 40 ]. It is important to assess whether a study used an appropriate effect size statistic for the focal question. For example, when the authors are interested in the effect of a certain treatment, they should typically use SMD or response ratio, rather than Zr . Most biological meta-analyses will use one of the standardized effect sizes mentioned above. These effect sizes are referred to as standardized because they are unit-less (dimension-less), and thus are comparable across studies, even if those studies use different units for reporting (for example, size can be measured by weight [g] or length [cm]). However, unstandardized effect sizes (raw mean difference or regression coefficients) can be used, as happens in medical and social sciences, when all studies use common and directly comparable units (for example, blood pressure [mmHg]).

That being said, a biological meta-analysis will often bring together original studies of different types (such as combinations of experimental and observational studies). As a general rule, SMD is considered a better fit for experimental studies, whereas Zr is better for observational (correlational) studies. In some cases different effect sizes might be calculated for different studies in a meta-analysis and then be converted to a common type prior to analysis: for example, Zr and SMD (and also ln OR ) are inter-convertible. Thus, if we were, for example, interested in the effect of temperature on growth, we could combine results from experimental studies that compare mean growth at two temperatures (SMD) with results from observational studies that compare growth across a temperature gradient ( Zr ) in a single meta-analysis by transforming SMD from experimental studies to Zr [ 29 , 30 , 31 , 32 ].

Q3: Is non-independence taken into account?

Statistical non-independence occurs when data points (in this case, effect sizes) are somewhat related to each other. For example, multiple effect sizes may be taken from a single study, making such effect sizes correlated. Failing to account for non-independence among effect sizes (or data points) can lead to erroneous conclusions [ 14 , 42 , 43 ,, 41 – 44 ]—typically, an invalid conclusion of statistical significance (type I error; also see Q7). Many authors do not correct for non-independence (see [ 15 ]). There are two main reasons for this: the authors may be unaware of non-independence among effect sizes or they may have difficulty in appropriately accounting for the correlated structure despite being aware of the problem.

To help the reader to detect non-independence where the authors have failed to take it into account, we have illustrated four common types of dependent effect sizes in Fig.  3 , with the legend including a biological example for each type. Phylogenetic relatedness (Fig.  3d ) is unique to biological meta-analyses that include multiple species [ 14 , 42 , 45 ]. Correction for phylogenetic non-independence can now be implemented in several mainstream software packages, including metafor [ 46 ].

Common sources of non-independence in biological meta-analyses. a – d Hypothetical examples of the four most common scenarios of non-independence ( a - d ). Orange lines and arrows indicate correlations between effect sizes. Effect size estimate ( gray boxes , ‘ ES ’) is the ratio of (or difference between) the means of two groups (control versus treatment). Scenarios a , b , and d may apply to other types of effect sizes (e.g., correlation), while scenario c is unique to situations where two or more groups are compared to one control group. a Multiple effect sizes can be calculated from a single study. Effect sizes in study 3 are not independent of each other because effects (ES3 and ES4) are derived from two experiments using samples from the same population. For example, a study exposed females and males to increased temperatures, and the results are reported separately for the two sexes. b Effect sizes taken from the same study (study 3) are derived from different traits measured from the same subjects, resulting in correlations among these effect sizes. For example, body mass and body length are both indicators of body size, with studies 1 and 2 reporting just one of these measurements and study 3 reporting both for the same group of individuals. c Effect sizes can be correlated via contrast with a common ‘control’ group of individuals; for example, both effect sizes from study 3 share a common control treatment. A study may, for example, compare a balanced diet (control) with two levels of a protein-enriched diet. d In a multi-species study effect sizes can be correlated when they are based on data from organisms from the same taxonomic unit, due to evolutionary history. Effect sizes taken from studies 3 and 4 are not independent, because these studies were performed on the same species ( Sp.3 ). Additionally, all species share a phylogenetic history, and thus all effect sizes can be correlated with one another in accordance with time since evolutionary divergence between species

Where non-independence goes uncorrected because of the difficulty of appropriately accounting for the correlated structure, it is usually because the non-independence is incompatible with the two traditional meta-analytic models (the fixed-effect and the random-effects models—see Q4) that are implemented in widely used software (for example, Metawin [ 47 ]). Therefore, it was (and still is) common to see averaging of non-independent effect sizes or the selection of one among several related effect sizes. These solutions are not necessarily incorrect (see [ 48 ]), but may be limiting, and clearly lead to a loss of information [ 14 , 49 ]. The reader should be aware that it is preferable to model non-independence directly by using multilevel meta-analytic models (see Q4) if the dataset contains a sufficient number of studies (complex models usually require a large sample size) [ 14 ].

Q4: Which meta-analytic model?

There are three main kinds of meta-analytic models, which differ in their assumptions about the data being analyzed, but for all three the common and primary goal is to estimate an overall effect (but see Q5). These models are: i) fixed-effect models (also referred to as common-effect models [ 31 ]); ii) random-effects models [ 50 ]; and iii) multilevel (hierarchical) models [ 14 , 49 ]. We have depicted these three kinds of models in Fig.  4 . When assessing a meta-analysis, the reader should be aware of the different assumptions each model makes. For the fixed-effect (Fig.  4a ) and random-effects (Fig.  4b ) models, all effect sizes are assumed to be independent (that is, one effect per study, with no other sources of non-independence; see Q3). The other major assumption of a fixed-effect model is that all effect sizes share a common mean, and thus that variation among data is solely attributable to sampling error (that is, the sampling variance, v i , which is related to the sample size for each effect size; Fig.  4a ). This assumption, however, is unrealistic for most biological meta-analyses (see [ 22 ]), especially those involving multiple populations, species, and/or ecosystems [ 14 , 51 ]. The use of a fixed-effect model could be justified where the effect sizes are obtained from the same species or population (assuming one effect per study and that the effect sizes are independent of each other). Random-effects models relax the assumption that all studies are based on samples from the same underlying population, meaning that these models can be used when different studies are likely to quantify different underlying mean effects (for example, one study design yields a different effect than another), as is likely to be the case for a biological meta-analysis (Fig.  4b ). A random-effects model needs to quantify the between-study variance, τ 2 , and to estimate this variance correctly requires a sample size of perhaps over ten effect sizes. Thus, random-effects models may not be appropriate for a meta-analysis with very few effect sizes, and fixed-effect models may be appropriate in such situations (bearing in mind the aforementioned assumptions). Multilevel models relax the assumptions of independence made by fixed-effect and random-effects models; that is, for example, these models allow for multiple effect sizes to come from the same study, which may be the case if one study contains several different experimental treatments, or the same experimental treatment is applied across species within one study. The simplest multilevel model depicted in Fig.  4c includes study effects, but it is probably not difficult to imagine this multilevel approach being extended to incorporate more ‘levels’, such as species effects, as well (for more details see [ 13 , 52 , 53 ,, 14 , 41 , 45 , 49 , 51 – 54 ]; incorporating the types of non-independence described in Fig.  3b–d requires modeling of correlation and covariance matrices).

Visualizations of the three main types of meta-analytic models and their assumptions. a The fixed-effect model can be written as y i  =  b 0  +  e i , where y i is the observed effect for the i th study ( i  = 1… k ; orange circles ), b 0 is the overall effect (overall mean; thick grey line and black diamond ) for all k studies and e i is the deviation from b 0 for the i th study ( dashed orange lines ), and e i is distributed with the sampling variance ν i ( orange curves ); note that this variance is sometimes called within-study variance in the literature, but we reserve this term for the multilevel model below. b The random-effects model can be written as y i  =  b 0  +  s i  +  e i , where b 0 is the overall mean for different studies, each of which has a different study-specific mean ( green squares and green solid lines ), deviating by s i ( green dashed lines ) from b 0 , s i is distributed with a variance of τ 2 (the between-study variance; green curves ); note that this is the conventional notation for the between-study variance, but in a biological meta-analysis, it can be referred to as, say, σ 2 [study] . The other notation is as above. Displayed on the top-right is the formula for the heterogeneity statistic, I 2 for the random-effects model, where \( \overline{v} \) is a typical sampling variance (perhaps, most easily conceptualized as the average value of sampling variances, ν i ). c The simplest multilevel model can be written as y ij  =  b 0  +  s i  +  u ij  +  e ij , where u ij is the deviation from s i for j th effect size for the i th study ( blue triangles and dashed blue lines ) and is distributed with the variance of σ 2 (the within-study variance or it may be denoted as σ 2 [effect size] ; blue curves ), e ij is the deviation from u ij , and the other notations are the same as above. Each of k studies has m effect sizes ( j  = 1… m ). Displayed on the top-right is the multilevel meta-analysis formula for the heterogeneity statistic, I 2 , where both the numerator and denominator include the within-study variance, σ 2 , in addition to what appears in the formula for the random-effects model

It is important for you, as the reader, to check whether the authors, given their data, employed an appropriate model or set of models (see Q3), because results from inappropriate models could lead to erroneous conclusions. For example, applying a fixed effect model, when a random effects model is more appropriate, may lead to errors in both the estimated magnitude of the overall effect and its uncertainty [ 55 ]. As can be seen from Fig.  4 , each of the three main meta-analytical models assume that effect sizes are distributed around an overall effect ( b 0 ). The reader should also be aware that this estimated overall effect (meta-analytic mean) is most commonly presented in an accompanying forest plot(s) [ 22 , 56 , 57 ]. Figure  5a is a forest plot of the kind that is typically seen in medical and social sciences, with both overall means from the fixed-effect or the common effect meta-analysis (FEMA/CEMA) model, and the random-effects meta-analysis (REMA) model. In a multiple-species meta-analysis, you may see an elaborate forest plot such as that in Fig.  5b .

Examples of forest plots used in a biological meta-analysis to represent effect sizes and their associated precisions. a A conventional forest plot displaying the magnitude and uncertainty (95% confidence interval, CI) of each effect size in the dataset, as well as reporting the associated numerical values and a reference to the original paper. The sizes of the shapes representing point estimates are usually scaled based on their precision (1/Standard error). Diamonds at the bottom of the plot display the estimated overall mean based on both fixed-effect meta-analysis/‘common-effect’ meta-analysis ( FEMA/CEMA ) and random-effects meta-analysis ( REMA ) models. b A forest plot that has been augmented to display a phylogenetic relationship between different taxa in the analysis; the estimated d seems on average to be higher in some clades than in the others. A diamond at the bottom summarizes the aggregate mean as estimated by a multi-level meta-analysis accounting for the given phylogenetic structure. On the right is the number of effect sizes for each species ( k ), although similarly one could also display the number of individuals/sample-size ( n ), where only one effect size per species is included. c As well as displaying overall effect ( diamond ), forest plots are sometimes used to display the mean effects from different sub-groups of the data (e.g., effects separated by sex or treatment type), as estimated with data sub-setting or meta-regression, or even a slope from meta-regression (indicating how an effect changes with increasing continuous variable, e.g., dosage). d Different magnitudes of correlation coefficient ( r ), and associated 95% CIs, p values, and the sample size on which each estimate is based. The space is shaded according to effect magnitude based on established guidelines; light grey , medium grey , and dark grey correspond to small, medium, and large effects, respectively

Q5: Is the level of consistency among studies reported?

The overall effect reported by a meta-analysis cannot be properly interpreted without an analysis of the heterogeneity, or inconsistency, among effect sizes. For example, an overall mean of zero can be achieved when effect sizes are all zero (homogenous; that is, the between-study variance is 0) or when all effect sizes are very different (heterogeneous; the between study variance is >0) but centered on zero, and clearly one should draw different conclusions in each case. Rather disturbingly, we have recently found that in ecology and evolutionary biology, tests of heterogeneity and their corresponding statistics ( τ 2 , Q , and I 2 ) are only reported in about 40% of meta-analyses [ 58 ]. Cochran’s Q (often referred to as Q total or Q T ) is a test statistic for the between-study variance ( τ 2 ), which allows one to assess whether the estimated between-study variance is non-zero (in other words, whether a fixed-effect model is appropriate as this model assumes τ 2  = 0) [ 59 ]. As a test statistic, Q is often presented with a corresponding p value, which is interpreted in the conventional manner. However, if presented without the associated τ 2 , Q can be misleading because, as is the case with most statistical tests, Q is more likely to be significant when more studies are included even if τ 2 is relatively small (see also Q7); the reader should therefore check whether both statistics are presented. Having said that, the magnitude of the between-study variance ( τ 2 ) can be hard to interpret because it is dependent on the scale of the effect size. The heterogeneity statistic, I 2 , which is a type of intra-class correlation, has also been recommended as it addresses some of the issues associated with Q and τ 2 [ 60 , 61 ]. I 2 ranges from 0 to 1 (or 0 to 100%) and indicates how much of the variation in effect sizes is due to the between-study variance ( τ 2 ; Fig.  4b ) or, more generally, the proportion of variance not attributable to sampling (error) variance ( \( \overline{v} \) ; see Fig.  4b, c ; for more details and extensions, see [ 13 , 14 , 49 , 58 ]). Tentatively suggested benchmarks for I 2 are low, medium, and high heterogeneity of 25, 50, and 75% [ 61 ]. These values are often used in meta-analyses in medical and social sciences for interpreting the degree of heterogeneity [ 62 , 63 ]. However, we have shown that the average I 2 in meta-analyses in ecology and evolution may be as high as 92%, which may not be surprising as these meta-analyses are not confined to a single species (or human subjects) [ 58 ]. Accordingly, the reader should consider whether these conventional benchmarks are applicable to the biological meta-analysis under consideration. The quantification and reporting of heterogeneity statistics is essential for any meta-analysis, and you need to make sure some or combinations of these three statistics are reported in a meta-analysis before making generalisations based on the overall mean effect (except when using fixed-effect models).

Q6: Are the causes of variation among studies investigated?

After quantifying variation among effect sizes beyond sampling variation ( I 2 ), it is important to understand the factors, or moderators, that might explain this additional variation, because it can elucidate important processes mediating variation in the strength of effect. Moderators are equivalent to explanatory (independent) variables or predictors in a normal linear model [ 8 , 49 , 62 ]. For example, in a meta-analysis examining the effect of experimentally increased temperature on growth using SMD (control versus treatment comparison) studies might vary in the magnitude of temperature increase: say 10 versus 20 °C in the first study, but 12 versus 16 °C in the second. In this case, the moderator of interest is the temperature difference between control and treatment groups (10 °C for the first study and 4 °C for the second). This difference in study design may explain variation in the magnitude of the observed effect sizes (that is, the SMD of growth at the two temperatures). Models that examine the effects of moderators are referred to as meta-regressions. One important thing to note is that meta-regression is just a special type of weighted regression. Therefore, the usual standard practices for regression analysis also apply to meta-regression. This means that, as a reader, you may want to check for the inclusion of too many predictors/moderators in a single model, or ‘over-fitting’ (the rule of thumb is that the authors may need at least ten effect sizes per estimated moderator) [ 64 ], and for ‘fishing expeditions’ (also known as ‘data dredging’ or ‘ p hacking’; that is, non-hypothesis-based exploration for statistical significance [ 28 , 65 , 66 ]).

Moderators can be correlated with each other (that is, be subject to the multicollinearity problem) and this dependence, in turn, could lead authors to attribute an effect to the wrong moderator [ 67 ]. For example, in the aforementioned meta-analysis of temperature on growth, the study may claim that females grew faster than males when exposed to increased temperatures. However, if most females came from studies where higher temperature increases were used but males were usually exposed to small increases, the moderators for sex and temperature would be confounded. Accordingly, the effect may be due to the severity of the temperature change rather than a sex effect. Readers should check whether the authors have examined potential confounding effects of moderators and reported how different potential moderators are related to one another. It is also important to know the sources of the moderator data; for example, species-specific data can be obtained from sources (papers, books, databases) other than the primary studies from which effect sizes were taken (Q1). Meta-regression results can be presented in a forest plot, as in Fig.  5c (see also Q6 and Fig.  6e, f ; the standardization of moderators may often be required for analyzing moderators [ 68 ]).

Graphical assessment tools for testing for publication bias. a A funnel plot showing greater variance among effects that have larger standard errors ( SE ) and that are thus more susceptible to sampling variability. Some studies in the lower right corner of the plot, opposite to most major findings, with large SE (less likely to detect significant results) are potentially missing (not shown), suggesting publication bias. b Often funnel plots are depicted using precision (1/SE), giving a different perspective of publication bias, where studies with low precision (or large SE) are expected to show greater sampling variability compared to studies with high precision (or low SE). Note that the data in panel b are the same as in panel a , except that a trim-and-fill analysis has been performed in b . A trim-and-fill analysis estimates the number of studies missing from the meta-analysis and creates ‘mirrored’ studies on the opposite side of the funnel ( unfilled dots ) to estimate how the overall effect size estimate is impacted by these missing studies. c Radial (Galbraith) plot in which the slope should be close to zero, if little publication bias exists, indicating little asymmetry in a corresponding funnel plot (compare it with b ); radial plots are closely associated with Egger’s tests. d Cumulative meta-analysis showing how the effect size changes as the number of studies on a particular topic increases. In this situation, the addition of effect size estimates led to convergence on an overall estimate of 0.36, and the confidence intervals decrease as the precision of the estimate increases. e Bubble plot showing a temporal trend in effect size ( Zr ) across years. Here effect sizes are weighted by their precision; larger bubbles indicate more precise estimates and smaller bubbles less precise. f Bubble plot of the relationship between effect size and impact factors of journals, indicating that larger magnitudes of effect sizes (the absolute values of Zr ) tend to be published in higher impact journals

Another way of exploring heterogeneity is to run separate meta-analysis on data subsets (for example, separating effect sizes by the sex of exposed animals). This is similar to running a meta-regression with categorical moderators (often referred to as subgroup analysis), with the key difference being that the authors can obtain heterogeneity statistics (such as I 2 ) for each subset in a subset analysis [ 69 ]. It is important to note that many meta-analytic studies include more than one meta-analysis, because several different types of data are included, even though these data pertain to one topic (for example, the effect of increased temperature not only on body growth, but also on parasite load). You, as a reader, will need to evaluate whether the authors’ sub-grouping or sub-setting of their data makes sense biologically; hopefully the authors will have provided clear justification (Q1).

Q7: Are effects interpreted in terms of biological importance?

Meta-analyses should focus on biological importance (which is reflected in estimated effects and their uncertainties) rather than on p values and statistical significance, as is outlined in Fig.  5d [ 29 , 71 ,, 70 – 72 ]. It should be clear to most readers that interpreting results only in terms of statistical significance ( p values) can be misleading. For example, in terms of effects’ magnitudes and uncertainties, ES4 and ES6 in Fig.  5d are nearly identical, yet ES4 is statistically significant, while ES6 is not. Also, ES1–3 are all what people describe as ‘highly significant’, but their magnitudes of effect, and thus biological relevance, are very different. The term ‘effective thinking’ is used to refer to the philosophy of placing emphasis on the interpretation of overall effect size in terms of biological importance rather than statistical significance [ 29 ]. It is useful for the reader to know that each of ES1–3 in Fig.  5d can be classified as what Jacob Cohen proposed as small, medium, and large effects, which are r  = 0.1, 0.3, and 0.5, respectively [ 73 ]; for SMD, corresponding benchmarks are d (SMD) = 0.2, 0.5, and 0.8 [ 29 , 61 ]. Researchers may have good intuition for the biological relevance of a particular r value, but this may not be the case for SMD. Thus, it may be helpful to know that Cohen’s benchmarks for r and d are comparable. Having said that, these benchmarks, along with those for I 2 , have to be used carefully, because what constitute biologically important effect magnitudes can vary according to the biological questions and systems (for example, 1% difference in fitness would not matter in ecological time but it certainly does over evolutionary time). We stress that authors should primarily be discussing their effect sizes (point estimates) and uncertainties in terms of point estimates (confidence intervals, or credible intervals, CIs) [ 29 , 70 , 72 ]. Meta-analysts can certainly note statistical significance, which is related to CI width, but direct description of precision may be more useful. Note that effect magnitude and precision are exactly what are displayed in forest plots (Fig.  5 ).

Q8: Has publication bias been considered?

Meta-analysts have to assume that research is published regardless of statistical significance, and that authors have not selectively reported results (that is, that there is no publication bias and no reporting bias) [ 74 , 75 , 76 ]. This is unlikely. Therefore, meta-analysts should check for publication bias using statistical and graphical tools. The reader should know that the commonly used methods for assessing publication bias are funnel plots (Fig.  6a, b ), radial (Galbraith) plots (Fig.  6c ), and Egger’s (regression) tests [ 57 , 77 , 78 ]; these methods visually or statistically (Egger’s test) help to detect funnel asymmetry, which can be caused by publication bias [ 79 ]. However, you should also know that funnel asymmetry may be an artifact of too few a number of effect sizes. Further, funnel asymmetry can result from heterogeneity (non-zero between-study variance, τ 2 ) [ 77 , 80 ]. Some readily-implementable methods for correcting for publication bias also exist, such as trim-and-fill methods [ 81 , 82 ] or the use of the p curve [ 83 ]. The reader should be aware that these methods have shortcomings; for example, the trim-and-fill method can under- or overestimate an overall effect size, while the p curve probably only works when effect sizes come from tightly controlled experiments [ 83 , 84 , 85 , 86 ] (see Q9; note that ‘selection modeling’ is an alternative approach, but it is more technically difficult [ 79 ]). A less contentious topic in this area is the time-lag bias, where the magnitudes of an effect diminish over time [ 87 , 88 , 89 ]. This bias can be easily tested with a cumulative meta-analysis and visualized using a forest plot [ 90 , 91 ] (Fig.  6d ) or a bubble plot combined with meta-regression (Fig.  6e ; note that journal impact factor can also be associated with the magnitudes of effect sizes [ 92 ], Fig.  6f ).

Alarmingly, meta-reviews have found that only half of meta-analyses in ecology and evolution assessed publication bias [ 14 , 15 ]. Disappointingly, there are no perfect solutions for detecting and correcting for publication bias, because we never really know with certainty what kinds of data are actually missing (although usually statistically non-significant and small effect sizes are underrepresented in the dataset; see also Q9). Regardless, the existing tools should still be used and the presentation of results from at least two different methods is recommended.

Q9: Are results really robust and unbiased?

Although meta-analyses from the medical and social sciences are often accompanied by sensitivity analysis [ 69 , 93 ], biological meta-analyses are often devoid of such tests. Sensitivity analyses include not only running meta-analysis and meta-regression without influential effect sizes or studies (for example, many effect sizes that come from one study or one clear outlier effect size; sometimes also termed ‘subset analysis’), but also, for example, comparing meta-analytic models with and without modeling non-independence (Q3–5), or other alternative analyses [ 44 , 93 ]. Analyses related to publication bias could generally also be regarded as part of a sensitivity analysis (Q8). In addition, it is worthwhile checking if the authors discuss missing data [ 94 , 95 ] (different from publication bias; Q8). Two major cases of missing data in meta-analysis are: 1) a lack of the information required to obtain sampling variance for a portion of the dataset (for example, missing standard deviations); and 2) missing information for moderators [ 96 ] (for example, most studies report the sex of animals used but a few studies do not). For the former, the authors should run models both with and without data with sampling variance information; note that without sampling variance (that is, unweighted meta-analysis) the analysis becomes a normal linear model [ 21 ]. For both cases 1 and 2, the authors could use data imputation techniques (as of yet, this is not standard practice). Although data imputation methods are rather technical, their implementation is becoming easier [ 96 , 97 , 98 ]. Furthermore, it may often be important to consider the sample size (the number and precision of constituent effect sizes) and statistical power of a meta-analysis. One of the main reasons to conduct meta-analysis is to increase statistical power. However, where an overall effect is expected to be small (as is often the case with biological phenomena) it is possible that a meta-analysis may be underpowered [ 99 , 100 , 101 ].

Q10: Is the current state (and lack) of knowledge summarized?

In the discussion of a meta-analysis, it is reasonable to expect the authors to discuss what conventional wisdoms the meta-analysis has confirmed or refuted and what new insights the meta-analysis has revealed [ 8 , 19 , 71 , 100 ]. New insights from meta-analyses are known as ‘review-generated evidence’ (as opposed to ‘study-generated evidence’) [ 18 ] because only aggregation of studies can generate such insights. This is analogous to comparative analyses bringing biologists novel understanding of a topic which would be impossible to obtain from studying a single species in isolation [ 14 ]. Because meta-analysis brings available (published) studies together in a systematic and/or comprehensive way (but see Q1), the authors can also summarize less quantitative themes along with the meta-analytic results. For example, the authors could point out what types of primary studies are lacking (that is, identify knowledge gaps). Also, the study should provide clear future directions for the topic under investigation [ 8 , 19 , 71 , 100 ]; for example, what types of empirical work are required to push the topic forward. An obvious caveat is that the value of these new insights, knowledge gaps and future directions is contingent upon the answers to the previous nine questions (Q1–9).

Post meta-evaluation: more to think about

Given that we are advocates of meta-analysis, we are certainly biased in saying ‘meta-analyses are enlightening’. A more nuanced interpretation of what we really mean is that meta-analyses are enlightening when they are done well. Mary Smith and Gene Glass published the first research synthesis carrying the label of ‘meta-analysis’ in 1977 [ 102 ]. At the time, their study and the general concept was ridiculed with the term ‘mega-silliness’ [ 103 ] (see also [ 16 , 17 ]). Although the results of this first meta-analysis on the efficacy of psychotherapies still stand strong, it is possible that a meta-analysis contains many mistakes. In a similar vein, Robert Whittaker warned that the careless use of meta-analyses could lead to ‘mega-mistakes’, reinforcing his case by drawing upon examples from ecology [ 104 , 105 ].

Even where a meta-analysis is conducted well, a future meta-analysis can sometimes yield a completely opposing conclusion from the original (see [ 106 ] for examples from medicine and the reasons why). Thus, medical and social scientists are aware that updating meta-analyses is extremely important, especially given that time-lag bias is a common phenomenon [ 87 , 88 , 89 ]. Although updating is still rare in biological meta-analyses [ 8 ], we believe this should become part of the research culture in the biological sciences. We appreciate the view of John Ioannidis who wrote, “Eventually, all research [both primary and meta-analytic] can be seen as a large, ongoing, cumulative meta-analysis” [ 106 ] (cf. effective thinking; Fig.  6d ).

Finally, we have to note that we have just scratched the surface of the enormous subject of meta-analysis. For example, we did not cover other relevant topics such as multilevel (hierarchical) meta-analytic and meta-regression models [ 14 , 45 , 49 ], which allow more complex sources of non-independence to be modeled, as well as multivariate (multi-response) meta-analyses [ 107 ] and network meta-analyses [ 108 ]. Many of the ten appraisal questions above, however, are also relevant for these extended methods. More importantly, we believe that asking the ten questions above will readily equip biologists with the knowledge necessary to differentiate among mega-enlightenment, mega-mistakes, and something in-between.

Glass GV. Primary, secondary, and meta-analysis research. Educ Res. 1976;5:3–8.

Article   Google Scholar  

Glass GV. Meta-analysis at middle age: a personal history. Res Synth Methods. 2015;6(3):221–31.

Article   PubMed   Google Scholar  

Cooper H, Hedges LV, Valentine JC. The handbook of research synthesis and meta-analysis. New York: Russell Sage Foundation; 2009.

Google Scholar  

Hedges L, Olkin I. Statistical methods for meta-analysis. New York: Academic Press; 1985.

Egger M, Smith GD, Altman DG. Systematic reviews in health care: meta-analysis in context. 2nd ed. London: BMJ; 2001.

Book   Google Scholar  

Arnqvist G, Wooster D. Meta-analysis: synthesizing research findings in ecology and evolution. Trends Ecol Evol. 1995;10:236–40.

Article   CAS   PubMed   Google Scholar  

Koricheva J, Gurevitch J, Mengersen K. Handbook of meta-analysis in ecology and evolution. Princeton: Princeton University Press; 2013.

Nakagawa S, Poulin R. Meta-analytic insights into evolutionary ecology: an introduction and synthesis. Evolutionary Ecol. 2012;26:1085–99.

van der Worp HB, Howells DW, Sena ES, Porritt MJ, Rewell S, O'Collins V, Macleod MR. Can animal models of disease reliably inform human studies? PLoS Med. 2010;7(3), e1000245.

Article   PubMed   PubMed Central   Google Scholar  

Stewart G. Meta-analysis in applied ecology. Biol Lett. 2010;6(1):78–81.

Stewart GB, Schmid CH. Lessons from meta-analysis in ecology and evolution: the need for trans-disciplinary evidence synthesis methodologies. Res Synth Methods. 2015;6(2):109–10.

Lortie CJ, Stewart G, Rothstein H, Lau J. How to critically read ecological meta-analyses. Res Synth Methods. 2015;6(2):124–33.

Nakagawa S, Kubo T. Statistical models for meta-analysis in ecology and evolution (in Japanese). Proc Inst Stat Math. 2016;64(1):105–21.

Nakagawa S, Santos ESA. Methodological issues and advances in biological meta-analysis. Evol Ecol. 2012;26:1253–74.

Koricheva J, Gurevitch J. Uses and misuses of meta-analysis in plant ecology. J Ecol. 2014;102:828–44.

Page MJ, Moher D. Mass production of systematic reviews and meta-analyses: an exercise in mega-silliness? Milbank Q. 2016;94(5):515–9.

Ioannidis JPA. The mass production of redundant, misleading, and conflicted systematic reviews and meta-analyses. Milbank Q. 2016;94(5):485–514.

Cooper HM. Research synthesis and meta-analysis : a step-by-step approach. 4th ed. London: SAGE; 2010.

Rothstein HR, Lorite CJ, Stewart GB, Koricheva J, Gurevitch J. Quality standards for research syntheses. In: Koricheva J, Gurevitch J, Mengersen K, editors. The handbook of meta-analysis in ecology and evolution. Princeton: Princeton University Press; 2013. p. 323–38.

Vetter D, Rcker G, Storch I. Meta-analysis: a need for well-defined usage in ecology and conservation biology. Ecosphere. 2013;6:1–24.

Morrissey M. Meta-analysis of magnitudes, differences, and variation in evolutionary parameters. J Evol Biol. 2016;29(10):1882–904.

Vesterinen HM, Sena ES, Egan KJ, Hirst TC, Churolov L, Currie GL, Antonic A, Howells DW, Macleod MR. Meta-analysis of data from animal studies: a practical guide. J Neurosci Methods. 2014;221:92–102.

Mongeon P, Paul-Hus A. The journal coverage of Web of Science and Scopus: a comparative analysis. Scientometrics. 2016;106(1):213–28.

Côté IM, Jennions MD. The procedure of meta-analysis in a nutshell. In: Koricheva J, Gurevitch J, Mengersen K, editors. The handbook of meta-analysis in ecology and evolution. Princeton: Princton University Press; 2013. p. 14–24.

Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JPA, Clarke M, Devereaux PJ, Kleijnen J, Moher D. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. PLoS Med. 2009;6:e1000100. doi: 10.1371/journal.pmed.1000100 .

Moher D, Liberati A, Tetzlaff J, Altman DG, Group P. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Ann Internal Med. 2009;151:264–9.

Ellison AM. Repeatability and transparency in ecological research. Ecology. 2010;91(9):2536–9.

Parker TH, Forstmeier W, Koricheva J, Fidler F, Hadfield JD, Chee YE, Kelly CD, Gurevitch J, Nakagawa S. Transparency in ecology and evolution: real problems, real solutions. Trends Ecol Evol. 2016;31(9):711–9.

Nakagawa S, Cuthill IC. Effect size, confidence interval and statistical significance: a practical guide for biologists. Biol Rev. 2007;82:591–605.

Borenstein M. Effect size for continuous data. In: Cooper H, Hedges LV, Valentine JC, editors. The handbook of research synthesis and meta-analysis. New York: Russell Sage Foundation; 2009. p. 221–35.

Borenstein M, Hedges LV, Higgens JPT, Rothstein HR. Introduction to meta-analysis. West Sussex: Wiley; 2009.

Fleiss JL, Berlin JA. Effect sizes for dichotomous data. In: Cooper H, Hedges LV, Valentine JC, editors. The handbook of research synthesis and meta-analysis. New York: Russell Sage Foundation; 2009. p. 237–53.

Hedges LV, Gurevitch J, Curtis PS. The meta-analysis of response ratios in experimental ecology. Ecology. 1999;80(4):1150–6.

Hector KL, Lagisz M, Nakagawa S. The effect of resveratrol on longevity across species: a meta-analysis. Biol Lett. 2012. doi: 10.1098/rsbl.2012.0316 .

Lagisz M, Hector KL, Nakagawa S. Life extension after heat shock exposure: Assessing meta-analytic evidence for hormesis. Ageing Res Rev. 2013;12(2):653–60.

Nakagawa S, Lagisz M, Hector KL, Spencer HG. Comparative and meta-analytic insights into life-extension via dietary restriction. Aging Cell. 2012;11:401–9.

Garratt M, Nakagawa S, Simons MJ. Comparative idiosyncrasies in life extension by reduced mTOR signalling and its distinctiveness from dietary restriction. Aging Cell. 2016;15(4):737–43.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Nakagawa S, Poulin R, Mengersen K, Reinhold K, Engqvist L, Lagisz M, Senior AM. Meta-analysis of variation: ecological and evolutionary applications and beyond. Methods Ecol Evol. 2015;6(2):143–52.

Senior AM, Nakagawa S, Lihoreau M, Simpson SJ, Raubenheimer D. An overlooked consequence of dietary mixing: a varied diet reduces interindividual variance in fitness. Am Nat. 2015;186(5):649–59.

Senior AM, Gosby AK, Lu J, Simpson SJ, Raubenheimer D. Meta-analysis of variance: an illustration comparing the effects of two dietary interventions on variability in weight. Evol Med Public Health. 2016;2016(1):244–55.

Mengersen K, Jennions MD, Schmid CH. Statistical models for the meta-analysis of non-independent data. In: Koricheva J, Gurevitch J, Mengersen K, editors. The handbook of meta-analysis in ecology and evolution. Princeton: Princeton University Press; 2013. p. 255–83.

Lajeunesse MJ. Meta-analysis and the comparative phylogenetic method. Am Nat. 2009;174(3):369–81.

PubMed   Google Scholar  

Chamberlain SA, Hovick SM, Dibble CJ, Rasmussen NL, Van Allen BG, Maitner BS. Does phylogeny matter? Assessing the impact of phylogenetic information in ecological meta-analysis. Ecol Lett. 2012;15:627–36.

Noble DWA, Lagisz M, O'Dea RE, Nakagawa S. Non-independence and sensitivity analyses in ecological and evolutionary meta-analyses. Mol Ecol. 2017; in press. doi: 10.1111/mec.14031 .

Hadfield J, Nakagawa S. General quantitative genetic methods for comparative biology: phylogenies, taxonomies and multi-trait models for continuous and categorical characters. J Evol Biol. 2010;23:494–508.

Viechtbauer W. Conducting meta-analyses in R with the metafor package. J Stat Software. 2010;36(3):1–48.

Rosenberg MS, Adams DC, Gurevitch J. MetaWin: statistical software for meta-analysis. 2nd ed. Sunderland: Sinauer; 2000.

Marín-Martínez F, Sánchez-Meca J. Averaging dependent effect sizes in meta-analysis: a cautionary note about procedures. Spanish J Psychol. 1999;2:32–8.

Cheung MWL. Modeling dependent effect sizes with three-level meta-analyses: a structural equation modeling approach. Psychol Methods. 2014;19:211–29.

Sutton AJ, Higgins JPI. Recent developments in meta-analysis. Stat Med. 2008;27(5):625–50.

Mengersen K, Schmid CH, Jennions MD, Gurevitch J. Statistical models and approcahes to inference. In: Koricheva J, Gurevitch J, Mengersen K, editors. The handbook of meta-analysis in ecology and evolution. Princeton: Princeton University Press; 2013. p. 89–107.

Lajeunesse MJ. Meta-analysis and the comparative phylogenetic method. Am Nat. 2009;174:369–81.

Lajeunesse MJ. On the meta-analysis of response ratios for studies with correlated and multi-group designs. Ecology. 2011;92:2049–55.

Lajeunesse MJ, Rosenberg MS, Jennions MD. Phylogenetic nonindepedence and meta-analysis. In: Koricheva J, Gurevitch J, Mengersen K, editors. The handbook of meta-analysis in ecology and evolution. Princeton: Princeton University Press; 2013. p. 284–99.

Borenstein M, Hedges LV, Higgens JPT, Rothstein H. A basic introduction to fixed-effect and andom-effects models for meta-analysis. Res Synth Methods. 2010;1:97–111.

Vetter D, Rucker G, Storch I. Meta-analysis: a need for well-defined usage in ecology and conservation biology. Ecosphere. 2013;4(6):1–24.

Anzures-Cabrera J, Higgins JPT. Graphical displays for meta-analysis: an overview with suggestions for practice. Res Synth Methods. 2010;1(1):66–80.

Senior AM, Grueber CE, Kamiya T, Lagisz M, O'Dwyer K, Santos ESA, Nakagawa S. Heterogeneity in ecological and evolutionary meta-analyses: its magnitudes and implications. Ecology. 2016; in press.

Cochran WG. The combination of estimates from different experiments. Biometrics. 1954;10(1):101–29.

Higgins JPT, Thompson SG. Quantifying heterogeneity in a meta-analysis. Stat Med. 2002;12:1539–58.

Higgins JPT, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ. 2003;327:557–60.

Huedo-Medina TB, Sanchez-Meca J, Marin-Martinez F, Botella J. Assessing heterogeneity in meta-analysis: Q statistic or I-2 index? Psychol Methods. 2006;11(2):193–206.

Rucker G, Schwarzer G, Carpenter JR, Schumacher M. Undue reliance on I-2 in assessing heterogeneity may mislead. BMC Med Res Methodol. 2008;8:79.

Harrell FEJ. Regression modeling strategies with applications to linear models, logistic regression, and survival analysis. New York: Springer; 2001.

Ioannidis JPA. Why most published research findings are false. PLoS Med. 2005;2(8):696–701.

Simmons JP, Nelson LD, Simonsohn U. False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol Sci. 2011;22(11):1359–66.

Lipsey MW. Those confounded moderators in meta-analysis: Good, bad, and ugly. Ann Am Acad Polit Social Sci. 2003;587:69–81.

Schielzeth H. Simple means to improve the interpretability of regression coefficients. Methods Ecol Evol. 2010;1(2):103–13.

Higgins JPT, Green S. Cochrane handbook for systematic reviews of interventions. West Sussex: Wiley-Blackwell; 2009.

Cumming G, Finch S. A primer on the understanding, use, and calculation of confidence intervals that are based on central and noncentral distributions. Educ Psychol Meas. 2001;61:532–84.

Jennions MD, Lorite CJ, Koricheva J. Role of meta-analysis in interpreting the scientific literature. In: Koricheva J, Gurevitch J, Mengersen K, editors. The handbook of meta-analysis in ecology and evolution. Princeton: Princeton University Press; 2013. p. 364–80.

Thompson B. What future quantitative social science research could look like: confidence intervals for effect sizes. Educ Res. 2002;31:25–32.

Cohen J. Statistical power analysis for the beahvioral sciences. 2nd ed. Hillsdale: Lawrence Erlbaum; 1988.

Rothstein HR, Sutton AJ, Borenstein M. Publication bias in meta-analysis: prevention, assessment and adjustments. Chichester: Wiley; 2005.

Sena ES, van der Worp HB, Bath PMW, Howells DW, Macleod MR. Publication bias in reports of animal stroke studies leads to major overstatement of efficacy. PLoS Biol. 2010;8(3), e1000344.

Article   PubMed   PubMed Central   CAS   Google Scholar  

Moller AP, Jennions MD. Testing and adjusting for publication bias. Trends Ecol Evol. 2001;16(10):580–6.

Egger M, Smith GD, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. BMJ. 1997;315:629–34.

Sterne JAC, Egger M. Funnel plots for detecting bias in meta-analysis: guidelines on choice of axis. J Clin Epidemiol. 2001;54:1046–55.

Sutton AJ. Publication bias. In: Cooper H, Hedges L, Valentine J, editors. The handbook of research synthesis and meta-analysis. New York: Russell Sage Foundation; 2009. p. 435–52.

Lau J, Ioannidis JPA, Terrin N, Schmid CH, Olkin I. Evidence based medicine--the case of the misleading funnel plot. BMJ. 2006;333(7568):597–600.

Duval S, Tweedie R. Trim and fill: a simple funnel-plot-based method of testing and adjusting for publication bias in meta-analysis. Biometrics. 2000;56:455–63.

Duval S, Tweedie R. A nonparametric "trim and fill" method of accounting for publication bias in meta-analysis. J Am Stat Assoc. 2000;95(449):89–98.

Simonsohn U, Nelson LD, Simmons JP. p-curve and effect size: correcting for publication bias using only significant results. Perspect Psychol Sci. 2014;9(6):666–81.

Terrin N, Schmid CH, Lau J, Olkin I. Adjusting for publication bias in the presence of heterogeneity. Stat Med. 2003;22(13):2113–26.

Bruns SB, Ioannidis JPA. p-curve and p-hacking in observational research. PLoS One. 2016;11(2), e0149144.

Schuch FB, Vancampfort D, Rosenbaum S, Richards J, Ward PB, Veronese N, Solmi M, Cadore EL, Stubbs B. Exercise for depression in older adults: a meta-analysis of randomized controlled trials adjusting for publication bias. Rev Bras Psiquiatr. 2016;38(3):247–54.

Jennions MD, Moller AP. Relationships fade with time: a meta-analysis of temporal trends in publication in ecology and evolution. Proc R Soc Lond B Biol Sci. 2002;269(1486):43–8.

Trikalinos TA, Ioannidis JP. Assessing the evolution of effect sizes over time. In: Rothstein H, Sutton AJ, Borenstein M, editors. Publication bias in meta-analysis: prevention, assessment and adjustments. Chichester: Wiley; 2005. p. 241–59.

Koricheva J, Jennions MD, Lau J. Temporal trends in effect sizes: causes, detection and implications. In: Koricheva J, Gurevitch J, editors. Mengersen K, editors. Princeton: Princeton University Press; 2013. p. 237–54.

Lau J, Schmid CH, Chalmers TC. Cumulative meta-analysis of clinical trials builds evidence for exemplary medical care. J Clin Epidemiol. 1995;48(1):45–57. discussion 59–60.

Leimu R, Koricheva J. Cumulative meta-analysis: a new tool for detection of temporal trends and publication bias in ecology. Proc R Soc Lond B Biol Sci. 2004;271(1551):1961–6.

Murtaugh PA. Journal quality, effect size, and publication bias in meta-analysis. Ecology. 2002;83(4):1162–6.

Greenhouse JB, Iyengar S. Sensitivity analysis and diagnostics. In: Cooper H, Hedges L, Valentine J, editors. The handbook of research synthesis and meta-analysis. New York: Russell Sage Foundation; 2009. p. 417–34.

Lajeunesse MJ. Recovering missing or partial data from studies: a survey. In: Koricheva J, Gurevitch J, Mengersen K, editors. The handbook of meta-analysis in ecology and evolution. Princeton: Princeton University Press; 2013. p. 195–206.

Nakagawa S, Freckleton RP. Missing inaction: the dangers of ignoring missing data. Trends Ecol Evol. 2008;23(11):592–6.

Ellington EH, Bastille-Rousseau G, Austin C, Landolt KN, Pond BA, Rees EE, Robar N, Murray DL. Using multiple imputation to estimate missing data in meta-regression. Methods Ecol Evol. 2015;6(2):153–63.

Gurevitch J, Nakagawa S. Research synthesis methods in ecology. In: Fox GA, Negrete-Yankelevich S, Sosa VJ, editors. Ecological statistics: contemporary theory and application. Oxford: Oxford University Press; 2015. p. 201–28.

Nakagawa S. Missing data: mechanisms, methods and messages. In: Fox GA, Negrete-Yankelevich S, Sosa VJ, editors. Ecological statistics. Oxford: Oxford University Press; 2015. p. 81–105.

Chapter   Google Scholar  

Ioannidis J, Patsopoulos N, Evangelou E. Uncertainty in heterogeneity estimates in meta-analyses. BMJ. 2007;335:914–6.

Jennions MD, Lorite CJ, Koricheva J. Using meta-analysis to test ecological and evolutionary theory. In: Koricheva J, Gurevitch J, Mengersen K, editors. The handbook of meta-analysis in ecology and evolution. Princeton: Princeton University Press; 2013. p. 38–403.

Lajeunesse MJ. Power statistics for meta-analysis: tests for mean effects and homogeneity. In: Koricheva J, Gurevitch J, Mengersen K, editors. The handbook of meta-analysis in ecology and evolution. Princeton: Princeton University Press; 2013. p. 348–63.

Smith ML, Glass GV. Meta-analysis of psychotherapy outcome studies. Am Psychologist. 1977;32(9):752–60.

Article   CAS   Google Scholar  

Eysenck HJ. Exercise in mega-silliness. Am Psychologist. 1978;33(5):517.

Whittaker RJ. Meta-analyses and mega-mistakes: calling time on meta-analysis of the species richness-productivity relationship. Ecology. 2010;91(9):2522–33.

Whittaker RJ. In the dragon's den: a response to the meta-analysis forum contributions. Ecology. 2010;91(9):2568–71.

Ioannidis JP. Meta-research: the art of getting it wrong. Res Synth Methods. 2010;3:169–84.

Jackson D, Riley R, White IR. Multivariate meta-analysis: potential and promise. Stat Med. 2011;30(20):2481–98.

Salanti G, Schmid CH. Special issue on network meta-analysis: introduction from the editors. Res Synth Methods. 2012;3(2):69–70.

Download references

Acknowledgements

We are grateful for comments on our article from the members of I-DEEL. We also thank John Brookfield, one anonymous referee, and the BMC Biology editorial team for comments, which significantly improved our article. SN acknowledges an ARC (Australian Research Council) Future Fellowship (FT130100268), DWAN is supported by an ARC Discovery Early Career Research Award (DE150101774) and UNSW Vice Chancellors Fellowship. AMS is supported by a Judith and David Coffey Fellowship from the University of Sydney.

Competing interests

The authors declare that they have no competing interests.

Author information

Authors and affiliations.

Evolution & Ecology Research Centre and School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, NSW, 2052, Australia

Shinichi Nakagawa, Daniel W. A. Noble & Malgorzata Lagisz

Diabetes and Metabolism Division, Garvan Institute of Medical Research, 384 Victoria Street, Darlinghurst, Sydney, NSW, 2010, Australia

Shinichi Nakagawa

Charles Perkins Centre, University of Sydney, Sydney, NSW, 2006, Australia

Alistair M. Senior

School of Mathematics and Statistics, University of Sydney, Sydney, NSW, 2006, Australia

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Shinichi Nakagawa .

Additional information

All authors contributed equally to the preparation of this manuscript

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Nakagawa, S., Noble, D.W.A., Senior, A.M. et al. Meta-evaluation of meta-analysis: ten appraisal questions for biologists. BMC Biol 15 , 18 (2017). https://doi.org/10.1186/s12915-017-0357-7

Download citation

Published : 03 March 2017

DOI : https://doi.org/10.1186/s12915-017-0357-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Effect size
  • Biological importance
  • Non-independence
  • Meta-regression
  • Meta-research
  • Publication bias
  • Quantitative synthesis
  • Reporting bias
  • Statistical significance
  • Systematic review

BMC Biology

ISSN: 1741-7007

meta analysis research questions

APS

Introduction to Meta-Analysis: A Guide for the Novice

  • Experimental Psychology
  • Methodology
  • Statistical Analysis

Free Meta-Analysis Software and Macros

MetaXL (Version 2.0)

RevMan (Version 5.3)

Meta-Analysis Macros for SAS, SPSS, and Stata

Opposing theories and disparate findings populate the field of psychology; scientists must interpret the results of any single study in the context of its limitations. Meta-analysis is a robust tool that can help researchers overcome these challenges by assimilating data across studies identified through a literature review. In other words, rather than surveying participants, a meta-analysis surveys studies. The goal is to calculate the direction and/or magnitude of an effect across all relevant studies, both published and unpublished. Despite the utility of this statistical technique, it can intimidate a beginner who has no formal training in the approach. However, any motivated researcher with a statistics background can complete a meta-analysis. This article provides an overview of the main steps of basic meta-analysis.

Meta-analysis has many strengths. First, meta-analysis provides an organized approach for handling a large number of studies. Second, the process is systematic and documented in great detail, which allows readers to evaluate the researchers’ decisions and conclusions. Third, meta-analysis allows researchers to examine an effect within a collection of studies in a more sophisticated manner than a qualitative summary.

However, meta-analysis also involves numerous challenges. First, it consumes a great deal of time and requires a great deal of effort. Second, meta-analysis has been criticized for aggregating studies that are too different (i.e., mixing “apples and oranges”). Third, some scientists argue that the objective coding procedure used in meta-analysis ignores the context of each individual study, such as its methodological rigor. Fourth, when a researcher includes low-quality studies in a meta-analysis, the limitations of these studies impact the mean effect size (i.e., “garbage in, garbage out”). As long as researchers are aware of these issues and consider the potential influence of these limitations on their findings, meta-analysis can serve as a powerful and informative approach to help us draw conclusions from a large literature base.

  Identifying the Right Question

Similar to any research study, a meta-analysis begins with a research question. Meta-analysis can be used in any situation where the goal is to summarize quantitative findings from empirical studies. It can be used to examine different types of effects, including prevalence rates (e.g., percentage of rape survivors with depression), growth rates (e.g., changes in depression from pretreatment to posttreatment), group differences (e.g., comparison of treatment and placebo groups on depression), and associations between variables (e.g., correlation between depression and self-esteem). To select the effect metric, researchers should consider the statistical form of the results in the literature. Any given meta-analysis can focus on only one metric at a time. While selecting a research question, researchers should think about the size of the literature base and select a manageable topic. At the same time, they should make sure the number of existing studies is large enough to warrant a meta-analysis.

Determining Eligibility Criteria

After choosing a relevant question, researchers should then identify and explicitly state the types of studies to be included. These criteria ensure that the studies overlap enough in topic and methodology that it makes sense to combine them. The inclusion and exclusion criteria depend on the specific research question and characteristics of the literature. First, researchers can specify relevant participant characteristics, such as age or gender. Second, researchers can identify the key variables that must be included in the study. Third, the language, date range, and types (e.g., peer-reviewed journal articles) of studies should be specified. Fourth, pertinent study characteristics, such as experimental design, can be defined. Eligibility criteria should be clearly documented and relevant to the research question. Specifying the eligibility criteria prior to conducting the literature search allows the researcher to perform a more targeted search and reduces the number of irrelevant studies. Eligibility criteria can also be revised later, because the researcher may become aware of unforeseen issues during the literature search stage.

Conducting a Literature Search and Review

The next step is to identify, retrieve, and review published and unpublished studies. The goal is to be exhaustive; however, being too broad can result in an overwhelming number of studies to review.

Online databases, such as PsycINFO and PubMed, compile millions of searchable records, including peer-reviewed journals, books, and dissertations.  In addition, through these electronic databases, researchers can access the full text of many of the records. It is important that researchers carefully choose search terms and databases, because these decisions impact the breadth of the review. Researchers who aren’t familiar with the research topic should consult with an expert.

Additional ways to identify studies include searching conference proceedings, examining reference lists of relevant studies, and directly contacting researchers. After the literature search is completed, researchers must evaluate each study for inclusion using the eligibility criteria. At least a subset of the studies should be reviewed by two individuals (i.e., double coded) to serve as a reliability check. It is vital that researchers keep meticulous records of this process; for publication, a flow diagram is typically required to depict the search and results. Researchers should allow adequate time, because this step can be quite time consuming.

Calculating Effect Size

Next, researchers calculate an effect size for each eligible study. The effect size is the key component of a meta-analysis because it encodes the results in a numeric value that can then be aggregated. Examples of commonly used effect size metrics include Cohen’s d (i.e., group differences) and Pearson’s r (i.e., association between two variables). The effect size metric is based on the statistical form of the results in the literature and the research question. Because studies that include more participants provide more accurate estimates of an effect than those that include fewer participants, it is important to also calculate the precision of the effect size (e.g., standard error).

Meta-analysis software guides researchers through the calculation process by requesting the necessary information for the specified effect size metric. I have identified some potentially useful resources and programs below. Although meta-analysis software makes effect size calculations simple, it is good practice for researchers to understand what computations are being used.

The effect size and precision of each individual study are aggregated into a summary statistic, which can be done with meta-analysis software. Researchers should confirm that the effect sizes are independent of each other (i.e., no overlap in participants). Additionally, researchers must select either a fixed effects model (i.e., assumes all studies share one true effect size) or a random effects model (i.e., assumes the true effect size varies among studies). The random effects model is typically preferred when the studies have been conducted using different methodologies. Depending on the software, additional specifications or adjustments may be possible.

During analysis, the effect sizes of the included studies are weighted by their precision (e.g., inverse of the sampling error variance) and the mean is calculated. The mean effect size represents the direction and/or magnitude of the effect summarized across all eligible studies. This statistic is typically accompanied by an estimate of its precision (e.g., confidence interval) and p -value representing statistical significance. Forest plots are a common way of displaying meta-analysis results.

Depending on the situation, follow-up analyses may be advised. Researchers can quantify heterogeneity (e.g., Q, t 2 , I 2 ), which is a measure of the variation among the effect sizes of included studies. Moderator variables, such as the quality of the studies or age of participants, may be included to examine sources of heterogeneity. Because published studies may be biased towards significant effects, it is important to evaluate the impact of publication bias (e.g., funnel plot, Rosenthal’s Fail-safe N ). Sensitivity analysis can indicate how the results of the meta-analysis would change if one study were excluded from the analysis.

If properly conducted and clearly documented, meta-analyses often make significant contributions to a specific field of study and therefore stand a good chance of being published in a top-tier journal. The biggest obstacle for most researchers who attempt meta-analysis for the first time is the amount of work and organization required for proper execution, rather than their level of statistical knowledge.

Recommended Resources

Borenstein, M., Hedges, L. V., Higgins, J. P., & Rothstein, H. R. (2009). Introduction to meta-analysis . Hoboken, NJ: Wiley.

Cooper, H., Hedges, L., & Valentine, J. (2009). The handbook of research synthesis and meta-analysis (2nd ed.). New York, NY: Russell Sage Foundation.

Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis . Thousand Oaks, California: Sage Publications.

Rothstein, H. R., Sutton, A. J., & Borenstein, M. (2005). Publication bias in meta-analysis: Prevention, assessment, and adjustments . Hoboken, NJ: Wiley.

meta analysis research questions

It is nice to see the software we developed (MetaXL) being mentioned. However, the reason we developed the software and made publicly available for free is that we disagree with an important statement in the review. This statement is “researchers must select either a fixed effects model (i.e., assumes all studies share one true effect size) or a random effects model (i.e., assumes the true effect size varies among studies)”. We developed MetaXL because we think that the random effects model is seriously flawed and should be abandoned. We implemented in MetaXL two additional models, the Inverse Variance heterogeneity model and the Quality Effects model, both meant to be used in case of heterogeneity. More details are in the User Guide, available from the Epigear website.

meta analysis research questions

Thank you very much! The article really helped me to start understanding what meta-analysis is about

meta analysis research questions

thank you for sharing this article; it is very helpful.But I am still confused about how to remove quickly duplicates papers without wasting time if we more than 10 000 papers?

meta analysis research questions

Not being one to blow my own horn all the time, but I would like to suggest that you may want to take a look at a web based application I wrote that conducts a Hunter-Schmidt type meta-analysis. The Meta-Analysis is very easy to use and corrects for sampling and error variance due to reliability. It also exports the results in excel format. You can also export the dataset effect sizes (r, d, and z), sample sizes and reliability information in excel as well.

http://www.lyonsmorris.com/lyons/MaCalc/index.cfm

APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines .

Please login with your APS account to comment.

About the Author

Laura C. Wilson is an Assistant Professor in the Psychology Department at the University of Mary Washington. She earned a PhD in Clinical Psychology from Virginia Tech and MA in General/Experimental Psychology from The College of William & Mary. Her main area of expertise is post-trauma functioning, particularly in survivors of sexual violence or mass trauma (e.g., terrorism, mass shootings, combat). She also has interest in predictors of violence and aggression, including psychophysiological and personality factors.

meta analysis research questions

Careers Up Close: Joel Anderson on Gender and Sexual Prejudices, the Freedoms of Academic Research, and the Importance of Collaboration

Joel Anderson, a senior research fellow at both Australian Catholic University and La Trobe University, researches group processes, with a specific interest on prejudice, stigma, and stereotypes.

meta analysis research questions

Experimental Methods Are Not Neutral Tools

Ana Sofia Morais and Ralph Hertwig explain how experimental psychologists have painted too negative a picture of human rationality, and how their pessimism is rooted in a seemingly mundane detail: methodological choices. 

APS Fellows Elected to SEP

In addition, an APS Rising Star receives the society’s Early Investigator Award.

Privacy Overview

CookieDurationDescription
__cf_bm30 minutesThis cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
CookieDurationDescription
AWSELBCORS5 minutesThis cookie is used by Elastic Load Balancing from Amazon Web Services to effectively balance load on the servers.
CookieDurationDescription
at-randneverAddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT2 yearsYouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
uvc1 year 27 daysSet by addthis.com to determine the usage of addthis.com service.
_ga2 yearsThe _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_3507334_11 minuteSet by Google to distinguish users.
_gid1 dayInstalled by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CookieDurationDescription
loc1 year 27 daysAddThis sets this geolocation cookie to help understand the location of users who share the information.
VISITOR_INFO1_LIVE5 months 27 daysA cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSCsessionYSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devicesneverYouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-idneverYouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextIdneverThis cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requestsneverThis cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Systematic Reviews and Meta-Analysis: A Guide for Beginners

Affiliation.

  • 1 Department of Pediatrics, Advanced Pediatrics Centre, PGIMER, Chandigarh. Correspondence to: Prof Joseph L Mathew, Department of Pediatrics, Advanced Pediatrics Centre, PGIMER Chandigarh. [email protected].
  • PMID: 34183469
  • PMCID: PMC9065227
  • DOI: 10.1007/s13312-022-2500-y

Systematic reviews involve the application of scientific methods to reduce bias in review of literature. The key components of a systematic review are a well-defined research question, comprehensive literature search to identify all studies that potentially address the question, systematic assembly of the studies that answer the question, critical appraisal of the methodological quality of the included studies, data extraction and analysis (with and without statistics), and considerations towards applicability of the evidence generated in a systematic review. These key features can be remembered as six 'A'; Ask, Access, Assimilate, Appraise, Analyze and Apply. Meta-analysis is a statistical tool that provides pooled estimates of effect from the data extracted from individual studies in the systematic review. The graphical output of meta-analysis is a forest plot which provides information on individual studies and the pooled effect. Systematic reviews of literature can be undertaken for all types of questions, and all types of study designs. This article highlights the key features of systematic reviews, and is designed to help readers understand and interpret them. It can also help to serve as a beginner's guide for both users and producers of systematic reviews and to appreciate some of the methodological issues.

PubMed Disclaimer

Similar articles

  • How to review and assess a systematic review and meta-analysis article: a methodological study (secondary publication). Myung SK. Myung SK. J Educ Eval Health Prof. 2023;20:24. doi: 10.3352/jeehp.2023.20.24. Epub 2023 Aug 27. J Educ Eval Health Prof. 2023. PMID: 37619974 Free PMC article.
  • How to Critically Appraise and Interpret Systematic Reviews and Meta-Analyses of Diagnostic Accuracy: A User Guide. Frank RA, Salameh JP, Islam N, Yang B, Murad MH, Mustafa R, Leeflang M, Bossuyt PM, Takwoingi Y, Whiting P, Dawit H, Kang SK, Ebrahimzadeh S, Levis B, Hutton B, McInnes MDF. Frank RA, et al. Radiology. 2023 May;307(3):e221437. doi: 10.1148/radiol.221437. Epub 2023 Mar 14. Radiology. 2023. PMID: 36916896 Free PMC article. Review.
  • Characteristics, quality and volume of the first 5 months of the COVID-19 evidence synthesis infodemic: a meta-research study. Abbott R, Bethel A, Rogers M, Whear R, Orr N, Shaw L, Stein K, Thompson Coon J. Abbott R, et al. BMJ Evid Based Med. 2022 Jun;27(3):169-177. doi: 10.1136/bmjebm-2021-111710. Epub 2021 Jun 3. BMJ Evid Based Med. 2022. PMID: 34083212 Free PMC article. Review.
  • The future of Cochrane Neonatal. Soll RF, Ovelman C, McGuire W. Soll RF, et al. Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12. Early Hum Dev. 2020. PMID: 33036834
  • Detecting, quantifying and adjusting for publication bias in meta-analyses: protocol of a systematic review on methods. Mueller KF, Meerpohl JJ, Briel M, Antes G, von Elm E, Lang B, Gloy V, Motschall E, Schwarzer G, Bassler D. Mueller KF, et al. Syst Rev. 2013 Jul 25;2:60. doi: 10.1186/2046-4053-2-60. Syst Rev. 2013. PMID: 23885765 Free PMC article.
  • Evidence, Experience, and Eminence: Building Blocks for Pediatric Pulmonology Practice and Research in India. Mathew JL. Mathew JL. Indian J Pediatr. 2023 Jul;90(7):690-692. doi: 10.1007/s12098-023-04649-y. Epub 2023 Jun 2. Indian J Pediatr. 2023. PMID: 37264278 Free PMC article. No abstract available.
  • Efficiency, Economy and Excellence: Experimental Exploration of Evidence-based Guideline Development in Resource-Constrained Settings. Mathew JL. Mathew JL. Indian J Pediatr. 2023 Jul;90(7):700-707. doi: 10.1007/s12098-023-04575-z. Epub 2023 Apr 28. Indian J Pediatr. 2023. PMID: 37106227 Free PMC article. Review.
  • Sackett D, Strauss S, Richardson W, et al. Evidence-Based Medicine: How to practice and teach EBM. 2nd ed. Churchill Livingstone: 2000.
  • Cook DJ, Mulrow CD, Haynes RB. Systematic reviews: Synthesis of best evidence for clinical decisions. Ann Intern Med. 1997;126:376–80. doi: 10.7326/0003-4819-126-5-199703010-00006. - DOI - PubMed
  • PennState Eberley College of Science. Lesson 4: Bias and Random Error. Accessed October 01, 2020. Available from: https://online.stat.psu.edu/stat509/node/26/
  • Comprehensive Meta-analysis. Accessed October 01, 2020. Available from: https://www.meta-analysis.com/pages/why_do.php?cart=
  • National Institute for Health Research. PROSPERO International prospective register of systematic reviews. Accessed October 01, 2020. Available from: https://utas.libguides.com/SystematicReviews/Protocol

Publication types

  • Search in MeSH

LinkOut - more resources

Full text sources.

  • Europe PubMed Central
  • Indian Pediatrics
  • PubMed Central

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Study Design 101: Meta-Analysis

  • Case Report
  • Case Control Study
  • Cohort Study
  • Randomized Controlled Trial
  • Practice Guideline
  • Systematic Review

Meta-Analysis

  • Helpful Formulas
  • Finding Specific Study Types

A subset of systematic reviews; a method for systematically combining pertinent qualitative and quantitative study data from several selected studies to develop a single conclusion that has greater statistical power. This conclusion is statistically stronger than the analysis of any single study, due to increased numbers of subjects, greater diversity among subjects, or accumulated effects and results.

Meta-analysis would be used for the following purposes:

  • To establish statistical significance with studies that have conflicting results
  • To develop a more correct estimate of effect magnitude
  • To provide a more complex analysis of harms, safety data, and benefits
  • To examine subgroups with individual numbers that are not statistically significant

If the individual studies utilized randomized controlled trials (RCT), combining several selected RCT results would be the highest-level of evidence on the evidence hierarchy, followed by systematic reviews, which analyze all available studies on a topic.

  • Greater statistical power
  • Confirmatory data analysis
  • Greater ability to extrapolate to general population affected
  • Considered an evidence-based resource

Disadvantages

  • Difficult and time consuming to identify appropriate studies
  • Not all studies provide adequate data for inclusion and analysis
  • Requires advanced statistical techniques
  • Heterogeneity of study populations

Design pitfalls to look out for

The studies pooled for review should be similar in type (i.e. all randomized controlled trials).

Are the studies being reviewed all the same type of study or are they a mixture of different types?

The analysis should include published and unpublished results to avoid publication bias.

Does the meta-analysis include any appropriate relevant studies that may have had negative outcomes?

Fictitious Example

Do individuals who wear sunscreen have fewer cases of melanoma than those who do not wear sunscreen? A MEDLINE search was conducted using the terms melanoma, sunscreening agents, and zinc oxide, resulting in 8 randomized controlled studies, each with between 100 and 120 subjects. All of the studies showed a positive effect between wearing sunscreen and reducing the likelihood of melanoma. The subjects from all eight studies (total: 860 subjects) were pooled and statistically analyzed to determine the effect of the relationship between wearing sunscreen and melanoma. This meta-analysis showed a 50% reduction in melanoma diagnosis among sunscreen-wearers.

Real-life Examples

Goyal, A., Elminawy, M., Kerezoudis, P., Lu, V., Yolcu, Y., Alvi, M., & Bydon, M. (2019). Impact of obesity on outcomes following lumbar spine surgery: A systematic review and meta-analysis. Clinical Neurology and Neurosurgery, 177 , 27-36. https://doi.org/10.1016/j.clineuro.2018.12.012

This meta-analysis was interested in determining whether obesity affects the outcome of spinal surgery. Some previous studies have shown higher perioperative morbidity in patients with obesity while other studies have not shown this effect. This study looked at surgical outcomes including "blood loss, operative time, length of stay, complication and reoperation rates and functional outcomes" between patients with and without obesity. A meta-analysis of 32 studies (23,415 patients) was conducted. There were no significant differences for patients undergoing minimally invasive surgery, but patients with obesity who had open surgery had experienced higher blood loss and longer operative times (not clinically meaningful) as well as higher complication and reoperation rates. Further research is needed to explore this issue in patients with morbid obesity.

Nakamura, A., van Der Waerden, J., Melchior, M., Bolze, C., El-Khoury, F., & Pryor, L. (2019). Physical activity during pregnancy and postpartum depression: Systematic review and meta-analysis. Journal of Affective Disorders, 246 , 29-41. https://doi.org/10.1016/j.jad.2018.12.009

This meta-analysis explored whether physical activity during pregnancy prevents postpartum depression. Seventeen studies were included (93,676 women) and analysis showed a "significant reduction in postpartum depression scores in women who were physically active during their pregnancies when compared with inactive women." Possible limitations or moderators of this effect include intensity and frequency of physical activity, type of physical activity, and timepoint in pregnancy (e.g. trimester).

Related Terms

A document often written by a panel that provides a comprehensive review of all relevant studies on a particular clinical or health-related topic/question.

Publication Bias

A phenomenon in which studies with positive results have a better chance of being published, are published earlier, and are published in journals with higher impact factors. Therefore, conclusions based exclusively on published studies can be misleading.

Now test yourself!

1. A Meta-Analysis pools together the sample populations from different studies, such as Randomized Controlled Trials, into one statistical analysis and treats them as one large sample population with one conclusion.

a) True b) False

2. One potential design pitfall of Meta-Analyses that is important to pay attention to is:

a) Whether it is evidence-based. b) If the authors combined studies with conflicting results. c) If the authors appropriately combined studies so they did not compare apples and oranges. d) If the authors used only quantitative data.

Evidence Pyramid - Navigation

  • Meta- Analysis
  • Case Reports
  • << Previous: Systematic Review
  • Next: Helpful Formulas >>

Creative Commons License

  • Last Updated: Sep 25, 2023 10:59 AM
  • URL: https://guides.himmelfarb.gwu.edu/studydesign101

GW logo

  • Himmelfarb Intranet
  • Privacy Notice
  • Terms of Use
  • GW is committed to digital accessibility. If you experience a barrier that affects your ability to access content on this page, let us know via the Accessibility Feedback Form .
  • Himmelfarb Health Sciences Library
  • 2300 Eye St., NW, Washington, DC 20037
  • Phone: (202) 994-2850
  • [email protected]
  • https://himmelfarb.gwu.edu
  • - Google Chrome

Intended for healthcare professionals

  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Home

Search form

  • Advanced search
  • Search responses
  • Search blogs
  • A guide to prospective...

A guide to prospective meta-analysis

  • Related content
  • Peer review
  • Kylie E Hunter , senior project officer 1 ,
  • Saskia Cheyne , senior evidence analyst 1 ,
  • Davina Ghersi , senior principal research scientist, adjunct professor 1 2 ,
  • Jesse A Berlin , vice president, global head of epidemiology 3 ,
  • Lisa Askie , professor and director of systematic reviews and health technology assessment, manager of the Australian New Zealand Clinical Trials Registry 1
  • 1 NHMRC Clinical Trials Centre, University of Sydney, Locked bag 77, Camperdown NSW 1450, Australia
  • 2 National Health and Medical Research Council, Canberra, Australia
  • 3 Johnson & Johnson, Titusville, NJ, USA
  • Correspondence to: A L Seidler lene.seidler{at}ctc.usyd.edu.au
  • Accepted 8 August 2019

In a prospective meta-analysis (PMA), study selection criteria, hypotheses, and analyses are specified before the results of the studies related to the PMA research question are known, reducing many of the problems associated with a traditional (retrospective) meta-analysis. PMAs have many advantages: they can help reduce research waste and bias, and they are adaptive, efficient, and collaborative. Despite an increase in the number of health research articles labelled as PMAs, the methodology remains rare, novel, and often misunderstood. This paper provides detailed guidance on how to address the key elements for conducting a high quality PMA with a case study to illustrate each step.

Summary points

In a prospective meta-analysis (PMA), studies are identified and determined to be eligible for inclusion before the results of the studies related to the PMA research question are known

PMAs are applicable to high priority research questions where limited previous evidence exists and where new studies are expected to emerge

Compared with standard systematic review and meta-analysis protocols, key adaptations should be made to a PMA protocol, including search methods to identify planned and ongoing studies, details of studies that have already been identified for inclusion, core outcomes to be measured by all studies, collaboration management, and publication policy

A systematic search for planned and ongoing studies should precede a PMA, including a search of clinical trial registries and medical literature databases, and contacting relevant stakeholders in the specialty

PMAs are ideally conducted by a collaboration or consortium, including a central steering and data analysis committee, and representatives from each individual study

Usually PMAs collect individual participant data, but PMAs of aggregate data are also possible. PMAs can include interventional or observational studies

PMAs can enable harmonised collection of core outcomes, which can be particularly useful for rare but important outcomes, such as adverse side effects

Adaptive forms of PRISMA (preferred reporting items for systematic reviews and meta-analyses) and quality assessment approaches such as GRADE (grading of recommendations assessment, development, and evaluation) should be used to report and assess the quality of evidence for a PMA. The development of a standardised set of reporting guidelines and PMA specific evidence rating tools is highly desirable

PMAs can help to reduce research waste and bias, and they are adaptive, efficient, and collaborative

Systematic reviews and meta-analyses of the best available evidence are widely used to inform healthcare policy and practice. 1 2 Yet the retrospective nature of traditional systematic reviews and meta-analyses can be problematic. Positive results are more likely to be reported and published (phenomena known as selective outcome reporting and publication bias), and therefore including only published results in a meta-analysis can produce misleading results 3 and pose a threat to the validity of evidence based medicine. 4 In the planning stage of a traditional meta-analysis, knowledge of individual study results can influence the study selection process as choosing the key components of the review question and eligibility criteria might be based on one or more positive studies. 2 5 Meta-analyses on the same topic can reach conflicting conclusions because of different eligibility criteria. 2 Also, inconsistencies across individual studies in outcome measurement and analyses can make the combination of data difficult. 6

Prospective meta-analyses (PMAs, see box 1) have recently been described as next generation systematic reviews 7 that reduce the problems of traditional retrospective meta-analyses. Ioannidis and others even argue that “all primary original research may be designed, executed, and interpreted as prospective meta-analyses.” 8 9 For PMAs, studies are included prospectively, meaning before any individual study results related to the PMA research question are known. 10 This reduces the risk of publication bias and selective reporting bias and can enable better harmonisation of study outcomes.

Definition of a prospective meta-analysis

The key feature of a prospective meta-analysis (PMA) is that the studies or cohorts are identified as eligible for inclusion in the meta-analysis, and hypotheses and analysis strategies are specified, before the results of the studies or cohorts related to the PMA research question are known

The number of meta-analyses described as PMAs is increasing ( fig 1 ). But the definition, methodology, and reporting of previous PMAs vary greatly, and guidance on how to conduct them is limited, outdated, and inconsistent. 11 12 With recent advancements in computing capabilities, and the ability to identify planned and ongoing studies through increased trial registration, the planning and conduct of PMAs have become more efficient and effective. For PMAs to be successfully implemented in future health research, a revised PMA definition and expanded guidance are required. In this article, we, the Cochrane PMA Methods Group, present a step by step guide on how to perform a PMA. Our aim is to provide up to date guidance on the key principles, rationale, methods, and challenges for each step, to enable more researchers to understand and use this methodology successfully. Figure 2 shows a summary of the steps needed to perform a PMA.

Fig 1

Number of prospective meta-analyses (PMAs) over time. Possible PMA describes studies that seem to fulfil the criteria for a PMA but not enough information was reported to make a definite decision on their status as a PMA. These data are based on a systematic search of the literature (see appendix 1 for methodology)

  • Download figure
  • Open in new tab
  • Download powerpoint

Fig 2

Steps in conducting a prospective meta-analysis (PMA)

Case study: Neonatal Oxygenation Prospective Meta-analysis (NeOProM)

We will illustrate each step with an example of a PMA of randomised controlled trials conducted by the Neonatal Oxygenation Prospective Meta-analysis (NeOProM) Collaboration. 13 In this PMA, five groups prospectively planned to conduct separate, but similar, trials assessing different target ranges for oxygen saturation in preterm infants, and combine their results on completion. Although no difference was found in the composite primary outcome of death or major disability, a statistically significant reduction in the secondary outcome of death alone was found for the higher oxygen target range, but no change in major disability. This PMA resolved a major debate in neonatology.

Steps for performing a prospective meta-analysis

Step 0: deciding if a pma is the right methodology.

PMA methodology should be considered for a high priority research question for which new studies are expected to emerge and limited previous evidence exists (fig 3):

Priority research question —PMAs should be planned for research questions that are a high priority for healthcare decision makers. Ideally, these questions should be identified using priority setting methods within consumer-clinician collaborations, and/or they should address priorities identified by guideline committees, funding bodies, or clinical and research associations. Often these questions are in areas where important new treatment or prevention strategies have recently emerged, or where practice varies because of insufficient evidence.

New studies expected —PMAs are only feasible if new studies are likely to be included—for example, if the research question is an explicit priority for funding bodies or research associations. Some PMAs have been initiated after researchers learnt they were planning or conducting similar studies, and so they decided to collaborate and prospectively plan to combine their data. In other cases, a research question is posed by a consortium of investigators who then decide to plan similar studies that are combined on completion. A research team planning a PMA can play an active role in motivating other researchers to conduct similar studies addressing the same research question. A PMA can therefore be a catalyst for initiating a programme of priority research to answer important questions. 8 Initiating a PMA rather than conducting a large multicentre study can be advantageous as PMAs allow flexibility for each study to answer additional local questions, and the studies can be funded independently, which circumvents the problem of funding a mega study.

Insufficient previous evidence —A PMA should only be conducted if insufficient evidence exists to answer the research question. If sufficient evidence is available (eg, based on a retrospective meta-analysis), no further studies and no PMA should be planned, to avoid research waste.

Fig 3

When to conduct a prospective meta-analysis (PMA)

If evidence is available, but is insufficient for clinical decision making, a nested PMA should be considered. A nested PMA integrates prospective evidence into a retrospective meta-analysis, making best use of existing and emerging evidence while also retaining some benefits of PMAs. A nested PMA allows the assessment of publication bias and selective reporting bias by comparing prospectively included evidence with retrospective evidence in a sensitivity analysis. Studies that are prospectively included can be harmonised with other ongoing studies, and with previous related retrospective studies, to optimise evidence synthesis (see step 5).

PMA methodology was chosen to determine the optimal target range for oxygen saturation in preterm infants for several reasons:

Priority research question —oxygen has been used to treat preterm infants for more than 60 years. The different oxygen saturation target ranges used in practice have been associated with clinically important outcomes, such as mortality, disability, and blindness. Changing the oxygen saturation target range would be relatively easy to implement in clinical practice.

Insufficient previous evidence —evidence was mainly observational, with no recent, high quality randomised controlled trials available.

New studies expected— a total sample size of about 5000 infants was needed to detect an absolute difference in death or major disability of 4%. The NeOProM PMA was originally proposed as one large multicentre, multinational trial. 14 But because expensive masked pulse oximeters were needed, one funder could not support a study of sufficient sample size to reliably answer the clinical question. Instead, a PMA collaboration was initiated. Each group of NeOProM investigators obtained funding to conduct their own trial (although alone each study was underpowered to answer the main clinical question), could choose their own focus, and publish their own results, but with agreement to contribute data to the PMA to ensure sufficient combined statistical power to reliably detect differences in important outcomes.

Step 1: defining the research question and the eligibility criteria

At the start of a PMA, a research question needs to be specified. Research questions for PMAs should be formed in a similar way to traditional retrospective systematic reviews. Guidance for formulating a review question is available in the Cochrane Handbook for Systematic Reviews of Interventions . 15 For PMAs of interventional studies, the PICO system (population, intervention, comparison, outcome) should be used. To avoid selective reporting bias, the PMA research question and hypotheses need to be specified before any study results related to the PMA research questions are known.

PMAs are possible for a wide range of different study types—their applicability reaches beyond randomised controlled trials. An interventional PMA includes interventional studies (eg, randomised controlled trials or non-randomised studies of interventions). For interventional PMAs, the key inclusion criterion of “no results being known” usually means that the analyses have not been conducted in any of the trials included in the PMA.

An observational PMA includes observational studies. For observational PMAs, “no results being known” would mean that no analyses related to the PMA research question have been done. As many observational studies collect data on different outcomes, a meta-analysis can be classified as a PMA if unrelated research questions have already been analysed before inclusion in the PMA. For instance, for a PMA on the risk of lung cancer for people exposed to air pollution, observational studies where the relation between cardiovascular disease and air pollution has already been analysed can be included in the PMA, but only if the analyses on the association between lung cancer and air pollution have not been done. In this case, however, little harmonisation of outcome collection is possible (unless the investigators agree to collect additional data).

The NeOProM PMA addressed the research question, does targeting a lower oxygen saturation range in extremely preterm infants, from birth or soon after, increase or decrease the composite outcome of death or major disability in survivors by 4% or more?

The PICOS system was applied to define the eligibility criteria:

• Participants=infants born before 28 weeks’ gestation and enrolled within 24 hours of birth

• Intervention=target a lower (85-89%) oxygen saturation (SpO 2 ) range

• Comparator=target a higher (91-95%) SpO 2 range

• Outcome=composite of death or major disability at a corrected age of 18-24 months

• Study type=double blinded, randomised controlled trial (making this an interventional PMA).

Step 2: writing the protocol

Key elements of the protocol need to be finalised for the PMA before any individual study results related to the PMA research question are known. These include specification of the research questions, eligibility criteria for inclusion of studies, hypotheses, outcomes, and the statistical analysis strategy. The preferred reporting items for systematic reviews and meta-analyses extension for protocols (PRISMA-P) 16 provides some guidance on what should be included. As these reporting items were created for retrospective meta-analyses, however, key adaptations need to be made for PMA protocols (see box 2).

Key additional reporting items for a PMA protocol

For a PMA, several key items should be reported in the protocol in addition to PRISMA-P items:

Search methods

The search methods need to include how planned and ongoing studies are identified and how potential collaborators will be or have been contacted to participate (see step 3)

Study details

Details for studies already identified for inclusion should be listed, along with a statement that their results related to the PMA research question are not yet known (see step 1)

Core outcomes

Any core outcomes that will be measured by all the included studies should be specified, along with details on how and why they should be measured, to facilitate outcome harmonisation (see step 5)

Type of data collected

PMAs often collect individual participant data (that is, row by row data for each participant) but they may also collect aggregate data (that is, summary data for each study), and some combine both (see step 6)

Collaboration management and publication policy

Collaboration management and publication policy (see steps 4 and 7) should be specified, including details of any central steering and data analysis committees

An initial PMA protocol should be drafted before the search for eligible studies, but it can be amended after searching and after all studies have been included if the results of the included studies are not known when the PMA protocol is finalised. The investigators of the included studies can agree on the collection and analysis of additional rare outcomes and these outcomes can be included in a revised version of the protocol.

The final PMA protocol should be publicly available on the international prospective register of systematic reviews, PROSPERO 17 (which supports registration of PMAs), before the results (relating to the PMA research question) of any of the included studies are known. A full version of the PMA protocol can be published in a peer reviewed journal or elsewhere.

For the NeOProM PMA, an initial protocol was drafted by the lead investigators and discussed and refined by collaborators from all the included trials. The PMA protocol was registered on ClinicalTrials.gov in 2010 ( NCT01124331 ) because PROSPERO had not yet been launched. After the launch of PROSPERO in 2011, the protocol was registered (CRD42015019508). The full version of the protocol was published in BMC Pediatrics . 18

Step 3: searching for studies

After the PMA protocol is finalised, a systematic literature search is conducted, similar to that of a systematic review for a high quality meta-analysis. The main resources available for identifying planned and ongoing studies are clinical trial registries. Currently, 17 global clinical trial registries provide data to the World Health Organization’s International Clinical Trials Registry Platform. 19 Views on the best strategies for searching trial registries differ. 20 Limiting the search by date can be useful (eg, only studies registered within a reasonable time frame, taking into account the expected study duration and follow-up times) to reduce the search burden and exclude studies registered earlier that would likely be completed and thus ineligible for a PMA. Ideally, searches should be repeated on a regular basis to identify new eligible studies.

Prospective trial registration is mandated by various legislative, ethical, and regulatory bodies but compliance is not complete. 21 22 23 Observational studies are not required to be registered. Hence additional approaches to identifying planned and ongoing studies should be pursued, including searching bibliographic databases for conference abstracts, study protocols, and cohort descriptions, and approaching relevant stakeholders. The existence and possibility of joining the PMA can be publicised through the publication of PMA protocols, presentations at relevant conferences and research forums, and through an online presence (eg, a collaboration website).

For NeOProM, the Cochrane Central Register of Controlled Trials, Medline through PubMed, Embase, and CINAHL, clinical trial registries (using the WHO portal ( www.who.int/ictrp/en/ ) and ClinicalTrials.gov), conference proceedings, and the reference lists of retrieved articles were searched. Key researchers in the specialty were contacted to inquire if they were aware of additional trials. The abstracts of the relevant perinatal meetings (including the Neonatal Register and the Society for Paediatric Research) were searched using the keywords “oxygen saturation”. Five planned or ongoing trials meeting the inclusion criteria for the NeOProM PMA were identified, based in Australia, New Zealand, Canada, the United Kingdom, and the United States. The trials completed enrolment and follow-up between 2005 and 2014 and recruited a total of 4965 preterm infants born before 28 weeks’ gestation. No results for any of the trials were known at the time each trial agreed to be included in the PMA. All the NeOProM trials were identified by discussion with collaborators, and no additional trials were identified from electronic database searches.

Step 4: forming a collaboration of study investigators

Ideally, PMAs are conducted by a collaboration or consortium, including a central steering committee (leading the PMA and managing the collaboration), a data analysis committee (responsible for data management, processing, and analysis), and representatives from each study (involved in decisions on the protocol, analysis, and interpretation of the results). Regular collaboration meetings can be beneficial for achieving consensus on disagreements and in keeping study investigators involved in the PMA process. Transparent processes and a priori agreements are crucial for building and maintaining trust within a PMA collaboration.

Investigators might refuse to collaborate. Refusal to collaborate is less likely in a PMA than in a retrospective individual participant data meta-analysis as reaching agreement to share data is easier if studies are in their planning phases and can still be amended and harmonised after internal discussions. Aggregate data can be included in the PMA even if investigators refuse to collaborate, if the relevant summary data can be extracted from the resulting publications when the studies are completed. The ability to harmonise studies (step 5), however, may be limited if eligible investigators refuse to participate.

The NeOProM Collaboration comprised at least one investigator and a statistician from each of the included trials, and a steering group. All investigators and the steering group agreed on key aspects of the protocol before the results of the trials were known, and they also developed and agreed on a common data collection form, coding sheet, and detailed analysis plan. The NeOProM Collaboration met regularly by teleconference, and at least once a year face to face, to reach consensus on disagreements and to discuss the progress of individual trials, funding, data harmonisation, analysis plans, and interpretation of the PMA findings.

Step 5: harmonisation of included study population, intervention/exposure, and outcome collection

When a collaboration of investigators of planned or ongoing studies has been formed, the investigators can work together to harmonise the design, conduct, and outcome collection of the included studies to facilitate a meta-analysis and interpretation. A common problem with retrospective meta-analyses is that interventions are administered slightly differently across studies, or to different populations, and outcome collection, measurement, or reporting can differ. These differences make it difficult, and sometimes impossible, to synthesise results that are directly relevant to the study outcomes, interventions, and populations. In a PMA, studies are included as they are being planned or are ongoing, allowing researchers to agree on how to conduct their studies and collect common core outcomes. The PMA design enables the generation of evidence that is directly relevant to the research questions and thus increases confidence in the strength of the statements and recommendations derived from the PMA.

The ability to harmonise varies depending on the time when the PMA is first planned ( fig 4 ). In a de novo PMA, studies are planned as part of a PMA. For PMAs of interventional studies, a de novo PMA is similar to a multicentre trial: the included trials often share a common protocol, and usually the study population, interventions, and outcome collection are fully harmonised. In contrast, some PMAs identify studies for inclusion when data collection has already finished but no analyses related to the PMA research question have been conducted (outside of data safety monitoring committees). These types of PMAs allow little to no data harmonisation and are more similar to traditional retrospective meta-analyses. Yet they still have the advantage of reducing selection bias as the studies are deemed eligible for inclusion before their PMA specific results are known.

Fig 4

Different scenarios and time points when studies can be included in a prospective meta-analysis (PMA)

Harmonisation of studies in a PMA can occur for different elements of the included studies: study populations and settings; interventions or exposures (that is, independent variables); and outcomes collection. For study populations, settings, and interventions/exposures, harmonisation of studies to some degree is often beneficial to enable their successful synthesis. But some variation in the individual study protocols, populations, and interventions/exposures is often desirable to improve the generalisability (that is, external validity) of the research findings beyond one study, one form of the intervention, or narrow study specific populations. The variation in populations also enables subgroup analyses, evaluating if differences in populations between and within the studies leads to differences in treatment effects. If particular subgroups appear in more than one study, additional statistical power for subgroup analyses is also achieved.

Harmonisation of outcome collection requires careful consideration of the amount of common data needed to answer the relevant research questions. These discussions should aim to minimise unnecessary burden on participants and reduce research waste by avoiding excessive data collection, while increasing the ability to answer important research questions. Researchers can also agree to collect and analyse rare outcomes, such as severe but rare adverse events, that their individual studies would not have had the statistical power to detect. Collaborations should be specific on exactly how shared outcomes will be measured to avoid heterogeneity in outcome collection and difficulties in combining data. The COMET (core outcome measures in effectiveness trials) initiative ( www.comet-initiative.org/ ) has introduced methods for the development of core outcome sets, as detailed in its handbook. 24 These core outcome sets specify what and how outcomes should be measured by all studies of specific conditions to facilitate comparison and synthesis of the results. For health conditions with common core outcome sets, PMA collaborators should include the core outcomes, and also consider collecting other common outcomes that are particularly relevant for the specific research question posed. Not all outcomes have to be harmonised and collected by all studies: individual studies in a PMA have more autonomy than individual centres in a multicentre study and can collect study specific outcomes for their own purposes.

The improved availability of common core outcomes in a PMA has recently been shown in a PMA of childhood obesity interventions. 25 Harmonisation increased from 18% of core outcomes collected by all trials before the trial investigators agreed to collaborate, to 91% after the investigators decided to collaborate in a PMA.

Investigators of the five NeOProM trials first met in 2005 when the first trial was about to begin and the other four studies were in the early planning stages. With de novo PMA planning, all trials had the same intervention and comparator and collected similar outcome and subgroup variables. Some inconsistencies in outcome definitions and assessment methods across studies remained, however, and required substantial discussion to harmonise the final outcome collection and analyses.

Step 6: synthesising the evidence and assessing certainty of evidence

When all the individual studies have been completed, data can be synthesised in a PMA. For aggregate data PMA, results are extracted from publications or provided by the study authors. For individual participant data PMA, the line by line data from each participant in each study must be collated, harmonised, and analysed. This process is usually easier for PMAs than for traditional, retrospective individual participant data meta-analyses because if outcome collection and coding were previously harmonised, fewer inconsistencies should arise. If possible, plans to share data should be outlined in each study’s ethics application and consent form. For PMAs that are planned after the eligible studies have commenced, amendments to ethics applications may be necessary for data sharing. To assure independent data analysis, some PMAs appoint an independent data manager and statistician who have not been involved in any of the studies. The initial time intensive planning and harmonisation phase is followed by a waiting period when all the individual studies are completed before their data are made available and synthesised. During this middle period, PMAs usually demand little time and can run alongside other projects.

For studies where data safety monitoring committees are appropriate, it might be sensible for the committees to communicate and plan joint interim analyses to take account of all the available evidence when making recommendations to continue or stop a study. The PMA collaboration should consider establishing a joint data monitoring committee to synthesise data from all included studies at prespecified times. Methods for sequential meta-analysis and adaptive trial design could be considered in this context. 26

When all studies have been synthesised, the methodological quality of the included studies needs to be appraised with validated tools, such as those recommended by Cochrane. 27 28 The certainty of the evidence can be assessed with the grading of recommendations assessment, development and evaluation (GRADE) approach. 29

The NeOProM Collaboration was established in 2005, the first trial commenced in 2005, the last trial’s results were available in 2016, and the final combined analysis was published in 2018. At the request of two of the trials’ data monitoring committees, an interim analysis of data from these two trials was undertaken in 2011 and both trials were stopped early. 30 The five trials included in NeOProM were assessed for risk of bias with the Cochrane domains, 31 and consensus was reached by discussion with the full study group. The risk of bias assessments were more accurate and complete after detailed discussion of several domains (eg, allocation concealment and blinding) between the NeOProM Collaborators than would have been possible with their publications alone. GRADE assessments were performed and published in the Cochrane version of the meta-analysis. 32

Step 7: interpretation and reporting of results

Generally, the quality of the evidence derived from a PMA, and the extent to which causal inferences can be made, directly depend on the type and quality of the studies included in a PMA. The prospective nature of interventional PMAs make them similar to large multicentre trials, allowing for causal conclusions to be drawn rather than only associations, as sometimes suggested for traditional retrospective meta-analyses. The results of observational PMAs should generally be interpreted as providing associations, not causal effects, as only the results of observational studies are included. But with modern methods for causal inference from observational studies, justification for supporting conclusions about causality can sometimes be found. 33

Currently no PMA specific reporting standards exist, but where applicable, PMA authors should follow the PRISMA-IPD (PRISMA of individual participant data) statement 34 if they are reporting an individual participant data PMA, or the PRISMA statement 35 if they are reporting an aggregate data PMA. As well as the PRISMA items, authors of PMAs need to report on identification of planned and ongoing studies, the PMA timeline, collaboration policies, and outcome harmonisation processes.

Discussions about methodology and interpretation of the results among all collaborators can sometimes be difficult to navigate, particularly if the results from the combination of the studies contradict the results of some of the individual studies. Although these discussions can be demanding and time consuming, robust discussion among experts can lead to well considered and high quality publications that can directly inform policy and practice.

For the successful management of a PMA collaboration, an explicit authorship policy should be in place. One model is to offer authorship to each member of the secretariat, and one investigator from each included study, for the main PMA publication, assuming they fulfil the authorship criteria of the International Committee of Medical Journal Editors (ICMJE). This model incentivises ongoing involvement and allows for multiple viewpoints to be integrated in the final publication. The collaborators usually agree that the final PMA results cannot be published until the results of each study are accepted for publication, but this is not essential.

At least one investigator from each of the participating trials was a co-author on the final publication for NeOProM. 13 Collaborators met regularly, face to face and by phone, to resolve opposing views and achieve consensus on the interpretation of the PMA findings. Face to face meetings were crucial in resolving major disagreements within the NeOProM Collaboration. The collaborators used the PRISMA-IPD checklist for reporting of the PMA.

PMAs have many advantages: they help reduce research waste and bias, while greatly improving use of data, and they are adaptive, efficient, and collaborative. PMAs increase the statistical power to detect effects of treatment and enable harmonised collection of core outcomes, while allowing enough variation to obtain greater generalisability of findings. Compared with a multicentre study, PMAs are more decentralised and allow greater flexibility in terms of funding and timelines. Compared with a retrospective meta-analysis, PMAs enable more data harmonisation and control. Planning a PMA can help a group of researchers prioritise a research question they can address collaboratively and determine the optimal sample size a priori. Disadvantages of PMAs include difficulties in searching for planned and ongoing studies, often long waiting periods for studies to be completed, and difficulties in reaching consensus on the interpretation of the results. Table 1 shows a detailed comparison of the features and advantages and disadvantages of PMAs, multicentre studies, and retrospective meta-analyses.

Advantages and disadvantages of a prospective meta-analysis (PMA) compared with a multicentre study and a retrospective meta-analysis

  • View inline

Integration of PMAs with other next generation systematic review methodologies

PMAs can be combined with other new systematic review methodologies. Living systematic reviews begin with a traditional systematic review but have continual updates with a predetermined frequency. Living systematic reviews address similar research questions as PMAs (high priority questions with inconclusive evidence in an active research field). 36 In some instances it might be beneficial to combine these two methodologies. If authors are considering a PMA in a discipline where evidence is expected to become available gradually, a living PMA is an option. In living PMAs, new studies are included as they are being planned (but importantly before any of the results related to the PMA research questions are known), until a definitive effect has been found or the maximum required statistical information has been reached to conclude that no clinically important effect has been found. 37 Appropriate statistical methods for multiple testing should be strongly considered in living PMAs, such as sequential meta-analysis methodology which controls for type 1 and type 2 errors and takes into account heterogeneity. 26 PMA methodology can also be combined with other methods, such as network meta-analysis or meta-analysis of prognostic models.

Future for PMAs

With the advancement of machine learning, artificial intelligence, and big data, new horizons are seen for PMAs. Several steps need to be taken to improve the feasibility and quality of PMAs. Firstly, the ability to identify planned and ongoing studies needs to be improved by introducing further mechanisms to promote and enforce study registration and providing guidance on the best search strategies. The ICMJE requirement for prospective registration of clinical trials, together with several other ethical and regulatory initiatives, has improved registration rates of clinical trials but more improvement is needed. 38 22 Possible solutions include the integration of data submitted to ethics committees, funding bodies, and clinical trial registries. 21 The Cochrane PMA Methods Group, in collaboration with several trial registries, is working on improving methods for identifying planned and ongoing studies. Future technologies might automate the searching and screening process for planned and ongoing studies and automatically connect researchers who are planning similar relevant studies. Furthermore, the reporting and quality of PMAs needs to be improved. The reporting of PMAs would be greatly helped by the development of a standardised set of reporting guidelines to which PMA authors can adhere. Such guidelines are currently under development. Also, the development of PMA specific evidence rating tools (such as an extension to the GRADE approach) would be highly desirable. The Cochrane PMA Methods Group will publicise any new developments in this area on their website ( https://methods.cochrane.org/pma/ ).

PMAs have many advantages, and mandating trial registration, development of core outcome sets, and improved data sharing abilities have increased opportunities for conducting PMAs. We hope this step by step guidance on PMAs will improve the understanding of PMAs in the research community and enable more researchers to conduct successful PMAs. The Cochrane PMA Methods Group can offer advice for researchers planning to undertake PMAs.

Contributors: ALS conceived the idea and facilitated the workshop and discussions. LA, DG, KEH, and ALS participated in the workshop, and JAB and SC contributed to further discussions after the workshop. ALS, SC, and KEH performed the searches for a scoping review that was conducted in preparation for this article, reviewing all prospective meta-analyses and methods papers on prospective meta-analyses in health research to date. LA was the coordinator of the NeOProM Collaboration and KEH was a member. ALS wrote the first draft of the manuscript. All authors contributed to and revised the manuscript. ALS is the guarantor.

Competing interests: We have read and understood the BMJ Group policy on declaration of interests and declare the following: all authors are convenors or members of the Cochrane PMA Methods Group and have been involved in numerous prospective meta-analyses. LA, DG, and JAB have published several methods articles on prospective meta-analyses and are authors of the prospective meta-analysis chapter in the Cochrane Handbook for Systematic Reviews of Interventions . LA manages the Australian New Zealand Clinical Trials Registry (ANZCTR). ALS and KEH work for the ANZCTR. JAB is a full time employee of Johnson & Johnson.

Provenance and peer review: Not commissioned; externally peer reviewed.

  • National Health and Medical Research Council
  • Ioannidis JP
  • Krleza-Jerić K ,
  • Berlin JA ,
  • Ioannidis J
  • Halpern SD ,
  • Karlawish JHT ,
  • ↵ Ghersi D, Berlin J, Askie L. Prospective meta‐analysis. In: Higgins JPT, Green S (eds), Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 (updated March 2011). The Cochrane Collaboration, 2011:559-70.
  • Margitić SE ,
  • Morgan TM ,
  • Probstfield J ,
  • Applegate WB
  • Darlow BA ,
  • Neonatal Oxygenation Prospective Meta-analysis (NeOProM) Collaboration
  • Wright KW ,
  • Tarnow-Mordi W ,
  • Phelps DL ,
  • Pulse Oximetry Saturation Trial for Prevention of Retinopathy of Prematurity Planning Study Group
  • ↵ Green S, Higgins JPT (eds). Preparing a Cochrane review. In: Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 (updated March 2011). The Cochrane Collaboration, 2011.
  • Shamseer L ,
  • PRISMA-P Group
  • Brocklehurst P ,
  • Schmidt B ,
  • NeOProM Collaborative Group
  • ↵ World Health Organization (WHO). WHO International Clinical Trials Registry Platform (ICTRP) Search Portal: http://apps.who.int/trialsearch/ [accessed 6 November 2018].
  • Isojarvi J ,
  • Lefebvre C ,
  • Glanville J
  • Hunter KE ,
  • Seidler AL ,
  • Harriman SL ,
  • Williamson PR ,
  • Altman DG ,
  • Seidler A ,
  • Mihrshahi S ,
  • Simmonds M ,
  • Salanti G ,
  • McKenzie J ,
  • Elliott J ,
  • Living Systematic Review Network
  • Higgins JPT ,
  • Sterne JAC ,
  • Savović J ,
  • Sterne JA ,
  • Hernán MA ,
  • Reeves BC ,
  • Stenson B ,
  • U.K. BOOST II trial ,
  • Australian BOOST II trial ,
  • New Zealand BOOST II trial
  • Higgins J ,
  • Stewart LA ,
  • PRISMA-IPD Development Group
  • Liberati A ,
  • Tetzlaff J ,
  • PRISMA Group
  • Elliott JH ,
  • Guyatt GH ,

meta analysis research questions

  • Download PDF
  • Share X Facebook Email LinkedIn
  • Permissions

Practical Guide to Meta-analysis

  • 1 Stanford-Surgery Policy Improvement, Research and Education (S-SPIRE) Center, Palo Alto, California
  • 2 Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill
  • 3 Department of Surgery, University of Michigan, Ann Arbor
  • Editorial Maximizing the Impact of Surgical Health Services Research Amir A. Ghaferi, MD, MS; Adil H. Haider, MD, MPH; Melina R. Kibbe, MD JAMA Surgery
  • Guide to Statistics and Methods Practical Guide to Qualitative Analysis Margaret L. Schwarze, MD, MPP; Amy H. Kaji, MD, PhD; Amir A. Ghaferi, MD, MS JAMA Surgery
  • Guide to Statistics and Methods Practical Guide to Mixed Methods Lesly A. Dossett, MD, MPH; Amy H. Kaji, MD, PhD; Justin B. Dimick, MD, MPH JAMA Surgery
  • Guide to Statistics and Methods Practical Guide to Cost-effectiveness Analysis Benjamin S. Brooke, MD, PhD; Amy H. Kaji, MD, PhD; Kamal M. F. Itani, MD JAMA Surgery
  • Guide to Statistics and Methods Practical Guide to Comparative Effectiveness Research Using Observational Data Ryan P. Merkow, MD, MS; Todd A. Schwartz, DrPH; Avery B. Nathens, MD, MPH, PhD JAMA Surgery
  • Guide to Statistics and Methods Practical Guide to Health Policy Evaluation Using Observational Data John W. Scott, MD, MPH; Todd A. Schwartz, DrPH; Justin B. Dimick, MD, MPH JAMA Surgery
  • Guide to Statistics and Methods Practical Guide to Survey Research Karen Brasel, MD, MPH; Adil Haider, MD, MPH; Jason Haukoos, MD, MSc JAMA Surgery
  • Guide to Statistics and Methods Practical Guide to Assessment of Patient-Reported Outcomes Giana H. Davidson, MD, MPH; Jason S. Haukoos, MD, MSc; Liane S. Feldman, MD JAMA Surgery
  • Guide to Statistics and Methods Practical Guide to Implementation Science Heather B. Neuman, MD, MS; Amy H. Kaji, MD, PhD; Elliott R. Haut, MD, PhD JAMA Surgery
  • Guide to Statistics and Methods Practical Guide to Decision Analysis Dorry L. Segev, MD, PhD; Jason S. Haukoos, MD, MSc; Timothy M. Pawlik, MD, MPH, PhD JAMA Surgery

Meta-analysis is a systematic approach of synthesizing, combining, and analyzing data from multiple studies (randomized clinical trials 1 or observational studies 2 ) into a single effect estimate to answer a research question. Meta-analysis is especially useful if there is debate around the research question in the literature published to date or the individual published studies are underpowered. Vital to a high-quality meta-analysis is a comprehensive literature search, prespecified hypothesis and aims, reporting of study quality, consideration of heterogeneity and examination of bias. In the hierarchy of evidence, meta-analysis appears above observational studies and randomized clinical trials because it rigorously collates evidence across a larger body of literature; however, meta-analysis is largely dependent on the quality of the primary data.

  • Editorial Maximizing the Impact of Surgical Health Services Research JAMA Surgery

Read More About

Arya S , Schwartz TA , Ghaferi AA. Practical Guide to Meta-analysis. JAMA Surg. 2020;155(5):430–431. doi:10.1001/jamasurg.2019.4523

Manage citations:

© 2024

Artificial Intelligence Resource Center

Surgery in JAMA : Read the Latest

Browse and subscribe to JAMA Network podcasts!

Others Also Liked

Select your interests.

Customize your JAMA Network experience by selecting one or more topics from the list below.

  • Academic Medicine
  • Acid Base, Electrolytes, Fluids
  • Allergy and Clinical Immunology
  • American Indian or Alaska Natives
  • Anesthesiology
  • Anticoagulation
  • Art and Images in Psychiatry
  • Artificial Intelligence
  • Assisted Reproduction
  • Bleeding and Transfusion
  • Caring for the Critically Ill Patient
  • Challenges in Clinical Electrocardiography
  • Climate and Health
  • Climate Change
  • Clinical Challenge
  • Clinical Decision Support
  • Clinical Implications of Basic Neuroscience
  • Clinical Pharmacy and Pharmacology
  • Complementary and Alternative Medicine
  • Consensus Statements
  • Coronavirus (COVID-19)
  • Critical Care Medicine
  • Cultural Competency
  • Dental Medicine
  • Dermatology
  • Diabetes and Endocrinology
  • Diagnostic Test Interpretation
  • Drug Development
  • Electronic Health Records
  • Emergency Medicine
  • End of Life, Hospice, Palliative Care
  • Environmental Health
  • Equity, Diversity, and Inclusion
  • Facial Plastic Surgery
  • Gastroenterology and Hepatology
  • Genetics and Genomics
  • Genomics and Precision Health
  • Global Health
  • Guide to Statistics and Methods
  • Hair Disorders
  • Health Care Delivery Models
  • Health Care Economics, Insurance, Payment
  • Health Care Quality
  • Health Care Reform
  • Health Care Safety
  • Health Care Workforce
  • Health Disparities
  • Health Inequities
  • Health Policy
  • Health Systems Science
  • History of Medicine
  • Hypertension
  • Images in Neurology
  • Implementation Science
  • Infectious Diseases
  • Innovations in Health Care Delivery
  • JAMA Infographic
  • Law and Medicine
  • Leading Change
  • Less is More
  • LGBTQIA Medicine
  • Lifestyle Behaviors
  • Medical Coding
  • Medical Devices and Equipment
  • Medical Education
  • Medical Education and Training
  • Medical Journals and Publishing
  • Mobile Health and Telemedicine
  • Narrative Medicine
  • Neuroscience and Psychiatry
  • Notable Notes
  • Nutrition, Obesity, Exercise
  • Obstetrics and Gynecology
  • Occupational Health
  • Ophthalmology
  • Orthopedics
  • Otolaryngology
  • Pain Medicine
  • Palliative Care
  • Pathology and Laboratory Medicine
  • Patient Care
  • Patient Information
  • Performance Improvement
  • Performance Measures
  • Perioperative Care and Consultation
  • Pharmacoeconomics
  • Pharmacoepidemiology
  • Pharmacogenetics
  • Pharmacy and Clinical Pharmacology
  • Physical Medicine and Rehabilitation
  • Physical Therapy
  • Physician Leadership
  • Population Health
  • Primary Care
  • Professional Well-being
  • Professionalism
  • Psychiatry and Behavioral Health
  • Public Health
  • Pulmonary Medicine
  • Regulatory Agencies
  • Reproductive Health
  • Research, Methods, Statistics
  • Resuscitation
  • Rheumatology
  • Risk Management
  • Scientific Discovery and the Future of Medicine
  • Shared Decision Making and Communication
  • Sleep Medicine
  • Sports Medicine
  • Stem Cell Transplantation
  • Substance Use and Addiction Medicine
  • Surgical Innovation
  • Surgical Pearls
  • Teachable Moment
  • Technology and Finance
  • The Art of JAMA
  • The Arts and Medicine
  • The Rational Clinical Examination
  • Tobacco and e-Cigarettes
  • Translational Medicine
  • Trauma and Injury
  • Treatment Adherence
  • Ultrasonography
  • Users' Guide to the Medical Literature
  • Vaccination
  • Venous Thromboembolism
  • Veterans Health
  • Women's Health
  • Workflow and Process
  • Wound Care, Infection, Healing
  • Register for email alerts with links to free full-text articles
  • Access PDFs of free articles
  • Manage your interests
  • Save searches and receive search alerts

Banner

Systematic Reviews & Meta-Analysis

Identifying your research question.

  • Developing Your Protocol
  • Conducting Your Search
  • Screening & Selection
  • Data Extraction & Appraisal
  • Meta-Analyses
  • Writing the Systematic Review
  • Suggested Readings

The first step in performing a Systematic Review is to develop your research question. The guidance provided on how to develop your research question for literature reviews will still apply here. The difference with a systematic review research question is that you must have a clearly defined question and consider what problem are you trying to address by conducting the review. The most important point is that you focus your question and design the question so that it is answerable by the research that you will be systematically examining.

Once you have developed your research question, it should not be changed once the review process has begun, as the review protocol needs to be formed around the question. 

Literature Review Question Systematic Review Question
Can be broad; highlight only particular pieces of literature, or support a particular viewpoint. Requires the question to be well-defined and focused so it is possible to answer.

To help develop and focus your research question you may use one of the question frameworks below.

Methods for Refining a Research Topic

PICO questions can be useful in the health or social sciences. PICO stands for:

  • Patient, Population, or Problem : What are the characteristics of the patient(s) or population, i.e. their ages, genders, or other demographics? What is the situation, disease, etc., that you are interested in?
  • Intervention or Exposure : What do you want to do with the patient, person, or population (i.e. observe, diagnose, treat)?
  • Comparison : What is the alternative to the intervention (i.e. a different drug, a different assignment in a classroom)?
  • Outcome : What are the relevant outcomes (i.e. complications, morbidity, grades)?

Additionally, the following are variations to the PICO framework:

  • PICO(T) : The 'T' stands for Timing, where you would define the duration of treatment and the follow-up schedule that matter to patients. Consider both long- and short-term outcomes.
  • PICO(S) : The 'S' stands for Study type (eg. randomized controlled trial), sometimes S can be used to stand for Setting or Sample Size

PPAARE is a useful question framework for patient care:

Problem -  Description of the problem related to the disease or condition

Patient - Description of the patient related to their demographics and risk factors

Action  - Description of the action related to the patient’s diagnosis, diagnostic test, etiology, prognosis, treatment or therapy, harm, prevention, patient ed.

Alternative - Description of the alternative to the action when there is one. (Not required)

Results   -   Identify the patient’s result of the action to produce, improve, or reduce the outcome for the patient

Evidence   -   Identify the level of evidence available after searching

SPIDER is a useful question framework for qualitative evidence synthesis:

Sample  - The group of participants, population, or patients being investigated. Qualitative research is not easy to generalize, which is why sample is preferred over patient.

Phenomenon of Interest - The reasons for behavior and decisions, rather than an intervention.

Design  - The research method and study design used for the research, such as interview or survey.

Evaluation  - The end result of the research or outcome measures.

Research type -   The research type; Qualitative, quantitative and/or mixed methods.

SPICE is a particularly useful method in the social sciences. It stands for

  • Setting (e.g. United States)
  • Perspective (e.g. adolescents)
  • Intervention (e.g. text message reminders)
  • Comparisons (e.g. telephone message reminders)
  • Evaluation (e.g. number of homework assignments turned in after text message reminder compared to the number of assignments turned in after a telephone reminder)

CIMO is useful method in the social sciences or organisational context. It stands for

  • Context - Which individuals, relationships, institutional settings, or wider systems are being studied?
  • Intervention - The effects of what event, action, or activity are being studied?
  • Mechanism - What are the mechanisms that explain the relationship between interventions and outcomes? Under what circumstances are these mechanisms activated or not activated?
  • Outcomes - What are the effects of the intervention? How will the outcomes be measured? What are the intended and unintended effects?

Has Your Systematic Review Already Been Done?

Once you have a reasonably well defined research question, it is important to check if your question has already been asked, or if there are other systematic reviews that are similar to that which you're preparing to do.

In the context of conducting a review, even if you do find one on your topic, it may be sufficiently out of date or you may find other defendable reasons to undertake a new or updated one. In addition, locating an existing systematic reviews may also provide a starting point for selecting a review topic, it may help you refocus your question, or redirect your research toward other gaps in the literature.

You may locate existing systematic reviews or protocols on the following resources:

  • Cochrane Library This link opens in a new window The Cochrane Library is a database collection containing high-quality, independent evidence, including systematic reviews and controlled trials, to inform healthcare decision-making. Terms of Use .
  • MEDLINE (EBSCO) This link opens in a new window Medline (EBSCO) produced by the U.S. National Library of Medicine is the premier database of biomedicine and health sciences, covering life sciences including biology, environmental science, marine biology, plant and animal science, biophysics and chemistry. Terms of Use . Coverage: 1950-present.

Open Access

  • PsycINFO This link opens in a new window Contains over 5 million citations and summaries of peer-reviewed journal articles, book chapters, and dissertations from the behavioral and social sciences in 29 languages from 50 countries. Terms of Use . Coverage: 1872-present.
  • << Previous: Systematic Reviews
  • Next: Developing Your Protocol >>
  • Last Updated: Jun 28, 2024 10:04 AM
  • URL: https://libguides.chapman.edu/systematic_reviews

Meta-analysis of data

Meta-analysis

Reviewed by Psychology Today Staff

Meta-analysis is an objective examination of published data from many studies of the same research topic identified through a literature search. Through the use of rigorous statistical methods, it can reveal patterns hidden in individual studies and can yield conclusions that have a high degree of reliability. It is a method of analysis that is especially useful for gaining an understanding of complex phenomena when independent studies have produced conflicting findings.

Meta-analysis provides much of the underpinning for evidence-based medicine. It is particularly helpful in identifying risk factors for a disorder, diagnostic criteria, and the effects of treatments on specific populations of people, as well as quantifying the size of the effects. Meta-analysis is well-suited to understanding the complexities of human behavior.

  • How Does It Differ From Other Studies?
  • When Is It Used?
  • What Are Some Important Things Revealed by Meta-analysis?

Person performing a meta-analysis

There are well-established scientific criteria for selecting studies for meta-analysis. Usually, meta-analysis is conducted on the gold standard of scientific research—randomized, controlled, double-blind trials. In addition, published guidelines not only describe standards for the inclusion of studies to be analyzed but also rank the quality of different types of studies. For example, cohort studies are likely to provide more reliable information than case reports.

Through statistical methods applied to the original data collected in the included studies, meta-analysis can account for and overcome many differences in the way the studies were conducted, such as the populations studied, how interventions were administered, and what outcomes were assessed and how. Meta-analyses, and the questions they are attempting to answer, are typically specified and registered with a scientific organization, and, with the protocols and methods openly described and reviewed independently by outside investigators, the research process is highly transparent.

Meta-analysis of data

Meta-analysis is often used to validate observed phenomena, determine the conditions under which effects occur, and get enough clarity in clinical decision-making to indicate a course of therapeutic action when individual studies have produced disparate findings. In reviewing the aggregate results of well-controlled studies meeting criteria for inclusion, meta-analysis can also reveal which research questions, test conditions, and research methods yield the most reliable results, not only providing findings of immediate clinical utility but furthering science.

The technique can be used to answer social and behavioral questions large and small. For example, to clarify whether or not having more options makes it harder for people to settle on any one item, a meta-analysis of over 53 conflicting studies on the phenomenon was conducted. The meta-analysis revealed that choice overload exists—but only under certain conditions. You will have difficulty selecting a TV show to watch from the massive array of possibilities, for example, if the shows differ from each other in multiple ways or if you don’t have any strong preferences when you finally get to sit down in front of the TV.

Person analyzing results of meta-analysis

A meta-analysis conducted in 2000, for example, answered the question of whether physically attractive people have “better” personalities . Among other traits, they prove to be more extroverted and have more social skills than others. Another meta-analysis, in 2014, showed strong ties between physical attractiveness as rated by others and having good mental and physical health. The effects on such personality factors as extraversion are too small to reliably show up in individual studies but real enough to be detected in the aggregate number of study participants. Together, the studies validate hypotheses put forth by evolutionary psychologists that physical attractiveness is important in mate selection because it is a reliable cue of health and, likely, fertility.

meta analysis research questions

When considered across a lifetime, no within-person association exists between religiosity and psychological well-being.

meta analysis research questions

What are the prevalence rates of psychosis for displaced refugees?

meta analysis research questions

A recent review provides compelling evidence that arts engagement significantly reduces cognitive decline and enhances the quality of life among healthy older adults.

meta analysis research questions

Personal Perspective: Mental healthcare AI is evolving beyond administrative roles. By automating routine tasks, therapists can spend sessions focusing on human interactions.

meta analysis research questions

Investing in building a positive classroom climate holds benefits for students and teachers alike.

meta analysis research questions

Mistakenly blaming cancer-causing chemicals and radiation for most cancers lets us avoid the simple lifestyle changes that could protect us from cancer far more.

meta analysis research questions

According to astronomer Carl Sagan, "Extraordinary claims require extraordinary evidence." Does the claim that pet owners live longer pass the extraordinary evidence requirement?

meta analysis research questions

People, including leading politicians, are working later in life than ever before. Luckily, social science suggests that aging does not get in the way of job performance.

meta analysis research questions

The healthcare industry is regulated to ensure patient safety, efficacy of treatments, and ethical practices. Why aren't these standards applied to mental health apps?

meta analysis research questions

Being able to forgive others makes you more resilient. You can learn to let go of anger and bitterness.

  • Find a Therapist
  • Find a Treatment Center
  • Find a Psychiatrist
  • Find a Support Group
  • Find Online Therapy
  • United States
  • Brooklyn, NY
  • Chicago, IL
  • Houston, TX
  • Los Angeles, CA
  • New York, NY
  • Portland, OR
  • San Diego, CA
  • San Francisco, CA
  • Seattle, WA
  • Washington, DC
  • Asperger's
  • Bipolar Disorder
  • Chronic Pain
  • Eating Disorders
  • Passive Aggression
  • Personality
  • Goal Setting
  • Positive Psychology
  • Stopping Smoking
  • Low Sexual Desire
  • Relationships
  • Child Development
  • Self Tests NEW
  • Therapy Center
  • Diagnosis Dictionary
  • Types of Therapy

July 2024 magazine cover

Sticking up for yourself is no easy task. But there are concrete skills you can use to hone your assertiveness and advocate for yourself.

  • Emotional Intelligence
  • Gaslighting
  • Affective Forecasting
  • Neuroscience

Recent Developments in Group Psychotherapy Research

Information & authors, metrics & citations, view options, research on group psychotherapy outcomes.

meta analysis research questions

Meta-Analyses on the Efficacy of Group Therapy in Treating Mental Disorders

meta analysis research questions

Moderators of Group Treatment Outcomes

Group therapy for medical conditions, general conclusions regarding group psychotherapy outcomes, limitations of studies on group psychotherapy outcomes, other relevant research topics, feedback systems in group therapy, cohesion and alliance, promising developments, conclusions, information, published in.

Go to American Journal of Psychotherapy

  • Group psychotherapy

Affiliations

Competing interests, funding information, export citations.

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download. For more information or tips please see 'Downloading to a citation manager' in the Help menu .

Format
Citation style
Style

To download the citation to this article, select your reference manager software.

There are no citations for this item

View options

Login options.

Already a subscriber? Access your subscription through your login credentials or your institution for full access to this article.

Purchase Options

Purchase this article to access the full text.

PPV Articles - APT - American Journal of Psychotherapy

Not a subscriber?

Subscribe Now / Learn More

PsychiatryOnline subscription options offer access to the DSM-5-TR ® library, books, journals, CME, and patient resources. This all-in-one virtual library provides psychiatrists and mental health professionals with key resources for diagnosis, treatment, research, and professional development.

Need more help? PsychiatryOnline Customer Service may be reached by emailing [email protected] or by calling 800-368-5777 (in the U.S.) or 703-907-7322 (outside the U.S.).

Share article link

Copying failed.

PREVIOUS ARTICLE

Next article, request username.

Can't sign in? Forgot your username? Enter your email address below and we will send you your username

If the address matches an existing account you will receive an email with instructions to retrieve your username

Create a new account

Change password, password changed successfully.

Your password has been changed

Reset password

Can't sign in? Forgot your password?

Enter your email address below and we will send you the reset instructions

If the address matches an existing account you will receive an email with instructions to reset your password.

Your Phone has been verified

American Psychological Association Logo

Who is prejudiced, and toward whom? The big five traits and generalized prejudice

Suggested instructions.

This research summary can be incorporated into a lecture or shared with students as a reading to demonstrate how psychological science explores the diversity of human experience. This resource can also be used to encourage students to identify the major components of a research study (i.e., the hypothesis or study question, sample, method, and findings). Students can be asked to identify potential limitations of the study and encouraged to discuss the implications for our understanding of human behavior. Keywords highlight how concepts within and across pillars are incorporated into a single research study.

Research summary

This study by Jarret Crawford (The College of New Jersey) and Mark Brandt (Tilburg University) examined two questions: (1) Are people with particular personality traits prejudiced against most different types of groups (generalized prejudice), and (2) are people with particular personality traits prejudiced against certain types of groups? To test these questions, they used a meta-analysis, a type of research method that combines the results of multiple different scientific studies. This study combined the results of four different studies (including more than 7,500 people). Their findings revealed that the trait of agreeableness was inversely related to general feelings of prejudice, with people higher in agreeableness showing less prejudice against people in different groups. Their findings also revealed that the trait of openness was slightly inversely related to general feelings of prejudice, with people higher in openness showing somewhat less prejudice against people in different groups. This research tells us that people who are low in the trait of agreeableness seem to see other people more negatively, which includes showing higher levels of prejudice against people in different types of groups.

Crawford, J. T., & Brandt, M. J. (2019). Who is prejudiced, and toward whom? The big five traits and generalized prejudice. Personality and Social Psychology Bulletin , 45 (10), 1455–1467. https://doi.org/10.1177/0146167219832335

Race, Personality, Prejudice

The development of resources to broaden diversity and representation in the teaching of high school psychology resources is an APA Committee of Teachers of Psychology in Secondary Schools initiative supported by funding from the American Psychological Foundation David and Carol Myers Fund. This resource was developed by Eric Castro, Shawn C.T. Jones, PhD, Allyn Olsen, and Catherine Sanderson, PhD.

  • Resources to Broaden Diversity and Representation in the High School Curriculum
  • Social and Personality Pillar Resources

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

nutrients-logo

Article Menu

meta analysis research questions

  • Subscribe SciFeed
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

The effect of probiotics on the management of pain and inflammation in osteoarthritis: a systematic review and meta-analysis of clinical studies.

meta analysis research questions

1. Introduction

2. materials and methods, 2.1. search strategy, 2.2. inclusion/exclusion criteria and study selection, 2.3. outcomes measures, 2.4. data items and data extraction, 2.5. data analysis, 2.6. quality review, 2.7. certainty of evidence, 2.8. publication bias, 3.1. study selection, 3.2. study characteristics, 3.3. meta-analysis/forest plot interpretation, 3.4. risk of bias of included studies, 3.5. certainty of evidence, 4. discussion, 5. conclusions, supplementary materials, author contributions, conflicts of interest.

  • Steinmetz, J.D.; Culbreth, G.T.; Haile, L.M.; Rafferty, Q.; Lo, J.; Fukutaki, K.G.; Cruz, J.A.; Smith, A.E.; Vollset, S.E.; Brooks, P.M.; et al. Global, Regional, and National Burden of Osteoarthritis, 1990–2020 and Projections to 2050: A Systematic Analysis for the Global Burden of Disease Study 2021. Lancet Rheumatol. 2023 , 5 , e508–e522. [ Google Scholar ] [ CrossRef ]
  • Hunter, D.J.; Bierma-Zeinstra, S. Osteoarthritis. Lancet 2019 , 393 , 1745–1759. [ Google Scholar ] [ CrossRef ]
  • Berenbaum, F.; Griffin, T.M.; Liu-Bryan, R. Review: Metabolic Regulation of Inflammation in Osteoarthritis. Arthritis Rheumatol. 2017 , 69 , 9–21. [ Google Scholar ] [ CrossRef ]
  • Azzini, G.O.M.; Santos, G.S.; Visoni, S.B.C.; Azzini, V.O.M.; dos Santos, R.G.; Huber, S.C.; Lana, J.F. Metabolic Syndrome and Subchondral Bone Alterations: The Rise of Osteoarthritis—A Review. J. Clin. Orthop. Trauma 2020 , 11 , S849–S855. [ Google Scholar ] [ CrossRef ]
  • Ramires, L.C.; Santos, G.S.; Ramires, R.P.; da Fonseca, L.F.; Jeyaraman, M.; Muthu, S.; Lana, A.V.; Azzini, G.; Smith, C.S.; Lana, J.F. The Association between Gut Microbiota and Osteoarthritis: Does the Disease Begin in the Gut? Int. J. Mol. Sci. 2022 , 23 , 1494. [ Google Scholar ] [ CrossRef ]
  • Marchese, L.; Contartese, D.; Giavaresi, G.; Di Sarno, L.; Salamanna, F. The Complex Interplay between the Gut Microbiome and Osteoarthritis: A Systematic Review on Potential Correlations and Therapeutic Approaches. Int. J. Mol. Sci. 2023 , 25 , 143. [ Google Scholar ] [ CrossRef ]
  • Biver, E.; Berenbaum, F.; Valdes, A.M.; Araujo de Carvalho, I.; Bindels, L.B.; Brandi, M.L.; Calder, P.C.; Castronovo, V.; Cavalier, E.; Cherubini, A.; et al. Gut Microbiota and Osteoarthritis Management: An Expert Consensus of the European Society for Clinical and Economic Aspects of Osteoporosis, Osteoarthritis and Musculoskeletal Diseases (ESCEO). Ageing Res. Rev. 2019 , 55 , 100946. [ Google Scholar ] [ CrossRef ]
  • Kolasinski, S.L.; Neogi, T.; Hochberg, M.C.; Oatis, C.; Guyatt, G.; Block, J.; Callahan, L.; Copenhaver, C.; Dodge, C.; Felson, D.; et al. 2019 American College of Rheumatology/Arthritis Foundation Guideline for the Management of Osteoarthritis of the Hand, Hip, and Knee. Arthritis Care Res. 2020 , 72 , 149–162. [ Google Scholar ] [ CrossRef ]
  • Bannuru, R.R.; Osani, M.C.; Vaysbrot, E.E.; Arden, N.K.; Bennell, K.; Bierma-Zeinstra, S.M.A.; Kraus, V.B.; Lohmander, L.S.; Abbott, J.H.; Bhandari, M.; et al. OARSI Guidelines for the Non-Surgical Management of Knee, Hip, and Polyarticular Osteoarthritis. Osteoarthr. Cartil. 2019 , 27 , 1578–1589. [ Google Scholar ] [ CrossRef ]
  • Kloppenburg, M.; Kroon, F.P.; Blanco, F.J.; Doherty, M.; Dziedzic, K.S.; Greibrokk, E.; Haugen, I.K.; Herrero-Beaumont, G.; Jonsson, H.; Kjeken, I.; et al. 2018 Update of the EULAR Recommendations for the Management of Hand Osteoarthritis. Ann. Rheum. Dis. 2019 , 78 , 16–24. [ Google Scholar ] [ CrossRef ]
  • da Costa, B.R.; Pereira, T.V.; Saadat, P.; Rudnicki, M.; Iskander, S.M.; Bodmer, N.S.; Bobos, P.; Gao, L.; Kiyomoto, H.D.; Montezuma, T.; et al. Effectiveness and Safety of Non-Steroidal Anti-Inflammatory Drugs and Opioid Treatment for Knee and Hip Osteoarthritis: Network Meta-Analysis. BMJ 2021 , 375 , n2321. [ Google Scholar ] [ CrossRef ]
  • Qiu, D.; Xia, Z.; Deng, J.; Jiao, X.; Liu, L.; Li, J. Glucorticoid-induced Obesity Individuals Have Distinct Signatures of the Gut Microbiome. BioFactors 2019 , 45 , 892–901. [ Google Scholar ] [ CrossRef ]
  • Garg, K.; Mohajeri, M.H. Potential Effects of the Most Prescribed Drugs on the Microbiota-Gut-Brain-Axis: A Review. Brain Res. Bull. 2024 , 207 , 110883. [ Google Scholar ] [ CrossRef ]
  • Zádori, Z.S.; Király, K.; Al-Khrasani, M.; Gyires, K. Interactions between NSAIDs, Opioids and the Gut Microbiota—Future Perspectives in the Management of Inflammation and Pain. Pharmacol. Ther. 2023 , 241 , 108327. [ Google Scholar ] [ CrossRef ]
  • Wang, Z.; Jones, G.; Blizzard, L.; Aitken, D.; Zhou, Z.; Wang, M.; Balogun, S.; Cicuttini, F.; Antony, B. Prevalence and Correlates of the Use of Complementary and Alternative Medicines among Older Adults with Joint Pain. Int. J. Rheum. Dis. 2023 , 26 , 1760–1769. [ Google Scholar ] [ CrossRef ]
  • Basedow, M.; Runciman, W.B.; March, L.; Esterman, A. Australians with Osteoarthritis; the Use of and Beliefs about Complementary and Alternative Medicines. Complement. Ther. Clin. Pract. 2014 , 20 , 237–242. [ Google Scholar ] [ CrossRef ]
  • Liu, X.; Machado, G.C.; Eyles, J.P.; Ravi, V.; Hunter, D.J. Dietary Supplements for Treating Osteoarthritis: A Systematic Review and Meta-Analysis. Br. J. Sports Med. 2018 , 52 , 167–175. [ Google Scholar ] [ CrossRef ]
  • FAO/WHO Evaluation of Health and Nutritional Properties of Probiotics in Food Including Powder Milk with Live Acid Bacteria ; Report of a Joint FAO/WHO Expert Consultation; FAO Food and Nutrition Paper; World Health Organization: Cordoba, Argentina, 2001; Volume 85, pp. 5–35.
  • Cunningham, M.; Azcarate-Peril, M.A.; Barnard, A.; Benoit, V.; Grimaldi, R.; Guyonnet, D.; Holscher, H.D.; Hunter, K.; Manurung, S.; Obis, D.; et al. Shaping the Future of Probiotics and Prebiotics. Trends Microbiol. 2021 , 29 , 667–685. [ Google Scholar ] [ CrossRef ]
  • Kim, S.-K.; Guevarra, R.B.; Kim, Y.-T.; Kwon, J.; Kim, H.; Cho, J.H.; Kim, H.B.; Lee, J.-H. Role of Probiotics in Human Gut Microbiome-Associated Diseases. J. Microbiol. Biotechnol. 2019 , 29 , 1335–1340. [ Google Scholar ] [ CrossRef ]
  • Sophocleous, A.; Azfer, A.; Huesa, C.; Stylianou, E.; Ralston, S.H. Probiotics Inhibit Cartilage Damage and Progression of Osteoarthritis in Mice. Calcif. Tissue Int. 2023 , 112 , 66–73. [ Google Scholar ] [ CrossRef ]
  • Lei, M.; Guo, C.; Wang, D.; Zhang, C.; Hua, L. The Effect of Probiotic Lactobacillus Casei Shirota on Knee Osteoarthritis: A Randomised Double-Blind, Placebo-Controlled Clinical Trial. Benef. Microbes 2017 , 8 , 697–704. [ Google Scholar ] [ CrossRef ]
  • Lyu, J.L.; Wang, T.M.; Chen, Y.H.; Chang, S.T.; Wu, M.S.; Lin, Y.H.; Lin, Y.H.; Kuan, C.M. Oral Intake of Streptococcus Thermophilus Improves Knee Osteoarthritis Degeneration: A Randomized, Double-Blind, Placebo-Controlled Clinical Study. HELIYON 2020 , 6 , e03757. [ Google Scholar ] [ CrossRef ]
  • Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 Statement: An Updated Guideline for Reporting Systematic Reviews. BMJ 2021 , 372 , n71. [ Google Scholar ] [ CrossRef ]
  • Higgins, J.; Green, S. (Eds.) Cochrane Handbook for Systematic Reviews of Interventions ; Version 5.1.0; [Updated March 2011]; The Nordic Cochrane Centre, The Cochrane Collaboration: Copenhagen, Denmark, 2011; Available online: www.handbook.cochrane.org (accessed on 12 May 2024).
  • Ma, J.; Liu, W.; Hunter, A.; Zhang, W. Performing Meta-Analysis with Incomplete Statistical Information in Clinical Trials. BMC Med. Res. Methodol. 2008 , 8 , 56. [ Google Scholar ] [ CrossRef ]
  • Weir, C.J.; Butcher, I.; Assi, V.; Lewis, S.C.; Murray, G.D.; Langhorne, P.; Brady, M.C. Dealing with Missing Standard Deviation and Mean Values in Meta-Analysis of Continuous Outcomes: A Systematic Review. BMC Med. Res. Methodol. 2018 , 18 , 25. [ Google Scholar ] [ CrossRef ]
  • Review Manager (RevMan) [Computer Program] , Version 5.3; The Nordic Cochrane Centre, The Cochrane Collaboration: Copenhagen, Denmark, 2014.
  • Punja, S.; Schmid, C.H.; Hartling, L.; Urichuk, L.; Nikles, C.J.; Vohra, S. To Meta-Analyze or Not to Meta-Analyze? A Combined Meta-Analysis of N-of-1 Trial Data with RCT Data on Amphetamines and Methylphenidate for Pediatric ADHD. J. Clin. Epidemiol. 2016 , 76 , 76–81. [ Google Scholar ] [ CrossRef ]
  • Higgins, J.P.T.; Altman, D.G.; Gøtzsche, P.C.; Jüni, P.; Moher, D.; Oxman, A.D.; Savović, J.; Schulz, K.F.; Weeks, L.; Sterne, J.A.C. The Cochrane Collaboration’s Tool for Assessing Risk of Bias in Randomised Trials. BMJ 2011 , 343 , d5928. [ Google Scholar ] [ CrossRef ]
  • Tate, R.L.; Perdices, M.; Rosenkoetter, U.; Wakim, D.; Godbee, K.; Togher, L.; McDonald, S. Revision of a Method Quality Rating Scale for Single-Case Experimental Designs and n -of-1 Trials: The 15-Item Risk of Bias in N -of-1 Trials (RoBiNT) Scale. Neuropsychol. Rehabil. 2013 , 23 , 619–638. [ Google Scholar ] [ CrossRef ]
  • Ryan, R.; Hill, S. How to GRADE the Quality of the Evidence , Version 3.0; The Nordic Cochrane Centre, The Cochrane Collaboration: Copenhagen, Denmark, 2016. Available online: http://cccrg.cochrane.org/author-resources (accessed on 14 May 2024).
  • Cumpston, M.; Li, T.; Page, M.J.; Chandler, J.; Welch, V.A.; Higgins, J.P.; Thomas, J. Updated Guidance for Trusted Systematic Reviews: A New Edition of the Cochrane Handbook for Systematic Reviews of Interventions. Cochrane Database Syst. Rev. 2019 , 2019 , ED000142. [ Google Scholar ] [ CrossRef ]
  • Taye, I.; Bradbury, J.; Grace, S.; Avila, C. Probiotics for Pain of Osteoarthritis; An N-of-1 Trial of Individual Effects. Complement. Ther. Med. 2020 , 54 , 102548. [ Google Scholar ] [ CrossRef ]
  • Perdices, M.; Tate, R.L.; Rosenkoetter, U. An Algorithm to Evaluate Methodological Rigor and Risk of Bias in Single-Case Studies. Behav. Modif. 2023 , 47 , 1482–1509. [ Google Scholar ] [ CrossRef ]
  • Guan, Z.; Jia, J.; Zhang, C.; Sun, T.; Zhang, W.; Yuan, W.; Leng, H.; Song, C. Gut Microbiome Dysbiosis Alleviates the Progression of Osteoarthritis in Mice. Clin. Sci. 2020 , 134 , 3159–3174. [ Google Scholar ] [ CrossRef ]
  • Boer, C.G.; Radjabzadeh, D.; Medina-Gomez, C.; Garmaeva, S.; Schiphof, D.; Arp, P.; Koet, T.; Kurilshikov, A.; Fu, J.; Ikram, M.A.; et al. Intestinal Microbiome Composition and Its Relation to Joint Pain and Inflammation. Nat. Commun. 2019 , 10 , 4881. [ Google Scholar ] [ CrossRef ]
  • Sanchez, P.; Letarouilly, J.-G.; Nguyen, Y.; Sigaux, J.; Barnetche, T.; Czernichow, S.; Flipo, R.-M.; Sellam, J.; Daïen, C. Efficacy of Probiotics in Rheumatoid Arthritis and Spondyloarthritis: A Systematic Review and Meta-Analysis of Randomized Controlled Trials. Nutrients 2022 , 14 , 354. [ Google Scholar ] [ CrossRef ]
  • Zeng, L.; Deng, Y.; He, Q.; Yang, K.; Li, J.; Xiang, W.; Liu, H.; Zhu, X.; Chen, H. Safety and efficacy of probiotic supplementation in 8 types of inflammatory arthritis: A systematic review and meta-analysis of 34 randomized controlled trials. Front. Immunol. 2022 , 13 , 961325. [ Google Scholar ] [ CrossRef ]
  • Farì, G.; Megna, M.; Scacco, S.; Ranieri, M.; Raele, M.V.; Noya, E.C.; Macchiarola, D.; Bianchi, F.P.; Carati, D.; Gnoni, A.; et al. Effects of Terpenes on the Osteoarthritis Cytokine Profile by Modulation of IL-6: Double Face versus Dark Knight? Biology 2023 , 12 , 1061. [ Google Scholar ] [ CrossRef ] [ PubMed ] [ PubMed Central ]
  • Farì, G.; Megna, M.; Scacco, S.; Ranieri, M.; Raele, M.V.; Noya, E.C.; Macchiarola, D.; Bianchi, F.P.; Carati, D.; Panico, S.; et al. Hemp Seed Oil in Association with β-Caryophyllene, Myrcene and Ginger Extract as a Nutraceutical Integration in Knee Osteoarthritis: A Double-Blind Prospective Case-Control Study. Medicina 2023 , 59 , 191. [ Google Scholar ] [ CrossRef ] [ PubMed ] [ PubMed Central ]
  • Johnson, J.S.; Spakowicz, D.J.; Hong, B.-Y.; Petersen, L.M.; Demkowicz, P.; Chen, L.; Leopold, S.R.; Hanson, B.M.; Agresta, H.O.; Gerstein, M.; et al. Evaluation of 16S RRNA Gene Sequencing for Species and Strain-Level Microbiome Analysis. Nat. Commun. 2019 , 10 , 5029. [ Google Scholar ] [ CrossRef ]
  • Jansson, P.-A.; Curiac, D.; Lazou Ahrén, I.; Hansson, F.; Martinsson Niskanen, T.; Sjögren, K.; Ohlsson, C. Probiotic Treatment Using a Mix of Three Lactobacillus Strains for Lumbar Spine Bone Loss in Postmenopausal Women: A Randomised, Double-Blind, Placebo-Controlled, Multicentre Trial. Lancet Rheumatol. 2019 , 1 , e154–e162. [ Google Scholar ] [ CrossRef ]
  • Wirth, W.; Ladel, C.; Maschek, S.; Wisser, A.; Eckstein, F.; Roemer, F. Quantitative Measurement of Cartilage Morphology in Osteoarthritis: Current Knowledge and Future Directions. Skeletal Radiol. 2023 , 52 , 2107–2122. [ Google Scholar ] [ CrossRef ]
  • Fillingim, R.B. Individual Differences in Pain: Understanding the Mosaic That Makes Pain Personal. Pain 2017 , 158 , S11–S18. [ Google Scholar ] [ CrossRef ]
  • Stone, A.A.; Broderick, J.E.; Goldman, R.E.; Junghaenel, D.U.; Bolton, A.; May, M.; Schneider, S.I. Indices of Pain Intensity Derived from Ecological Momentary Assessments: Rationale and Stakeholder Preferences. J. Pain 2021 , 22 , 359–370. [ Google Scholar ] [ CrossRef ]
  • Schneider, S.; Junghaenel, D.U.; Broderick, J.E.; Ono, M.; May, M.; Stone, A.A., II. Indices of Pain Intensity Derived from Ecological Momentary Assessments and Their Relationships with Patient Functioning: An Individual Patient Data Meta-Analysis. J. Pain 2021 , 22 , 371–385. [ Google Scholar ] [ CrossRef ]
  • Schneider, S.; Junghaenel, D.U.; Ono, M.; Broderick, J.E.; Stone, A.A., III. Detecting Treatment Effects in Clinical Trials with Different Indices of Pain Intensity Derived from Ecological Momentary Assessment. J. Pain 2021 , 22 , 386–399. [ Google Scholar ] [ CrossRef ]
  • Coulson, S.; Butt, H.; Vecchio, P.; Gramotnev, H.; Vitetta, L. Green-lipped mussel extract ( Perna canaliculus ) and glucosamine sulphate in patients with knee osteoarthritis: Therapeutic efficacy and effects on gastrointestinal microbiota profiles. Inflammopharmacology 2013 , 21 , 79–90. [ Google Scholar ] [ CrossRef ]
  • Mizrahi, A.; Pilmis, B.; Lambert, T.; Mohamed-Hadj, A.; Wolff, S.; Desplaces, N.; Le Monnier, A. Thumb osteoarthritis caused by Lactobacillus plantarum . Med. Mal. Infect. 2016 , 46 , 237–239. [ Google Scholar ] [ CrossRef ]
  • Pasticci, M.B.; Baldelli, F.; Malincarne, L.; Mancini, G.B.; Marroni, M.; Morosi, S.; Stagni, G. Vancomycin-resistant Enterococcus faecium osteoarthritis following Staphylococcus aureus hip infection. Orthopedics 2005 , 28 , 1457–1458. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

Study/YearPopulationAge (Years)No. of Patients
after Dropout
(No. Male/Female)
Study DesignTreatment Groups (n)Treatment DoseTreatment PeriodOutcomes Reported
Lei, et al., (2017) [ ]Patients with knee OA66.9 ± 5.0433 (192/241)RCT = 215) = 218) CFU of Lactobacillus casei Shirota, daily 6 months
Lyu, et al., (2020) [ ]Patients with knee OA 60.8 ± 12.267 (14/53)RCT = 37) = 30) bacteria per capsule, daily 12 weeks
Taye, et al., (2020) [ ]Patients with OA in lower back and right ankle671 (0/1)N-of-1 trial (LGG ), Saccharomyces cerevisiae (boulardii), and Bifidobacterium animalis ssp lactis (BB-12 ) (LGG ) (10 × 10 CFU), Saccharomyces cerevisiae (boulardii) (7.5 × 10 CFU), and Bifidobacterium animalis ssp lactis (BB-12 ) (5 × 10 CFU), daily NOTE: Three treatment blocks, each with one pair of active/placebo interventions, randomly orderedSix intervention periods, each lasting 3 weeks and separated by a 2-week washout period
In total: 32 weeks
OutcomeStd. Mean Difference (95% CI)Statistical MethodTest for HeterogeneityTest for Overall Effect
WOMAC pain subscale score−1.13 (−13.32, 11.05)Std. mean difference, IV, Random effects, 95% CIChi = 457.60, df = 1,
p < 0.00001, I = 100%
z = 0.18, p = 0.86
WOMAC stiffness subscale score−21.31 (−63.41, 20.79)Std. mean difference, IV, Random effects, 95% CIChi = 125.52, df = 1,
p < 0.00001, I = 99%
z = 0.99, p = 0.32
WOMAC physical function subscale score−4.59 (−20.75, 11.56)Std. mean difference, IV, Random effects, 95% CIChi = 740.16, df = 1,
p < 0.00001, I = 100%
z = 0.56, p = 0.58
Study IDInternal Validity (IV) Subscale Total IV
(Out of 14)
External Validity and Interpretation (EVI) Subscale Total EVI
(Out of 16)
Total Score (Out of 30)
123456789101112131415
Taye et al., 2020 [ ] 2 22 2 1 1 0 10112212021121
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Moyseos, M.; Michael, J.; Ferreira, N.; Sophocleous, A. The Effect of Probiotics on the Management of Pain and Inflammation in Osteoarthritis: A Systematic Review and Meta-Analysis of Clinical Studies. Nutrients 2024 , 16 , 2243. https://doi.org/10.3390/nu16142243

Moyseos M, Michael J, Ferreira N, Sophocleous A. The Effect of Probiotics on the Management of Pain and Inflammation in Osteoarthritis: A Systematic Review and Meta-Analysis of Clinical Studies. Nutrients . 2024; 16(14):2243. https://doi.org/10.3390/nu16142243

Moyseos, Maria, Jenny Michael, Nuno Ferreira, and Antonia Sophocleous. 2024. "The Effect of Probiotics on the Management of Pain and Inflammation in Osteoarthritis: A Systematic Review and Meta-Analysis of Clinical Studies" Nutrients 16, no. 14: 2243. https://doi.org/10.3390/nu16142243

Article Metrics

Article access statistics, supplementary material.

ZIP-Document (ZIP, 336 KiB)

Further Information

Mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • BMJ Journals

You are here

  • Online First
  • What is the vibration of effects?
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0002-6899-1838 Constant Vinatier 1 ,
  • Sabine Hoffmann 2 , 3 ,
  • Chirag Patel 4 ,
  • http://orcid.org/0000-0001-8286-1995 Nicholas J DeVito 5 ,
  • http://orcid.org/0000-0002-9854-7076 Ioana Alina Cristea 6 ,
  • Braden Tierney 7 ,
  • http://orcid.org/0000-0003-3118-6859 John P A Ioannidis 8 ,
  • http://orcid.org/0000-0003-3760-3801 Florian Naudet 1 , 9
  • 1 Univ Rennes, CHU Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, Centre d’investigation clinique de Rennes (CIC1414) , Rennes , France
  • 2 Department of Statistics , Ludwig-Maximilians-Universität München , München , Germany
  • 3 LMU Open Science Center , Ludwig-Maximilians-Universität München , München , Germany
  • 4 Department of Biomedical Informatics , Harvard Medical School , Boston , Massachusetts , USA
  • 5 Nuffield Primary Care Health Sciences , University of Oxford , Oxford , UK
  • 6 Department of General Psychology , University of Padova , Pavia , Italy
  • 7 Department of Physiology and Biophysics , Weill Cornell Medical College , New York , New York , USA
  • 8 Departments of Medicine, of Epidemiology and Population Health, of Biomedical Data Science, and of Statistics, and Meta-Research Innovation Center at Stanford (METRICS) , Stanford University , Stanford , California , USA
  • 9 Institut Universitaire de France (IUF) , Paris , France
  • Correspondence to Constant Vinatier, University of Rennes, Rennes, France; constant.vinatier1{at}gmail.com

https://doi.org/10.1136/bmjebm-2023-112747

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

  • Evidence-Based Practice

Introduction

Navigating between contradictory results is not rare in the practice of evidence-based medicine. Recently, two papers published in the same year and in the same journal investigated the same research question with the same dataset and reached divergent results regarding the benefits of retrieval bag use during laparoscopic appendectomy. 1 The two studies reached contrasting conclusions, one found that these bags actually reduce the risk of infection 2 while the other study found no support for a difference. 3 Likewise, a multitude of network meta-analyses about the treatment of psoriasis reached divergent conclusions on the best drug to use, 4 the best drug always being the one of the drug manufacturer in case of industry-funded meta-analyses. Implementing the findings of medical research for decision-making in clinical practice is quite challenging when scientific results stand on such unstable ground. One reason, among others, is analytical flexibility that represents the variability in results arising from ‘researcher degrees of freedom’ (ie, uncertain decisions researchers have to make in study design, data collection and data analysis 5 ). Analytical flexibility arises, for instance, when researchers have to choose among multiple justifiable methods, models or measurements. Given this analytical variability and under the pressure to publish, researchers may try different analysis strategies and selectively report the most impressive, desirable, publishable result. 6 Not surprisingly, reported results may be, on average, inflated. 7

Assessing and reporting analytical variability

A realistic approach to explore analytical variability may rely on seeing how different investigators can reasonably approach and vary their choices for a given research question and dataset. In a ‘Multi-analyst study’ 8 several independent teams analyse the same dataset to assess the analytical choices and variability in results. 9 Each team may choose how they best want to analyse the data. For instance, 29 research teams independently investigated the same dataset to examine whether skin tone was associated with red cards in soccer. The teams employed a variety of statistical models leading to considerable differences in effect size and statistical significance. 9 This approach is, however, challenging to implement as it relies on recruiting and managing a large network of independent teams which still may leave many plausible analytic strategies unexplored. Moreover, very often it is difficult to justify which specific analytical choices are more meaningful than others.

Another less complex approach in term of feasibility may help. Accordingly, the VoE is a more general framework that can be used for any research project to explore analytical variability in a more comprehensive way. It involves computing the results of a very large number of possible analysis strategies by varying one or more analytical choices in all possible analytical scenarios and comparing their impact on the observed results.

Figure 1 illustrates the VoE approach using observational data from National Health and Nutrition Examination Survey (NHANES) prescription data by fitting 9595 different models (of which 6242 models converge) exploring the association of systolic blood pressure (in mm Hg) and use of lisinopril. 10 Lisinopril is a drug to treat high blood pressure and it is the most commonly prescribed drug in the NHANES 2011–2018 prescription data. The estimated beta coefficients of interest ranged from −0.553 to 0.575, with a median of 0.003 mm Hg.

  • Download figure
  • Open in new tab
  • Download powerpoint

Vibration of effects of beta coefficient in the exploration of the association between lisinopril usage and systolic blood pressure. An estimate <0 suggests lower systolic blood pressure with lisinopril. This figure was produced using data from Tierney et al 10 by fitting 9595 random select models, among all possible models, exploring the association and using 253 covariates, with a maximum number of variables in the model set to 20. Data and code to reproduce the figure are available on the Open Science Framework at https://osf.io/xfy75/ . (A) Dots represent the 6242 convergent regression models among the 9595 randomly selected models. Colours represent densities (red=high, blue=low), with marginal density plot of distributions. (B) Point estimates and 95% CIs for all models. Colours represent densities (red=high, blue=low).

Various indicators have been proposed to assess and quantify VoE. Some are focused on p values, including the range of p values (RP): the difference between the 99th and the 1st percentile of the negative log transformation of the p value. 11 12 Others are based on effect sizes, including, for instance, relative OR and relative HR, which are the ratio of the 1st and the 99th percentile of the OR and the HR, respectively. 11 13 A Janus effect, named after the two-faced Greek-Roman god Janus, is defined in a VoE study by the presence of opposite results (eg, ORs on both sides of the no-effect line) among all possible analysis strategies, 11 and indicates substantial analytical variability: for example, a treatment seems to be better than control in some analyses while the control is better than the treatment in some other analyses of the very same data; or a biomarker seems to be a risk factor for a disease in some analyses and a protective factor for that disease in some other analyses using the same dataset.

Regarding the example presented in figure 1 , there was a Janus effect with the 1st percentile being negative (−0.247) and the 99th percentile being positive (0.126). A total of 1.6% (154/9595) of the associations were statistically significant at p<0.05 (141 and 13 of which were negative and positive, respectively). The p values of the different models ranged from 0.1510×10– 6 to 0.9997 with a median of 0.6908 ( figure 1 ). Given this wide variability in results, it would have been easy to selectively report a favourable or unfavourable association between lisinopril intake and systolic blood pressure based on analytical flexibility .

VoE in primary research and evidence syntheses

From data processing choices 14 (eg, eligibility criteria, handling of outliers, 15 dichotomisation of outcome 16 and of covariates) to model selection, many sources of analytical flexibility exist in primary research and can be explored using the VoE framework. There is a continuum between study designs. Randomised controlled trials (RCTs) control flexibility through generally more stringent design choices than observational studies, for example, randomisation limits confounding. 17 The presence of analytical variability also depends on (1) the richness of datasets (eg, big, wide, deep data), especially when data are not collected for research purpose 10 and need to be heavily preprocessed, 18 and (2) the complexity 19 of the models being considered (eg, linear models vs more complex ones), 20 as there are many more junctures where choices can be made with complex models. The VoE framework allows for exploring the stability of results within a primary study.

However, the phenomenon of analytical variability is not restricted to primary research and is also observed in evidence synthesis methods. While meta-analyses are supposed to be exhaustive, reproducible quantitative syntheses of the available evidence on a given research question, they are also prone to analytical flexibility. Differences in inclusion/exclusion criteria regarding Population-Intervention-Comparison-Outcomes-Study design and other analytical choices can lead to substantial analytical variability, especially for controversial topics with high clinical and statistical heterogeneity, 21 or when the evidence synthesis methods are complex and rely on assumptions that are difficult to verify (eg, exchangeability in comparative effectiveness research). Analytical variability has been observed in head-to-head meta-analyses assessing the efficacy of acupuncture for smoking cessation 21 and operative compared with non-operative treatments for proximal humerus fractures, 12 in an indirect comparison of nalmefene to naltrexone 22 and in a network meta-analysis of 21 antidepressants 23 with 172/231 (74%) comparisons exhibiting a Janus effect. 24 To a smaller extent, analytical variability has been observed in 16 332 individual-level data meta-analyses exploring the efficacy of canagliflozin versus placebo in type 2 diabetes, depending on the combinations of RCTs to be included. 25

Implications for evidence-based medicine

The VoE framework may be helpful to assess the robustness of results to alternative plausible analytical choices in a systematic way. It also offers a valuable meta-research tool to explore specific replicability issues, such as controversies across discrepant meta-analyses. For instance, there is some controversy regarding the additional benefit of escitalopram, a single-enantiomer drug of citalopram, whose launch coincided with the expiration of exclusivity for citalopram. 26 Despite its large commercial success, the superiority of escitalopram over citalopram remains uncertain with contradictory claims based on conflicting meta-analyses. 23 26–29 Using the most comprehensive Network Meta Analysis to date, 23 it could be interesting to explore whether difference in treatment selection can lead to different effect estimates in terms of magnitude and statistical significance. 24 Among the 4 116 254 possible network meta-analyses based on the 21 included treatments, 1 174 541 included both drugs. The estimated ORs ranged from 0.735 to 0.982, with a median of 0.881 (1st percentile 0.747, 99th percentile 0.965). There was no Janus effect since all OR estimates were in favour of escitalopram, possibly owing to the identification of an effect in the direct comparisons (OR=0.753 (0.630, 0.900), 13 studies). However, the RP was 0.0003–0.8258 with median of 0.1196 and an RP of 2.348. Only 33% (389 726) of the associations reached statistical significance at p<0.05 ( figure 2 ). The VoE framework allowed to explore the robustness of the identified difference depending on treatment selection. If a genuine difference exists, its identification and magnitude depends on the pathways used for indirect comparisons as defined by the different network geometry. As a direct consequence, the VoE framework has the potential to help explore controversies in evidence-based medicine such as conflicting meta-analyses on the same topic. And indeed, it has been argued that there are inconsistencies between direct and indirect evidence in the escitalopram-citalopram comparison, with some doubts concerning even the reliability of the direct evidence. 26

Vibration of effects for the comparison of escitalopram versus citalopram in the treatment of major depressive disorder. Data and code to reproduce the figure are respectively available on the Open Science Framework at https://osf.io/xfy75/ . (A) An OR<1 is in favour of escitalopram. In the graphs on the right, dots represent meta-analyses and colours represent densities (red=high; blue=low), with marginal density plot of distributions. Full methods are detailed at https://doi.org/10.17605/OSF.IO/MB5DY . (B) Example of a network of 12 treatments (in blue) that failed to identify a difference, OR=0.98 (0.84; 1.15) (p=0.823). Treatments in grey are treatment of the full meta-analysis not included in the network. Size of the points represents the number of patients included. (C) Example of a network of 12 treatments (in blue) that identified a difference, OR=0.74 (0.61; 0.89) (p=0.001). Treatments in grey are treatment of the full meta-analysis not included in the network. Size of the points represents the number of patients included.

The VoE framework can be very informative, but it should be handled with care. Even a strong association that is seemingly robust in a VoE analysis could be a false positive. Additionally, care should be given to choosing what parameters should vary and what reasonable and plausible limits to set on the variation examined. Since there is usually no consensus regarding all potential methodological choices, defining the set of model specifications to consider, VoE analysis is itself a subjective choice. Moreover, some model specifications examined may not be sensible or valid, for example, if they include collider variables that can impact the effect of interest. In the same vein, the VoE framework is an agnostic approach that explores all possible choices, although researchers may usually consider existing framework and rationales when making their choices and some combinations may make less sense than others. Finally, conducting all possible subset analyses within a study can lead to an overwhelming number of analyses, posing computational challenges.

The existence of analytical variability in primary research and in evidence syntheses is an important argument for registration of statistical analysis plans. 17 It allows, for primary research and also for evidence syntheses, 30 to check if they deviated from their initial plans. In addition, registration for evidence syntheses may help limit the conduct of redundant meta-analysis that may end in divergent results and even more divergent interpretations, 31 adding confusion. Detailed statistical analysis plans can be employed to prespecify the proposed approach to handle multiplicity by making any choice transparent and to limit outcome-dependent analytical choices. Registration is currently mandatory for clinical trials but still optional for observational research and meta-analyses. Even for clinical trials, mandatory registration does not extend to mandating also the public availability of detailed statistical analysis plans in advance.

If used with care, the VoE framework can be a useful tool to explore and visualise uncertainties related to a universe of possible analytical choices in primary studies, datasets and meta-analyses. It can increase transparency in the reporting of results arising from different data processing, data/study eligibility and model specifications and help explore controversies in evidence-based medicine such as conflicting meta-analyses on the same topic.

Ethics statements

Patient consent for publication.

Not applicable.

Ethics approval

  • Childers CP ,
  • Maggard-Gibbons M
  • Fields AC ,
  • Palenzuela DL , et al
  • Turner SA ,
  • Scarborough JE
  • Guelimi R ,
  • Régnaux JP , et al
  • Simmons JP ,
  • Nelson LD ,
  • Simonsohn U
  • Munafò MR ,
  • Bishop DVM , et al
  • Ioannidis JPA
  • Nilsonne G , et al
  • Silberzahn R ,
  • Uhlmann EL ,
  • Martin DP , et al
  • Tierney BT ,
  • Anderson E ,
  • Tan Y , et al
  • Burford B ,
  • Aagaard TV ,
  • Hróbjartsson A , et al
  • Schönbrodt FD ,
  • Patel CJ , et al
  • Deussen AR ,
  • Pollet TV ,
  • van der Meij L
  • DeVito NJ , et al
  • Haibe-Kains B ,
  • Hosny A , et al
  • El Bahri M ,
  • Biaggi T , et al
  • Palpacuer C ,
  • Duprez R , et al
  • Cipriani A ,
  • Furukawa TA ,
  • Salanti G , et al
  • Vinatier C ,
  • Scanff A , et al
  • Gouraud H ,
  • Wallach JD ,
  • Boussageon R , et al
  • Alkhafaji AA ,
  • Trinquart L ,
  • Baron G , et al
  • Siebert M ,
  • Caquelin L ,
  • Madera M , et al
  • Bauchner H ,

X @NaudetFlorian

Contributors During the writing of another paper about registration of observational studies, editor Juan Franco invited this educational paper on vibration of effects. FN invited the team of coauthors. CV and FN wrote the first draft. All other authors contributed to revising it critically and agreed on the final content.

Funding The author(s) received no specific funding for this work. Publications fees will be paid by Rennes University Hospital.

Competing interests CV is a PhD student in the OSIRIS (Open Science to Increase Reproducibility in Science) project. The OSIRIS project has received funding from the European Union’s Horizon Europe research and innovation programme under the grant agreement number 101094725. SH has received funding from the European Union’s Horizon Europe programme, the German Federal Ministry for the Environment, Nature Conservation, Nuclear Safety and Consumer Protection (BMUV) and the LMUExcellent. CP received funding from NIH (NIEHS R01ES0324702 and NIA RF1AG074372). NJD has received funding from the European Union’s Horizon Europe programme, also via the OSIRIS project, the Naji Foundation, the German Federal Ministry of Education and Research (BMBF) and the Fetzer Franklin Memorial Fund, and has been employed on grants from the Mohn-Westlake Foundation, Laura and John Arnold Foundation, Elsevier and the Good Thinking Society in the last 5 years. BT is compensated for consulting with Seed Health and Enzymetrics Biosciences on microbiome study design. FN received funding from the French National Research Agency (ANR-17-CE36-0010), the French Ministry of Health and the French Ministry of Research. He is a work package leader in the OSIRIS project. He is a work package leader for the doctoral network MSCA-DN SHARE-CTD (HORIZON-MSCA-2022-DN-01 101120360), funded by the EU. The work of JPAI has been supported by an unrestricted gift from Sue and Bob O’Donnell to Stanford University.

Provenance and peer review Not commissioned; externally peer reviewed.

Read the full text or download the PDF:

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Hippokratia
  • v.14(Suppl 1); 2010 Dec

Meta-analysis in medical research

The objectives of this paper are to provide an introduction to meta-analysis and to discuss the rationale for this type of research and other general considerations. Methods used to produce a rigorous meta-analysis are highlighted and some aspects of presentation and interpretation of meta-analysis are discussed.

Meta-analysis is a quantitative, formal, epidemiological study design used to systematically assess previous research studies to derive conclusions about that body of research. Outcomes from a meta-analysis may include a more precise estimate of the effect of treatment or risk factor for disease, or other outcomes, than any individual study contributing to the pooled analysis. The examination of variability or heterogeneity in study results is also a critical outcome. The benefits of meta-analysis include a consolidated and quantitative review of a large, and often complex, sometimes apparently conflicting, body of literature. The specification of the outcome and hypotheses that are tested is critical to the conduct of meta-analyses, as is a sensitive literature search. A failure to identify the majority of existing studies can lead to erroneous conclusions; however, there are methods of examining data to identify the potential for studies to be missing; for example, by the use of funnel plots. Rigorously conducted meta-analyses are useful tools in evidence-based medicine. The need to integrate findings from many studies ensures that meta-analytic research is desirable and the large body of research now generated makes the conduct of this research feasible.

Important medical questions are typically studied more than once, often by different research teams in different locations. In many instances, the results of these multiple small studies of an issue are diverse and conflicting, which makes the clinical decision-making difficult. The need to arrive at decisions affecting clinical practise fostered the momentum toward "evidence-based medicine" 1 – 2 . Evidence-based medicine may be defined as the systematic, quantitative, preferentially experimental approach to obtaining and using medical information. Therefore, meta-analysis, a statistical procedure that integrates the results of several independent studies, plays a central role in evidence-based medicine. In fact, in the hierarchy of evidence ( Figure 1 ), where clinical evidence is ranked according to the strength of the freedom from various biases that beset medical research, meta-analyses are in the top. In contrast, animal research, laboratory studies, case series and case reports have little clinical value as proof, hence being in the bottom.

An external file that holds a picture, illustration, etc.
Object name is hippokratia-14-29-g001.jpg

Meta-analysis did not begin to appear regularly in the medical literature until the late 1970s but since then a plethora of meta-analyses have emerged and the growth is exponential over time ( Figure 2 ) 3 . Moreover, it has been shown that meta-analyses are the most frequently cited form of clinical research 4 . The merits and perils of the somewhat mysterious procedure of meta-analysis, however, continue to be debated in the medical community 5 – 8 . The objectives of this paper are to introduce meta-analysis and to discuss the rationale for this type of research and other general considerations.

An external file that holds a picture, illustration, etc.
Object name is hippokratia-14-30-g001.jpg

Meta-Analysis and Systematic Review

Glass first defined meta-analysis in the social science literature as "The statistical analysis of a large collection of analysis results from individual studies for the purpose of integrating the findings" 9 . Meta-analysis is a quantitative, formal, epidemiological study design used to systematically assess the results of previous research to derive conclusions about that body of research. Typically, but not necessarily, the study is based on randomized, controlled clinical trials. Outcomes from a meta-analysis may include a more precise estimate of the effect of treatment or risk factor for disease, or other outcomes, than any individual study contributing to the pooled analysis. Identifying sources of variation in responses; that is, examining heterogeneity of a group of studies, and generalizability of responses can lead to more effective treatments or modifications of management. Examination of heterogeneity is perhaps the most important task in meta-analysis. The Cochrane collaboration has been a long-standing, rigorous, and innovative leader in developing methods in the field 10 . Major contributions include the development of protocols that provide structure for literature search methods, and new and extended analytic and diagnostic methods for evaluating the output of meta-analyses. Use of the methods outlined in the handbook should provide a consistent approach to the conduct of meta-analysis. Moreover, a useful guide to improve reporting of systematic reviews and meta-analyses is the PRISMA (Preferred Reporting Items for Systematic reviews and Meta-analyses) statement that replaced the QUOROM (QUality Of Reporting of Meta-analyses) statement 11 – 13 .

Meta-analyses are a subset of systematic review. A systematic review attempts to collate empirical evidence that fits prespecified eligibility criteria to answer a specific research question. The key characteristics of a systematic review are a clearly stated set of objectives with predefined eligibility criteria for studies; an explicit, reproducible methodology; a systematic search that attempts to identify all studies that meet the eligibility criteria; an assessment of the validity of the findings of the included studies (e.g., through the assessment of risk of bias); and a systematic presentation and synthesis of the attributes and findings from the studies used. Systematic methods are used to minimize bias, thus providing more reliable findings from which conclusions can be drawn and decisions made than traditional review methods 14 , 15 . Systematic reviews need not contain a meta-analysisthere are times when it is not appropriate or possible; however, many systematic reviews contain meta-analyses 16 .

The inclusion of observational medical studies in meta-analyses led to considerable debate over the validity of meta-analytical approaches, as there was necessarily a concern that the observational studies were likely to be subject to unidentified sources of confounding and risk modification 17 . Pooling such findings may not lead to more certain outcomes. Moreover, an empirical study showed that in meta-analyses were both randomized and non-randomized was included, nonrandomized studies tended to show larger treatment effects 18 .

Meta-analyses are conducted to assess the strength of evidence present on a disease and treatment. One aim is to determine whether an effect exists; another aim is to determine whether the effect is positive or negative and, ideally, to obtain a single summary estimate of the effect. The results of a meta-analysis can improve precision of estimates of effect, answer questions not posed by the individual studies, settle controversies arising from apparently conflicting studies, and generate new hypotheses. In particular, the examination of heterogeneity is vital to the development of new hypotheses.

Individual or Aggregated Data

The majority of meta-analyses are based on a series of studies to produce a point estimate of an effect and measures of the precision of that estimate. However, methods have been developed for the meta-analyses to be conducted on data obtained from original trials 19 , 20 . This approach may be considered the "gold standard" in metaanalysis because it offers advantages over analyses using aggregated data, including a greater ability to validate the quality of data and to conduct appropriate statistical analysis. Further, it is easier to explore differences in effect across subgroups within the study population than with aggregated data. The use of standardized individual-level information may help to avoid the problems encountered in meta-analyses of prognostic factors 21 , 22 . It is the best way to obtain a more global picture of the natural history and predictors of risk for major outcomes, such as in scleroderma 23 – 26 .This approach relies on cooperation between researchers who conducted the relevant studies. Researchers who are aware of the potential to contribute or conduct these studies will provide and obtain additional benefits by careful maintenance of original databases and making these available for future studies.

Literature Search

A sound meta-analysis is characterized by a thorough and disciplined literature search. A clear definition of hypotheses to be investigated provides the framework for such an investigation. According to the PRISMA statement, an explicit statement of questions being addressed with reference to participants, interventions, comparisons, outcomes and study design (PICOS) should be provided 11 , 12 . It is important to obtain all relevant studies, because loss of studies can lead to bias in the study. Typically, published papers and abstracts are identified by a computerized literature search of electronic databases that can include PubMed ( www.ncbi.nlm.nih.gov./entrez/query.fcgi ), ScienceDirect ( www.sciencedirect.com ), Scirus ( www.scirus.com/srsapp ), ISI Web of Knowledge ( http://www.isiwebofknowledge.com ), Google Scholar ( http://scholar.google.com ) and CENTRAL (Cochrane Central Register of Controlled Trials, http://www.mrw.interscience.wiley.com/cochrane/cochrane_clcentral_articles_fs.htm ). PRISMA statement recommends that a full electronic search strategy for at least one major database to be presented 12 . Database searches should be augmented with hand searches of library resources for relevant papers, books, abstracts, and conference proceedings. Crosschecking of references, citations in review papers, and communication with scientists who have been working in the relevant field are important methods used to provide a comprehensive search. Communication with pharmaceutical companies manufacturing and distributing test products can be appropriate for studies examining the use of pharmaceutical interventions.

It is not feasible to find absolutely every relevant study on a subject. Some or even many studies may not be published, and those that are might not be indexed in computer-searchable databases. Useful sources for unpublished trials are the clinical trials registers, such as the National Library of Medicine's ClinicalTrials.gov Website. The reviews should attempt to be sensitive; that is, find as many studies as possible, to minimize bias and be efficient. It may be appropriate to frame a hypothesis that considers the time over which a study is conducted or to target a particular subpopulation. The decision whether to include unpublished studies is difficult. Although language of publication can provide a difficulty, it is important to overcome this difficulty, provided that the populations studied are relevant to the hypothesis being tested.

Inclusion or Exclusion Criteria and Potential for Bias

Studies are chosen for meta-analysis based on inclusion criteria. If there is more than one hypothesis to be tested, separate selection criteria should be defined for each hypothesis. Inclusion criteria are ideally defined at the stage of initial development of the study protocol. The rationale for the criteria for study selection used should be clearly stated.

One important potential source of bias in meta-analysis is the loss of trials and subjects. Ideally, all randomized subjects in all studies satisfy all of the trial selection criteria, comply with all the trial procedures, and provide complete data. Under these conditions, an "intention-totreat" analysis is straightforward to implement; that is, statistical analysis is conducted on all subjects that are enrolled in a study rather than those that complete all stages of study considered desirable. Some empirical studies had shown that certain methodological characteristics, such as poor concealment of treatment allocation or no blinding in studies exaggerate treatment effects 27 . Therefore, it is important to critically appraise the quality of studies in order to assess the risk of bias.

The study design, including details of the method of randomization of subjects to treatment groups, criteria for eligibility in the study, blinding, method of assessing the outcome, and handling of protocol deviations are important features defining study quality. When studies are excluded from a meta-analysis, reasons for exclusion should be provided for each excluded study. Usually, more than one assessor decides independently which studies to include or exclude, together with a well-defined checklist and a procedure that is followed when the assessors disagree. Two people familiar with the study topic perform the quality assessment for each study, independently. This is followed by a consensus meeting to discuss the studies excluded or included. Practically, the blinding of reviewers from details of a study such as authorship and journal source is difficult.

Before assessing study quality, a quality assessment protocol and data forms should be developed. The goal of this process is to reduce the risk of bias in the estimate of effect. Quality scores that summarize multiple components into a single number exist but are misleading and unhelpful 28 . Rather, investigators should use individual components of quality assessment and describe trials that do not meet the specified quality standards and probably assess the effect on the overall results by excluding them, as part of the sensitivity analyses.

Further, not all studies are completed, because of protocol failure, treatment failure, or other factors. Nonetheless, missing subjects and studies can provide important evidence. It is desirable to obtain data from all relevant randomized trials, so that the most appropriate analysis can be undertaken. Previous studies have discussed the significance of missing trials to the interpretation of intervention studies in medicine 29 , 30 . Journal editors and reviewers need to be aware of the existing bias toward publishing positive findings and ensure that papers that publish negative or even failed trials be published, as long as these meet the quality guidelines for publication.

There are occasions when authors of the selected papers have chosen different outcome criteria for their main analysis. In practice, it may be necessary to revise the inclusion criteria for a meta-analysis after reviewing all of the studies found through the search strategy. Variation in studies reflects the type of study design used, type and application of experimental and control therapies, whether or not the study was published, and, if published, subjected to peer review, and the definition used for the outcome of interest. There are no standardized criteria for inclusion of studies in meta-analysis. Universal criteria are not appropriate, however, because meta-analysis can be applied to a broad spectrum of topics. Published data in journal papers should also be cross-checked with conference papers to avoid repetition in presented data.

Clearly, unpublished studies are not found by searching the literature. It is possible that published studies are systemically different from unpublished studies; for example, positive trial findings may be more likely to be published. Therefore, a meta-analysis based on literature search results alone may lead to publication bias.

Efforts to minimize this potential bias include working from the references in published studies, searching computerized databases of unpublished material, and investigating other sources of information including conference proceedings, graduate dissertations and clinical trial registers.

Statistical analysis

The most common measures of effect used for dichotomous data are the risk ratio (also called relative risk) and the odds ratio. The dominant method used for continuous data are standardized mean difference (SMD) estimation. Methods used in meta-analysis for post hoc analysis of findings are relatively specific to meta-analysis and include heterogeneity analysis, sensitivity analysis, and evaluation of publication bias.

All methods used should allow for the weighting of studies. The concept of weighting reflects the value of the evidence of any particular study. Usually, studies are weighted according to the inverse of their variance 31 . It is important to recognize that smaller studies, therefore, usually contribute less to the estimates of overall effect. However, well-conducted studies with tight control of measurement variation and sources of confounding contribute more to estimates of overall effect than a study of identical size less well conducted.

One of the foremost decisions to be made when conducting a meta-analysis is whether to use a fixed-effects or a random-effects model. A fixed-effects model is based on the assumption that the sole source of variation in observed outcomes is that occurring within the study; that is, the effect expected from each study is the same. Consequently, it is assumed that the models are homogeneous; there are no differences in the underlying study population, no differences in subject selection criteria, and treatments are applied the same way 32 . Fixed-effect methods used for dichotomous data include most often the Mantel-Haenzel method 33 and the Peto method 34 (only for odds ratios).

Random-effects models have an underlying assumption that a distribution of effects exists, resulting in heterogeneity among study results, known as τ2. Consequently, as software has improved, random-effects models that require greater computing power have become more frequently conducted. This is desirable because the strong assumption that the effect of interest is the same in all studies is frequently untenable. Moreover, the fixed effects model is not appropriate when statistical heterogeneity (τ2) is present in the results of studies in the meta-analysis. In the random-effects model, studies are weighted with the inverse of their variance and the heterogeneity parameter. Therefore, it is usually a more conservative approach with wider confidence intervals than the fixed-effects model where the studies are weighted only with the inverse of their variance. The most commonly used random-effects method is the DerSimonian and Laird method 35 . Furthermore, it is suggested that comparing the fixed-effects and random-effect models developed as this process can yield insights to the data 36 .

Heterogeneity

Arguably, the greatest benefit of conducting metaanalysis is to examine sources of heterogeneity, if present, among studies. If heterogeneity is present, the summary measure must be interpreted with caution 37 . When heterogeneity is present, one should question whether and how to generalize the results. Understanding sources of heterogeneity will lead to more effective targeting of prevention and treatment strategies and will result in new research topics being identified. Part of the strategy in conducting a meta-analysis is to identify factors that may be significant determinants of subpopulation analysis or covariates that may be appropriate to explore in all studies.

To understand the nature of variability in studies, it is important to distinguish between different sources of heterogeneity. Variability in the participants, interventions, and outcomes studied has been described as clinical diversity, and variability in study design and risk of bias has been described as methodological diversity 10 . Variability in the intervention effects being evaluated among the different studies is known as statistical heterogeneity and is a consequence of clinical or methodological diversity, or both, among the studies. Statistical heterogeneity manifests itself in the observed intervention effects varying by more than the differences expected among studies that would be attributable to random error alone. Usually, in the literature, statistical heterogeneity is simply referred to as heterogeneity.

Clinical variation will cause heterogeneity if the intervention effect is modified by the factors that vary across studies; most obviously, the specific interventions or participant characteristics that are often reflected in different levels of risk in the control group when the outcome is dichotomous. In other words, the true intervention effect will differ for different studies. Differences between studies in terms of methods used, such as use of blinding or differences between studies in the definition or measurement of outcomes, may lead to differences in observed effects. Significant statistical heterogeneity arising from differences in methods used or differences in outcome assessments suggests that the studies are not all estimating the same effect, but does not necessarily suggest that the true intervention effect varies. In particular, heterogeneity associated solely with methodological diversity indicates that studies suffer from different degrees of bias. Empirical evidence suggests that some aspects of design can affect the result of clinical trials, although this may not always be the case.

The scope of a meta-analysis will largely determine the extent to which studies included in a review are diverse. Meta-analysis should be conducted when a group of studies is sufficiently homogeneous in terms of subjects involved, interventions, and outcomes to provide a meaningful summary. However, it is often appropriate to take a broader perspective in a meta-analysis than in a single clinical trial. Combining studies that differ substantially in design and other factors can yield a meaningless summary result, but the evaluation of reasons for the heterogeneity among studies can be very insightful. It may be argued that these studies are of intrinsic interest on their own, even though it is not appropriate to produce a single summary estimate of effect.

Variation among k trials is usually assessed using Cochran's Q statistic, a chi-squared (χ 2 ) test of heterogeneity with k-1 degrees of freedom. This test has relatively poor power to detect heterogeneity among small numbers of trials; consequently, an α-level of 0.10 is used to test hypotheses 38 , 39 .

Heterogeneity of results among trials is better quantified using the inconsistency index I 2 , which describes the percentage of total variation across studies 40 . Uncertainty intervals for I 2 (dependent on Q and k) are calculated using the method described by Higgins and Thompson 41 . Negative values of I 2 are put equal to zero, consequently I 2 lies between 0 and 100%. A value >75% may be considered substantial heterogeneity 41 . This statistic is less influenced by the number of trials compared with other methods used to estimate the heterogeneity and provides a logical and readily interpretable metric but it still can be unstable when only a few studies are combined 42 .

Given that there are several potential sources of heterogeneity in the data, several steps should be considered in the investigation of the causes. Although random-effects models are appropriate, it may be still very desirable to examine the data to identify sources of heterogeneity and to take steps to produce models that have a lower level of heterogeneity, if appropriate. Further, if the studies examined are highly heterogeneous, it may be not appropriate to present an overall summary estimate, even when random effects models are used. As Petiti notes 43 , statistical analysis alone will not make contradictory studies agree; critically, however, one should use common sense in decision-making. Despite heterogeneity in responses, if all studies had a positive point direction and the pooled confidence interval did not include zero, it would not be logical to conclude that there was not a positive effect, provided that sufficient studies and subject numbers were present. The appropriateness of the point estimate of the effect is much more in question.

Some of the ways to investigate the reasons for heterogeneity; are subgroup analysis and meta-regression. The subgroup analysis approach, a variation on those described above, groups categories of subjects (e.g., by age, sex) to compare effect sizes. The meta-regression approach uses regression analysis to determine the influence of selected variables (the independent variables) on the effect size (the dependent variable). In a meta-regresregression, studies are regarded as if they were individual patients, but their effects are properly weighted to account for their different variances 44 .

Sensitivity analyses have also been used to examine the effects of studies identified as being aberrant concerning conduct or result, or being highly influential in the analysis. Recently, another method has been proposed that reduces the weight of studies that are outliers in meta-analyses 45 . All of these methods for examining heterogeneity have merit, and the variety of methods available reflects the importance of this activity.

Presentation of results

A useful graph, presented in the PRISMA statement 11 , is the four-phase flow diagram ( Figure 3 ).

An external file that holds a picture, illustration, etc.
Object name is hippokratia-14-33-g001.jpg

This flow-diagram depicts the flow of information through the different phases of a systematic review or meta-analysis. It maps out the number of records identified, included and excluded, and the reasons for exclusions. The results of meta-analyses are often presented in a forest plot, where each study is shown with its effect size and the corresponding 95% confidence interval ( Figure 4 ).

An external file that holds a picture, illustration, etc.
Object name is hippokratia-14-34-g001.jpg

The pooled effect and 95% confidence interval is shown in the bottom in the same line with "Overall". In the right panel of Figure 4 , the cumulative meta-analysis is graphically displayed, where data are entered successively, typically in the order of their chronological appearance 46 , 47 . Such cumulative meta-analysis can retrospectively identify the point in time when a treatment effect first reached conventional levels of significance. Cumulative meta-analysis is a compelling way to examine trends in the evolution of the summary-effect size, and to assess the impact of a specific study on the overall conclusions 46 . The figure shows that many studies were performed long after cumulative meta-analysis would have shown a significant beneficial effect of antibiotic prophylaxis in colon surgery.

Biases in meta-analysis

Although the intent of a meta-analysis is to find and assess all studies meeting the inclusion criteria, it is not always possible to obtain these. A critical concern is the papers that may have been missed. There is good reason to be concerned about this potential loss because studies with significant, positive results (positive studies) are more likely to be published and, in the case of interventions with a commercial value, to be promoted, than studies with non-significant or "negative" results (negative studies). Studies that produce a positive result, especially large studies, are more likely to have been published and, conversely, there has been a reluctance to publish small studies that have non-significant results. Further, publication bias is not solely the responsibility of editorial policy as there is reluctance among researchers to publish results that were either uninteresting or are not randomized 48 . There are, however, problems with simply including all studies that have failed to meet peer-review standards. All methods of retrospectively dealing with bias in studies are imperfect.

It is important to examine the results of each meta-analysis for evidence of publication bias. An estimation of likely size of the publication bias in the review and an approach to dealing with the bias is inherent to the conduct of many meta-analyses. Several methods have been developed to provide an assessment of publication bias; the most commonly used is the funnel plot. The funnel plot provides a graphical evaluation of the potential for bias and was developed by Light and Pillemer 49 and discussed in detail by Egger and colleagues 50 , 51 . A funnel plot is a scatterplot of treatment effect against a measure of study size. If publication bias is not present, the plot is expected to have a symmetric inverted funnel shape, as shown in Figure 5A .

An external file that holds a picture, illustration, etc.
Object name is hippokratia-14-35-g001.jpg

In a study in which there is no publication bias, larger studies (i.e., have lower standard error) tend to cluster closely to the point estimate. As studies become less precise, such as in smaller trials (i.e., have a higher standard error), the results of the studies can be expected to be more variable and are scattered to both sides of the more precise larger studies. Figure 5A shows that the smaller, less precise studies are, indeed, scattered to both sides of the point estimate of effect and that these seem to be symmetrical, as an inverted funnel-plot, showing no evidence of publication bias. In contrast to Figure 5A , Figure 5B shows evidence of publication bias. There is evidence of the possibility that studies using smaller numbers of subjects and showing an decrease in effect size (lower odds ratio) were not published.

Asymmetry of funnel plots is not solely attributable to publication bias, but may also result from clinical heterogeneity among studies. Sources of clinical heterogeneity include differences in control or exposure of subjects to confounders or effect modifiers, or methodological heterogeneity between studies; for example, a failure to conceal treatment allocation. There are several statistical tests for detecting funnel plot asymmetry; for example, Eggers linear regression test 50 , and Begg's rank correlation test 52 but these do not have considerable power and are rarely used. However, the funnel plot is not without problems. If high precision studies really are different than low precision studies with respect to effect size (e.g., due different populations examined) a funnel plot may give a wrong impression of publication bias 53 . The appearance of the funnel plot plot can change quite dramatically depending on the scale on the y-axis - whether it is the inverse square error or the trial size 54 .

Other types of biases in meta-analysis include the time lag bias, selective reporting bias and the language bias. The time lag bias arises from the published studies, when those with striking results are published earlier than those with non-significant findings 55 . Moreover, it has been shown that positive studies with high early accrual of patients are published sooner than negative trials with low early accrual 56 . However, missing studies, either due to publication bias or time-lag bias may increasingly be identified from trials registries.

The selective reporting bias exists when published articles have incomplete or inadequate reporting. Empirical studies have shown that this bias is widespread and of considerable importance when published studies were compared with their study protocols 29 , 30 . Furthermore, recent evidence suggests that selective reporting might be an issue in safety outcomes and the reporting of harms in clinical trials is still suboptimal 57 . Therefore, it might not be possible to use quantitative objective evidence for harms in performing meta-analyses and making therapeutic decisions.

Excluding clinical trials reported in languages other than English from meta-analyses may introduce the language bias and reduce the precision of combined estimates of treatment effects. Trials with statistically significant results have been shown to be published in English 58 . In contrast, a later more extensive investigation showed that trials published in languages other than English tend to be of lower quality and produce more favourable treatment effects than trials published in English and concluded that excluding non-English language trials has generally only modest effects on summary treatment effect estimates but the effect is difficult to predict for individual meta-analyses 59 .

Evolution of meta-analyses

The classical meta-analysis compares two treatments while network meta-analysis (or multiple treatment metaanalysis) can provide estimates of treatment efficacy of multiple treatment regimens, even when direct comparisons are unavailable by indirect comparisons 60 . An example of a network analysis would be the following. An initial trial compares drug A to drug B. A different trial studying the same patient population compares drug B to drug C. Assume that drug A is found to be superior to drug B in the first trial. Assume drug B is found to be equivalent to drug C in a second trial. Network analysis then, allows one to potentially say statistically that drug A is also superior to drug C for this particular patient population. (Since drug A is better than drug B, and drug B is equivalent to drug C, then drug A is also better to drug C even though it was not directly tested against drug C.)

Meta-analysis can also be used to summarize the performance of diagnostic and prognostic tests. However, studies that evaluate the accuracy of tests have a unique design requiring different criteria to appropriately assess the quality of studies and the potential for bias. Additionally, each study reports a pair of related summary statistics (for example, sensitivity and specificity) rather than a single statistic (such as a risk ratio) and hence requires different statistical methods to pool the results of the studies 61 . Various techniques to summarize results from diagnostic and prognostic test results have been proposed 62 – 64 . Furthermore, there are many methodologies for advanced meta-analysis that have been developed to address specific concerns, such as multivariate meta-analysis 65 – 67 , and special types of meta-analysis in genetics 68 but will not be discussed here.

Meta-analysis is no longer a novelty in medicine. Numerous meta-analyses have been conducted for the same medical topic by different researchers. Recently, there is a trend to combine the results of different meta-analyses, known as a meta-epidemiological study, to assess the risk of bias 79 , 70 .

Conclusions

The traditional basis of medical practice has been changed by the use of randomized, blinded, multicenter clinical trials and meta-analysis, leading to the widely used term "evidence-based medicine". Leaders in initiating this change have been the Cochrane Collaboration who have produced guidelines for conducting systematic reviews and meta-analyses 10 and recently the PRISMA statement, a helpful resource to improve reporting of systematic reviews and meta-analyses has been released 11 . Moreover, standards by which to conduct and report meta-analyses of observational studies have been published to improve the quality of reporting 71 .

Meta-analysis of randomized clinical trials is not an infallible tool, however, and several examples exist of meta-analyses which were later contradicted by single large randomized controlled trials, and of meta-analyses addressing the same issue which have reached opposite conclusions 72 . A recent example, was the controversy between a meta-analysis of 42 studies 73 and the subsequent publication of the large-scale trial (RECORD trial) that did not support the cardiovascular risk of rosiglitazone 74 . However, the reason for this controversy was explained by the numerous methodological flaws found both in the meta-analysis and the large clinical trial 75 .

No single study, whether meta-analytic or not, will provide the definitive understanding of responses to treatment, diagnostic tests, or risk factors influencing disease. Despite this limitation, meta-analytic approaches have demonstrable benefits in addressing the limitations of study size, can include diverse populations, provide the opportunity to evaluate new hypotheses, and are more valuable than any single study contributing to the analysis. The conduct of the studies is critical to the value of a meta-analysis and the methods used need to be as rigorous as any other study conducted.

IMAGES

  1. Meta-Analysis Methodology for Basic Research: A Practical Guide

    meta analysis research questions

  2. What is a Meta-Analysis? The benefits and challenges

    meta analysis research questions

  3. PPT

    meta analysis research questions

  4. How is a meta-analysis performed?

    meta analysis research questions

  5. A practical Guide to do Primary research on Meta analysis Methodology

    meta analysis research questions

  6. PPT

    meta analysis research questions

VIDEO

  1. Statistical Procedure in Meta-Essentials

  2. Meta Analysis Research Methodology #research

  3. Meta Analysis Research

  4. Statistical Power of a Meta-Analysis

  5. Meta Analysis Research (मेटा विश्लेषण अनुसंधान) #ugcnet #ResearchMethodology #educationalbyarun

  6. What is a Meta-Analysis?

COMMENTS

  1. How to conduct a meta-analysis in eight steps: a practical guide

    2.1 Step 1: defining the research question. The first step in conducting a meta-analysis, as with any other empirical study, is the definition of the research question. Most importantly, the research question determines the realm of constructs to be considered or the type of interventions whose effects shall be analyzed.

  2. Systematic Reviews and Meta-Analysis: A Guide for Beginners

    The graphical output of meta-analysis is a forest plot which provides information on individual studies and the pooled effect. Systematic reviews of literature can be undertaken for all types of questions, and all types of study designs. This article highlights the key features of systematic reviews, and is designed to help readers understand ...

  3. Meta-Analytic Methodology for Basic Research: A Practical Guide

    Meta-analysis refers to the statistical analysis of the data from independent primary studies focused on the same question, which aims to generate a quantitative estimate of the studied phenomenon, for example, the effectiveness of the intervention (Gopalakrishnan and Ganeshkumar, 2013). In clinical research, systematic reviews and meta ...

  4. Introduction to systematic review and meta-analysis

    It is easy to confuse systematic reviews and meta-analyses. A systematic review is an objective, reproducible method to find answers to a certain research question, by collecting all available studies related to that question and reviewing and analyzing their results. A meta-analysis differs from a systematic review in that it uses statistical ...

  5. Meta-Analysis

    Definition. "A meta-analysis is a formal, epidemiological, quantitative study design that uses statistical methods to generalise the findings of the selected independent studies. Meta-analysis and systematic review are the two most authentic strategies in research. When researchers start looking for the best available evidence concerning ...

  6. PDF How to conduct a meta-analysis in eight steps: a practical guide

    2 Eight steps in conducting a meta‑analysis 2.1 Step 1: dening the research question The rst step in conducting a meta-analysis, as with any other empirical study, is the denition of the research question. Most importantly, the research question deter-mines the realm of constructs to be considered or the type of interventions whose

  7. Meta‐analysis and traditional systematic literature reviews—What, why

    The research question for a meta-analysis could be formulated around specific theory (e.g., regulatory fit theory; Motyka et al., 2014) or model (e.g., technology acceptance model; King & He, 2006)). Defining a research question in meta-analysis requires a deep understanding of the topic and literature, and entails specifying a valuable ...

  8. Meta-analysis of social science research: A practitioner's guide

    It is a problem of primary empirical research, and meta-analysis represents one of two ways of effectively addressing the bias. Preregistration of large multi-lab experiments is the other (Nosek et al., ... when enough primary studies have been conducted on the specific research question. Still, current bias-correction techniques are not ...

  9. A step by step guide for conducting a systematic review and meta

    Systematic review/meta-analysis steps include development of research question and its validation, forming criteria, search strategy, searching databases, importing all results to a library and exporting to an excel sheet, protocol writing and registration, title and abstract screening, full-text screening, manual searching, extracting data and ...

  10. Getting Started

    It may take several weeks to complete and run a search. Moreover, all guidelines for carrying out systematic reviews recommend that at least two subject experts screen the studies identified in the search. The first round of screening can consume 1 hour per screener for every 100-200 records. A systematic review is a labor-intensive team effort.

  11. Meta-evaluation of meta-analysis: ten appraisal questions for

    Meta-analysis is a statistical procedure for analyzing the combined data from different studies, and can be a major source of concise up-to-date information. The overall conclusions of a meta-analysis, however, depend heavily on the quality of the meta-analytic process, and an appropriate evaluation of the quality of meta-analysis (meta-evaluation) can be challenging. We outline ten questions ...

  12. A Guide to Conducting a Meta-Analysis

    Abstract. Meta-analysis is widely accepted as the preferred method to synthesize research findings in various disciplines. This paper provides an introduction to when and how to conduct a meta-analysis. Several practical questions, such as advantages of meta-analysis over conventional narrative review and the number of studies required for a ...

  13. Introduction to Meta-Analysis: A Guide for the Novice

    Similar to any research study, a meta-analysis begins with a research question. Meta-analysis can be used in any situation where the goal is to summarize quantitative findings from empirical studies. It can be used to examine different types of effects, including prevalence rates (e.g., percentage of rape survivors with depression), growth ...

  14. Systematic Reviews and Meta-Analysis: A Guide for Beginners

    Meta-analysis is a statistical tool that provides pooled estimates of effect from the data extracted from individual studies in the systematic review. The graphical output of meta-analysis is a forest plot which provides information on individual studies and the pooled effect. Systematic reviews of literature can be undertaken for all types of ...

  15. Research Guides: Study Design 101: Meta-Analysis

    Meta-analysis would be used for the following purposes: To establish statistical significance with studies that have conflicting results. To develop a more correct estimate of effect magnitude. To provide a more complex analysis of harms, safety data, and benefits. To examine subgroups with individual numbers that are not statistically significant.

  16. Ten simple rules for carrying out and writing meta-analyses

    Rule 1: Specify the topic and type of the meta-analysis. Considering that a systematic review [ 10] is fundamental for a meta-analysis, you can use the Population, Intervention, Comparison, Outcome (PICO) model to formulate the research question. It is important to verify that there are no published meta-analyses on the specific topic in order ...

  17. A guide to prospective meta-analysis

    In a prospective meta-analysis (PMA), study selection criteria, hypotheses, and analyses are specified before the results of the studies related to the PMA research question are known, reducing many of the problems associated with a traditional (retrospective) meta-analysis. PMAs have many advantages: they can help reduce research waste and bias, and they are adaptive, efficient, and ...

  18. Practical Guide to Meta-analysis

    Practical Guide to Meta-analysis. Meta-analysis is a systematic approach of synthesizing, combining, and analyzing data from multiple studies (randomized clinical trials 1 or observational studies 2) into a single effect estimate to answer a research question. Meta-analysis is especially useful if there is debate around the research question in ...

  19. Identifying Your Research Question

    The difference with a systematic review research question is that you must have a clearly defined question and consider what problem are you trying to address by conducting the review. The most important point is that you focus your question and design the question so that it is answerable by the research that you will be systematically examining.

  20. Meta-analysis

    Meta-analysis is an objective examination of published data from many studies of the same research topic identified through a literature search. Through the use of rigorous statistical methods, it ...

  21. Comparison of early and late norepinephrine administration in ...

    Study question: Does early versus late administration of norepinephrine improve outcomes in septic shock patients? Results: Our meta-analysis compared outcomes between early and late norepinephrine use categorized by time points and differences in vasopressor protocols across 12 studies and found no significant differences in overall mortality ...

  22. Recent Developments in Group Psychotherapy Research

    This article reviews group psychotherapy research published within the past 30 years, predominantly focusing on outcomes of group treatments for patients with various mental disorders. Additionally, meta-analyses on the efficacy of group treatments for patients with cancer or chronic pain are summarized. Results strongly support the use of group therapy and demonstrate outcomes equivalent to ...

  23. Systematic Reviews and Meta-analysis: Understanding the Best Evidence

    This is the most important part of systematic reviews/meta-analysis. The research question for the systematic reviews may be related to a major public health problem or a controversial clinical situation which requires acceptable intervention as a possible solution to the present healthcare need of the community. This step is most important ...

  24. Who is prejudiced, and toward whom? The big five traits and generalized

    Topics in Psychology. Explore how scientific research by psychologists can inform our professional lives, family and community relationships, emotional wellness, and more. ... To test these questions, they used a meta-analysis, a type of research method that combines the results of multiple different scientific studies. This study combined the ...

  25. Perceived warmth, competence predict callback decisions in meta

    Perceived warmth, competence predict callback decisions in meta-analysis of hiring experiments. ScienceDaily . Retrieved July 11, 2024 from www.sciencedaily.com / releases / 2024 / 07 ...

  26. Nutrients

    Osteoarthritis (OA) is one of the most common musculoskeletal disorders. Recently, research has focused on the role of intestinal microbiome dysbiosis in OA. The aim of this study was to systematically review randomized intervention clinical studies investigating the effect of probiotics on the management of OA-related pain and inflammation. Pre-clinical studies and non-randomized trials were ...

  27. What is the vibration of effects?

    Navigating between contradictory results is not rare in the practice of evidence-based medicine. Recently, two papers published in the same year and in the same journal investigated the same research question with the same dataset and reached divergent results regarding the benefits of retrieval bag use during laparoscopic appendectomy.1 The two studies reached contrasting conclusions, one ...

  28. Meta‐analysis: Key features, potentials and misunderstandings

    A meta‐analysis is an objective procedure. Every meta‐analysis is characterized by decisions regarding research question, eligibility criteria, risk of bias analysis, and statistical approach. These decisions should be reasonable and transparently reported. Probably, no single best and ultimately objective procedure exists.

  29. Understanding and optimising gratitude interventions: The right methods

    ObjectiveGratitude has consistently been associated with various beneficial health-related outcomes, including subjective wellbeing, positive mental health, and positive physical health. In light of such effects, positive psychology researchers and practitioners have often implemented gratitude interventions in an attempt to build individuals' orientations toward appreciation and thankfulness.

  30. Meta-analysis in medical research

    Meta-Analysis and Systematic Review. Glass first defined meta-analysis in the social science literature as "The statistical analysis of a large collection of analysis results from individual studies for the purpose of integrating the findings" 9.Meta-analysis is a quantitative, formal, epidemiological study design used to systematically assess the results of previous research to derive ...