- UNC Libraries
- HSL Academic Process
- Systematic Reviews
- Step 6: Assess Quality of Included Studies
Systematic Reviews: Step 6: Assess Quality of Included Studies
Created by health science librarians.
- Step 1: Complete Pre-Review Tasks
- Step 2: Develop a Protocol
- Step 3: Conduct Literature Searches
- Step 4: Manage Citations
- Step 5: Screen Citations
Assess studies for quality and bias
Critically appraise included studies, select a quality assessment tool, a closer look at popular tools, use covidence for quality assessment.
- Quality Assessment FAQs
- Step 7: Extract Data from Included Studies
- Step 8: Write the Review
Check our FAQ's
Email us
Call (919) 962-0800
Make an appointment with a librarian
Request a systematic or scoping review consultation
About Step 6: Assess Quality of Included Studies
In step 6 you will evaluate the articles you included in your review for quality and bias. To do so, you will:
- Use quality assessment tools to grade each article.
- Create a summary of the quality of literature included in your review.
This page has links to quality assessment tools you can use to evaluate different study types. Librarians can help you find widely used tools to evaluate the articles in your review.
Reporting your review with PRISMA
If you reach the quality assessment step and choose to exclude articles for any reason, update the number of included and excluded studies in your PRISMA flow diagram.
Managing your review with Covidence
Covidence includes the Cochrane Risk of Bias 2.0 quality assessment template, but you can also create your own custom quality assessment template.
How a librarian can help with Step 6
- What the quality assessment or risk of bias stage of the review entails
- How to choose an appropriate quality assessment tool
- Best practices for reporting quality assessment results in your review
After the screening process is complete, the systematic review team must assess each article for quality and bias. There are various types of bias, some of which are outlined in the table below from the Cochrane Handbook.
The most important thing to remember when choosing a quality assessment tool is to pick one that was created and validated to assess the study design(s) of your included articles.
For example, if one item in the inclusion criteria of your systematic review is to only include randomized controlled trials (RCTs), then you need to pick a quality assessment tool specifically designed for RCTs (for example, the Cochrane Risk of Bias tool)
Once you have gathered your included studies, you will need to appraise the evidence for its relevance, reliability, validity, and applicability.
Ask questions like:
Relevance: .
- Is the research method/study design appropriate for answering the research question?
- Are specific inclusion / exclusion criteria used?
Reliability:
- Is the effect size practically relevant? How precise is the estimate of the effect? Were confidence intervals given?
Validity:
- Were there enough subjects in the study to establish that the findings did not occur by chance?
- Were subjects randomly allocated? Were the groups comparable? If not, could this have introduced bias?
- Are the measurements/ tools validated by other studies?
- Could there be confounding factors?
Applicability:
- Can the results be applied to my organization and my patient?
What are Quality Assessment tools?
Quality Assessment tools are questionnaires created to help you assess the quality of a variety of study designs. Depending on the types of studies you are analyzing, the questionnaire will be tailored to ask specific questions about the methodology of the study. There are appraisal tools for most kinds of study designs. You should choose a Quality Assessment tool that matches the types of studies you expect to see in your results. If you have multiple types of study designs, you may wish to use several tools from one organization, such as the CASP or LEGEND tools, as they have a range of assessment tools for many study designs.
Click on a study design below to see some examples of quality assessment tools for that type of study.
Randomized Controlled Trials (RCTs)
- Cochrane Risk of Bias (ROB) 2.0 Tool Templates are tailored to randomized parallel-group trials, cluster-randomized parallel-group trails (including stepped-wedge designs), and randomized cross-over trails and other matched designs.
- CASP- Randomized Controlled Trial Appraisal Tool A checklist for RCTs created by the Critical Appraisal Skills Program (CASP)
- The Jadad Scale A scale that assesses the quality of published clinical trials based methods relevant to random assignment, double blinding, and the flow of patients
- CEBM-RCT A critical appraisal tool for RCTs from the Centre for Evidence Based Medicine (CEBM)
- Checklist for Randomized Controlled Trials (JBI) A critical appraisal checklist from the Joanna Briggs Institute (JBI)
- Scottish Intercollegiate Guidelines Network (SIGN) Checklists for quality assessment
- LEGEND Evidence Evaluation Tools A series of critical appraisal tools from the Cincinnati Children's Hospital. Contains tools for a wide variety of study designs, including prospective, retrospective, qualitative, and quantitative designs.
Cohort Studies
- CASP- Cohort Studies A checklist created by the Critical Appraisal Skills Programme (CASP) to assess key criteria relevant to cohort studies
- Checklist for Cohort Studies (JBI) A checklist for cohort studies from the Joanna Briggs Institute
- The Newcastle-Ottawa Scale (NOS) for assessing the quality of nonrandomised studies in meta-analyses A validated tool for assessing case-control and cohort studies
- STROBE Checklist A checklist for quality assessment of case-control, cohort, and cross-sectional studies
Case-Control Studies
- CASP- Case Control Study A checklist created by the Critical Appraisal Skills Programme (CASP) to assess key criteria relevant to case-control studies
- Tool to Assess Risk of Bias in Case Control Studies by the CLARITY Group at McMaster University A quality assessment tool for case-control studies from the CLARITY Group at McMaster University
- Checklist for Case-Control Studies A checklist created by the Joanna Briggs Institute
Cross-Sectional Studies
Diagnostic studies.
- CASP- Diagnostic Studies A checklist for diagnostic studies created by the Critical Appraisal Skills Program (CASP)
- QUADAS-2 A quality assessment tool developed by a team at the Bristol Medical School: Population Health Sciences at the University of Bristol
- Critical Appraisal Checklist for Diagnostic Test Accuracy Studies (JBI) A checklist for quality assessment of diagnostic studies developed by the Joanna Briggs Institute
Economic Studies
- Consensus Health Economic Criteria (CHEC) List 19 yes-or-no questions, one for each category to assess economic evaluations
- CASP- Economic Evaluation A checklist for quality assessment of economic studies by the Critical Appraisal Skills Programme
Mixed Methods
- McGill Mixed Methods Appraisal Tool (MMAT) 2018 User Guide See full site for additional information, including FAQ's, references and resources, earlier versions, and more
Qualitative Studies
- CASP- Qualitative Studies 10 questions to help assess qualitative research from the Critical Appraisal Skills Programme
Systematic Reviews and Meta-Analyses
- Critical Appraisal Checklist for Systematic Reviews and Research Syntheses An 11-item checklist for evaluating systematic reviews
- AMSTAR Checklist A 16-question measurement tool to assess systematic reviews
- AHRQ Methods Guide for Effectiveness and Comparative Effectiveness Reviews A guide to selecting eligibility criteria, searching the literature, extracting data, assessing quality, and completing other steps in the creation of a systematic review
- CASP - Systematic Review A checklist for quality assessment of systematic review from the Critical Appraisal Skills Programme
Clinical Practice Guidelines
- National Guideline Clearinghouse Extent of Adherence to Trustworthy Standards (NEATS) Instrument A 15-item instrument using a scale of 1-5 to evaluate a guideline's adherence to the Institute of Medicine's standard for trust worth guidelines
- AGREE-II Appraisal of Guidelines for Research and Evaluation The Appraisal of Guidelines for Research and Evaluation (AGREE) Instrument evaluates the process of practice guideline development and the quality of reporting
Other Study Designs
- NTACT Quality Checklists Quality indicator checklists for correlational studies, group experimental studies, single case research studies, and qualitative studies developed by the National Technical Assistance Center on Transition (NTACT). (Users must make an account.)
Below, you will find a sample of four popular quality assessment tools and some basic information about each. For more quality assessment tools, please view the blue tabs in the boxes above, organized by study design.
Covidence uses Cochrane Risk of Bias (which is designed for rating RCTs and cannot be used for other study types) as the default tool for quality assessment of included studies. You can opt to manually customize the quality assessment template and use a different tool better suited to your review. More information about quality assessment using Covidence, including how to customize the quality assessment template, can be found below. If you decide to customize the quality assessment template, you cannot switch back to using the Cochrane Risk of Bias template.
More Information
- Quality Assessment on the Covidence Guide
- Covidence FAQs on Quality Assessment Commonly asked questions about quality assessment using Covidence
- Covidence YouTube Channel A collection of Covidence-created videos
- << Previous: Step 5: Screen Citations
- Next: Step 7: Extract Data from Included Studies >>
- Last Updated: Nov 6, 2024 9:29 AM
- URL: https://guides.lib.unc.edu/systematic-reviews
- JABSOM Library
Systematic Review Toolbox
Quality assessment.
- Guidelines & Rubrics
- Databases & Indexes
- Reference Management
- Data Extraction
- Data Analysis
- Manuscript Development
- Software Comparison
- Systematic Searching This link opens in a new window
- Authorship Determination This link opens in a new window
- Critical Appraisal Tools This link opens in a new window
- Systematic Review Decision Tree This link opens in a new window
Critical Appraisal Questions
- Is the study question relevant?
- Does the study add anything new?
- What type of research question is being asked?
- Was the study design appropriate for the research question?
- Did the study methods address the most important potential sources of bias?
- Was the study performed according to the original protocol?
- Does the study test a stated hypothesis?
- Were the statistical analyses performed correctly?
- Do the data justify the conclusions?
- Are there any conflicts of interest?
The University of Sydney Library, Systematic Reviews: Assessment Tools and Critical Appraisal
Taylor, P., Hussain, J. A., & Gadoud, A. (2013). How to appraise a systematic review. British Journal of Hospital Medicine, 74(6), 331-334. doi: 10.12968/hmed.2013.74.6.331
Young, J. M., & Solomon, M. J. (2009). How to critically appraise an article. Nature Clinical Practice Gastroenterology and Hepatology, 6(2), 82-91. doi: 10.1038/ncpgasthep1331
Assessing the quality of evidence contained within a systematic review is as important as analyzing the data within. Results from a poorly conducted study can be skewed by biases from the research methodology and should be interpreted with caution. Such studies should be acknowledged as such in the systematic review or outright excluded. Selecting an appropriate tool to help analyze strength of evidence and imbedded biases within each paper is also essential. If using a systematic review manuscript development tool (e.g., RevMan), a checklist may be built into the software. Other software (e.g., Rayyan) may help with screening search results and discarding irrelevant studies. The following tools/checklists may help with study assessment and critical appraisal.
- Assessing the Methodological Quality of Systematic Reviews (AMSTAR 2) is widely used to critically appraise systematic reviews .
- Centre for Evidence-Based Medicine (CEBM) contains a collection of critical appraisal tools for studies of all types and examples of usage.
- Cochrane risk-of-bias (RoB 2) tool is the recommended tool for assessing quality and risk of bias in randomized clinical trials in Cochrane-submitted systematic reviews.
- Critical Appraisal Skills Programme (CASP) has 25 years of experience and expertise in critical appraisal and offers appraisal checklists for a wide range of study types .
- Joanna Briggs Institute (JBI) provides robust checklists for the appraisal and assessment of most types of studies .
- National Academies of Sciences, Health and Medicine Division provides standards for assessing bias in primary studies comprising systematic reviews of therapeutic or medical interventions.
- Newcastle-Ottawa Scale (NOS) is also used in non-observational studies of cohort and case-control varieties.
- Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool surveys diagnostic accuracy studies on four domains: index test, reference standard, patient selection, and flow and timing.
- Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) framework is often used to measure the quality of cohort, case-control and cross-sectional studies .
Requesting Research Consultation
The Health Sciences Library provides consultation services for University of Hawaiʻi-affiliated students, staff, and faculty. The John A. Burns School of Medicine Health Sciences Library does not have staffing to conduct or assist researchers unaffiliated with the University of Hawaiʻi. Please utilize the publicly available guides and support pages that address research databases and tools.
Before Requesting Assistance
Before requesting systematic review assistance from the librarians, please review the relevant guides and the various pages of the Systematic Review Toolbox . Most inquiries received have been answered there previously. Support for research software issues is limited to help with basic installation and setup. Please contact the software developer directly if further assistance is needed.
- << Previous: Data Extraction
- Next: Data Analysis >>
- Last Updated: Oct 14, 2024 2:41 PM
- URL: https://hslib.jabsom.hawaii.edu/systematicreview
Health Sciences Library, John A. Burns School of Medicine, University of Hawai‘i at Mānoa, 651 Ilalo Street, MEB 101, Honolulu, HI 96813 - Phone: 808-692-0810, Fax: 808-692-1244
Copyright © 2004-2024. All rights reserved. Library Staff Page - Other UH Libraries
- Methodology
- Open access
- Published: 17 October 2017
A proposed framework for developing quality assessment tools
- Penny Whiting ORCID: orcid.org/0000-0003-1138-5682 1 , 2 ,
- Robert Wolff 3 ,
- Susan Mallett 4 , 5 ,
- Iveta Simera 6 &
- Jelena Savović 1 , 2
Systematic Reviews volume 6 , Article number: 204 ( 2017 ) Cite this article
23k Accesses
54 Citations
54 Altmetric
Metrics details
Assessment of the quality of included studies is an essential component of any systematic review. A formal quality assessment is facilitated by using a structured tool. There are currently no guidelines available for researchers wanting to develop a new quality assessment tool.
This paper provides a framework for developing quality assessment tools based on our experiences of developing a variety of quality assessment tools for studies of differing designs over the last 14 years. We have also drawn on experience from the work of the EQUATOR Network in producing guidance for developing reporting guidelines.
We do not recommend a single ‘best’ approach. Instead, we provide a general framework with suggestions as to how the different stages can be approached. Our proposed framework is based around three key stages: initial steps, tool development and dissemination.
Conclusions
We recommend that anyone who would like to develop a new quality assessment tool follow the stages outlined in this paper. We hope that our proposed framework will increase the number of tools developed using robust methods.
Peer Review reports
Systematic reviews are generally considered to provide the most reliable form of evidence for decision makers [ 1 ]. A formal assessment of the quality of the included studies is an essential component of any systematic review [ 2 , 3 ]. Quality can be considered to have three components—internal validity (risk of bias), external validity (applicability/variability) and reporting quality. The quality of included studies depends on them being sufficiently well designed and conducted to be able to provide reliable results [ 4 ]. Poor design, conduct or analysis can introduce bias or systematic error affecting study results and conclusions—this is also known as internal validity. External validity or the applicability of the study to the review question is also an important component of study quality. Reporting quality relates to how well the study is reported—it is difficult to assess other components of study quality if the study is not reported with the appropriate level of detail.
When conducting a systematic review, stronger conclusions can be derived from studies at low risk of bias, rather than when evidence is based on studies with serious methodological flaws. Formal quality assessment as part of a systematic review, therefore, provides an indication of the strength of the evidence on which conclusions are based and allows comparisons between studies based on risk of bias [ 3 ]. The GRADE system for rating the overall quality of the evidence included in a systematic review is recommended by many guidelines and systematic review organisations such as National Institute for Health and Care Excellence (NICE) and Cochrane. Risk of bias is a key component of this along with publication bias, imprecision, inconsistency, indirectness and magnitude of effect [ 5 , 6 ].
A formal quality assessment is facilitated by using a structured tool. Although it is possible for reviewers to simply assess what they consider to be key components of quality, this may result in important sources of bias being omitted, inappropriate items included or too much emphasis being given to particular items guided by reviewers’ subjective opinions. In contrast, a structured tool provides a convenient standardised way to assess quality providing consistency across reviews. Robust tools are usually developed based on empirical evidence refined by expert consensus.
This paper provides a framework for developing quality assessment tools. We use the term ‘quality assessment tool’ to refer to any tool designed to target one or more aspects of the quality of a research study. This term can apply to any tool whether focused specifically on one aspect of study quality (usually risk of bias) or for broader tools covering additional aspects such as applicability/generalisability and reporting quality. We do not place any restrictions on the type of ‘tool’ to which this framework can be approach—it should be appropriate for a variety of different approaches such as checklists, domain-based approaches, tables or graphics or any other format that developers may want to consider. We do not recommend a single ‘best’ approach. Instead, we provide a general framework with suggestions on how the different stages can be approached. This is based on our experience of developing quality assessment tools for studies of differing designs over the last 14 years. These include QUADAS [ 7 ] and QUADAS-2 [ 8 ] for diagnostic accuracy studies, ROBIS [ 9 ] for systematic reviews, PROBAST [ 10 ] for prediction modelling studies, ROBINS-I [ 11 ] for non-randomised studies of interventions and the new version of the Cochrane risk of bias tool for randomised trials (RoB 2.0) [ 12 ]. We have also drawn on experience from the work of the EQUATOR Network in producing guidance for developing reporting guidelines [ 13 ].
Over the years that we have been involved in the development of quality assessment tools and through involvement in different development processes, we noticed that the methods used to develop each tool could be mapped to a similar underlying process. The proposed framework evolved through discussion among the team, describing the steps involved in developing the different tools, and then grouping these into appropriate headings and stages.
Results: Proposed framework
The Fig. 1 and Table 1 outline the proposed steps in our framework, grouped into three stages. The table also includes examples of how each step was approached for the tools that we have been involved in developing. Each step is discussed in detail below.
Overview of proposed framework
Stage 1: initial steps
Identify the need for a new tool.
The first step in developing a new quality assessment (QA) tool is to identify the need for a new tool: What is the rationale for developing the new tool? In their guidance on developing reporting guidelines, Moher et al. [ 13 ] stated that “developing a reporting guidelines is complex and time consuming, so a compelling rationale is needed”. The same applies to the development of QA tools. It may be that there is no existing QA tool for the specific study design of interest; a QA tool is available but not directly targeted to the specific context required (e.g. tools designed for clinical interventions may not be appropriate for public health interventions), existing tools might not be up to date, new evidence on particular sources of bias may have emerged that is not adequately addressed by existing tools, or new approaches to quality assessment mean that a new approach is needed. For example, QUADAS-2 and RoB 2.0 were developed as experience, anecdotal reports, and feedback suggested areas for improvement of the original QUADAS and Cochrane risk of bias tools [ 7 ]. ROBIS was developed as we felt there was no tool that specifically addressed risk of bias in systematic reviews [ 9 ].
It is important to consider whether a completely new tool is needed or whether it may be possible to modify or adapt an existing tool. If modifying an existing tool, then the original can act as a starting point, although in practice, the new tool may look very different from the original. Both QUADAS-2 [ 8 ] and the new Cochrane risk of bias tool used the original versions of these tools as a starting point [ 12 ].
Obtain funding for the tool development
There are costs involved in developing a new QA tool. These will vary depending on the approach taken but items that may need to be funded include researcher time, literature searching, travel and subsistence for attending meetings, face-to-face meetings, piloting the tool, online survey software, open access publication costs, website fees and conference attendance for dissemination. We have used different approaches to fund the development of quality assessment tools. QUADAS-2 [ 8 ] was funded by the UK Medical Research Council Methodology Programme as part of a larger project grant. ROBIS, [ 9 ] ROBINS-I [ 11 ] and Cochrane ROB 2.0 [ 12 ] were funded through smaller project-specific grants, and PROBAST [ 10 ] received no specific funding. Instead, the host institutions for each steering group member allowed them time to work on the project and covered travel and subsistence for regular steering group meetings and conference attendance. Freely available survey monkey software ( www.surveymonkey.co.uk ) was used to run an online Delphi process.
Assemble team
Assembling a team with the appropriate expertise is a key step in developing a quality assessment tool. As tool development usually relies on expert consensus, it is essential that the team includes people with an appropriate range of expertise. This generally includes methodologists with expertise in the study designs targeted by the tool, people with expertise in QA tool development and also end users, i.e. reviewers who will be using the tool. Reviewers are a group that may sometimes be overlooked but are essential to ensure that the final tool is usable by those for whom it is developed. If the tool is likely to be used in different content areas, then it is important to include reviewers who will be using the tool in all contexts. For example, ROBIS is targeted at different types of systematic reviews including reviews of interventions, diagnostic accuracy, aetiology and prognosis. We included team members who were familiar with all different types of review to ensure that the team included the appropriate expertise to develop the tool. It can also be helpful to include reviewers with a range of expertise from those new to quality assessment to more experienced reviewers. Including representatives from a wide range of organisations can also be helpful for the future uptake and dissemination of the tool. Thinking about this at an early stage is helpful. The more organisations that are involved in the development of the tool, the more likely these organisations are to feel some ownership of the tool and to want to implement the tool within their organisation in the future. The total number of people involved in tool development varies. For our tools, the number of people involved directly in the development of each tool ranged from 27 to 51 with a median of 40.
Manage the project
The size and the structure of the project team also need to be carefully considered. In order to cover an appropriate range of expertise, it is generally necessary to include a relatively large group of people. It may not be practical for such a large group to be involved in the day-to-day development of the tool, and so it may be desirable to have a smaller group responsible for driving the project by leading and coordinating all activities, and involving the larger group where their input is required. For example, when developing QUADAS-2 and PROBAST, a steering group of around 6–8 people led the development of the tool, bringing in a larger consensus group to help inform decisions on the scope and content of the tool. For ROBINS-I and Cochrane ROB 2.0, a smaller steering group led the development with domain-based working groups developing specific areas of the tool.
Define the scope
The scope of the quality assessment tool needs to be defined at an early stage. The Table 2 outlines key questions to consider when defining the scope. Tools generally target one specific type of study. The specific study design to be considered is one of the first components to define. For example, QUADAS-2 [ 8 ] focused on diagnostic accuracy studies, PROBAST [ 10 ] on prediction modelling studies and the Cochrane Risk of Bias tool on randomised trials. Some tools may be broader, targeted at multiple related designs. For example, ROBINS-I targets all non-randomised studies of interventions rather than one single study design such as cohort studies. When deciding on the focus of the tool, it is important to clearly define the design and topic areas targeted. Trade-offs of different approaches need consideration. A more focused tool can be tailored to a specific topic area. A broader tool may not be as specific but can be used to assess a wider variety of studies. For example, we developed ROBIS to be used to assess any type of systematic review, e.g. intervention, prognostic, diagnostic or aetiology. Previous tools, such as the AMSTAR tool, were developed to assess reviews of RCTs [ 14 ]. Key to any quality assessment tool is a definition of quality as addressed by the tool, i.e. defining what exactly the tool is trying to address. We have found that once the definition of quality has been clearly agreed, then it becomes much easier to decide on which items to include in the tool.
Other features to consider include whether to address both internal (risk of bias) and external validity (applicability) and the structure of the tool. The original QUADAS tool used a simple checklist design and combined items on risk of bias, reporting quality and applicability. Our more recently developed tools have followed a domain-based approach with a clear focus on assessment of risk of bias. Many of these domain-based tools also include sections covering applicability/relevance. How to rate individual items included in the tool also forms part of the scope. The original QUADAS tool [ 7 ] used a simple ‘yes, no or unclear’ rating for each question. The domain-based tools such as QUADAS-2, [ 8 ] ROBIS [ 9 ] and PROBAST [ 10 ] have signalling questions which flag the potential for bias. These are generally factual questions and can be answered as ‘yes, no or no information’. Some tools include a ‘probably yes’ or ‘probably no’ response to help reviewers answer these questions when there is not sufficient information for a more definite response. The overall domain ratings then use decision ratings like ‘high, low or unclear’ risk of bias. Some tools, such as ROBINS-I [ 11 ] and the RoB 2.0 [ 12 ], include additional domain level ratings such as ‘critical, severe, moderate or low’ and ‘low, some concerns, high’. We strongly recommend that at this stage, tool developers are explicit that quality scores should not be incorporated into the tools. Numerical summary quality scores have been shown to be poor indicators of study quality, and so, alternatives to their use should be encouraged [ 15 , 16 ]. When developing many of our tools, we were explicit at the scope stage that we wanted to come up an overall assessment of study quality but avoid the use of quality scores. One of the reasons for introducing the domain level structure first used with the QUADAS-2 tool was explicit to avoid users calculating quality scores by simply summing the number of items fulfilled.
Agreeing the scope of the tool may not be straightforward and can require much discussion between team members. An additional consideration is how decisions on scope will be made. Will this be by a single person, by the steering group and should some or all decisions be agreed by the larger group? The approach that we have often taken is for a smaller group (e.g. steering group) to propose the scope of the tool with the agreement reached following consultation with the larger group. Questions on the scope can often form the first discussion points at a face-to-face meeting (e.g. ROBIS [ 9 ] and QUADAS-2 [ 8 ]) or the first questions on a web-based survey (e.g. PROBAST [ 10 ]).
As with any research project, a protocol that clearly defines the scope and proposed plans for the development of the tool should be produced at an early stage of the tool development process.
Stage 2: tool development
Generate initial list of items for inclusion.
The starting point for a tool is an initial list of items to consider for inclusion. There are various ways in which this list can be generated. These include looking at existing tools, evidence reviews and expert knowledge. The most comprehensive way is to review the literature for potential sources of bias and to provide a systematic review summarising the evidence for the effects of these. This is the approach we took for the original QUADAS tool [ 7 ] and also the updated QUADAS-2 [ 8 , 17 , 18 ]. Reviewing the items included in existing tools and summarising the number of tools that included each potential item can be a useful initial step as it shows which potential items of bias have been considered as important by previous tool developers. This process was followed for the original QUADAS tool [ 7 ] and for ROBIS [ 9 ]. Examining how previous systematic reviews have incorporated quality into their results can also be helpful to provide an indication of the requirements of a QA tool. If you are updating a previous QA tool then this will often form the starting point for potential items to include in the updated tool. This was the case for QUADAS-2 [ 8 ] and the RoB 2.0 [ 12 ]. For ROBINS-I [ 11 ], domains were agreed at a consensus meeting, and then expert working groups identified potential items to include in each domain. Generating the list of items for inclusion was, therefore, based on expert consensus rather than reviewing existing evidence. This can also be a valid approach. The development of PROBAST used a combined approach of using an existing tool for a related area as the starting point (QUADAS-2), non-systematic literature reviews and expert input from both steering group members and wider PROBAST group [ 10 ].
Agree initial items and scope
After the initial stages of tool development which can often be performed by a smaller group, input from the larger group should be sought. Methods for gaining input from the larger group include holding a face-to-face meeting or a web-based survey. At this stage, the scope defined in step 1.5 can be brought to the larger group for further discussion and refinement. The initial list of items needs to be further refined until agreement is reached on which items should be included in an initial draft of the tool. If a face-to-face meeting is held, smaller break-out groups focussing on specific domains can be a helpful structure to the meeting. QUADAS-2, ROBIS and ROBINS-I all involved face-to-face meetings with smaller break-out groups early in the development process [ 8 , 9 , 11 ]. If moving straight to a web-based survey, then respondents can be asked about the scope with initial questions considering possible items to include. This approach was taken for PROBAST [ 10 ] and the original QUADAS tool [ 7 ]. For PROBAST, we also asked group members to provide supporting evidence for why items should be included in the tool [ 10 ]. Items should be turned into potential questions/signalling questions for inclusion in the tool at this relatively early stage in the development of the tool.
Produce first draft of tool and develop guidance
Following the face-to-face meeting or initial survey rounds, a first draft of the tool can be produced. The initial draft may be produced by a smaller group (e.g. steering group), single person, or by taking a domain-based approach with the larger group split into groups with each taking responsibility for single domains. For QUADAS-2 [ 8 ] and PROBAST [ 10 ], a single person developed the first draft which was then agreed by the steering group before moving forwards. The first draft of ROBIS was developed following the face-to-face meeting by two team members. Initial drafts of ROBINS-I [ 11 ] and the RoB 2.0 [ 12 ] were produced by teams working on single domains proposing initial versions for their domains. Drafts for each domain were then put together by the steering group to give a first draft of the tool. Once a first draft of the tool is available, it may be helpful to start producing a clear guidance document describing how to assess each of the items included in the tool. The earlier such a guide can be produced, the more opportunity there will be to pilot and refine it alongside the tool.
Pilot and refine
The first draft of the tool needs to go through a process of refinement until a final version that has agreement of the wider group is achieved. Consensus may be achieved in various ways. Online surveys consisting of multiple rounds until agreement on the final tool is reached are a good way of involving large numbers of experts in this process. This is the approach used for QUADAS, [ 7 ], QUADAS-2 [ 8 ], ROBIS, [ 9 ] and PROBAST [ 10 ]. If domain-based working groups were adopted for the initial development of the tool, these can also be used to finalise the tool. Members of the full group can then provide feedback on draft versions, including domains that they were not initially assigned to. This approach was used for ROBINS-I and RoB 2.0. It would also be feasible to combine such an approach with a web-based survey.
Whilst the tool is being refined, initial piloting work can be undertaken. If a guidance document has been produced, then it can be included in the piloting process. If the tool is available in different formats, for example paper-based or Access database, then these could also be made available and tested as part of the piloting. The research team may ask reviewers working on appropriate review topics to pilot the tool in their review. Alternatively, reviewers can be asked to pilot the tool on a series of sample papers and to provide feedback on their experience of using the tool. An efficient way of completing such a process is to hold a piloting event where reviewers try out the tool on a sample of papers which they can either bring with them or that are provided to them. This can be a good approach to get feedback in a timely and interactive manner. However, there are costs associated with running such an event. Asking reviewers to pilot the tool in ongoing reviews can result in delays as piloting cannot be started until the review is at the data extraction stage. Identifying reviews at an appropriate stage with reviewers willing to spend the extra time needed to pilot a new tool is not always straightforward. We held a piloting event when developing the RoB 2.0 and found this to be very efficient in providing immediate feedback on the tool. We were also able to hold a group discussion for reviewers to provide suggestions for improvements to the tool and to highlight any items that they found difficult. For previous tools, we used remote piloting which provided helpful feedback but was not as efficient as the piloting event. Ideally, any piloting process should involve reviewers with a broad range of experience ranging from those with extensive experience of conducting quality assessment of studies of a variety of designs to those relatively new to the process.
The time taken for piloting and refining the tool can vary considerably. For some tools, such as ROBIS and QUADAS-2, this process was completed in around 6–9 months. For PROBAST and ROBINS-I, the process took over 4 years.
Stage 3: dissemination
Develop a publication strategy.
A strategy to disseminate the tool is required. This should be discussed at the start of the project but may evolve as the tool is developed. The primary means of dissemination is usually through publication in a peer-reviewed journal. A more detailed guidance document can accompany the publication and be made available as a web appendix. Another option is to have dual publications, one reporting the tool and outlining how it was developed, and a second providing additional guidance on how to use the tool. This is sometimes known as an ‘E&E’ (explanation and elaboration) publication and is an approach adopted by many reporting guidelines [ 13 ].
Establish a website
Developing a website for the tool can help with dissemination. Ideally, the website should be developed before publication of the tool so that details can be included in the publication. The final version of the tool can be posted on the website together with the full guidance document. Details on who contributed to the tool development and any funding should also be acknowledged on the website. Additional resources to help reviewers use the tool can also be posted there. For example, the ROBIS ( www.robis-tool.info ) and QUADAS ( www.quadas.org ) websites both contain Microsoft Access database that reviewers can use to complete their assessment and templates to produce graphical and tabular displays. They also contain links to other relevant resources and details of training opportunities. Other resources that may be useful to include on tool websites include worked examples and translations of the tools, where available. QUADAS-2 has been translated into Italian and Japanese, and the translations of these tools can be accessed via its website. If the tool has been endorsed or recommended for use by particular organisations (e.g. Cochrane, UK National Institute for Health and Care Excellence (NICE)), then this could also be included on the website.
The website is also a helpful way to encourage comments about the tool, which can lead to its further improvement, and exchange of experiences with the tool implementation.
Encourage uptake of tool by leading organisations
Encouraging organisations, both national and international, to recommend the tool for use in their systematic reviews is a very effective means of making sure that, once developed, the tool is used. There are different ways this can be achieved. Involving representatives from a wide range of organisations as part of the development team may mean that they are more likely to recommend the use of the tool in their organisations. Presentations at conferences, for example the Cochrane Colloquium or Health Technology Assessment Conference, may increase knowledge of the tool within that organisation making it more likely that the tool may be recommended for use. Running workshops on the tool for organisations can help increase familiarity and usability of the tool. These can also provide helpful feedback for what to include in guidance documents and to inform future updates of the tool. For example, we have been running workshops on QUADAS and ROBIS within Cochrane for a number of years. We have also provided training to institutions such as NICE on how to use the tools. QUADAS is now recommended by both these organisations, among many others, for use in diagnostic accuracy reviews. We have also run workshops on ROBIS, PROBAST, ROBINS-I and RoB 2.0 at the annual Cochrane Colloquium. We were recently approached by the Estonian Health Insurance Fund with a request to provide training to some of their reviewers so that they could implement ROBIS within their guideline development process. We supported this by running a specific training session for them.
Ultimately, the best way to encourage tool uptake is to make sure that the tool was developed robustly and fills a gap where there is currently no existing tool or there are limitations with existing tools. Ensuring that the tool is widely disseminated also means that the tool is more likely to be used and recommended.
Translate tools
After the tool has been published, you may receive requests to translate the tool. Translation can help to disseminate the tool and encourage its use in a much broader range of countries. Tool translations, therefore, should be encouraged but it is important to reassure yourself that the translation has been completed appropriately. One method to do this is via back translation.
In this paper, we suggest a framework for developing quality assessment tools. The framework consists of three stages: (1) initial steps, (2) tool development and (3) dissemination. Each stage includes defined steps that we consider important to follow when developing a tool; there is some flexibility on how these stages may be approached. In developing this framework, we have drawn on our extensive experience of developing quality assessment tools. Despite having used different approaches to the development of each of these tools, we found that all approaches shared common features and processes. This led to the development of the framework. We recommend that anyone who would like to develop a new quality assessment tool follow the stages outlined in this paper.
When developing a new tool, you need to decide how to approach each of the proposed stages. We have given some examples of how to do this, other approaches may also be valid. Factors that may influence how you choose to approach the development of your tool include available funding, topic area, number and range of people to involve, target audience and tool complexity. For example, holding face-to-face meetings and running piloting events incur greater costs than web-based surveys or asking reviewers to pilot the tool at their own convenience. More complex tools may take longer, require additional expertise, and require more piloting and refinement.
We are not aware of any existing guidance on how to develop QA tools. Moher and colleagues have produced guidance on how to develop reporting guidelines [ 13 ]. These have been cited over 190 times, mainly by new reporting guidelines, suggesting that many reporting guideline developers have found a structured approach helpful. In the absence of guidance specifically for the development of QA tools, we also based our development of QUADAS-2 [ 8 ] and ROBIS [ 9 ] on the guidance for developing reporting guidance. Although many of the steps proposed by Moher et al. apply to the development of QA tool, there are areas where these are not directly relevant and where specific guidance on developing QA tools would be helpful.
There are a very large number of quality assessment tools available. When developing ROBIS and QUADAS, we conducted reviews of existing quality assessment tools. These identified 40 tools to assess the quality of systematic reviews [ 19 ] and 91 tools to assess the quality of diagnostic accuracy studies [ 20 ]. However, only three systematic review tools (7.5%) [ 19 ] and two diagnostic tools (2%) reported being rigorously developed [ 20 ]. The lack of a rigorous development process for most tools suggests a need for guidance on how to develop quality assessment tools. We hope that our proposed framework will increase the number of tools developed using robust methods.
The large number of quality assessment tools available makes it difficult for people working on systematic reviews to choose the most appropriate tool(s) for use in their reviews. Therefore, we are developing an initiative similar to the EQUATOR Network to improve the process of quality assessment in systematic reviews. This will be known as the LATITUDES Network ( www.latitudes-network.org ). LATITUDES aims to highlight and increase the use of key risk of bias assessment tools, help people to use these tools more effectively, improve incorporation of results of the risk of bias assessment into the review and to disseminate best practice in risk of bias assessment.
Murad MH, Montori VM. Synthesizing evidence: shifting the focus from individual studies to the body of evidence. JAMA. 2013;309(21):2217–8.
Article PubMed Google Scholar
Centre for Reviews and Dissemination. Systematic reviews: CRD’s guidance for undertaking reviews in health care [internet]. In . York: University of York; 2009. [accessed 23 Mar 2011].
Higgins JPT, Green S (eds.): Cochrane handbook for systematic reviews of interventions [Internet]. Version 5.1.0 [updated March 2011]: The Cochrane Collaboration; 2011. [accessed 23 Mar 2011 ].
Torgerson D, Torgerson C. Designing randomised trials in health, education and the social sciences: an introduction. New York: Palgrave MacMillan; 2008.
Book Google Scholar
Guyatt GH, Oxman AD, Vist G, Kunz R, Brozek J, Alonso-Coello P, Montori V, Akl EA, Djulbegovic B, Falck-Ytter Y, et al. GRADE guidelines: 4. Rating the quality of evidence—study limitations (risk of bias). J Clin Epidemiol. 2011;64(4):407–15.
Balshem H, Helfand M, Schunemann HJ, Oxman AD, Kunz R, Brozek J, Vist GE, Falck-Ytter Y, Meerpohl J, Norris S, et al. GRADE guidelines: 3. Rating the quality of evidence. J Clin Epidemiol. 2011;64(4):401–6.
Whiting P, Rutjes AW, Reitsma JB, Bossuyt PM, Kleijnen J. The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Med Res Methodol. 2003;3:25.
Article PubMed PubMed Central Google Scholar
Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, Leeflang MM, Sterne JA, Bossuyt PM. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155(8):529–36.
Whiting P, Savovic J, Higgins JP, Caldwell DM, Reeves BC, Shea B, Davies P, Kleijnen J, Churchill R, group R. ROBIS: a new tool to assess risk of bias in systematic reviews was developed. J Clin Epidemiol. 2016;69:225–34.
Mallett S, Wolff R, Whiting P, Riley R, Westwood M, Kleinen J, Collins G, Reitsma H, Moons K. Methods for evaluating medical tests and biomarkers. 04 prediction model study risk of bias assessment tool (PROBAST). Diagn Prognostic Res. 2017;1(1):7.
Google Scholar
Sterne JA, Hernán MA, Reeves BC, Savović J, Berkman ND, Viswanathan M, Henry D, Altman DG, Ansari MT, Boutron I. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ. 2016;355:i4919.
Higgins J, Sterne J, Savović J, Page M, Hróbjartsson A, Boutron I, Reeves B, Eldridge S. A revised tool for assessing risk of bias in randomized trials. In: Chandler J, McKenzie J, Boutron I, Welch V, editors. Cochrane methods Cochrane database of systematic reviews volume issue 10 (Suppl 1); 2016.
Moher D, Schulz KF, Simera I, Altman DG. Guidance for developers of health research reporting guidelines. PLoS Med. 2010;7(2):e1000217.
Shea BJ, Grimshaw JM, Wells GA, Boers M, Andersson N, Hamel C, Porter AC, Tugwell P, Moher D, Bouter LM. Development of AMSTAR: a measurement tool to assess the methodological quality of systematic reviews. BMC Med Res Methodol. 2007;7:10.
Whiting P, Harbord R, Kleijnen J. No role for quality scores in systematic reviews of diagnostic accuracy studies. BMC Med Res Methodol. 2005;5:19.
Juni P, Witschi A, Bloch R, Egger M. The hazards of scoring the quality of clinical trials for meta-analysis. JAMA. 1999;282(11):1054–60.
Article CAS PubMed Google Scholar
Whiting P, Rutjes AW, Reitsma JB, Glas AS, Bossuyt PM, Kleijnen J. Sources of variation and bias in studies of diagnostic accuracy: a systematic review. Ann Intern Med. 2004;140(3):189–202.
Whiting PF, Rutjes AW, Westwood ME, Mallett S, Group Q-S. A systematic review classifies sources of bias and variation in diagnostic test accuracy studies. J Clin Epidemiol. 2013;66(10):1093–104.
Whiting P, Davies P, Savović J, Caldwell D, Churchill R. Evidence to inform the development of ROBIS, a new tool to assess the risk of bias in systematic reviews. 2013.
Whiting P, Rutjes AW, Dinnes J, Reitsma JB, Bossuyt PM, Kleijnen J. A systematic review finds that diagnostic reviews fail to incorporate quality despite available tools. J Clin Epidemiol. 2005;58(1):1–12.
Moons KG, Altman DG, Reitsma JB, Ioannidis JP, Macaskill P, Steyerberg EW, Vickers AJ, Ransohoff DF, Collins GS. Transparent reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): explanation and elaboration The TRIPOD statement: explanation and elaboration. Ann Intern Med. 2015;162(1):W1–W73.
Moons KG, de Groot JA, Bouwmeester W, Vergouwe Y, Mallett S, Altman DG, Reitsma JB, Collins GS. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PLoS Med. 2014;11(10):e1001744.
Higgins JP, Altman DG, Gøtzsche PC, Jüni P, Moher D, Oxman AD, Savović J, Schulz KF, Weeks L, Sterne JA. The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials. Bmj. 2011;343:d5928.
Download references
Acknowledgements
Not applicable.
The development of QUADAS-2, ROBIS and the new version of the Cochrane risk of bias tool for randomised trials (RoB 2.0) were funded by grants from the UK Medical Research Council (G0801405/1, MR/K01465X/1, MR/L004933/1- N61 and MR/K025643/1). ROBINS-I was funded by the Cochrane Methods Innovation Fund.
PW and JS time was partially supported by the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care (CLAHRC) West at University Hospitals Bristol NHS Foundation Trust. SM received support from the NIHR Birmingham Biomedical Research Centre.
The views expressed in this article are those of the authors and not necessarily those of the NHS, NIHR, MRC and Cochrane or the Department of Health. The funders had no role in the design of the study, data collection and analysis, decision to publish or preparation of the manuscript.
Availability of data and materials
Author information, authors and affiliations.
NIHR CLAHRC West, University Hospitals Bristol NHS Foundation Trust, Bristol, UK
Penny Whiting & Jelena Savović
School of Social and Community Medicine, University of Bristol, Bristol, UK
Kleijnen Systematic Reviews Ltd., Escrick, York, UK
Robert Wolff
Institute of Applied Health Research, University of Birmingham, Birmingham, UK
Susan Mallett
National Institute for Health Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK
Centre for Tropical Medicine and Global Health, University of Oxford, Oxford, UK
Iveta Simera
You can also search for this author in PubMed Google Scholar
Contributions
PW conceived the idea for this paper and drafted the manuscript. JS, IS, RW and SM contributed to the writing of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Penny Whiting .
Ethics declarations
Ethics approval and consent to participate, consent for publication, competing interests.
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.
Reprints and permissions
About this article
Cite this article.
Whiting, P., Wolff, R., Mallett, S. et al. A proposed framework for developing quality assessment tools. Syst Rev 6 , 204 (2017). https://doi.org/10.1186/s13643-017-0604-6
Download citation
Received : 11 July 2017
Accepted : 04 October 2017
Published : 17 October 2017
DOI : https://doi.org/10.1186/s13643-017-0604-6
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
- Risk of bias
- Systematic reviews
Systematic Reviews
ISSN: 2046-4053
- Submission enquiries: Access here and click Contact Us
- General enquiries: [email protected]
An official website of the United States government
The .gov means it's official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
- Publications
- Account settings
- Browse Titles
NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Lorenc T, Petticrew M, Whitehead M, et al. Crime, fear of crime and mental health: synthesis of theory and systematic reviews of interventions and qualitative evidence. Southampton (UK): NIHR Journals Library; 2014 Mar. (Public Health Research, No. 2.2.)
Crime, fear of crime and mental health: synthesis of theory and systematic reviews of interventions and qualitative evidence.
Appendix 5 quality assessment for the systematic review of qualitative evidence.
The quality assessment tool used for the qualitative studies was drawn directly from Appendix D of Hawker et al. 245 This tool contains nine questions, each of which can be answered ‘good’, ‘fair’, ‘poor’ or ‘very poor’. Having applied the tool to the studies, we converted it into a numerical score by assigning the answers from 1 point (very poor) to 4 points (good). This produced a score for each study of a minimum of 9 points and a maximum of 36 points. To create the overall quality grades we used the following definitions: high quality (A), 30–36 points; medium quality (B), 24–29 points; low quality (C), 9–24 points. The nine questions in the tool are as follows:
- Abstract and title . Did they provide a clear description of the study? Good: structured abstract with full information and clear title. Fair: abstract with most of the information. Poor: inadequate abstract. Very poor: no abstract.
- Introduction and aims . Was there a good background section and clear statement of the aims of the research? Good: full but concise background to discussion/study containing up-to-date literature review and highlighting gaps in knowledge; clear statement of aim AND objectives including research questions. Fair: some background and literature review; research questions outlined. Poor: some background but no aim/objectives/questions OR aims/objectives but inadequate background. Very poor: no mention of aims/objectives; no background or literature review.
- Method and data . Is the method appropriate and clearly explained? Good: method is appropriate and described clearly (e.g. questionnaires included); clear details of the data collection and recording. Fair: method appropriate, description could be better; data described. Poor: questionable whether method is appropriate; method described inadequately; little description of data. Very poor: no mention of method AND/OR method inappropriate AND/OR no details of data.
- Sampling . Was the sampling strategy appropriate to address the aims? Good: details (age/gender/race/context) of who was studied and how they were recruited and why this group was targeted; the sample size was justified for the study; response rates shown and explained. Fair: sample size justified; most information given but some missing. Poor: sampling mentioned but few descriptive details. Very poor: no details of sample.
- Data analysis . Was the description of the data analysis sufficiently rigorous? Good: clear description of how analysis was carried out; description of how themes derived/respondent validation or triangulation. Fair: descriptive discussion of analysis. Poor: minimal details about analysis. Very poor: no discussion of analysis.
- Ethics and bias . Have ethical issues been addressed and has necessary ethical approval been gained?Has the relationship between researchers and participants been adequately considered? Good: ethics: when necessary, issues of confidentiality, sensitivity and consent were addressed; bias: researcher was reflexive and/or aware of own bias. Fair: lip service was paid to above (i.e. these issues were acknowledged). Poor: brief mention of issues. Very poor: no mention of issues.
- Results . Is there a clear statement of the findings? Good: findings explicit, easy to understand and in logical progression; tables, if present, are explained in text; results relate directly to aims; sufficient data are presented to support findings. Fair: findings mentioned but more explanation could be given; data presented relate directly to results. Poor: findings presented haphazardly, not explained and do not progress logically from results. Very poor: findings not mentioned or do not relate to aims.
- Transferability or generalisability . Are the findings of this study transferable (generalisable) to a wider population? Good: context and setting of the study are described sufficiently to allow comparison with other contexts and settings, plus high score in Q4 (sampling). Fair: some context and setting described but more needed to replicate or compare the study with others, plus fair score or higher in Q4. Poor: minimal description of context/setting. Very poor: no description of context/setting.
- Implications and usefulness . How important are these findings to policy and practice? Good: contributes something new and/or different in terms of understanding/insight or perspective; suggests ideas for further research; suggests implications for policy and/or practice. Fair: two of the above. Poor: only one of the above. Very poor: none of the above.
The results of the quality assessment are shown in Table 6 .
Results of the quality assessment for the qualitative studies ( n = 39)
Included under terms of UK Non-commercial Government License .
- Cite this Page Lorenc T, Petticrew M, Whitehead M, et al. Crime, fear of crime and mental health: synthesis of theory and systematic reviews of interventions and qualitative evidence. Southampton (UK): NIHR Journals Library; 2014 Mar. (Public Health Research, No. 2.2.) Appendix 5, Quality assessment for the systematic review of qualitative evidence.
- PDF version of this title (2.4M)
In this Page
Other titles in this collection.
- Public Health Research
Recent Activity
- Quality assessment for the systematic review of qualitative evidence - Crime, fe... Quality assessment for the systematic review of qualitative evidence - Crime, fear of crime and mental health: synthesis of theory and systematic reviews of interventions and qualitative evidence
Your browsing activity is empty.
Activity recording is turned off.
Turn recording back on
Connect with NLM
National Library of Medicine 8600 Rockville Pike Bethesda, MD 20894
Web Policies FOIA HHS Vulnerability Disclosure
Help Accessibility Careers
An official website of the United States government
Official websites use .gov A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
- Publications
- Account settings
- Advanced Search
- Journal List
Systematically Reviewing the Literature: Building the Evidence for Health Care Quality
Suzanne austin boren , phd, david moxley , mlis.
- Author information
- Copyright and License information
Contact: [email protected]
Corresponding author.
There are important research and non-research reasons to systematically review the literature. This article describes a step-by-step process to systematically review the literature along with links to key resources. An example of a graduate program using systematic literature reviews to link research and quality improvement practices is also provided.
Introduction
Systematic reviews that summarize the available information on a topic are an important part of evidence-based health care. There are both research and non-research reasons for undertaking a literature review. It is important to systematically review the literature when one would like to justify the need for a study, to update personal knowledge and practice, to evaluate current practices, to develop and update guidelines for practice, and to develop work related policies. 1 A systematic review draws upon the best health services research principles and methods to address: What is the state of the evidence on the selected topic? The systematic process enables others to reproduce the methods and to make a rational determination of whether to accept the results of the review. An abundance of articles on systematic reviews exist focusing on different aspects of systematic reviews. 2 – 9 The purpose of this article is to describe a step by step process of systematically reviewing the health care literature and provide links to key resources.
Systematic Review Process: Six Key Steps
Six key steps to systematically review the literature are outlined in Table 1 and discussed here.
Systematic Review Steps
1. Formulate the Question and Refine the Topic
When preparing a topic to conduct a systematic review, it is important to ask at the outset, “What exactly am I looking for?” Hopefully it seems like an obvious step, but explicitly writing a one or two sentence statement of the topic before you begin to search is often overlooked. It is important for several reasons; in particular because, although we usually think we know what we are searching for, in truth our mental image of a topic is often quite fuzzy. The act of writing something concise and intelligible to a reader, even if you are the only one who will read it, clarifies your thoughts and can inspire you to ask key questions. In addition, in subsequent steps of the review process, when you begin to develop a strategy for searching the literature, your topic statement is the ready raw material from which you can extract the key concepts and terminology for your strategies. The medical and related health literature is massive, so the more precise and specific your understanding of your information need, the better your results will be when you search.
2. Search, Retrieve, and Select Relevant Articles
The retrieval tools chosen to search the literature should be determined by the purpose of the search. Questions to ask include: For what and by whom will the information be used? A topical expert or a novice? Am I looking for a simple fact? A comprehensive overview on the topic? Exploration of a new topic? A systematic review? For the purpose of a systematic review of journal research in the area of health care, PubMed or Medline is the most appropriate retrieval tool to start with, however other databases may be useful ( Table 2 ). In particular, Google Scholar allows one to search the same set of articles as PubMed/MEDLINE, in addition to some from other disciplines, but it lacks a number of key advanced search features that a skilled searcher can exploit in PubMed/MEDLINE.
Examples of Electronic Bibliographic Databases Specific to Health Care
Note: These databases may be available through university or hospital library systems.
An effective way to search the literature is to break the topic into different “building blocks.” The building blocks approach is the most systematic and works the best in periodical databases such as PubMed/MEDLINE. The “blocks” in a “building blocks” strategy consist of the key concepts in the search topic. For example, let’s say we are interested in researching about mobile phone-based interventions for monitoring of patient status or disease management. We could break the topic into the following concepts or blocks: 1. Mobile phones, 2. patient monitoring, and 3. Disease management. Gather synonyms and related terms to represent each concept and match to available subject headings in databases that offer them. Organize the resulting concepts into individual queries. Run the queries and examine your results to find relevant items and suggest query modifications to improve your results. Revise and re-run your strategy based on your observations. Repeat this process until you are satisfied or further modifications produce no improvements. For example in Medline, these terms would be used in this search and combined as follows: cellular phone AND (ambulatory monitoring OR disease management), where each of the key word phrases is an official subject heading in the MEDLINE vocabulary. Keep detailed notes on the literature search, as it will need to be reported in the methods section of the systematic review paper. Careful noting of search strategies also allows you to revisit a topic in the future and confidently replicate the same results, with the addition of those subsequently published on your topic.
3. Assess Quality
There is no consensus on the best way to assess study quality. Many quality assessment tools include issues such as: appropriateness of study design to the research objective, risk of bias, generalizability, statistical issues, quality of the intervention, and quality of reporting. Reporting guidelines for most literature types are available at the EQUATOR Network website ( http://www.equator-network.org/ ). These guidelines are a useful starting point; however they should not be used for assessing study quality.
4. Extract Data and Information
Extract information from each eligible article into a standardized format to permit the findings to be summarized. This will involve building one or more tables. When making tables each row should represent an article and each column a variable. Not all of the information that is extracted into the tables will end up in the paper. All of the information that is extracted from the eligible articles will help you obtain an overview of the topic, however you will want to reserve the use of tables in the literature review paper for the more complex information. All tables should be introduced and discussed in the narrative of the literature review. An example of an evidence summary table is presented in Table 3 .
Example of an evidence summary table
Notes: BP = blood pressure, HbA1c = Hemoglobin A1c, Hypo = hypoglycemic, I = Internet, NS = not significant, PDA = personal digital assistant, QOL = quality of life, SMBG = self-monitored blood glucose, SMS = short message service, V = voice
5. Analyze and Synthesize Data and information
The findings from individual studies are analyzed and synthesized so that the overall effectiveness of the intervention can be determined. It should also be observed at this time if the effect of an intervention is comparable in different studies, participants, and settings.
6. Write the Systematic Review
The PRISMA 12 and ENTREQ 13 checklists can be useful resources when writing a systematic review. These uniform reporting tools focus on how to write coherent and comprehensive reviews that facilitate readers and reviewers in evaluating the relative strengths and weaknesses. A systematic literature review has the same structure as an original research article:
TITLE : The systematic review title should indicate the content. The title should reflect the research question, however it should be a statement and not a question. The research question and the title should have similar key words.
STRUCTURED ABSTRACT: The structured abstract recaps the background, methods, results and conclusion in usually 250 words or less.
INTRODUCTION: The introduction summarizes the topic or problem and specifies the practical significance for the systematic review. The first paragraph or two of the paper should capture the attention of the reader. It might be dramatic, statistical, or descriptive, but above all, it should be interesting and very relevant to the research question. The topic or problem is linked with earlier research through previous attempts to solve the problem. Gaps in the literature regarding research and practice should also be noted. The final sentence of the introduction should clearly state the purpose of the systematic review.
METHODS: The methods provide a specification of the study protocol with enough information so that others can reproduce the results. It is important to include information on the:
Eligibility criteria for studies: Who are the patients or subjects? What are the study characteristics, interventions, and outcomes? Were there language restrictions?
Literature search: What databases were searched? Which key search terms were used? Which years were searched?
Study selection: What was the study selection method? Was the title screened first, followed by the abstract, and finally the full text of the article?
Data extraction: What data and information will be extracted from the articles?
Data analysis: What are the statistical methods for handling any quantitative data?
RESULTS: The results should also be well-organized. One way to approach the results is to include information on the:
Search results: What are the numbers of articles identified, excluded, and ultimately eligible?
Study characteristics: What are the type and number of subjects? What are the methodological features of the studies?
Study quality score: What is the overall quality of included studies? Does the quality of the included studies affect the outcome of the results?
Results of the study: What are the overall results and outcomes? Could the literature be divided into themes or categories?
DISCUSSION: The discussion begins with a nonnumeric summary of the results. Next, gaps in the literature as well as limitations of the included articles are discussed with respect to the impact that they have on the reliability of the results. The final paragraph provides conclusions as well as implications for future research and current practice. For example, questions for future research on this topic are revealed, as well as whether or not practice should change as a result of the review.
REFERENCES: A complete bibliographical list of all journal articles, reports, books, and other media referred to in the systematic review should be included at the end of the paper. Referencing software can facilitate the compilation of citations and is useful in terms of ensuring the reference list is accurate and complete.
The following resources may be helpful when writing a systematic review:
CEBM: Centre for Evidence-based Medicine. Dedicated to the practice, teaching and dissemination of high quality evidence based medicine to improve health care Available at: http://www.cebm.net/ .
CITING MEDICINE: The National Library of Medicine Style Guide for Authors, Editors, and Publishers. This resource provides guidance in compiling, revising, formatting, and setting reference standards. Available at http://www.ncbi.nlm.nih.gov/books/NBK7265/ .
EQUATOR NETWORK: Enhancing the QUAlity and Transparency Of health Research. The EQUATOR Network promotes the transparent and accurate reporting of research studies. Available at: http://www.equator-network.org/ .
ICMJE RECOMMENDATIONS: International Committee of Medical Journal Editors Recommendations for the Conduct, Reporting, Editing and Publication of Scholarly Work in Medical Journals. The ICJME recommendations are followed by a large number of journals. Available at: http://www.icmje.org/about-icmje/faqs/icmje-recommendations/ .
PRISMA STATEMENT: Preferred Reporting Items for Systematic Reviews and Meta-Analyses. Authors can utilize the PRISMA Statement checklist to improve the reporting of systematic reviews and meta-analyses. Available at: http://prisma-statement.org .
THE COCHRANE COLLABORATION: A reliable source for making evidence generated through research useful for informing decisions about health. Available at: http://www.cochrane.org/ .
Examples of Systematic Reviews To Link Research and Quality Improvement
Over the past 17 years more than 300 learners, including physicians, nurses, and health administrators have completed a course as part of a Master of Health Administration or a Master of Science in Health Informatics degree at the University of Missouri. An objective of the course is to educate health informatics and health administration professionals about how to utilize a systematic, scientific, and evidence-based approach to literature searching, appraisal, and synthesis. Learners in the course conduct a systematic review of the literature on a health care topic of their choosing that could suggest quality improvement in their organization. Students select topics that make sense in terms of their core educational competencies and are related to their work. The categories of topics include public health, leadership, information management, health information technology, electronic medical records, telehealth, patient/clinician safety, treatment/screening evaluation cost/finance, human resources, planning and marketing, supply chain, education/training, policies and regulations, access, and satisfaction. Some learners have published their systematic literature reviews 14 – 15 . Qualitative comments from the students indicate that the course is well received and the skills learned in the course are applicable to a variety of health care settings.
Undertaking a literature review includes identification of a topic of interest, searching and retrieving the appropriate literature, assessing quality, extracting data and information, analyzing and synthesizing the findings, and writing a report. A structured step-by-step approach facilitates the development of a complete and informed literature review.
Suzanne Austin Boren, PhD, MHA, (above) is Associate Professor and Director of Academic Programs, and David Moxley, MLIS, is Clinical Instructor and Associate Director of Executive Programs. Both are in the Department of Health Management and Informatics at the University of Missouri School of Medicine.
None reported.
- 1. Polit DF, Beck CT. Nursing Research: principles and methods. 9th edition. Lippincott, Williams & Wilkins; Philadelphia: 2011. [ Google Scholar ]
- 2. Bruce J, Mollison J. Reviewing the literature: adopting a systematic approach. Journal of Family Planning and Reproductive Health Care. 2004;30(1) doi: 10.1783/147118904322701901. [ DOI ] [ PubMed ] [ Google Scholar ]
- 3. Cronin P, Ryan F, Coughlin M. Undertaking a literature review: A step-by-step approach. British Journal of Nursing. 2008;17(1):38–43. doi: 10.12968/bjon.2008.17.1.28059. [ DOI ] [ PubMed ] [ Google Scholar ]
- 4. Crowther DM. A clinician’s guide to systematic reviews. Nutr Clin Pract. 2013;28:459–462. doi: 10.1177/0884533613490742. [ DOI ] [ PubMed ] [ Google Scholar ]
- 5. Hasse SC. Systematic reviews and meta-analysis. Plast Reconstr Surg. 2011;127:955–966. doi: 10.1097/PRS.0b013e318200afa9. [ DOI ] [ PubMed ] [ Google Scholar ]
- 6. Mandrekar JN, Mandreker SJ. Systematic reviews and meta analysis of published studies: An over view and best practices. J Thorac Oncol. 2011;6(8):1301–1303. doi: 10.1097/JTO.0b013e31822461b0. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 7. Ng KH, Peh WC. Writing a systematic review. Singapore Med J. 2010 May;51(5):362–6. [ PubMed ] [ Google Scholar ]
- 8. Price B. Guidance on conducting a literature search reviewing mixed literature. Nursing Standard. 2009;23(24):43–49. doi: 10.7748/ns2009.02.23.24.43.c6829. [ DOI ] [ PubMed ] [ Google Scholar ]
- 9. Engberg S. Systematic reviews and meta-analysis: Studies of studies. J Wound Ostomy Continence Nurs. 2008;35(3):258–265. doi: 10.1097/01.WON.0000319122.76112.23. [ DOI ] [ PubMed ] [ Google Scholar ]
- 10. Benhamou PY, Melki V, Boizel R, et al. One-year efficacy and safety of Web-based follow-up using cellular phone in type 1 diabetic patients under insulin pump therapy: the PumpNet study. Diabetes & Metabolism. 2007;33(3):220–6. doi: 10.1016/j.diabet.2007.01.002. [ DOI ] [ PubMed ] [ Google Scholar ]
- 11. Marquez Contreras E, de la Figuera von Wichmann M, Gil Guillen V, Ylla-Catala A, Figueras M, Balana M, Naval J. Effective news of an inter vention to provide information to patients with hypertension as short text messages and reminder sent to their mobile phone [Spanish] Aten Primaria. 2004;34(8):399–405. doi: 10.1016/S0212-6567(04)78922-2. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 12. Moher D, Liberati A, Tetzlaff J, Altman DG The PRISMA Group. Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement. Ann Intern Med. 2009;151(4):264–269. W64. doi: 10.7326/0003-4819-151-4-200908180-00135. [ DOI ] [ PubMed ] [ Google Scholar ]
- 13. Tong A, Flemming K, McInnes E, Oliver S, Craig J. Enhancing transparency in reporting the synthesis of qualitative research: ENTREQ. BMC Med Res Methodol. 2012;12(1):181. doi: 10.1186/1471-2288-12-181. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
- 14. Hart MD. Informatics competency and development within the US nursing population workforce. CIN: Computers, Informatics, Nursing. 2008;26(6):320–329. doi: 10.1097/01.NCN.0000336462.94939.4c. [ DOI ] [ PubMed ] [ Google Scholar ]
- 15. Bryan C, Boren SA. The use and effectiveness of electronic clinical decision support tools in the ambulatory, primary care setting: A systematic review of the literature. Informatics in Primary Care. 2008 Jun;16(2):79–91. doi: 10.14236/jhi.v16i2.679. [ DOI ] [ PubMed ] [ Google Scholar ]
- PDF (510.6 KB)
- Collections
Similar articles
Cited by other articles, links to ncbi databases.
- Download .nbib .nbib
- Format: AMA APA MLA NLM
IMAGES
VIDEO
COMMENTS
Use quality assessment tools to grade each article. Create a summary of the quality of literature included in your review. This page has links to quality assessment tools you can use to evaluate different study types. Librarians can help you find widely used tools to evaluate the articles in your review.
1 The systematic review process involves a comprehensive search on a focused practical issue, followed by the inclusion of eligible studies based on clearly defined criteria, the quality assessment of each study, data extraction, and finally synthesizing the data from the included studies. In systematic reviews, the use of inappropriate tools ...
ity of the evidence using GRADE criteriaThe GRADE system considers 8 criter. or assessing the quality of evidence. All decisions to downgrade involve subjective judgements, so a consensus view of the quality of evidence for. each outcome is of paramount importance. For this reason downgrading decisi.
For faculty and educational researchers, the present case study and the training guide (see Supplemental Appendix B) serve as a heuristic for organizing and training systematic review teams, specifically the assessment of methodological quality. In addition, study authors anticipate that promoting the systematic review as a research method and ...
Quality Assessment. Assessing the quality of evidence contained within a systematic review is as important as analyzing the data within. Results from a poorly conducted study can be skewed by biases from the research methodology and should be interpreted with caution. Such studies should be acknowledged as such in the systematic review or ...
Protocol. Protocol development is considered a core component of systematic reviews [125, 126, 132]. Review protocols may allow researchers to plan and anticipate potential issues, assess validity of methods, prevent arbitrary decision-making, and minimize bias that can be introduced by the conduct of the review.
Systematic reviews are generally considered to provide the most reliable form of evidence for decision makers [1]. A formal assessment of the quality of the included studies is an essential component of any systematic review [2, 3]. Quality can be considered to have three components—internal validity (risk of bias), external validity ...
We conducted a scoping review, a type of literature review that is used when it is difficult to identify a narrow review question; no prior synthesis has been undertaken on the topic; studies in the review sources are likely to have employed a range of data collection and analysis techniques; and a quality assessment of reviewed sources is not ...
The quality assessment tool used for the qualitative studies was drawn directly from Appendix D of Hawker et al. 245 This tool contains nine questions, each of which can be answered ‘good’, ‘fair’, ‘poor’ or ‘very poor’. Having applied the tool to the studies, we converted it into a numerical score by assigning the answers from ...
Introduction. Systematic reviews that summarize the available information on a topic are an important part of evidence-based health care. There are both research and non-research reasons for undertaking a literature review. It is important to systematically review the literature when one would like to justify the need for a study, to update ...