• Affiliate Program

Wordvice

  • UNITED STATES
  • 台灣 (TAIWAN)
  • TÜRKIYE (TURKEY)
  • Academic Editing Services
  • - Research Paper
  • - Journal Manuscript
  • - Dissertation
  • - College & University Assignments
  • Admissions Editing Services
  • - Application Essay
  • - Personal Statement
  • - Recommendation Letter
  • - Cover Letter
  • - CV/Resume
  • Business Editing Services
  • - Business Documents
  • - Report & Brochure
  • - Website & Blog
  • Writer Editing Services
  • - Script & Screenplay
  • Our Editors
  • Client Reviews
  • Editing & Proofreading Prices
  • Wordvice Points
  • Partner Discount
  • Plagiarism Checker
  • APA Citation Generator
  • MLA Citation Generator
  • Chicago Citation Generator
  • Vancouver Citation Generator
  • - APA Style
  • - MLA Style
  • - Chicago Style
  • - Vancouver Style
  • Writing & Editing Guide
  • Academic Resources
  • Admissions Resources

How to Write the Rationale of the Study in Research (Examples)

sample rationale in research paper

What is the Rationale of the Study?

The rationale of the study is the justification for taking on a given study. It explains the reason the study was conducted or should be conducted. This means the study rationale should explain to the reader or examiner why the study is/was necessary. It is also sometimes called the “purpose” or “justification” of a study. While this is not difficult to grasp in itself, you might wonder how the rationale of the study is different from your research question or from the statement of the problem of your study, and how it fits into the rest of your thesis or research paper. 

The rationale of the study links the background of the study to your specific research question and justifies the need for the latter on the basis of the former. In brief, you first provide and discuss existing data on the topic, and then you tell the reader, based on the background evidence you just presented, where you identified gaps or issues and why you think it is important to address those. The problem statement, lastly, is the formulation of the specific research question you choose to investigate, following logically from your rationale, and the approach you are planning to use to do that.

Table of Contents:

How to write a rationale for a research paper , how do you justify the need for a research study.

  • Study Rationale Example: Where Does It Go In Your Paper?

The basis for writing a research rationale is preliminary data or a clear description of an observation. If you are doing basic/theoretical research, then a literature review will help you identify gaps in current knowledge. In applied/practical research, you base your rationale on an existing issue with a certain process (e.g., vaccine proof registration) or practice (e.g., patient treatment) that is well documented and needs to be addressed. By presenting the reader with earlier evidence or observations, you can (and have to) convince them that you are not just repeating what other people have already done or said and that your ideas are not coming out of thin air. 

Once you have explained where you are coming from, you should justify the need for doing additional research–this is essentially the rationale of your study. Finally, when you have convinced the reader of the purpose of your work, you can end your introduction section with the statement of the problem of your research that contains clear aims and objectives and also briefly describes (and justifies) your methodological approach. 

When is the Rationale for Research Written?

The author can present the study rationale both before and after the research is conducted. 

  • Before conducting research : The study rationale is a central component of the research proposal . It represents the plan of your work, constructed before the study is actually executed.
  • Once research has been conducted : After the study is completed, the rationale is presented in a research article or  PhD dissertation  to explain why you focused on this specific research question. When writing the study rationale for this purpose, the author should link the rationale of the research to the aims and outcomes of the study.

What to Include in the Study Rationale

Although every study rationale is different and discusses different specific elements of a study’s method or approach, there are some elements that should be included to write a good rationale. Make sure to touch on the following:

  • A summary of conclusions from your review of the relevant literature
  • What is currently unknown (gaps in knowledge)
  • Inconclusive or contested results  from previous studies on the same or similar topic
  • The necessity to improve or build on previous research, such as to improve methodology or utilize newer techniques and/or technologies

There are different types of limitations that you can use to justify the need for your study. In applied/practical research, the justification for investigating something is always that an existing process/practice has a problem or is not satisfactory. Let’s say, for example, that people in a certain country/city/community commonly complain about hospital care on weekends (not enough staff, not enough attention, no decisions being made), but you looked into it and realized that nobody ever investigated whether these perceived problems are actually based on objective shortages/non-availabilities of care or whether the lower numbers of patients who are treated during weekends are commensurate with the provided services.

In this case, “lack of data” is your justification for digging deeper into the problem. Or, if it is obvious that there is a shortage of staff and provided services on weekends, you could decide to investigate which of the usual procedures are skipped during weekends as a result and what the negative consequences are. 

In basic/theoretical research, lack of knowledge is of course a common and accepted justification for additional research—but make sure that it is not your only motivation. “Nobody has ever done this” is only a convincing reason for a study if you explain to the reader why you think we should know more about this specific phenomenon. If there is earlier research but you think it has limitations, then those can usually be classified into “methodological”, “contextual”, and “conceptual” limitations. To identify such limitations, you can ask specific questions and let those questions guide you when you explain to the reader why your study was necessary:

Methodological limitations

  • Did earlier studies try but failed to measure/identify a specific phenomenon?
  • Was earlier research based on incorrect conceptualizations of variables?
  • Were earlier studies based on questionable operationalizations of key concepts?
  • Did earlier studies use questionable or inappropriate research designs?

Contextual limitations

  • Have recent changes in the studied problem made previous studies irrelevant?
  • Are you studying a new/particular context that previous findings do not apply to?

Conceptual limitations

  • Do previous findings only make sense within a specific framework or ideology?

Study Rationale Examples

Let’s look at an example from one of our earlier articles on the statement of the problem to clarify how your rationale fits into your introduction section. This is a very short introduction for a practical research study on the challenges of online learning. Your introduction might be much longer (especially the context/background section), and this example does not contain any sources (which you will have to provide for all claims you make and all earlier studies you cite)—but please pay attention to how the background presentation , rationale, and problem statement blend into each other in a logical way so that the reader can follow and has no reason to question your motivation or the foundation of your research.

Background presentation

Since the beginning of the Covid pandemic, most educational institutions around the world have transitioned to a fully online study model, at least during peak times of infections and social distancing measures. This transition has not been easy and even two years into the pandemic, problems with online teaching and studying persist (reference needed) . 

While the increasing gap between those with access to technology and equipment and those without access has been determined to be one of the main challenges (reference needed) , others claim that online learning offers more opportunities for many students by breaking down barriers of location and distance (reference needed) .  

Rationale of the study

Since teachers and students cannot wait for circumstances to go back to normal, the measures that schools and universities have implemented during the last two years, their advantages and disadvantages, and the impact of those measures on students’ progress, satisfaction, and well-being need to be understood so that improvements can be made and demographics that have been left behind can receive the support they need as soon as possible.

Statement of the problem

To identify what changes in the learning environment were considered the most challenging and how those changes relate to a variety of student outcome measures, we conducted surveys and interviews among teachers and students at ten institutions of higher education in four different major cities, two in the US (New York and Chicago), one in South Korea (Seoul), and one in the UK (London). Responses were analyzed with a focus on different student demographics and how they might have been affected differently by the current situation.

How long is a study rationale?

In a research article bound for journal publication, your rationale should not be longer than a few sentences (no longer than one brief paragraph). A  dissertation or thesis  usually allows for a longer description; depending on the length and nature of your document, this could be up to a couple of paragraphs in length. A completely novel or unconventional approach might warrant a longer and more detailed justification than an approach that slightly deviates from well-established methods and approaches.

Consider Using Professional Academic Editing Services

Now that you know how to write the rationale of the study for a research proposal or paper, you should make use of Wordvice AI’s free AI Grammar Checker , or receive professional academic proofreading services from Wordvice, including research paper editing services and manuscript editing services to polish your submitted research documents.

You can also find many more articles, for example on writing the other parts of your research paper , on choosing a title , or on making sure you understand and adhere to the author instructions before you submit to a journal, on the Wordvice academic resources pages.

How to Write the Rationale for a Research Paper

  • Research Process
  • Peer Review

A research rationale answers the big SO WHAT? that every adviser, peer reviewer, and editor has in mind when they critique your work. A compelling research rationale increases the chances of your paper being published or your grant proposal being funded. In this article, we look at the purpose of a research rationale, its components and key characteristics, and how to create an effective research rationale.

Updated on September 19, 2022

a researcher writing the rationale for a research paper

The rationale for your research is the reason why you decided to conduct the study in the first place. The motivation for asking the question. The knowledge gap. This is often the most significant part of your publication. It justifies the study's purpose, novelty, and significance for science or society. It's a critical part of standard research articles as well as funding proposals.

Essentially, the research rationale answers the big SO WHAT? that every (good) adviser, peer reviewer, and editor has in mind when they critique your work.

A compelling research rationale increases the chances of your paper being published or your grant proposal being funded. In this article, we look at:

  • the purpose of a research rationale
  • its components and key characteristics
  • how to create an effective research rationale

What is a research rationale?

Think of a research rationale as a set of reasons that explain why a study is necessary and important based on its background. It's also known as the justification of the study, rationale, or thesis statement.

Essentially, you want to convince your reader that you're not reciting what other people have already said and that your opinion hasn't appeared out of thin air. You've done the background reading and identified a knowledge gap that this rationale now explains.

A research rationale is usually written toward the end of the introduction. You'll see this section clearly in high-impact-factor international journals like Nature and Science. At the end of the introduction there's always a phrase that begins with something like, "here we show..." or "in this paper we show..." This text is part of a logical sequence of information, typically (but not necessarily) provided in this order:

the order of the introduction to a research paper

Here's an example from a study by Cataldo et al. (2021) on the impact of social media on teenagers' lives.

an example of an introduction to a research paper

Note how the research background, gap, rationale, and objectives logically blend into each other.

The authors chose to put the research aims before the rationale. This is not a problem though. They still achieve a logical sequence. This helps the reader follow their thinking and convinces them about their research's foundation.

Elements of a research rationale

We saw that the research rationale follows logically from the research background and literature review/observation and leads into your study's aims and objectives.

This might sound somewhat abstract. A helpful way to formulate a research rationale is to answer the question, “Why is this study necessary and important?”

Generally, that something has never been done before should not be your only motivation. Use it only If you can give the reader valid evidence why we should learn more about this specific phenomenon.

A well-written introduction covers three key elements:

  • What's the background to the research?
  • What has been done before (information relevant to this particular study, but NOT a literature review)?
  • Research rationale

Now, let's see how you might answer the question.

1. This study complements scientific knowledge and understanding

Discuss the shortcomings of previous studies and explain how'll correct them. Your short review can identify:

  • Methodological limitations . The methodology (research design, research approach or sampling) employed in previous works is somewhat flawed.

Example : Here , the authors claim that previous studies have failed to explore the role of apathy “as a predictor of functional decline in healthy older adults” (Burhan et al., 2021). At the same time, we know a lot about other age-related neuropsychiatric disorders, like depression.

Their study is necessary, then, “to increase our understanding of the cognitive, clinical, and neural correlates of apathy and deconstruct its underlying mechanisms.” (Burhan et al., 2021).

  • Contextual limitations . External factors have changed and this has minimized or removed the relevance of previous research.

Example : You want to do an empirical study to evaluate the effects of the COVID-19 pandemic on the number of tourists visiting Sicily. Previous studies might have measured tourism determinants in Sicily, but they preceded COVID-19.

  • Conceptual limitations . Previous studies are too bound to a specific ideology or a theoretical framework.

Example : The work of English novelist E. M. Forster has been extensively researched for its social, political, and aesthetic dimensions. After the 1990s, younger scholars wanted to read his novels as an example of gay fiction. They justified the need to do so based on previous studies' reliance on homophobic ideology.

This kind of rationale is most common in basic/theoretical research.

2. This study can help solve a specific problem

Here, you base your rationale on a process that has a problem or is not satisfactory.

For example, patients complain about low-quality hospital care on weekends (staff shortages, inadequate attention, etc.). No one has looked into this (there is a lack of data). So, you explore if the reported problems are true and what can be done to address them. This is a knowledge gap.

Or you set out to explore a specific practice. You might want to study the pros and cons of several entry strategies into the Japanese food market.

It's vital to explain the problem in detail and stress the practical benefits of its solution. In the first example, the practical implications are recommendations to improve healthcare provision.

In the second example, the impact of your research is to inform the decision-making of businesses wanting to enter the Japanese food market.

This kind of rationale is more common in applied/practical research.

3. You're the best person to conduct this study

It's a bonus if you can show that you're uniquely positioned to deliver this study, especially if you're writing a funding proposal .

For an anthropologist wanting to explore gender norms in Ethiopia, this could be that they speak Amharic (Ethiopia's official language) and have already lived in the country for a few years (ethnographic experience).

Or if you want to conduct an interdisciplinary research project, consider partnering up with collaborators whose expertise complements your own. Scientists from different fields might bring different skills and a fresh perspective or have access to the latest tech and equipment. Teaming up with reputable collaborators justifies the need for a study by increasing its credibility and likely impact.

When is the research rationale written?

You can write your research rationale before, or after, conducting the study.

In the first case, when you might have a new research idea, and you're applying for funding to implement it.

Or you're preparing a call for papers for a journal special issue or a conference. Here , for instance, the authors seek to collect studies on the impact of apathy on age-related neuropsychiatric disorders.

In the second case, you have completed the study and are writing a research paper for publication. Looking back, you explain why you did the study in question and how it worked out.

Although the research rationale is part of the introduction, it's best to write it at the end. Stand back from your study and look at it in the big picture. At this point, it's easier to convince your reader why your study was both necessary and important.

How long should a research rationale be?

The length of the research rationale is not fixed. Ideally, this will be determined by the guidelines (of your journal, sponsor etc.).

The prestigious journal Nature , for instance, calls for articles to be no more than 6 or 8 pages, depending on the content. The introduction should be around 200 words, and, as mentioned, two to three sentences serve as a brief account of the background and rationale of the study, and come at the end of the introduction.

If you're not provided guidelines, consider these factors:

  • Research document : In a thesis or book-length study, the research rationale will be longer than in a journal article. For example, the background and rationale of this book exploring the collective memory of World War I cover more than ten pages.
  • Research question : Research into a new sub-field may call for a longer or more detailed justification than a study that plugs a gap in literature.

Which verb tenses to use in the research rationale?

It's best to use the present tense. Though in a research proposal, the research rationale is likely written in the future tense, as you're describing the intended or expected outcomes of the research project (the gaps it will fill, the problems it will solve).

Example of a research rationale

Research question : What are the teachers' perceptions of how a sense of European identity is developed and what underlies such perceptions?

an example of a research rationale

Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology , 3(2), 77-101.

Burhan, A.M., Yang, J., & Inagawa, T. (2021). Impact of apathy on aging and age-related neuropsychiatric disorders. Research Topic. Frontiers in Psychiatry

Cataldo, I., Lepri, B., Neoh, M. J. Y., & Esposito, G. (2021). Social media usage and development of psychiatric disorders in childhood and adolescence: A review. Frontiers in Psychiatry , 11.

CiCe Jean Monnet Network (2017). Guidelines for citizenship education in school: Identities and European citizenship children's identity and citizenship in Europe.

Cohen, l, Manion, L., & Morrison, K. (2018). Research methods in education . Eighth edition. London: Routledge.

de Prat, R. C. (2013). Euroscepticism, Europhobia and Eurocriticism: The radical parties of the right and left “vis-à-vis” the European Union P.I.E-Peter Lang S.A., Éditions Scientifiques Internationales.

European Commission. (2017). Eurydice Brief: Citizenship education at school in Europe.

Polyakova, A., & Fligstein, N. (2016). Is European integration causing Europe to become more nationalist? Evidence from the 2007–9 financial crisis. Journal of European Public Policy , 23(1), 60-83.

Winter, J. (2014). Sites of Memory, Sites of Mourning: The Great War in European Cultural History . Cambridge: Cambridge University Press.

The AJE Team

The AJE Team

See our "Privacy Policy"

Ensure your structure and ideas are consistent and clearly communicated

Pair your Premium Editing with our add-on service Presubmission Review for an overall assessment of your manuscript.

sample rationale in research paper

Community Blog

Keep up-to-date on postgraduate related issues with our quick reads written by students, postdocs, professors and industry leaders.

How do you Write the Rationale for Research?

Picture of DiscoverPhDs

  • By DiscoverPhDs
  • October 21, 2020

Rationale for Research

What is the Rationale of Research?

The term rationale of research means the reason for performing the research study in question. In writing your rational you should able to convey why there was a need for your study to be carried out. It’s an important part of your research paper that should explain how your research was novel and explain why it was significant; this helps the reader understand why your research question needed to be addressed in your research paper, term paper or other research report.

The rationale for research is also sometimes referred to as the justification for the study. When writing your rational, first begin by introducing and explaining what other researchers have published on within your research field.

Having explained the work of previous literature and prior research, include discussion about where the gaps in knowledge are in your field. Use these to define potential research questions that need answering and explain the importance of addressing these unanswered questions.

The rationale conveys to the reader of your publication exactly why your research topic was needed and why it was significant . Having defined your research rationale, you would then go on to define your hypothesis and your research objectives.

Final Comments

Defining the rationale research, is a key part of the research process and academic writing in any research project. You use this in your research paper to firstly explain the research problem within your dissertation topic. This gives you the research justification you need to define your research question and what the expected outcomes may be.

What do you call a professor?

You’ll come across many academics with PhD, some using the title of Doctor and others using Professor. This blog post helps you understand the differences.

Science Investigatory Project

A science investigatory project is a science-based research project or study that is performed by school children in a classroom, exhibition or science fair.

What is Scientific Misconduct?

Scientific misconduct can be described as a deviation from the accepted standards of scientific research, study and publication ethics.

Join thousands of other students and stay up to date with the latest PhD programmes, funding opportunities and advice.

sample rationale in research paper

Browse PhDs Now

Rationale for Research

The term rationale of research means the reason for performing the research study in question.

Difference between the journal paper status of In Review and Under Review

This post explains the difference between the journal paper status of In Review and Under Review.

sample rationale in research paper

Dr Williams gained her PhD in Chemical Engineering at the Rensselaer Polytechnic Institute in Troy, New York in 2020. She is now a Presidential Postdoctoral Fellow at Cornell University, researching simplifying vaccine manufacturing in low-income countries.

sample rationale in research paper

Dr Tuohilampi gained her PhD in Mathematics Education from the University of Helsinki in 2016. She is now a lecturer at the University of Helsinki, a Research Fellow at the University of New South Wales, Sydney and has also founded the company Math Hunger.

Join Thousands of Students

How to Write a Rationale: A Guide for Research and Beyond

How to Write a Rationale: A Guide for Research and Beyond

Ever found yourself scratching your head, wondering how to justify your choice of a research topic or project? You’re not alone! Writing a rationale, which essentially means explaining the ‘why’ behind your decisions, is crucial to any research process. It’s like the secret sauce that adds flavour to your research recipe. So, the only thing you need to know is how to write a rationale.

Guide

What is a Rationale?

A rationale in research is essentially the foundation of your study. It serves as the justification for undertaking a particular research project. At its core, the rationale explains why the research was conducted or needs to be conducted, thus addressing a specific knowledge gap or research question.

Here’s a breakdown of the key elements involved in crafting a rationale:

Linking Background to Research Question: 

The rationale should connect the background of the study to your specific research question. It involves presenting and discussing existing data on your topic, identifying gaps or issues in the current understanding, and explaining why addressing them is important​.

Objectives and Significance: 

Your rationale should clearly outline your research objectives – what you hope to discover or achieve through the study. It should also emphasize the subject’s significance in your field and explain why more or better research is needed​.

Methodological Approach: 

The rationale should briefly describe your proposed research method , whether qualitative (descriptive) or quantitative (experimental), and justify this choice​.

Justifying the Need for Research: 

The rationale isn’t just about what you’re doing and why it’s necessary. It can involve highlighting methodological, contextual, or conceptual limitations in previous studies and explaining how your research aims to overcome these limitations. Essentially, you’re making a case for why your research fills a crucial gap in existing knowledge​​.

Presenting Before and After Research: 

Interestingly, the rationale can be presented before and after the research. Before the research, it forms a central part of the research proposal, setting out the plan for the work. After the research, it’s presented in a research article or dissertation to explain the focus on a specific research question and link it to the study’s aims and outcomes​.

Elements to Include: 

A good rationale should include a summary of conclusions from your literature review, identify what is currently unknown, discuss inconclusive or contested results from previous studies, and emphasize the necessity to improve or build on previous research​.

Creating a rationale is a vital part of the research process, as it not only sets the stage for your study but also convinces readers of the value and necessity of your work.

A Laptop With A Book On It On A Wooden Table, Showcasing The Keywords &Quot;How To Write A Rationale&Quot;.

How to Write a Rationale:

Writing a rationale for your research is crucial in conducting and presenting your study. It involves explaining why your research is necessary and important. Here’s a guide to help you craft a compelling rationale:

Identify the Problem or Knowledge Gap: 

Begin by clearly stating the issue or gap in knowledge that your research aims to address. Explain why this problem is important and merits investigation. It is the foundation of your rationale and sets the stage for the need for your research.​

Review the Literature: 

Conduct a thorough review of existing literature on your topic. It helps you understand what research has already been done and what gaps or open questions exist. Your rationale should build on this background by highlighting these gaps and emphasizing the importance of addressing them​​​​.

Define Your Research Questions/Hypotheses: 

Based on your understanding of the problem and literature review, clearly state the research questions or hypotheses that your study aims to explore. These should logically stem from the identified gaps or issues.

Explain Your Research Approach: 

Describe the methods you will use for your research, including data collection and analysis techniques. Justify why these methods are appropriate for addressing your research questions or hypotheses​​.

Discuss the Potential Impact of Your Research:  Explain the significance of your study. Consider both theoretical contributions and practical implications. For instance, how does your research advance existing knowledge? Does it have real-world applications? Is it relevant to a specific field or community?​

Consider Ethical Considerations: 

If your research involves human or animal subjects, discuss the ethical aspects and how you plan to conduct your study responsibly​.

Contextualise Your Study: 

Justify the relevance of your research by explaining how it fits into the broader context. Connect your study to current trends, societal needs, or academic discussions​​.

Support with Evidence: 

Provide evidence or examples that underscore the need for your research. It could include citing relevant studies, statistics, or scenarios that illustrate the problem or gap your research addresses​.

Methodological, Contextual, and Conceptual Limitations: 

Address any limitations of previous research and how your study aims to overcome them. It can include methodological flaws in previous studies, changes in external factors that make past research less relevant, or the need to study a phenomenon within a new conceptual framework​.

Placement in Your Paper: 

Typically, the rationale is written toward the end of the introduction section of your paper, providing a logical lead-in to your research questions and methodology​​.

By following these steps and considering your audience’s perspective, you can write a strong and compelling rationale that clearly communicates the significance and necessity of your research project.

Frequently Asked Questions:

What makes a good research rationale.

A good rationale clearly identifies a gap in existing knowledge, builds on previous research, and outlines why your study is necessary and significant.

How detailed should my literature review be in the rationale?

Your literature review should be comprehensive enough to highlight the gaps your research aims to fill, but it should not overshadow the rationale itself.

Conclusion: 

A well-crafted rationale is your ticket to making your research stand out. It’s about bridging gaps, challenging norms, and paving the way for new discoveries. So go ahead, make your rationale the cornerstone of your research narrative!

Konger Avatar

Award-Winning Results

Team of 11+ experts, 10,000+ page #1 rankings on google, dedicated to smbs, $175,000,000 in reported client revenue.

Up until working with Casey, we had only had poor to mediocre experiences outsourcing work to agencies. Casey & the team at CJ&CO are the exception to the rule.

Communication was beyond great, his understanding of our vision was phenomenal, and instead of needing babysitting like the other agencies we worked with, he was not only completely dependable but also gave us sound suggestions on how to get better results, at the risk of us not needing him for the initial job we requested (absolute gem).

This has truly been the first time we worked with someone outside of our business that quickly grasped our vision, and that I could completely forget about and would still deliver above expectations.

I honestly can't wait to work in many more projects together!

Related Articles

View All Post

How to Tap Into What Your Customers Love & Hate

Conversion Rate Optimization , Copywriting , Customer Experience

How to Tap Into What Your Customers Love & Hate

Casey Jones Avatar

Casey Jones

What does a Copywriter do at an Ad Agency: Things You Should Know if You’re a Copywriter

Advertising , Copywriting

What does a Copywriter do at an Ad Agency: Things You Should Know if You’re a Copywriter

How to Write a Marketing Script: Take Your Marketing to the Next Level in 2023

Copywriting

How to Write a Marketing Script: Take Your Marketing to the Next Level in 2023

*The information this blog provides is for general informational purposes only and is not intended as financial or professional advice. The information may not reflect current developments and may be changed or updated without notice. Any opinions expressed on this blog are the author’s own and do not necessarily reflect the views of the author’s employer or any other organization. You should not act or rely on any information contained in this blog without first seeking the advice of a professional. No representation or warranty, express or implied, is made as to the accuracy or completeness of the information contained in this blog. The author and affiliated parties assume no liability for any errors or omissions.

sample rationale in research paper

Enago Academy

Setting Rationale in Research: Cracking the code for excelling at research

' src=

Knowledge and curiosity lays the foundation of scientific progress. The quest for knowledge has always been a timeless endeavor. Scholars seek reasons to explain the phenomena they observe, paving way for development of research. Every investigation should offer clarity and a well-defined rationale in research is a cornerstone upon which the entire study can be built.

Research rationale is the heartbeat of every academic pursuit as it guides the researchers to unlock the untouched areas of their field. Additionally, it illuminates the gaps in the existing knowledge, and identifies the potential contributions that the study aims to make.

Table of Contents

What Is Research Rationale and When Is It Written

Research rationale is the “why” behind every academic research. It not only frames the study but also outlines its objectives , questions, and expected outcomes. Additionally, it helps to identify the potential limitations of the study . It serves as a lighthouse for researchers that guides through data collection and analysis, ensuring their efforts remain focused and purposeful. Typically, a rationale is written at the beginning of the research proposal or research paper . It is an essential component of the introduction section and provides the foundation for the entire study. Furthermore, it provides a clear understanding of the purpose and significance of the research to the readers before delving into the specific details of the study. In some cases, the rationale is written before the methodology, data analysis, and other sections. Also, it serves as the justification for the research, and how it contributes to the field. Defining a research rationale can help a researcher in following ways:

Define Your Research Rationale

1. Justification of a Research Problem

  • Research rationale helps to understand the essence of a research problem.
  • It designs the right approach to solve a problem. This aspect is particularly important for applied research, where the outcomes can have real-world relevance and impact.
  • Also, it explains why the study is worth conducting and why resources should be allocated to pursue it.
  • Additionally, it guides a researcher to highlight the benefits and implications of a strategy.

2. Elimination of Literature Gap

  • Research rationale helps to ideate new topics which are less addressed.
  • Additionally, it offers fresh perspectives on existing research and discusses the shortcomings in previous studies.
  • It shows that your study aims to contribute to filling these gaps and advancing the field’s understanding.

3. Originality and Novelty

  • The rationale highlights the unique aspects of your research and how it differs from previous studies.
  • Furthermore, it explains why your research adds something new to the field and how it expands upon existing knowledge.
  • It highlights how your findings might contribute to a better understanding of a particular issue or problem and potentially lead to positive changes.
  • Besides these benefits, it provides a personal motivation to the researchers. In some cases, researchers might have personal experiences or interests that drive their desire to investigate a particular topic.

4. An Increase in Chances of Funding

  • It is essential to convince funding agencies , supervisors, or reviewers, that a research is worth pursuing.
  • Therefore, a good rationale can get your research approved for funding and increases your chances of getting published in journals; as it addresses the potential knowledge gap in existing research.

Overall, research rationale is essential for providing a clear and convincing argument for the value and importance of your research study, setting the stage for the rest of the research proposal or manuscript. Furthermore, it helps establish the context for your work and enables others to understand the purpose and potential impact of your research.

5 Key Elements of a Research Rationale

Research rationale must include certain components which make it more impactful. Here are the key elements of a research rationale:

Elements of research rationale

By incorporating these elements, you provide a strong and convincing case for the legitimacy of your research, which is essential for gaining support and approval from academic institutions, funding agencies, or other stakeholders.

How to Write a Rationale in Research

Writing a rationale requires careful consideration of the reasons for conducting the study. It is usually written in the present tense.

Here are some steps to guide you through the process of writing a research rationale:

Steps to write a research rationale

After writing the initial draft, it is essential to review and revise the research rationale to ensure that it effectively communicates the purpose of your research. The research rationale should be persuasive and compelling, convincing readers that your study is worthwhile and deserves their attention.

How Long Should a Research Rationale be?

Although there is no pre-defined length for a rationale in research, its length may vary depending on the specific requirements of the research project. It also depends on the academic institution or organization, and the guidelines set by the research advisor or funding agency. In general, a research rationale is usually a concise and focused document.

Typically, it ranges from a few paragraphs to a few pages, but it is usually recommended to keep it as crisp as possible while ensuring all the essential elements are adequately covered. The length of a research rationale can be roughly as follows:

1. For Research Proposal:

A. Around 1 to 3 pages

B. Ensure clear and comprehensive explanation of the research question, its significance, literature review , and methodological approach.

2. Thesis or Dissertation:

A. Around 3 to 5 pages

B. Ensure an extensive coverage of the literature review, theoretical framework, and research objectives to provide a robust justification for the study.

3. Journal Article:

A. Usually concise. Ranges from few paragraphs to one page

B. The research rationale is typically included as part of the introduction section

However, remember that the quality and content of the research rationale are more important than its length. The reasons for conducting the research should be well-structured, clear, and persuasive when presented. Always adhere to the specific institution or publication guidelines.

Example of a Research Rationale

Example of a research rationale

In conclusion, the research rationale serves as the cornerstone of a well-designed and successful research project. It ensures that research efforts are focused, meaningful, and ethically sound. Additionally, it provides a comprehensive and logical justification for embarking on a specific investigation. Therefore, by identifying research gaps, defining clear objectives, emphasizing significance, explaining the chosen methodology, addressing ethical considerations, and recognizing potential limitations, researchers can lay the groundwork for impactful and valuable contributions to the scientific community.

So, are you ready to delve deeper into the world of research and hone your academic writing skills? Explore Enago Academy ‘s comprehensive resources and courses to elevate your research and make a lasting impact in your field. Also, share your thoughts and experiences in the form of an article or a thought piece on Enago Academy’s Open Platform .

Join us on a journey of scholarly excellence today!

Frequently Asked Questions

A rationale of the study can be written by including the following points: 1. Background of the Research/ Study 2. Identifying the Knowledge Gap 3. An Overview of the Goals and Objectives of the Study 4. Methodology and its Significance 5. Relevance of the Research

Start writing a research rationale by defining the research problem and discussing the literature gap associated with it.

A research rationale can be ended by discussing the expected results and summarizing the need of the study.

A rationale for thesis can be made by covering the following points: 1. Extensive coverage of the existing literature 2. Explaining the knowledge gap 3. Provide the framework and objectives of the study 4. Provide a robust justification for the study/ research 5. Highlight the potential of the research and the expected outcomes

A rationale for dissertation can be made by covering the following points: 1. Highlight the existing reference 2. Bridge the gap and establish the context of your research 3. Describe the problem and the objectives 4. Give an overview of the methodology

Rate this article Cancel Reply

Your email address will not be published.

sample rationale in research paper

Enago Academy's Most Popular Articles

What is Academic Integrity and How to Uphold it [FREE CHECKLIST]

Ensuring Academic Integrity and Transparency in Academic Research: A comprehensive checklist for researchers

Academic integrity is the foundation upon which the credibility and value of scientific findings are…

7 Step Guide for Optimizing Impactful Research Process

  • Publishing Research
  • Reporting Research

How to Optimize Your Research Process: A step-by-step guide

For researchers across disciplines, the path to uncovering novel findings and insights is often filled…

Launch of "Sony Women in Technology Award with Nature"

  • Industry News
  • Trending Now

Breaking Barriers: Sony and Nature unveil “Women in Technology Award”

Sony Group Corporation and the prestigious scientific journal Nature have collaborated to launch the inaugural…

Guide to Adhere Good Research Practice (FREE CHECKLIST)

Achieving Research Excellence: Checklist for good research practices

Academia is built on the foundation of trustworthy and high-quality research, supported by the pillars…

ResearchSummary

  • Promoting Research

Plain Language Summary — Communicating your research to bridge the academic-lay gap

Science can be complex, but does that mean it should not be accessible to the…

Mitigating Survivorship Bias in Scholarly Research: 10 tips to enhance data integrity

The Power of Proofreading: Taking your academic work to the next level

Facing Difficulty Writing an Academic Essay? — Here is your one-stop solution!

sample rationale in research paper

Sign-up to read more

Subscribe for free to get unrestricted access to all our resources on research writing and academic publishing including:

  • 2000+ blog articles
  • 50+ Webinars
  • 10+ Expert podcasts
  • 50+ Infographics
  • 10+ Checklists
  • Research Guides

We hate spam too. We promise to protect your privacy and never spam you.

I am looking for Editing/ Proofreading services for my manuscript Tentative date of next journal submission:

sample rationale in research paper

In your opinion, what is the most effective way to improve integrity in the peer review process?

down arrow

  • Translation

How to write the Rationale for your research

By charlesworth author services.

  • Charlesworth Author Services
  • 19 November, 2021

The rationale for one’s research is the justification for undertaking a given study. It states the reason(s) why a researcher chooses to focus on the topic in question, including what the significance is and what gaps the research intends to fill. In short, it is an explanation that rationalises the need for the study. The rationale is typically followed by a hypothesis/ research question (s) and the study objectives.

When is the rationale for research written?

The rationale of a study can be presented both before and after the research is conducted. 

  • Before : The rationale is a crucial part of your research proposal , representing the plan of your work as formulated before you execute your study.
  • After : Once the study is completed, the rationale is presented in a research paper or dissertation to explain why you focused on the particular question. In this instance, you would link the rationale of your research project to the study aims and outcomes.

Basis for writing the research rationale

The study rationale is predominantly based on preliminary data . A literature review will help you identify gaps in the current knowledge base and also ensure that you avoid duplicating what has already been done. You can then formulate the justification for your study from the existing literature on the subject and the perceived outcomes of the proposed study.

Length of the research rationale

In a research proposal or research article, the rationale would not take up more than a few sentences . A thesis or dissertation would allow for a longer description, which could even run into a couple of paragraphs . The length might even depend on the field of study or nature of the experiment. For instance, a completely novel or unconventional approach might warrant a longer and more detailed justification.

Basic elements of the research rationale

Every research rationale should include some mention or discussion of the following: 

  • An overview of your conclusions from your literature review
  • Gaps in current knowledge
  • Inconclusive or controversial findings from previous studies
  • The need to build on previous research (e.g. unanswered questions, the need to update concepts in light of new findings and/or new technical advancements). 

Example of a research rationale

Note: This uses a fictional study.

Abc xyz is a newly identified microalgal species isolated from fish tanks. While Abc xyz algal blooms have been seen as a threat to pisciculture, some studies have hinted at their unusually high carotenoid content and unique carotenoid profile. Carotenoid profiling has been carried out only in a handful of microalgal species from this genus, and the search for microalgae rich in bioactive carotenoids has not yielded promising candidates so far. This in-depth examination of the carotenoid profile of Abc xyz will help identify and quantify novel and potentially useful carotenoids from an untapped aquaculture resource .

In conclusion

It is important to describe the rationale of your research in order to put the significance and novelty of your specific research project into perspective. Once you have successfully articulated the reason(s) for your research, you will have convinced readers of the importance of your work!

Maximise your publication success with Charlesworth Author Services.

Charlesworth Author Services , a trusted brand supporting the world’s leading academic publishers, institutions and authors since 1928. 

To know more about our services, visit: Our Services

Share with your colleagues

cwg logo

Scientific Editing Services

Sign up – stay updated.

We use cookies to offer you a personalized experience. By continuing to use this website, you consent to the use of cookies in accordance with our Cookie Policy.

How to Write a Rationale for Your Research Paper

Learn how to write a compelling research rationale. Discover key elements, steps, and tips to justify your study and strengthen your academic paper.

Laptop, book and business woman hands writing in night office with documents, report or startup growth profit review. Notebook, budget and auditor with financial records for tax, audit and compliance Laptop, book and business woman hands writing in night office with documents, report or startup growth profit review. Notebook, budget and auditor with financial records for tax, audit and compliance research paper stock pictures, royalty-free photos & images

Jun 25, 2024

How to Write a Rationale for Your Research Paper

A research rationale is a crucial component of any academic paper, serving as a concise explanation of why your research project is necessary and valuable.

It justifies the importance of your study and outlines its potential contributions to the field, effectively bridging the gap between your research question and the existing body of knowledge.

In academic writing , the rationale plays a vital role by providing context for your research and helping readers understand its relevance and significance. It demonstrates that you've identified a meaningful gap or problem in the current literature, justifying the time, effort, and resources required for your study.

A well-crafted rationale helps convince readers - whether they're supervisors, peers, or funding bodies - of the merit of your research. In other words, writing the rationale for research is essential. In other words, writing the rationale for research is essential.

Moreover, it sets the stage for your methodology and expected outcomes, strengthening the overall structure and coherence of your research paper.

By clearly articulating the purpose and value of your study, a strong rationale for a research paper lays the foundation for a compelling and impactful research article.

mobile mockup listening.com

Purpose of a Rationale

The purpose of a research rationale is to justify your study and explain its contributions to your research proposal.

Essentially, the rationale for a research paper should clarify these points. Justifying the need involves answering "Why does this study matter?" Highlight the problem's significance and why current knowledge is insufficient, emphasizing this in your rationale for a research paper.

Point out gaps, contradictions, or new developments in existing research in your rationale for research.

Explaining contributions demonstrates your work's value and the impact of your research in your dissertation.

This may include new methods, challenging theories, providing evidence, or exploring overlooked aspects of your research topic. A clear rationale shows your research is the reason it is necessary and valuable to the field.

It convinces readers that your work advances collective knowledge in meaningful ways by highlighting the impact of your research.

Elements of an Effective Rationale

An effective research rationale has three key elements: clear justification, relevance to the research topic, and supporting evidence.

  • Clear problem statement: Identify the specific issue or gap your research addresses. State the problem concisely and explain why it matters, emphasizing the rationale of your research .
  • Relevance to existing literature: Show how your work fits into current knowledge. Highlight connections to previous studies and explain what's missing or needs further exploration, particularly when writing the rationale for the research.
  • Potential impact of the research: Describe the expected outcomes and their significance. Explain how your findings could advance theory, improve practice, or benefit society, providing a strong justification for your research. This is crucial when conducting research.

Steps to write a rationale:

Identify the research problem:.

  • Pinpoint the specific issue or question your study addresses in the rationale of your research.
  • Ensure it's clear, focused, and significant to your specific research field.

Review relevant literature:

  • Examine current research related to your topic.
  • Identify gaps, contradictions, or areas needing further study in the rationale of the study.

Articulate the significance of your study:

  • Explain why your research matters, stating the rationale of your research.
  • Highlight potential contributions to theory or practice.

Explain your unique approach or perspective:

  • Describe how your method or viewpoint differs from previous research when writing a research article.
  • Show how your approach adds value to the field in your research topic proposal and conducting the study.

Tips for Crafting a Compelling Rationale:

Be concise and focused:.

  • Use clear, direct language.
  • Stick to essential information.
  • Avoid unnecessary jargon or repetition.

Use evidence to support your claims:

  • Cite relevant studies or statistics in your dissertation .
  • Provide concrete examples when possible.
  • Show how evidence links to your arguments within the context of conducting research.

Address potential counterarguments:

  • Anticipate possible objections to your research while writing the rationale for the research.
  • Acknowledge the limitations of your approach in your thesis.
  • Explain why your study is still valuable despite these challenges.

Conclusion 

A well-written rationale for a research paper is crucial for a strong research paper, justifying your study's need, explaining its contribution, and setting the stage for your work. Remember to clearly state your research problem, show relevance to existing literature, and highlight potential impact. Be concise and focused, use evidence to support claims, and address counterarguments.

Avoid common pitfalls like vagueness, disconnection from prior research, and overstatement. A solid rationale strengthens your paper by providing context for your study, demonstrating its significance and originality, and convincing readers of its value and necessity. Writing the rationale for research is key in this process. Ultimately, a compelling rationale lays the foundation for a persuasive and impactful research paper, enhancing its credibility and relevance in your field of study as part of your research proposal.

Easily pronounces technical words in any field

Thesis Writing

Research Methodology

Research Rationale

Recent articles

sample rationale in research paper

Best Business Schools in the US

sample rationale in research paper

Glice Martineau

Jul 10, 2024

Graduate School

United States of America

Business School

sample rationale in research paper

When Does College Start in the US?

sample rationale in research paper

Kate Windsor

College search tools

College admissions guide

College planning tips

College academic calendar

Summer term

Quarter system calendar

Spring semester start

Fall semester start

College start dates

When does college start

sample rationale in research paper

How to Apply to Graduate School? Practical and Helpful Tips

sample rationale in research paper

Derek Pankaew

Jul 11, 2024

#GradSchoolApplication

#GraduateSchool

#HigherEducation

#AdmissionTips

#PersonalStatement

sample rationale in research paper

9 Things I Wish I Knew Before Starting a PhD

#AcademicJourney

#GradSchoolTips

#PhDStudentLife

#ResearchAndMentorship

Research-Methodology

Rationale for the Study

It is important for you to be able to explain the importance of the research you are conducting by providing valid arguments. Rationale for the study, also referred to as justification for the study, is reason why you have conducted your study in the first place. This part in your paper needs to explain uniqueness and importance of your research. Rationale for the study needs to be specific and ideally, it should relate to the following points:

1. The research needs to contribute to the elimination of a gap in the literature. Elimination of gap in the present literature is one of the compulsory requirements for your study. In other words, you don’t need to ‘re-invent the wheel’ and your research aims and objectives need to focus on new topics. For example, you can choose to conduct an empirical study to assess the implications of COVID-19 pandemic on the numbers of tourists visitors in your city. This might be previously undressed topic, taking into account that COVID-19 pandemic is a relatively recent phenomenon.

Alternatively, if you cannot find a new topic to research, you can attempt to offer fresh perspectives on existing management, business or economic issues. For example, while thousands of studies have been previously conducted to study various aspects of leadership, this topic as far from being exhausted as a research area. Specifically, new studies can be conducted in the area of leadership to analyze the impacts of new communication mediums such as TikTok, and other social networking sites on leadership practices.

You can also discuss the shortcomings of previous works devoted to your research area. Shortcomings in previous studies can be divided into three groups:

a) Methodological limitations . Methodology employed in previous study may be flawed in terms of research design, research approach or sampling.

b) Contextual limitations . Relevance of previous works may be non-existent for the present because external factors have changed.

c) Conceptual limitations . Previous studies may be unjustifiably bound up to a particular model or an ideology.

While discussing the shortcomings of previous studies you should explain how you are going to correct them. This principle is true to almost all areas in business studies i.e. gaps or shortcomings in the literature can be found in relation to almost all areas of business and economics.

2. The research can be conducted to solve a specific problem. It helps if you can explain why you are the right person and in the right position to solve the problem. You have to explain the essence of the problem in a detailed manner and highlight practical benefits associated with the solution of the problem. Suppose, your dissertation topic is “a study into advantages and disadvantages of various entry strategies into Chinese market”. In this case, you can say that practical implications of your research relates to assisting businesses aiming to enter Chinese market to do more informed decision making.

Alternatively, if your research is devoted to the analysis of impacts of CSR programs and initiatives on brand image, practical contributions of your study would relate to contributing to the level of effectiveness of CSR programs of businesses.

Additional examples of studies that can assist to address specific practical problems may include the following:

  • A study into the reasons of high employee turnover at Hanson Brick
  • A critical analysis of employee motivation problems at Esporta, Finchley Road, London
  • A research into effective succession planning at Microsoft
  • A study into major differences between private and public primary education in the USA and implications of these differences on the quality of education

However, it is important to note that it is not an obligatory for a dissertation   to be associated with the solution of a specific problem. Dissertations can be purely theory-based as well. Examples of such studies include the following:

  • Born or bred: revising The Great Man theory of leadership in the 21 st century
  • A critical analysis of the relevance of McClelland’s Achievement theory to the US information technology industry
  • Neoliberalism as a major reason behind the emergence of the global financial and economic crisis of 2007-2009
  • Analysis of Lewin’s Model of Change and its relevance to pharmaceutical sector of France

3. Your study has to contribute to the level of professional development of the researcher . That is you. You have to explain in a detailed manner in what ways your research contributes to the achievement of your long-term career aspirations.

For example, you have selected a research topic of “ A critical analysis of the relevance of McClelland’s Achievement theory in the US information technology industry ”.  You may state that you associate your career aspirations with becoming an IT executive in the US, and accordingly, in-depth knowledge of employee motivation in this industry is going to contribute your chances of success in your chosen career path.

Therefore, you are in a better position if you have already identified your career objectives, so that during the research process you can get detailed knowledge about various aspects of your chosen industry.

Rationale for the Study

My e-book, The Ultimate Guide to Writing a Dissertation in Business Studies: a step by step assistance offers practical assistance to complete a dissertation with minimum or no stress. The e-book covers all stages of writing a dissertation starting from the selection to the research area to submitting the completed version of the work within the deadline.

John Dudovskiy

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Formulating a convincing rationale for a research study

Profile image of Céline Rojon

2012, Coaching: An International Journal of Theory, Research and Practice

Explaining the purpose of a research study and providing a compelling rationale is an part of any research project, enabling the work to be set in the context of both existing evidence (and theory) and its practical applications. This necessitates formulating a clear research question and deriving specific research objectives, thereby justifying and contextualising the study. In this research note we consider the characteristics of good research questions and research objectives and the role of theory in developing these. We conclude with a summary and a checklist to help ensure the rationale for a research study is convincing.

Related Papers

Nurse researcher

To describe the development of a research question, aim and objective. The first steps of any study are developing the research question, aim and objective. Subsequent steps develop from these and they govern the researchers' choice of population, setting, data to be collected and time period for the study. Clear, succinctly posed research questions, aims and objectives are essential if studies are to be successful. Researchers developing their research questions, aims and objectives generally experience difficulties. They are often overwhelmed trying to convert what they see as a relevant issue from practice into research. This necessitates engaging with the relevant published literature and knowledgeable people. This paper identifies the issues to be considered when developing a research question, aim and objective. Understanding these considerations will enable researchers to effectively present their research question, aim and objective. To conduct successful studies, resear...

sample rationale in research paper

BMC Medical Ethics

Daniel Strech

British journal of community nursing

Keith Meadows

The development of the research question for a study can be where a lot of research fails. Without a well-defined and specific research question or hypothesis, findings from the research are unlikely to tell us very much. Developing a tightly focused research question or hypothesis defines how and what data is collected and analysed and provides a context for the results. This article, the second in a series of six, focuses on the process of developing a research question or hypothesis from the initial idea through to the final research question, using examples to illustrate the key principles. Approaches to reviewing the literature, including hand searching and the use of electronic sources, are described together with their different strengths and weaknesses. An overview of the deductive and inductive approaches to research are described, as well as the underlying rationale of the null hypothesis and one and two-tailed tests. Finally, issues around the feasibility of the study, in...

Gaceta de M�xico

Rodolfo Rivas-Ruiz

Jackie Campbell

Mary Gardiner

Nurse Education Today

Stefanos Mantzoukas

Journal of Clinical Epidemiology

Jane Andreasen

The Cyprus Journal of Sciences

Kakia Avgousti

Carrying out a research paper is concerned to be a simple task. However, in practice it is far more complicated. The most important factor is for the researcher to know the main principles of the research process. It is vital to identify the research methods progression, the meaning and purpose of the research to be carried out, by the formulation of hypothesis, aims and questions, the use of methodology-both quantitative and qualitative-their characteristics and suitability when utilized, and the need of sampling and ethical considerations. By the use of theoretical framework, the current research paper firstly discusses and analyses the principles of bringing about a research paper, and most importantly it emphasizes the advantages and disadvantages of research methodology.

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

RELATED PAPERS

Ponsian P R O T Ntui

Muhammad Ans

WORLD EDUCATION CONNECT Multidisciplinary e-Publication

JAYSON S . DIGAMON

Indian Journal of Medical Specialities

Jugal Kishore

Victor Ramapala

Yuanita Damayanti

Research Methods

Ganizani Nkhambule

Carsten Juhl

Tinashe Paul

Umair Majid

Nursing Research

Janice Morse

Stephanie Schaefer

Notchie Angeles

UNICAF University - Zambia

Ivan Steenkamp

Evidence-Based Spine-Care Journal

Andrea Skelly

Aastha Baral

Hershey Gabi

Academic Voices (AV)

Prof. Joshua Chukwuere

hotweiner 520

Academic Emergency Medicine

rashmi kothari

Pablo Puescas Sánchez

RELATED TOPICS

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Research paper

Writing a Research Paper Introduction | Step-by-Step Guide

Published on September 24, 2022 by Jack Caulfield . Revised on March 27, 2023.

Writing a Research Paper Introduction

The introduction to a research paper is where you set up your topic and approach for the reader. It has several key goals:

  • Present your topic and get the reader interested
  • Provide background or summarize existing research
  • Position your own approach
  • Detail your specific research problem and problem statement
  • Give an overview of the paper’s structure

The introduction looks slightly different depending on whether your paper presents the results of original empirical research or constructs an argument by engaging with a variety of sources.

Instantly correct all language mistakes in your text

Upload your document to correct all your mistakes in minutes

upload-your-document-ai-proofreader

Table of contents

Step 1: introduce your topic, step 2: describe the background, step 3: establish your research problem, step 4: specify your objective(s), step 5: map out your paper, research paper introduction examples, frequently asked questions about the research paper introduction.

The first job of the introduction is to tell the reader what your topic is and why it’s interesting or important. This is generally accomplished with a strong opening hook.

The hook is a striking opening sentence that clearly conveys the relevance of your topic. Think of an interesting fact or statistic, a strong statement, a question, or a brief anecdote that will get the reader wondering about your topic.

For example, the following could be an effective hook for an argumentative paper about the environmental impact of cattle farming:

A more empirical paper investigating the relationship of Instagram use with body image issues in adolescent girls might use the following hook:

Don’t feel that your hook necessarily has to be deeply impressive or creative. Clarity and relevance are still more important than catchiness. The key thing is to guide the reader into your topic and situate your ideas.

Scribbr Citation Checker New

The AI-powered Citation Checker helps you avoid common mistakes such as:

  • Missing commas and periods
  • Incorrect usage of “et al.”
  • Ampersands (&) in narrative citations
  • Missing reference entries

sample rationale in research paper

This part of the introduction differs depending on what approach your paper is taking.

In a more argumentative paper, you’ll explore some general background here. In a more empirical paper, this is the place to review previous research and establish how yours fits in.

Argumentative paper: Background information

After you’ve caught your reader’s attention, specify a bit more, providing context and narrowing down your topic.

Provide only the most relevant background information. The introduction isn’t the place to get too in-depth; if more background is essential to your paper, it can appear in the body .

Empirical paper: Describing previous research

For a paper describing original research, you’ll instead provide an overview of the most relevant research that has already been conducted. This is a sort of miniature literature review —a sketch of the current state of research into your topic, boiled down to a few sentences.

This should be informed by genuine engagement with the literature. Your search can be less extensive than in a full literature review, but a clear sense of the relevant research is crucial to inform your own work.

Begin by establishing the kinds of research that have been done, and end with limitations or gaps in the research that you intend to respond to.

The next step is to clarify how your own research fits in and what problem it addresses.

Argumentative paper: Emphasize importance

In an argumentative research paper, you can simply state the problem you intend to discuss, and what is original or important about your argument.

Empirical paper: Relate to the literature

In an empirical research paper, try to lead into the problem on the basis of your discussion of the literature. Think in terms of these questions:

  • What research gap is your work intended to fill?
  • What limitations in previous work does it address?
  • What contribution to knowledge does it make?

You can make the connection between your problem and the existing research using phrases like the following.

Although has been studied in detail, insufficient attention has been paid to . You will address a previously overlooked aspect of your topic.
The implications of study deserve to be explored further. You will build on something suggested by a previous study, exploring it in greater depth.
It is generally assumed that . However, this paper suggests that … You will depart from the consensus on your topic, establishing a new position.

Now you’ll get into the specifics of what you intend to find out or express in your research paper.

The way you frame your research objectives varies. An argumentative paper presents a thesis statement, while an empirical paper generally poses a research question (sometimes with a hypothesis as to the answer).

Argumentative paper: Thesis statement

The thesis statement expresses the position that the rest of the paper will present evidence and arguments for. It can be presented in one or two sentences, and should state your position clearly and directly, without providing specific arguments for it at this point.

Empirical paper: Research question and hypothesis

The research question is the question you want to answer in an empirical research paper.

Present your research question clearly and directly, with a minimum of discussion at this point. The rest of the paper will be taken up with discussing and investigating this question; here you just need to express it.

A research question can be framed either directly or indirectly.

  • This study set out to answer the following question: What effects does daily use of Instagram have on the prevalence of body image issues among adolescent girls?
  • We investigated the effects of daily Instagram use on the prevalence of body image issues among adolescent girls.

If your research involved testing hypotheses , these should be stated along with your research question. They are usually presented in the past tense, since the hypothesis will already have been tested by the time you are writing up your paper.

For example, the following hypothesis might respond to the research question above:

The final part of the introduction is often dedicated to a brief overview of the rest of the paper.

In a paper structured using the standard scientific “introduction, methods, results, discussion” format, this isn’t always necessary. But if your paper is structured in a less predictable way, it’s important to describe the shape of it for the reader.

If included, the overview should be concise, direct, and written in the present tense.

  • This paper will first discuss several examples of survey-based research into adolescent social media use, then will go on to …
  • This paper first discusses several examples of survey-based research into adolescent social media use, then goes on to …

Full examples of research paper introductions are shown in the tabs below: one for an argumentative paper, the other for an empirical paper.

  • Argumentative paper
  • Empirical paper

Are cows responsible for climate change? A recent study (RIVM, 2019) shows that cattle farmers account for two thirds of agricultural nitrogen emissions in the Netherlands. These emissions result from nitrogen in manure, which can degrade into ammonia and enter the atmosphere. The study’s calculations show that agriculture is the main source of nitrogen pollution, accounting for 46% of the country’s total emissions. By comparison, road traffic and households are responsible for 6.1% each, the industrial sector for 1%. While efforts are being made to mitigate these emissions, policymakers are reluctant to reckon with the scale of the problem. The approach presented here is a radical one, but commensurate with the issue. This paper argues that the Dutch government must stimulate and subsidize livestock farmers, especially cattle farmers, to transition to sustainable vegetable farming. It first establishes the inadequacy of current mitigation measures, then discusses the various advantages of the results proposed, and finally addresses potential objections to the plan on economic grounds.

The rise of social media has been accompanied by a sharp increase in the prevalence of body image issues among women and girls. This correlation has received significant academic attention: Various empirical studies have been conducted into Facebook usage among adolescent girls (Tiggermann & Slater, 2013; Meier & Gray, 2014). These studies have consistently found that the visual and interactive aspects of the platform have the greatest influence on body image issues. Despite this, highly visual social media (HVSM) such as Instagram have yet to be robustly researched. This paper sets out to address this research gap. We investigated the effects of daily Instagram use on the prevalence of body image issues among adolescent girls. It was hypothesized that daily Instagram use would be associated with an increase in body image concerns and a decrease in self-esteem ratings.

The introduction of a research paper includes several key elements:

  • A hook to catch the reader’s interest
  • Relevant background on the topic
  • Details of your research problem

and your problem statement

  • A thesis statement or research question
  • Sometimes an overview of the paper

Don’t feel that you have to write the introduction first. The introduction is often one of the last parts of the research paper you’ll write, along with the conclusion.

This is because it can be easier to introduce your paper once you’ve already written the body ; you may not have the clearest idea of your arguments until you’ve written them, and things can change during the writing process .

The way you present your research problem in your introduction varies depending on the nature of your research paper . A research paper that presents a sustained argument will usually encapsulate this argument in a thesis statement .

A research paper designed to present the results of empirical research tends to present a research question that it seeks to answer. It may also include a hypothesis —a prediction that will be confirmed or disproved by your research.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Caulfield, J. (2023, March 27). Writing a Research Paper Introduction | Step-by-Step Guide. Scribbr. Retrieved July 16, 2024, from https://www.scribbr.com/research-paper/research-paper-introduction/

Is this article helpful?

Jack Caulfield

Jack Caulfield

Other students also liked, writing strong research questions | criteria & examples, writing a research paper conclusion | step-by-step guide, research paper format | apa, mla, & chicago templates, what is your plagiarism score.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Consensus Statement
  • Published: 19 July 2024

Reporting guidelines for precision medicine research of clinical relevance: the BePRECISE checklist

  • Siew S. Lim 1 ,
  • Zhila Semnani-Azad   ORCID: orcid.org/0000-0001-7822-5072 2 ,
  • Mario L. Morieri   ORCID: orcid.org/0000-0001-6864-0547 3 , 4 ,
  • Ashley H. Ng 5 , 6 , 7 ,
  • Abrar Ahmad 8 ,
  • Hugo Fitipaldi 8 ,
  • Jacqueline Boyle 1 ,
  • Christian Collin 9 ,
  • John M. Dennis   ORCID: orcid.org/0000-0002-7171-732X 10 ,
  • Claudia Langenberg   ORCID: orcid.org/0000-0002-5017-7344 7 , 11 ,
  • Ruth J. F. Loos 12 , 13 ,
  • Melinda Morrison 14 ,
  • Michele Ramsay   ORCID: orcid.org/0000-0002-4156-4801 15 ,
  • Arun J. Sanyal   ORCID: orcid.org/0000-0001-8682-5748 16 ,
  • Naveed Sattar   ORCID: orcid.org/0000-0002-1604-2593 17 ,
  • Marie-France Hivert   ORCID: orcid.org/0000-0001-7752-2585 18 ,
  • Maria F. Gomez   ORCID: orcid.org/0000-0001-6210-3142 8 ,
  • Jordi Merino   ORCID: orcid.org/0000-0001-8312-1438 12 , 19 , 20 , 21 ,
  • Deirdre K. Tobias 2 , 22 ,
  • Michael I. Trenell 23 ,
  • Stephen S. Rich   ORCID: orcid.org/0000-0003-3872-7793 24 ,
  • Jennifer L. Sargent 25 &
  • Paul W. Franks   ORCID: orcid.org/0000-0002-0520-7604 2 , 26  

Nature Medicine ( 2024 ) Cite this article

Metrics details

  • Medical research
  • Translational research

Precision medicine should aspire to reduce error and improve accuracy in medical and health recommendations by comparison with contemporary practice, while maintaining safety and cost-effectiveness. The etiology, clinical manifestation and prognosis of diseases such as obesity, diabetes, cardiovascular disease, kidney disease and fatty liver disease are heterogeneous. Without standardized reporting, this heterogeneity, combined with the diversity of research tools used in precision medicine studies, makes comparisons across studies and implementation of the findings challenging. Specific recommendations for reporting precision medicine research do not currently exist. The BePRECISE (Better Precision-data Reporting of Evidence from Clinical Intervention Studies & Epidemiology) consortium, comprising 23 experts in precision medicine, cardiometabolic diseases, statistics, editorial and lived experience, conducted a scoping review and participated in a modified Delphi and nominal group technique process to develop guidelines for reporting precision medicine research. The BePRECISE checklist comprises 23 items organized into 5 sections that align with typical sections of a scientific publication. A specific section about health equity serves to encourage precision medicine research to be inclusive of individuals and communities that are traditionally under-represented in clinical research and/or underserved by health systems. Adoption of BePRECISE by investigators, reviewers and editors will facilitate and accelerate equitable clinical implementation of precision medicine.

Precision medicine represents an evolution in the long history of evidence-based medicine and healthcare. Spanning disease classifications and risk factor boundaries, precision medicine is underpinned by four key ‘pillars’ (prevention, diagnosis, treatment and prognosis) 1 , 2 . The overarching objective of precision medicine is to reduce error and improve accuracy in medical and health recommendations compared with contemporary approaches 3 . Precision medicine solutions should meet or improve on existing standards for safety. They should also be compatible with the individual’s preferences, capabilities and needs and tailored to the cultural and societal conditions of the population. Furthermore, precision medicine should be cost-effective and enhance health equity by increasing access to better medical and healthcare practices for the people most in need.

Cardiometabolic diseases are the leading causes of mortality globally 4 . With this burden projected to worsen over the coming decades 5 , innovative approaches to disease prevention, diagnosis and treatment are urgently needed. A plethora of precision medicine approaches are being explored in translational and clinical research. However, translating, scaling and implementing these findings for clinical practice have proved difficult. The heterogeneous nature of disease presentation and the etiology of cardiometabolic diseases contribute to these challenges, as does the range and diversity of clinical information, molecular data types and computational analyses used in precision medicine research.

The ability to synthesize data and reproduce research findings are tenets of the modern scientific process, which help maximize progress in evidence-based healthcare and medicine. The ‘Second international consensus report on gaps and opportunities for the clinical translation of precision diabetes medicine’ 3 was supported by a series of systematic evidence reviews 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 . The report focused on key dimensions of precision diabetes medicine, including evidence for prevention, diagnosis, treatment and prognosis in monogenic forms of diabetes, gestational diabetes, and type 1 and type 2 diabetes. A key finding from the report and the systematic evidence reviews underpinning it is that the published literature on precision diabetes medicine lacks evidence standardization or benchmarking against contemporary standards and often overlooks under-represented populations, who tend to bear the greatest burden of diabetes and its complications.

In the present report, we present reporting guidelines for clinically relevant precision medicine research, using common cardiometabolic diseases as the example. We first evaluated a representative sample of the literature on precision medicine in cardiometabolic diseases, determining that the quality of evidence reporting is low, akin to the level previously observed for precision diabetes medicine 3 . We then generated consensus guidelines and a corresponding checklist for reporting of research germane to precision medicine. The purpose of these guidelines is to improve reporting standards so that: (1) evidence can be combined and synthesized in a way that yields meaningful insights from collective efforts; (2) claims of clinical utility can be benchmarked against contemporary standards; and (3) end-user engagement and health equity will be strengthened.

Scoping review

The literature search focused on identifying precision medicine publications using the term ‘precision medicine’ and associated proxy nomenclature, among other keywords and phrases ( Supplementary Methods ). The search identified 2,679 publications, of which 13 were excluded owing to duplication. The remaining 2,666 papers were screened, of which 47 were randomly selected (through computer-generated, random-number sequence) for full text review and quality assessment. The summary (count and percentage) of each quality assessment item across all papers and the quality assessment results for each paper are shown in Supplementary Tables 2 and 3 . This quality assessment yielded a median score of 6 (interquartile range = 4–7) with none of the papers achieving a positive quality evaluation across all 11 items (Fig. 1 ).

figure 1

Median scores of 47 published precision medicine manuscripts randomly selected for full text review and quality assessment through computer-generated, random-number sequence. IQR, Interquartile range.

A summary of the itemized evidence reporting quality is shown in Supplementary Table 2 . Most abstracts (81%) reported findings relevant to the four pillars of precision medicine (prevention, diagnosis, treatment and/or prognosis) and provided sufficient detail in the methods sections to determine whether the study was designed to test hypotheses on precision medicine (77%), details about participant eligibility (75%) and descriptions of standard reporting definitions (70%). The items that were less frequently reported were the description of patient and public involvement and engagement (PPIE) in determining the impact and utility of precision medicine (15%), the inclusion of the term ‘precision medicine’ in the title or abstract (17%), the reporting of measures of discriminative or predictive accuracy (23%), the description of the approach used to control risk of false-positive reporting (28%), the reporting of effect estimates with 95% confidence intervals and units underlying effect estimates (57%) and the reporting of a statistical test for comparisons of subgroups (for example, interaction test) (60%).

Stakeholder survey

Delphi panel demographics.

Of the 23 Delphi panelists, 22 (96%) completed Delphi survey 1, 18 (78%) and attended the full-panel consensus meeting and 22 (96%) completed Delphi survey 2. All panelists engaged in further extensive dialog around key topics through online communication.

Delphi results

The initial checklist in Delphi survey 1 contained 68 items. After Delphi survey 1 and the full-panel consensus meeting, 2 items were added, resulting in 70 items in Delphi survey 2. At the Consensus meeting, it was determined that the checklist should be used together with existing relevant checklists. These include the CONSORT (Consolidated Standards of Reporting Trials) 17 and STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) 18 checklists for interventional trials and observational studies, respectively. This led to a recommendation to remove items covered in established checklists (Supplementary Fig. 1 ). The scoring from Delphi survey 1, Delphi survey 2 and notes from the Consensus meetings are as shown in Supplementary Table 4 . After Delphi survey 2, the consensus was to retain 25 items across 6 core categories.

Guidelines finalization

The executive oversight committee reviewed the panel scores and free-text comments from all the rounds of Delphi surveys to determine the final checklist items and wording. The group discussed five items with inconsistent consensus (between 70% and 80% consensus), resulting in the removal of one item because it overlapped conceptually with another item (17b and 17g in Supplementary Table 4 ). It was also determined that ‘health equity’ should be included as an overarching theme, thereby encouraging users of the checklist to consider this topic more broadly when describing precision medicine research. This resulted in removal of two items.

The final checklist comprised 23 items that the executive oversight committee concluded are unique and essential for reporting standards in precision medicine. The final BePRECISE checklist is presented in Table 1 , with a downloadable version of the checklist available online ( https://www.be-precise.org , and https://www.equator-network.org/reporting-guidelines/ ).

Explanation of checklist Items

The checklist and the explanation of each item are presented in Table 1 . The BePRECISE checklist is intended to complement existing guidelines such as CONSORT 17 , STROBE 18 and PRISMA (Preferred Reporting System for Systematic Reviews and Meta-Analyses) 19 .

These reporting guidelines use the terms ‘precision medicine’ and ‘personalized medicine’ as defined in the ‘Second international consensus report on gaps and opportunities for the clinical translation of precision diabetes medicine’ 3 , as follows:

‘Precision medicine focuses on minimizing errors and improving accuracy in medical decisions and health recommendations. It seeks to maximize efficacy, cost-effectiveness, safety, access for those in need and compliance compared with contemporary evidence-based medicine. Precision medicine emphasizes tailoring diagnostics or therapeutics (prevention or treatment) to subgroups of populations sharing similar characteristics.’
Personalized medicine refers to ‘the use of a person’s own data to objectively gauge the efficacy, safety, and tolerability of therapeutics, and, subjectively, to tailor health recommendations and/or medical decisions to the individual’s preferences, circumstances, and capabilities’.

Accordingly, personalized medicine can be viewed as being nested within the broader concept of precision medicine.

Equity and PPIE (E1–E4)

Equity, diversity and inclusivity considerations and the involvement of patients and public is a crosscutting theme in this checklist. Where relevant, papers should include a description of how equity has been considered, including diversity and inclusivity of study participants, and whether there was PPIE. Cohort selection biases and probable risks when extrapolating the study’s results to other populations should be clearly described.

The selection of participants should consider racial, ethnic, ancestral, geographic and sociodemographic characteristics 20 , and include an explanation for the inclusion or exclusion of groups that are typically under-represented in clinical research (E1 and E2). Race and ethnicity are social constructs but, as they are categories recognized by some government and health authorities in contexts that are relevant to precision medicine, we have elected to retain inclusion of these somewhat controversial terms here.

PPIE in any part of the study should be described, including but not limited to design, conduct and reporting (E3).

Where possible, and ideally with guidance from those with lived experience, the potential impact of the research findings on the target population(s) should be discussed (E4). Consider co-writing these aspects with PPIE representatives.

Title and abstract (1.1–1.4)

In the title and/or abstract, the term ‘precision medicine’ should be included to highlight that the research is relevant to precision medicine (1.1). Given that precision medicine is an approach that can be used in several research contexts, the study design (for example, randomized clinical trial (RCT), retrospective observational) and the research question should be stated clearly (1.2). Use of the terms ‘prevention’, ‘diagnostics’, ‘treatment’ or ‘prognostics’ is needed to highlight which pillar of precision medicine the study concerns 3 (1.3). To ensure transparency about generalizability and/or applicability of the findings to a specific population or subgroup, the study population must be described (1.4).

Background and objectives (2.1–2.2)

The background should clearly describe the rationale for the chosen precision medicine approach, including the context and prior work that led to it and the specific hypothesis being tested (2.1). To provide the reader with greater context, papers should also state the nature and objective of the precision medicine study as ‘etiological’, ‘discovery’, ‘predictive’ and/or ‘confirmatory’ (2.2).

Methods (general)

Although this reporting guide focuses on clarifying elements of papers that are germane to precision medicine, authors are strongly encouraged to ensure that methods also adhere to other appropriate reporting guidelines (for example, CONSORT and STROBE), with the overarching goal of ensuring that the study protocol described therein could, in principle, be accurately reproduced by third-party investigators.

Methods (3.1–3.7)

Methods should describe the aspects of a study design relating to precision medicine in such detail that the design can be understood and replicated (3.1). The rationale for the choice of primary outcome should be clearly stated (3.2).

To enable readers to assess bias and interpret the study findings, this section should state how the participants were identified and enrolled in the study (4.1) and (if applicable) how a subset of a broader group of participants was selected from an existing study (3.3). Any markers used for stratification or prediction should be explicitly stated with an explanation of how the marker(s) was(were) chosen (3.4).

The sample size and how it was derived should be described, for example, following a priori power calculations, or if the sample size was limited primarily by availability or cost, and any implications that this might have for type 2 error (3.5). Authors should also describe attempts to minimize false-positive discovery, especially when multiple testing has occurred (3.5).

If any replication and/or validation analyses were undertaken, a clear description should be given of the approach, including whether these analyses were planned and relevant datasets identified before or after conclusion of primary analyses (3.6), in addition to justification for the sample size and choice of replication cohort (3.7).

Results (4.1–4.4)

The number of participants in the study should be provided, along with a table of baseline characteristics (4.1). If the analysis involves comparison (rather than discovery) of subgroups, the baseline characteristics and numbers of participants should be provided by the subgroup.

Results from any statistical tests done should be reported. Any comparisons of subgroups should include appropriate test statistics, which may include tests of interaction and heterogeneity, and in cluster analyses tests of probability for cluster assignment (for example, relative entropy statistic) (4.2).

Key findings should be benchmarked against current reference standards or practice, if they exist, so that the reader can determine the likely benefit of translating the study’s findings into clinical practice. This may include, for example, the comparison of the new and existing approaches using tests of discriminative (cross-sectional) or predictive (prospective) accuracy, or estimation of net reclassification or changes in numbers needed to treat. If benchmarking has not been done, a clear explanation should be given (4.3).

If validation and/or replication analyses were undertaken, the results of all such attempts at analyses should be clearly described (4.4).

Discussion (5.1–5.2)

The paper should include a balanced and nuanced discussion of any limitations to the interpretation and/or implementation of the reported findings. The limitations section should consider biases that might prevent fair and equitable generalization of the study’s findings to other populations, particularly to groups that are under-represented within the published literature. Authors are also encouraged to consider other potential biases that might arise with stratified and subgroup analyses (5.1).

If there is a direct clinical implication of the study’s findings, authors should describe how their findings might be applied in clinical practice. This might, for example, include an explanation of how any algorithms, technologies or risk markers that stem directly from the research might benefit clinical practice.

The BePRECISE guidelines are intended to enhance publication of research on precision medicine by improving quality and standardization of reporting. In turn, it is anticipated that this will help improve and accelerate the impact of precision medicine research on the health and well-being of target populations and individuals.

BePRECISE was initiated to follow up on recommendations from the ‘Second international consensus report on gaps and opportunities for the clinical translation of precision diabetes medicine’ 3 . The report, founded on 16 systematic evidence reviews summarizing research described in >100,000 published papers, found a low degree of standardization across the published literature, with a broad absence of key information needed for benchmarking against contemporary standards, validation analyses and meaningful interpretation of research findings.

Implementation of the checklist

These reporting guidelines were derived through structured evaluation and consensus processes undertaken by subject-matter experts in precision medicine for complex traits. The report is premised on cardiometabolic disease translational research but is relevant to translation of research in other complex diseases. These guidelines are directed toward authors describing translational research in precision medicine, as well as for journal editors handling submissions in this field. These guidelines may also be of value to funding agencies, policy advisers and health educators.

The BePRECISE guidelines are designed to be used together with existing study-specific checklists such as CONSORT 17 , STROBE 18 and STORMS (Strengthening the Organization and Reporting of Microbiome Studies) 21 . Publications relevant to precision medicine cover diverse topics and study designs; thus, to accommodate this diversity, we recommend that authors elaborate on relevant details related to checklist items to facilitate manuscript evaluations by journal editors and peer reviewers who will determine whether a given paper has addressed the BePRECISE checklist criteria.

Health equity

Precision medicine has the potential to improve health equity by making health advice and medical therapies more accessible to those in most need and by being more effective and acceptable to the recipient than contemporary clinical approaches. Nevertheless, as the ‘inverse care law’ 22 highlights, the best healthcare often reaches those who need it least. We believe that precision medicine research should place emphasis on the development of solutions for people in greatest need, regardless of who or where they are.

Ensuring representation of underserved populations, where the disease burden can be high, is important because determining the effectiveness of precision medicine solutions requires data from the target populations. Research in population genetics provides clear evidence of this, where the predictive accuracy of polygenic burden scores can be low when applied outside the data-source population, even when these populations are geographically proximal 23 , 24 . Raising awareness of these challenges by discussing them in the health literature and, ultimately, by addressing them through improved study design could facilitate enhancement of health equity using precision medicine approaches.

Promoting equity through precision medicine requires awareness of the many biases. For this reason, the BePRECISE guidelines place emphasis on equity, diversity and inclusion as an overarching concept throughout the checklist.

As with health equity, the BePRECISE guidelines position PPIE as a crosscutting theme to motivate its consideration in all elements of precision medicine translational research. We encourage those using the BePRECISE checklist to follow existing guidance on PPIE 25 . Ensuring that the eventual recipients of precision medicine solutions are adequately represented in the planning, execution and reporting of precision medicine research will help maximize the translational value of the research. Ideally, research teams should include members of the communities that will eventually benefit from this work, including in leadership roles, although to achieve this will often require long-term capacity strengthening. This engagement will help ensure that the relevance and utility of the research output are maximized. It will also strengthen the potential for target populations to determine their own health trajectories. Where this is not immediately achievable, establishing authentic partnerships with representatives from these target populations should be prioritized. This may involve community consultations, training opportunities and co-creation of research proposals with assigned community members, through dissemination and translation of research findings. Moreover, the selection of study participants should be done equitably and result in study cohorts that are representative of the populations who are the focus of the research 26 . The use of patient-reported outcome measures and patient-reported experience measures should be considered during the research design and execution phases, and reported in research papers wherever possible following established guidelines 27 , 28 . Doing so will amplify the patient voice and maximize the relevance of the research to the target populations and individuals.

Cost-effectiveness

The translation of precision medicine research into practice will invariably depend on it being cost-effective, affordable and accessible. This initial version of the BePRECISE checklist does not include checklist items pertaining directly to these important factors. The consensus view was that such analyses are sufficiently complex to stand alone and are likely to be outside the scope of most current precision medicine research. This topic may be revisited in subsequent versions of the checklist.

Strengths and limitations

We believe that implementation of the BePRECISE checklist in the context of academic publishing will strengthen standardization of reporting across precision medicine research, ultimately enabling improved and equitable translation of research findings into the clinical and public health settings. The checklist will also encourage investigators to improve study design, particularly with respect to health equity. Other strengths include rigor of our consensus methods and the diverse range of societal backgrounds and expertise of our group.

We acknowledge that precision medicine in many complex diseases is relatively nascent (with the exception of precision oncology), with the needs of the field and stakeholders evolving. We plan to evaluate uptake of the checklist among journals and authors to assess whether items should be added or removed from the checklist as the field matures. An additional limitation is that the BePRECISE consensus group is small by comparison with similar efforts in other fields of research. We will involve a larger group of experts with broader global and technical representation in future efforts, including increased representation from low- and middle-income countries and individuals with more diverse lived experiences. Additional technical expertise may also be needed from other disciplines, including health economics and health systems administration, for example.

We acknowledge that journal formatting requirements and procedures may not always entirely align with the checklist specifications. We removed a checklist item for provision of a plain language summary, for example, because many journal formats are presently unable to accommodate this type of additional material. However, we hope that in the future editors and publishers of medical and scientific journals will include space for this incredibly important component that facilitates scientific communication with the public.

We defer to editorial and reviewer discretion in implementation of the BePRECISE checklist. Although the BePRECISE checklist items are included to support best scientific practices, at least in the short term, some ongoing precision medicine studies will not have addressed the health equity or PPIE considerations in their design. We do not expect that insufficient attention to these items would be a sole reason for not considering a manuscript for review, unless blatant disregard for participant and/or community safety, privacy or respect has occurred in the study design and/or conduct. Over time, however, we hope that health equity and PPIE will be considered as standard practice in precision medicine research and implementation.

Conclusions

The BePRECISE reporting guidelines have been generated through a structured consensus process to address the need for better reporting of clinical translational research in precision medicine in common complex diseases. The burgeoning literature on this topic is reported inconsistently, impeding the assimilation, syntheses and interpretation of evidence. There is a general lack of benchmarking against contemporary standards, a situation that makes it impossible to determine whether new precision medicine approaches might be beneficial, feasible and sustainable. Moreover, very little existing precision medicine research has incorporated PPIE or focused on the groups within societies most in need of innovative precision medicine solutions. These barriers limit the positive impact that precision medicine could have on the health and well-being of those most in need. The BePRECISE reporting guidelines are intended to help address these and other important challenges.

Consortium structure

The BePRECISE Consortium comprised an executive oversight committee (S.S.L., Z.S.-A., M.L.M., A.H.N., S.S.R., J.L.S. and P.W.F.), which oversaw the full process, with representation across key domain areas, and an evidence evaluation group (Z.S.-A., M.L.M., A.A., H.F., M.-F.H., M.F.G., J.M., D.K.T., M.I.T., S.S.R., J.L.S. and P.W.F.), which undertook the scoping review to determine current reporting standards. All consortium members participated in a Delphi consensus process 29 . The Consortium chair and co-chair were P.W.F. and S.S.L., respectively (Supplementary Table 1 ).

Protocols and registrations

A scoping review protocol was developed before initiating the literature review or consensus activities and was registered in the Open Science Framework (http://osf-registrations-nh4g2-v1). The consensus process followed the EQUATOR (Enhancing the QUAlity and TRansparency of health Research) Network recommendations for reporting guidelines development ( https://www.equator-network.org/library/equator-network-reporting-guideline-manual ) and was registered with EQUATOR as ‘Reporting guidelines under development’ ( https://www.equator-network.org/library/reporting-guidelines-under-development/reporting-guidelines-under-development-for-other-study-designs ). The final BePRECISE guidelines are available on the Equator website ( https://www.equator-network.org/reporting-guidelines/ ).

The purpose of the scoping review was to determine whether the published literature on precision medicine in cardiometabolic diseases met a minimum threshold for reporting quality. We set the minimum expectation as a condition where most (that is, ≥50%) published papers in this domain are adequately reported. To define a study as adequately or inadequately reported (as a binary variable), members of the scoping review committee identified, through consultation, 11 key items (Supplementary Tables 2 and 3 ). Papers that met all 11 reporting criteria were deemed, a priori, to be adequately reported.

The checklist items used to assess the reporting quality of studies captured in the scoping review were determined before the Delphi surveys were undertaken. These scoping review checklist items correspond with some of those used in the Delphi surveys that formed the basis of the final BePRECISE checklist, because both the scoping review and Delphi surveys are, to varying degrees, derived from the findings of the ‘Second international consensus report on gaps and opportunities for the clinical translation of precision diabetes medicine’ 3 . The scoping review was intended to provide a snapshot of the quality of reporting in a subset of literature relating to precision medicine. It was not undertaken to inform the items in the BePRECISE checklist; this purpose was served by the systematic evidence reviews 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 and the Consensus report 3 described above.

Based on the findings of the precision diabetes medicine Consensus report 3 , we hypothesized that no more than 30% of currently published studies are adequately reported. This assumption was tested by full text reviewing a statistically powered, random subsample of published papers on precision medicine across cardiometabolic diseases (ʻSearch strategyʼ and ʻSample size estimationʼ). This scoping review was conducted in accordance with the PRISMA Extension for Scoping Review guidelines 30 to identify and assess the current literature on precision medicine in cardiometabolic diseases and was completed before the ‘guidelines consensus process’ described below.

Sample size estimation

The literature search was not intended to be a comprehensive evaluation of the published evidence, but instead to provide an unbiased representation of this literature. To determine how many papers should be reviewed as a representative sample of the published literature, an a priori sample size calculation was performed using SAS software v.9.4 (SAS Institute). Given the scenario described, we used a two-sided test with a type 1 error threshold (critical α) of 0.05, assuming a null hypothesis proportion of 0.50, which corresponds to our minimum expectation, an expected number of adequately reported papers of <30% and nominal power of 80%. This calculation determined that 47 randomly selected papers should be full text reviewed to ascertain whether the assumed proportion of adequately reported studies is significantly lower than the prespecified null proportion (that is, to infer that the quality of papers reported in this field is lower than the minimum expectation).

Search strategy

We searched the PubMed database ( https://pubmed.ncbi.nlm.nih.gov ) to identify relevant articles published in the past 5 years (January 2019 to January 2024). The search strategy incorporated keywords and terms ( https://www.ncbi.nlm.nih.gov/mesh ) in human epidemiological cohorts and clinical trials representing: (1) precision medicine, (2) cardiometabolic diseases and (3) clinical translation (see Supplementary Methods for the detailed search strategy). The search was constrained to publications written in English. Conference abstracts, case reports, study protocols, reviews and animal studies were excluded.

Study selection and quality assessment

Covidence software ( https://www.covidence.org ; Veritas Health Innovation) was used to manage the scoping review selection process. Studies were filtered in three stages: (1) removal of duplicate publications; (2) ascertainment of study eligibility based on title and abstract by at least two independent reviewers; and (3) full text review of 47 randomly selected studies, where at least 2 independent reviewers assessed the eligibility of each publication according to the inclusion and exclusion criteria. Each paper was further evaluated to determine whether it met the 11 predetermined quality criteria. Any conflicts were subsequently resolved by an independent reviewer.

Consensus process

The five-step consensus process was based on a modified Delphi and nominal group technique 29 . The consensus process involved: (1) completion of an initial Delphi survey (6–13 February 2024); (2) a consensus meeting (15–16 February 2024); and (3) a second Delphi survey (19–26 February 2024). Finalization of the checklist was conducted at a second consensus meeting by the executive oversight committee (5–6 March 2024), who reviewed the voting of all rounds of the Delphi survey, made final decisions about item inclusion and refined wording of the BePRECISE checklist. The executive oversight committee also evaluated the checklist against two publications on precision medicine determined through the scoping review to be of high and low quality, respectively. The final version of the checklist was circulated to all panel members for consultation and approval (13–19 March 2024).

The items in the first iteration of Delphi survey 1 were derived from existing checklists: CONSORT 17 , STROBE 18 , CONSORT-Equity 2017 extension 31 and STrengthening the REporting of Genetic Association Studies (STREGA)—an extension of the STROBE guidelines 32 . Additional items specific to precision medicine were generated based on the reporting gaps identified from the series of systematic reviews (11 published) that underpinned the ‘Second international consensus report on gaps and opportunities for the clinical translation of precision diabetes medicine’ 3 . The draft of Delphi survey 1 was presented to the full panel at a roundtable discussion followed by co-development with the full panel through an online document-sharing platform. The final items for Delphi survey 1, including the input sources for its development, are shown in Supplementary Table 3 .

The Delphi survey response scale had five options: ‘Completely inappropriate’, ‘Somewhat inappropriate’, ‘Neither appropriate nor inappropriate’, ‘Somewhat appropriate’ and ‘Completely appropriate’. The consensus threshold was defined a priori as at least 80% of the panel voting for ‘Completely appropriate’ or ‘Somewhat appropriate’. Items with voting scores under this consensus threshold were discussed at the Consensus meetings. The Delphi surveys were administered online and were anonymous. Panelists were invited to provide free-text comments to suggest new items (survey 1 only), suggest a change of wording for a given item or justify their voting decision. The voting scores and anonymous comments for each item from the previous consensus round were provided to panelists at the subsequent rounds, such that consensus was reached iteratively.

Delphi panel and executive oversight committee

The BePRECISE checklist panelists cover the core areas of expertise outlined in the EQUATOR Network recommendations for reporting guidelines development ( https://www.equator-network.org/library/equator-network-reporting-guideline-manual ). The panel includes subject-matter experts across relevant disease areas and with expertise in the topics highlighted as gaps in the ‘Second international consensus report on gaps and opportunities for the clinical translation of precision diabetes medicine’. Moreover, the BePRECISE panelist selection focused on ensuring diversity: (1) global representation (Europe, North America, sub-Saharan Africa and Australia); (2) career stages (23% early career researchers within 10 years of research experience, 27% of mid-career researchers of 11–15 years of experience and 50% of senior researchers of >20 years of experience); and (3) gender (55% of authors being female).

Accordingly, the Delphi panel comprised subject-matter experts in key cardiometabolic disorders (diabetes, obesity, cardiovascular disease, fatty liver disease, renal disease), statistics, study design (epidemiologists and clinical trialists), journal editorial, lived experience, benchmarking and technology, education and translation, health equity, community engagement and clinical practice. Several of these experts are based in or have worked extensively with investigators in low- and middle-income countries (M.R., N.S., J.L.S. and P.W.F.).

The executive oversight committee for this report consisted of multidisciplinary experts in cardiometabolic disorders, equity research, medical journal editorial and lived experience (P.W.F., S.S.L., S.S.R., J.L.S., A.H.N., Z.S.A. and M.L.M.).

Chung, W. K. et al. Precision medicine in diabetes: a consensus report from the American Diabetes Association (ADA) and the European Association for the Study of Diabetes (EASD). Diabetes Care 43 , 1617–1635 (2020).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Chung, W. K. et al. Precision medicine in diabetes: a consensus report from the American Diabetes Association (ADA) and the European Association for the Study of Diabetes (EASD). Diabetologia 63 , 1671–1693 (2020).

Article   PubMed   PubMed Central   Google Scholar  

Tobias, D. K. et al. Second international consensus report on gaps and opportunities for the clinical translation of precision diabetes medicine. Nat. Med. 29 , 2438–2457 (2023).

GBD 2019 Diseases and Injuries Collaborators. Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet 396 , 1204–1222 (2020).

Article   Google Scholar  

Sun, H. et al. IDF diabetes atlas: global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045. Diabetes Res. Clin. Pract. 183 , 109119 (2022).

Article   PubMed   Google Scholar  

Ahmad, A. et al. Precision prognostics for cardiovascular disease in type 2 diabetes: a systematic review and meta-analysis. Commun. Med. 4 , 11 (2024).

Semnani-Azad, Z. et al. Precision stratification of prognostic risk factors associated with outcomes in gestational diabetes mellitus: a systematic review. Commun. Med. 4 , 9 (2024).

Francis, E. C. et al. Refining the diagnosis of gestational diabetes mellitus: a systematic review and meta-analysis. Commun. Med. 3 , 185 (2023).

Misra, S. et al. Precision subclassification of type 2 diabetes: a systematic review. Commun. Med. 3 , 138 (2023).

Benham, J. L. et al. Precision gestational diabetes treatment: a systematic review and meta-analyses. Commun. Med. 3 , 135 (2023).

Felton, J. L. et al. Disease-modifying therapies and features linked to treatment response in type 1 diabetes prevention: a systematic review. Commun. Med. 3 , 130 (2023).

Murphy, R. et al. The use of precision diagnostics for monogenic diabetes: a systematic review and expert opinion. Commun. Med. 3 , 136 (2023).

Lim, S. et al. Participant characteristics in the prevention of gestational diabetes as evidence for precision medicine: a systematic review and meta-analysis. Commun. Med. 3 , 137 (2023).

Jacobsen, L. M. et al. Utility and precision evidence of technology in the treatment of type 1 diabetes: a systematic review. Commun. Med. 3 , 132 (2023).

Semple, R. K., Patel, K. A., Auh, S., Ada/Easd, P. & Brown, R. J. Genotype-stratified treatment for monogenic insulin resistance: a systematic review. Commun. Med. 3 , 134 (2023).

Naylor, R. N. et al. Systematic review of treatment of beta-cell monogenic diabetes. Preprint at medRxiv https://doi.org/10.1101/2023.05.12.23289807 (2023).

Schulz, K. F., Altman, D. G., Moher, D.& the CONSORT Group. CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. PLoS Med. 7 , e1000251 (2010).

Cuschieri, S. The STROBE guidelines. Saudi J. Anaesth. 13 , S31–S34 (2019).

Page, M. J. et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Br. Med. J. 372 , n71 (2021).

O’Neill, J. et al. Applying an equity lens to interventions: using PROGRESS ensures consideration of socially stratifying factors to illuminate inequities in health. J. Clin. Epidemiol. 67 , 56–64 (2014).

Mirzayi, C. et al. Reporting guidelines for human microbiome research: the STORMS checklist. Nat. Med. 27 , 1885–1892 (2021).

Hart, J. T. The inverse care law. Lancet i , 405–412 (1971).

Kamiza, A. B. et al. Transferability of genetic risk scores in African populations. Nat. Med. 28 , 1163–1166 (2022).

Choudhury, A. et al. Meta-analysis of sub-Saharan African studies provides insights into genetic architecture of lipid traits. Nat. Commun. 13 , 2578 (2022).

Aiyegbusi, O. L. et al. Considerations for patient and public involvement and engagement in health research. Nat. Med. 29 , 1922–1929 (2023).

Article   CAS   PubMed   Google Scholar  

Retzer, A. et al. A toolkit for capturing a representative and equitable sample in health research. Nat. Med. 29 , 3259–3267 (2023).

Calvert, M. et al. Reporting of patient-reported outcomes in randomized trials: the CONSORT PRO extension. J. Am. Med. Assoc. 309 , 814–822 (2013).

Article   CAS   Google Scholar  

Calvert, M. et al. Guidelines for inclusion of patient-reported outcomes in clinical trial protocols: the SPIRIT-PRO extension. J. Am. Med. Assoc. 319 , 483–494 (2018).

Rankin, N. M. et al. Adapting the nominal group technique for priority setting of evidence-practice gaps in implementation science. BMC Med. Res. Methodol. 16 , 110 (2016).

Tricco, A. C. et al. PRISMA extension for Scoping Reviews (PRISMA-ScR): checklist and explanation. Ann. Intern. Med. 169 , 467–473 (2018).

Welch, V. A. et al. CONSORT-Equity 2017 extension and elaboration for better reporting of health equity in randomised trials. Br. Med. J. 359 , j5085 (2017).

Little, J. et al. STrengthening the REporting of Genetic Association studies (STREGA)—an extension of the STROBE statement. Eur. J. Clin. Invest. 39 , 247–266 (2009).

Download references

Acknowledgements

As a PPIE representative from Australia, A.H.N. was remunerated by the Cardiometabolic Health Implementation Research in Postpartum women (CHIRP) consumer group, Eastern Health Clinical School, Monash University according to the Monash Partners Remuneration and Reimbursement Guidelines for Consumer and Community Involvement Activity. The Covidence license was paid for in part by Lund University’s Medical Library (Faculty of Medicine, Lund University, Lund, Sweden). Z.S.-A. was supported by the Canadian Institutes of Health Research Fellowship; M.L.M. by the Italian Ministry of Health Grant ‘Ricerca Finalizzata 2019’ (no. GR-2019-12369702); A.A. by Swedish Heart–Lung Foundation (grant no. 20190470), Swedish Research Council (2018-02837), EU H2020-JTI-lMl2-2015-05 (grant no. 115974—BEAt-DKD) and HORIZON-RIA project (grant no. 101095146—PRIME-CKD); H.F. by EU H2020-JTI-lMl2-2015-05 (grant no. 115974—BEAt-DKD) and HORIZON-RIA project (grant no. 101095146—PRIME-CKD). J.M.D. is a Wellcome Trust Early Career Fellow (no. 227070/Z/23/Z) and is supported by the Medical Research Council (UK) (grant no. MR/N00633X/1) and the National Institute for Health and Care Research (NIHR), Exeter Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the NIHR or the Department of Health and Social Care. R.J.F.L. is employed at the Novo Nordisk Foundation Center for Basic Metabolic Research, which is supported by grants from the Novo Nordisk Foundation (nos. NNF23SA0084103 and NNF18CC0034900), and in addition by personal grants from the Novo Nordisk Foundation (Laureate award no. NNF20OC0059313) and the Danish National Research Fund (Chair DNRF161). M.R. is a South African Research Chair on the Genomics and Bioinformatics of African Populations, funded by the Department of Science and Innovation. N.S. is Chair of the Obesity Mission for the Office of Life Science, UK Government. M.F.G. is supported by the Swedish Research Council (EXODIAB, no. 2009-1039), Swedish Foundation for Strategic Research (LUDC-IRC, no. 15-0067) and EU H2020-JTI-lMl2-2015-05 (grant no. 115974—BEAt-DKD). A.H.N’.s salary is supported by funding from the Medical Research Future Fund and Monash Centre for Health Research and Implementation. S.S.R. is supported by grants from the National Institute of Diabetes and Digestive and Kidney Diseases (no. R01 DK122586), the Juvenile Diabetes Research Foundation (no. 2-SRA-202201260-S-B) and the Leona M. and Harry B. Helmsley Charitable Trust (no. 2204–05134). P.W.F. is supported by grants from the Swedish Research Council (no. 2019-01348), the European Commission (ERC-2015-CoG-681742-NASCENT), and Swedish Foundation for Strategic Research (no. LUDC-IRC, 15-0067).

Author information

Authors and affiliations.

Health Systems and Equity, Eastern Health Clinical School, Monash University, Box Hill, Victoria, Australia

Siew S. Lim & Jacqueline Boyle

Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA

Zhila Semnani-Azad, Deirdre K. Tobias & Paul W. Franks

Unit of Metabolic Disease, University-Hospital of Padua, Padua, Italy

Mario L. Morieri

Department of Medicine, University of Padua, Padua, Italy

Monash Centre for Health Research Implementation, Monash University and Monash Health, Melbourne, Victoria, Australia

Ashley H. Ng

Monash Partners Academic Health Science Centre, Melbourne, Victoria, Australia

Precision Healthcare University Research Institute, Queen Mary University of London, London, UK

Ashley H. Ng & Claudia Langenberg

Diabetic Complications Unit, Department of Clinical Sciences, Lund University Diabetes Centre, Malmo, Sweden

Abrar Ahmad, Hugo Fitipaldi & Maria F. Gomez

Board of Directors, Steno Diabetes Center, Copenhagen, Denmark

Christian Collin

Department of Clinical and Biomedical Sciences, University of Exeter Medical School, Exeter, UK

John M. Dennis

Computational Medicine, Berlin Institute of Health at Charité—Universitätsmedizin Berlin, Berlin, Germany

Claudia Langenberg

Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark

Ruth J. F. Loos & Jordi Merino

Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA

Ruth J. F. Loos

Diabetes Australia, Canberra, Australian Capital Territory, Australia

Melinda Morrison

Sydney Brenner Institute for Molecular Bioscience, University of the Witwatersrand, Faculty of Health Sciences, Johannesburg, South Africa

Michele Ramsay

Division of Gastroenterology, Hepatology and Nutrition, Virginia Commonwealth University School of Medicine, Richmond, VA, USA

Arun J. Sanyal

School of Cardiovascular and Metabolic Health, University of Glasgow, Glasgow, UK

Naveed Sattar

Division of Chronic Disease Research Across the Lifecourse, Department of Population Medicine, Harvard Medical School, Harvard Pilgrim Health Care Institute; Diabetes Unit, Massachusetts General Hospital, Boston, MA, USA

Marie-France Hivert

Diabetes Unit, Endocrine Division, Massachusetts General Hospital, Boston, MA, USA

Jordi Merino

Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA

Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA

Division of Preventive Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA

Deirdre K. Tobias

Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK

Michael I. Trenell

Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA

Stephen S. Rich

School of Public Health, Imperial College London, London, UK

Jennifer L. Sargent

Department of Clinical Sciences, Lund University, Helsingborg, Sweden

Paul W. Franks

You can also search for this author in PubMed   Google Scholar

Contributions

S.S.L. (co-chair), Z.S.-A., M.L.M., A.H.N., S.S.R., J.L.S. and P.W.F. (chair) formed the executive oversight committee. Z.S.-A. (lead), M.L.M., A.A., H.F., M.-F.H., M.F.G., J.M., D.K.T., M.I.T., S.S.R., J.L.S. and P.W.F. formed the evidence evaluation group. S.S.L. (lead), Z.S.-A., M.L.M., A.A., H.F., J.B., C.C., J.M.D., C.L., R.J.F.L., M.M., M.R., A.J.S., N.S., M.-F.H., M.F.G., J.M., D.K.T., M.I.T., A.H.N., S.S.R., J.L.S. and P.W.F. formed the consensus review panel. A.H.N. and C.C. were the PPIE representatives. S.S.L., Z.S.-A., M.L.M., A.H.N., S.S.R., J.L.S. and P.W.F. wrote the first draft of the manuscript. All the authors edited and approved the final version of the manuscript before submission for journal review.

Corresponding author

Correspondence to Paul W. Franks .

Ethics declarations

Competing interests.

M.L.M. has consulted for and/or received speaker honoraria from Amarin, Amgen, AstraZeneca, Boehringer Ingelheim, Daichi, Eli Lilly, Merck Sharp & Dohme, Novo Nordisk, Novartis and Servier. In the past 5 years, A.H.N. has received an investigator-initiated grant from Abbott Diabetes Care and consulting honoraria from Roche Diabetes Care, Australia and the Australian Diabetes Educators Association. There are no perceived conflicts from previous involvements on this work. C.C. is a member of the Board of Directors for the Steno Diabetes Center in Copenhagen, Denmark. The views expressed in this paper do not necessarily reflect those of the Steno Center. M.R. is a consultant on the Genentech. ‘One Roche: Race, Ethnicity and Ancestry (“REA”) Initiative’. A.J.S. received research grants (paid to the institution) from: Intercept, Lilly, Novo Nordisk, Echosense, Boehringer Ingelhiem, Pfizer, Merck, Bristol Myers Squibb, Hanmi, Madrigal, Galmed, Gilead, Salix and Malinckrodt; was a consultant for Intercept, Gilead, Merck, NGM Bio, Terns, Regeneron, Alnylam, Amgen, Genentech, Pfizer, Novo Nordisk, AstraZeneca, Salix, Malinckrodt, Lilly, Histoindex, Path AI, Rivus, Hemoshear, Northsea, 89Bio, Altimmune, Surrozen and Poxel; and had ownership interests in Tiziana, Durect, Exhalenz, GENFIT, Galmed, Northsea and Hemoshear. N.S. has consulted for and/or received speaker honoraria from Abbott Laboratories, AbbVie, Amgen, AstraZeneca, Boehringer Ingelheim, Eli Lilly, Hanmi Pharmaceuticals, Janssen, Menarini-Ricerche, Novartis, Novo Nordisk, Pfizer, Roche Diagnostics and Sanofi; and received grant support (paid to the institution) from AstraZeneca, Boehringer Ingelheim, Novartis and Roche Diagnostics outside the submitted work. M.F.G. received financial and nonfinancial (in-kind) support (paid to the institution) from Boehringer Ingelheim Pharma, JDRF International, Eli Lilly, AbbVie, Sanofi-Aventis, Astellas, Novo Nordisk, Bayer, within EU grant H2020-JTI-lMl2-2015-05 (grant no. 115974—BEAt-DKD); also received financial and in-kind support from Novo Nordisk, Pfizer, Follicum, Coegin Pharma, Abcentra, Probi and Johnson & Johnson, within a project funded by the Swedish Foundation for Strategic Research on precision medicine in diabetes (LUDC-IRC no. 15-0067); and received personal consultancy fees from Lilly and Tribune Therapeutics AB. M.I.T. has, within the past 5 years, received consulting/honoraria from the Novo Nordisk Foundation, Abbott Nutrition, Changing Health and DAISER. This work is independent and does not represent the opinions of these organizations. S.S.R. has received consulting honoraria from Westat and investigator-initiated grants from the US National Institutes of Health, the Juvenile Diabetes Research Foundation and the Leona M. and Harry B. Helmsley Charitable Trust. J.L.S. receives consulting fees from the World Health Organization and the University of Bergen. This work was done outside these roles and the opinions expressed in these guidelines do not necessarily reflect those of the World Health Organization or the University of Bergen. J.L.S. was deputy editor of Nature Medicine until December 2023. She left employment at Springer Nature before any of her work on this Consensus Statement was initiated. P.W.F. was an employee of the Novo Nordisk Foundation at the time that these guidelines were written, although this work was done entirely within his academic capacity. The opinions expressed in these guidelines do not necessarily reflect those of the Novo Nordisk Foundation. Within the past 5 years, he has received consulting honoraria from Eli Lilly, Novo Nordisk Foundation, Novo Nordisk, UBS and Zoe, and previously had other financial interests in Zoe. He has also received investigator-initiated grants (paid to the institution) from numerous pharmaceutical companies as part of the Innovative Medicines Initiative of the European Union. The remaining authors declare no competing interests. J.A.B. received royalties from Elsevier as an editor on a medical textbook that does not impact this work. A.J.S. has stock options in Rivus, is a consultant to Boehringer Ingelhiem and Akero, and has grants from Takeda.

Peer review

Peer review information.

Nature Medicine thanks Jose Florez and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Joao Monteiro, in collaboration with the Nature Medicine team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information, rights and permissions.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article.

Lim, S.S., Semnani-Azad, Z., Morieri, M.L. et al. Reporting guidelines for precision medicine research of clinical relevance: the BePRECISE checklist. Nat Med (2024). https://doi.org/10.1038/s41591-024-03033-3

Download citation

Received : 21 March 2024

Accepted : 29 April 2024

Published : 19 July 2024

DOI : https://doi.org/10.1038/s41591-024-03033-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

sample rationale in research paper

Cookies on GOV.UK

We use some essential cookies to make this website work.

We’d like to set additional cookies to understand how you use GOV.UK, remember your settings and improve government services.

We also use cookies set by other sites to help us deliver content from their services.

You have accepted additional cookies. You can change your cookie settings at any time.

You have rejected additional cookies. You can change your cookie settings at any time.

  • Education, training and skills
  • School curriculum
  • Exam regulation and administration

A review of standards in GCSE computer science

Ofqual

Published 18 July 2024

Applies to England

sample rationale in research paper

© Crown copyright 2024

This publication is licensed under the terms of the Open Government Licence v3.0 except where otherwise stated. To view this licence, visit nationalarchives.gov.uk/doc/open-government-licence/version/3 or write to the Information Policy Team, The National Archives, Kew, London TW9 4DU, or email: [email protected] .

Where we have identified any third party copyright information you will need to obtain permission from the copyright holders concerned.

This publication is available at https://www.gov.uk/government/publications/a-review-of-standards-in-gcse-computer-science/a-review-of-standards-in-gcse-computer-science

  • Tim Stratton

With thanks to

  • Charlotte Draper
  • Rachel Taylor
  • Ian Stockford

Executive Summary

Computer science is a relatively new GCSE that was first awarded in 2012. It has since gone through a series of changes, in terms of the content and assessment structure of the qualification, and in terms of the size and makeup of the cohort taking the qualification. Due to these changes, and following representations from stakeholders, Ofqual undertook a substantial programme of research to consider grading standards in GCSE computer science over time.

GCSE computer science represents an unusual scenario due to the number of changes which have occurred within the short lifespan of the qualification. These have included changes to the assessment structure, reform, the COVID-19 pandemic and reported high levels of malpractice in the pre-reform specifications. Entries have grown substantially from being only offered by a small number of schools and colleges to being much more widely available since being included in the Ebacc school performance measure and with the discontinuation of GCSE Information and Communication Technology (ICT). Substantial changes to the design of a qualification, the context within which it is operating, and the nature of its entry can introduce challenges in effectively maintaining standards over time.

This report includes details of the programme of research that Ofqual has undertaken to consider grading standards over time in GCSE computer science. There are 2 main strands of work: the first strand used a range of methodologies and analyses to consider whether there is any evidence that standards have not been consistently maintained over time, and the second aimed to consider the possible impact of any changes to the current grading standard, by reviewing examples of student work from summer 2023.

Strand 1 of this programme of work utilised a series of analytical approaches, both judgemental and statistical, to evaluate the grading standard of GCSE computer science over time. These analyses indicated that there has been a small reduction in the likelihood of students receiving at least a grade 7 (grade A pre-reform) or grade 4 (grade C pre-reform) between 2014 and 2019. No consistent effect is seen at grade 1 (grade G pre-reform). Through these analyses we aimed to exclude possible valid reasons for a change in outcomes, such as a change in the ability of the cohort, change in the familiarity of centres with the qualification and assessment and changes in the type of school or the cohort taking the qualification. Therefore, this change in outcomes likely suggests a small unintended change in the qualification standard. Analyses indicate this change in standard may have occurred primarily between 2014 and 2017, when there was a large increase in entry to the qualification, including many new centres offering computer science.

The second strand of work aimed to consider the possible impact a change in standards would have on the performance necessary to achieve a grade 7 or grade 4 in the most recent assessments. A group of 8 subject experts reviewed examples of students’ work at various mark points. The findings indicated that the experts believed a small change in the performance standard necessary for students to attain a grade 7 or grade 4 would have a limited impact on the skills and knowledge demonstrated by students at these grades. However, a larger change would risk potentially undermining the value of the qualification. Experts also provided various qualitative insights into the standard of the qualification.

Taking into account the range of evidence, there is a compelling case that standards may not have been consistently maintained through the period from 2014 to 2019, with the standard being set slightly more severely during that period. This change in standards appears to have been the result of a gradual change over a series of years. These small incremental changes are unlikely to have been detectable by senior examiners in any individual year, but cumulatively have resulted in a more substantive change. This is not as a consequence of a failure of awarding organisations to have provided sufficient oversight and care through the awarding process but is a consequence of the changes to the qualification and the context in which it was operating during this period of time.

Introduction

Computer science is a relatively new GCSE that was first awarded in 2012. It has since gone through a series of changes, in terms of the content and assessment structure of the qualification, and in terms of the size and makeup of the cohort taking the qualification. Due to these changes, and following representations from stakeholders, Ofqual undertook a substantial programme of research to consider grading standards in GCSE computer science over time, with a view to considering whether or not standards have been effectively maintained.

A brief history of GCSE computer science qualifications and assessment

A GCSE in ‘computing’ was first offered by the awarding organisation (AO) OCR, with a pilot award in 2011 and the first full award in 2012. The qualification was designed to develop students’ understanding of the inner working and programming of computer systems, distinct from the end-user focus of the existing GCSE in ICT (Dallaway, 2015).

Assessments in this first specification consisted of a single exam counting for 40% of the qualification grade and 2 controlled assessments, conducted in the classroom, worth 30% each. Each controlled assessment lasted around 20 hours and the final piece of work was generated under controlled conditions, that is, direct teacher supervision. These assessments were marked internally by schools and colleges (referred to as centres throughout) and moderated by OCR.

In 2013, the Department for Education (DfE) published a national curriculum for computing, covering key stage 1 to key stage 4 and in 2014 ‘computer science’ was added to the Ebacc school performance measure, in the sciences category. This aimed to incentivise schools to provide computer science education (Brown et al, 2014). In 2014, a GCSE in computer science was offered by 2 more AOs, WJEC and AQA, and OCR revised their specification to align with the new computer science requirements for inclusion in the Ebacc. The qualification was made available from a fourth AO, Pearson, for first assessment from 2015.

During the period from 2014 to 2017, the structure of the qualifications offered by the different AOs was similar to that initially offered by OCR, although there were some differences between the AOs. All the qualifications consisted of an exam and one or more controlled assessments that made up 25% to 60% of the total qualification (see Table 1 for details). Controlled assessments were all marked internally by teachers and externally moderated by the AOs. All AOs had a single exam paper, however WJEC also included a 2-hour onscreen externally marked problem-solving assessment making up 30% of the qualification in addition to the controlled assessment.

Table 1. Assessment structure for the different computer science specifications available before and after GCSE reform, including the percentage contribution of each assessment to qualification outcomes.

AO OCR AQA WJEC Pearson
2012 2014 2014 2015
40% Written exam (Computer systems and programming)

30% Controlled assessment 1 (Practical investigation)

30% Controlled assessment 2 (Programming project)
40% Written exam (Computing fundamentals)

60% Controlled assessment (Practical programming)

45% Written exam (Understanding computer science)

30% Onscreen assessment (Solving problems using computers)

25% Controlled assessment (Developing computing solutions)
75% Written exam (Principles of computer science)

25% Controlled assessment (Practical programming)

50% Written exam 1 (Computer systems)

50% Written exam 2 (Computational thinking, algorithms and programming)
50% Written exam 1 (Computational thinking and problem solving)

50% Written exam 2 (Written assessment)
50% Written exam (Understanding computer science)

50% Onscreen exam (Computer programming)
50% Written exam (Principles of computer science)

50% Onscreen exam (Application of computational thinking)

Note: The post-reform structure represents the assessment structure of the qualification following the removal of the NEA. See text for details.

All GCSE subjects were reformed for first teaching between 2015 and 2018. Reformed GCSEs are graded on a 9 to 1 grading scale, rather than the pre-reform A* to G scale. Reformed GCSE computer science specifications were based on core subject content defined by the DfE (DfE, 2015) and were available for first teaching in 2016. At the same time, GCSE ICT was discontinued (Ofqual, 2015a), which had until that point sat alongside GCSE computer science. The assessment requirements for the reformed qualifications were more specific, and therefore all AOs’ assessments followed the same structure. Assessment of the post-reform qualifications was intended to consist of assessment by exam, contributing 80% of marks, and a non-examination assessment (NEA) intended to take 20 hours in total under tightly-controlled conditions, making up 20% of the marks. The NEA was again permitted to be marked internally, but externally moderated.

The NEA task for the first year of reformed GCSE computer science was released by exam boards in September 2017 and due to be completed by March 2018, with first exams of the reformed qualifications sat in summer 2018. However, shortly after the NEA was released, reports of widespread malpractice, including solutions to the assessment being available online, led to the rapid withdrawal of the NEA (Ofqual, 2017). Following a public consultation, Ofqual stipulated temporary interim assessment arrangements. Centres were still required to conduct the 20-hour assessment, but it no longer counted towards a student’s overall grade. The exam boards updated the weighting of their exam papers so they counted for 50% of the marks each (see Table 1). These arrangements were intended to remain in place until 2021, while Ofqual consulted on long-term changes to the assessments (Ofqual, 2019).

Alongside reforms, in 2018, the National Centre for Computing Education (NCCE) was established to help train computer science teachers. The NCCE provides lesson plans and resources, as well as training programmes for teachers. By 2018 nearly 80% of year 11 pupils were in a school offering GCSE computer science (Kemp & Berry, 2019).

In 2020 and 2021, formal exams were cancelled for all GCSE and A level qualifications and were replaced with a system of teacher assessment, due to impacts of the COVID-19 pandemic. There was a return to exams in 2022 and at this point, following a further Ofqual consultation, the GCSE computer science exams were updated to include questions assessing students’ knowledge and understanding of programming skills, in lieu of the NEA (Ofqual, 2019). This has remained the case for 2023 and 2024.

In summary, despite being a relatively new GCSE that was first awarded in 2012, computer science has undergone many changes in terms of its content and assessment structure. The context within which it is operating has also changed, as has the size and makeup of the cohort taking the qualification.

Setting and maintaining grading standards

Determining if standards have been effectively maintained in a qualification is a challenging task. Outcomes from a qualification can change year on year for many reasons. However, in most cases these are legitimate increases or decreases in outcomes, which do not necessarily reflect a change in the standard of the qualification.

When seeking to maintain standards outside of any times of change, the aim is to ensure that results across successive years of the same qualification can be interpreted in the same way, in terms of what it tells us about student attainment in the subject. Typically, we would say that standards have been maintained if students receiving that same grade in different years show equivalent levels of attainment. By attainment we mean the level of skills or knowledge that students have developed through their course of study. When all else is stable, we would expect this attainment to be evidenced through students’ performance in their assessments. Therefore, during stable periods we would expect the quality of students’ work produced in exams, or other assessments, at each grade boundary (that is, the ‘performance standard’) to be highly similar between exam series. The aim of the awarding process is to set grade boundaries that make that the case.

Identifying grade boundaries that maintain standards is not straightforward as assessments change from year to year, both in terms of the content covered from the qualification’s specification and because of changes to the difficulty of the assessment. Although assessment writers aim to write assessments that are of similar difficulty each exam series, this is highly challenging to achieve in practice, therefore no two exams are likely to be of exactly equal difficulty. Consequently, the grade boundaries are unlikely to be the same from year to year. If the assessment is more demanding in one year, then we would expect the grade boundaries to be lower to compensate. There is an added level of complexity in that GCSE assessments are ‘compensatory’. This means that students can gain marks in different areas of the assessment but receive the same total marks, potentially showing very different profiles in terms of their skills and knowledge. Therefore, to support examiners in their judgements, statistical evidence is used to help identify the direction and size of any changes in assessment demand. The details of these 2 types of evidence and how they are used in tandem is discussed below.

While there are complexities to the maintenance of standards, once a qualification is well established, the aim of maintaining the performance standard over time is relatively simple to conceptualise, as outlined above. This is less so during times of change, when assessments or qualification content is updated, such as during reform. When qualifications change, it is less meaningful to consider whether or not performance is maintained in the new reformed version, compared to the previous version for 2 reasons. First, the content of the qualification and the way that content is assessed is likely to have changed substantially meaning like-for-like comparisons are not possible. Second, there is evidence that student performance might be impacted during such changes. Previous data has shown that students’ performance in assessments is typically weaker in the first year after reform, and this is usually attributed to teachers being less familiar with new content or features of the updated assessments (Cuff et al., 2019). Performance then gradually improves over the following few years as teachers become more familiar with the reformed qualifications. This pattern of a dip in performance followed by gradual improvement is referred to as the Sawtooth Effect (more can be read about it in Newton, 2020). During these periods, it may not be meaningful for examiners to seek to identify similar levels of performance between pre- and post-reform qualifications, and it may not be fair to do so as students risk being disadvantaged if they happen to be in the first, or early, cohort taking a newly reformed qualification.

Therefore, during periods of reform in England statistical evidence is typically prioritised and judgemental evidence provides a more supporting role, to ensure that students are not disadvantaged. This approach seeks to reward students with the same level of underlying attainment similarly either side of the reforms, not disadvantaging those whose performance in the assessments may have been lower due to a lack of familiarity with the assessments post-reform. The assumption is made that if the makeup of the cohort has not substantially changed then we would not expect the outcomes to substantially change year on year, at the cohort level. The principle behind the use of statistical evidence is therefore that outcomes on the new assessments should be comparable to outcomes if the same cohort had taken the qualification in another year (Cresswell, 2003). However, this means that the quality of work produced by students during these periods may be weaker than that receiving the same grade during stable periods.

Operationalising the setting and maintenance of standards

Standards in GCSE assessments in England are maintained through the setting of grade boundaries. Grade boundaries represent the lowest mark where students demonstrate the performance necessary to receive each grade. Pre-reform at GCSE, awarding focused on the key judgemental boundaries of A, C and F. To support the maintenance of standards across the transition, grades A, C and G were referenced to grades 7, 4 and 1 post-reform, which became the new judgemental boundaries. The intermediate boundaries are calculated arithmetically, equally spaced between the judgemental boundaries. Examiners use a range of evidence to help guide their decision making when recommending grade boundaries.

Examiners typically scrutinise examples of student work to identify the mark where students demonstrate the same level of performance as those at the grade boundary in the previous year. To achieve this, ‘archive evidence’ representing students’ work at each grade boundary from previous years is used to encapsulate the expected level of performance. Examiners review the quality of student work compared to the archive evidence, to identify the grade boundaries that most closely carries forward the performance standard from the previous year.

As discussed in the previous section, examiners’ decisions are supported by statistical evidence. One key source of statistical evidence is prior-attainment-based predictions. Predictions take into account the prior attainment of each cohort and provide an indication of what outcomes might be expected to look like if the cohort in the current year is similar to that in a previous reference year, in terms of all features which may affect outcomes except for prior attainment. They achieve this by carrying forward the value-added relationship for a qualification from a reference year, that is, the relationship between the cohort’s performance in a previous set of qualifications and the current assessment. For GCSEs this is typically the relationship between KS2 assessment results and GCSE results. Therefore, if the ability of the cohort as measured by their prior attainment is similar to the reference year, then the predicted outcomes will be similar to those in the reference year. However, if the ability of the cohort taking the subject has increased or decreased then the predictions will change accordingly. AOs use predictions to identify the grade boundaries which most closely maintain the relationship between prior attainment results of the cohort and results in the subject in question over time. The boundaries suggested by the predictions are then used to guide examiner judgements to set grade boundaries. Using statistical predictions in this way also helps support the alignment of standards between different AOs.

Predictions are based on a subset of ‘matched candidates’, those who are of target age group (for GCSEs this is those who would be 16 on 31 August of the year they took their exams), who have available prior attainment data (KS2 results). GCSE predictions also typically exclude students at selective or independent centres, as research has shown that they tend to have a different value-added relationship between prior attainment and current outcomes than students at other centres.

The reliability of statistical evidence will vary depending on the size and stability of the cohort taking the qualification. When the number of students taking a qualification is small, the statistical evidence is likely to be weaker, so AOs will put more weight on other sources of evidence, such as examiner judgement. Similarly, if there have been substantial changes in the cohort taking a qualification, such as large increases or decreases in entry, or changes to the types of students or centres taking a qualification, then statistical predictions may be less reliable representation of the performance of the current cohort.

In the early years following reform to the GCSE, greater weight was placed on statistical predictions. As discussed previously, the intention of this was to avoid students being disadvantaged in the early years post-reform, when performance may be lower due to teachers’ unfamiliarity with the new content and assessments. However, awarding teams continued to scrutinise examples of student work to confirm that the quality of students’ work at the grade boundaries was acceptable.

Setting and maintaining standards in GCSE computer science, 2012 to 2023

In the first award of GCSE computer science in 2012 it was necessary to set the standard for this new qualification. The first award was largely judgemental, but statistical evidence was used to support awarders’ judgements. Statistical predictions were produced from a selection of related GCSE subjects (namely, ICT, physics and maths) to provide an indication of what outcomes might look like in the first award of computer science, taking into account the ability of the cohort. This was then used to inform examiner scrutiny of the quality of work students produced in the assessments to determine grade boundaries.

Following the first award in 2012, until 2017, grade boundaries continued to be set based on a balance of statistical evidence and examiner judgement with the aim of maintaining the performance standard. In most years, statistical predictions were produced based on the previous year’s outcomes, to guide examiners in making their judgements. In the early years of the qualification, AOs would also have been aware that these were new assessments and a new specification, with which teachers would have been somewhat unfamiliar.

Reformed GCSE computer science assessments were first awarded in 2018. As for all reformed GCSEs, statistical evidence was prioritised during the reform period (2018 and 2019). As described above, this was to ensure that students were not disadvantaged due to any dips in performance during the transition years and to overcome the challenges to the use of examiner judgement during this period.

During the COVID-19 pandemic (2020 and 2021), normal assessment arrangements were suspended, and grades were awarded based on teacher assessments. Normal exam arrangements for GCSEs returned in summer 2022. However, in 2022 grade boundaries were set in such a way that outcomes were broadly mid-way between results in 2021 and 2019 as part of the 2-year return to pre-pandemic standards.

Summer 2023 then represented the first year that grading returned to pre-pandemic grading standards. To facilitate this, standard setting in GCSE computer science in summer 2023 was guided by predictions so that overall results would be similar to outcomes in 2019. This approach was taken to carry forward the grading standard from before the pandemic, but with protection built into the grading process to recognise the disruption that students had faced. This allowed for the fact that exam performance may have been a little lower than before the pandemic, similar to the approach taken during reform. However, examiners were asked in awarding in 2023 to review students’ work at the grade boundaries and confirm that students were demonstrating an acceptable level of performance. Therefore 2023 provides a good representation of the current performance standard.

Structure of this report

The preceding sections of this report have outlined the history of GCSE computer science, the principles that underpin the setting and maintenance of standards, and how that is operationalised through the process of awarding. This aims to support understanding of the analytical approaches that are documented through the main sections of this report.

There are 2 main strands of work, the first strand used a range of methodologies and analyses to consider whether there is any evidence that standards have not been consistently maintained over time, and the second strand aimed to consider the possible impact of any changes to the current standard, by reviewing examples of student work from summer 2023.

The methodology, results and interim findings relating to each analysis are reported in the sections that follow, before the overall findings are discussed and conclusions drawn.

Throughout this report, reference will be made to grade A/7, C/4 or G/1 to describe effects at those grades that span across pre- and post-reform versions of the qualification. When referring to the percentage of students receiving each grade we mean the cumulative percentage, that is the percentage of students receiving either the grade in question or a higher grade.

Strand 1 - Standards over time 2012 to 2019

The aim of this strand is to look back on standards in GCSE computer science historically, focusing on the period from when assessments were first sat (2012) to the last year before the pandemic (2019). This is principally to identify if there have been any unexpected changes in standards in this qualification over time. If any changes in standard are identified the aim is to try to understand the cause of these changes and the size of the impact on student outcomes.

Structure of strand 1

To achieve the above aim we have taken a variety of approaches to consider standards in the qualification over time, both purely quantitative and more qualitative approaches. The following sections of this report will outline each of these methods in turn, detailing the aims, methodology and key findings of the individual approaches. Each of these methods allows us to control for different potentially confounding factors, however each method also comes with its own limitations and assumptions, which we outline in each section. We will then draw together the findings from these individual analyses.

The first section (analyses 1 and 2) includes contextual background information to changes in the qualification. This includes descriptive information of how the qualification and cohort has changed over time, and also an overview of how the AOs approached standard maintenance and setting grade boundaries in each year.

The second section (analyses 3 to 7) contains a range of statistical methods that look at the relationship between outcomes in the qualification and other measures of student attainment over time.

Caution needs to be taken when directly comparing between outputs from the different analyses as they each have their own assumptions and in some cases are calculated using a slightly different subset of the population.

The key dataset used for the majority of the analyses presented in this report is the National Pupil Database (NPD). This is a dataset maintained by the DfE and contains details of students’ assessment results, along with a large number of other student and centre characteristics. Data was taken from NPD years 2011 to 2019 and filtered to students who had taken GCSE computer science for the primary analyses. Data from GCSE maths, physics and English language are also used in various analyses for comparison. Prior attainment data was available in the NPD for the majority of students in each year based on their key stage 2 (KS2) national curriculum assessment results in maths and English.

Data was filtered to only 16-year-old students from England, who had a valid GCSE grade. Results data were combined with Schools Census data to provide student characteristics, these included: centre type attended, gender, ethnic group, language spoken, special education needs (SEN) status and free school meal (FSM) eligibility. Table 2 shows the number of students entered for GCSE computer science in each year, along with the percentage of students with available census data and prior attainment data. It is notable that the availability of prior attainment data is lower in 2015 due to boycotts of KS2 assessments in 2010. The majority of the analyses focus on years 2014 to 2019. Data prior to 2014 is presented where possible but needs to be treated with caution as entries were small.

Data on statistical predictions used by AOs comes from datasets regularly shared with Ofqual as part of routine monitoring of results in each year. Additional data was also collected from the 2 AOs with the largest entry to computer science, OCR and AQA. This included documentation of decision-making during grade boundary setting in each year and some of the supporting information used.

Where additional datasets were used, or additional data processing was carried out, this is detailed in the relevant section.

Table 2. Summary of student numbers and match rates across data sets in each year

Year Total computer science students % with census data % with prior attainment data
2011 (pilot) 92 97.8 96.7
2012 1,745 92.3 90.7
2013 4,179 95.9 92.7
2014 16,011 96.7 92.2
2015 33,773 96.6 69.1
2016 61,751 96.9 92.6
2017 67,374 96.8 92.5
2018 71,111 96.2 91.6
2019 75,165 95.2 91.6

Strand 1. Analysis 1. Cohort changes and outcomes over time

Changes to the cohort taking an assessment can make it more challenging to effectively maintain standards over time in a qualification. The aim of this first section is to identify any changes to the cohort entered for GCSE computer science over time. This will identify whether such changes might indicate a case for further exploration and provide context for any further analysis.

Cohort size

Figure 1 shows the number of students entering GCSE computer science between 2011 and 2019, both overall and broken down by individual AO. There are 2 notable things from this figure. The first is that OCR, the AO that was first to offer the GCSE, has continued to have the majority of entries over time. Second, the number of students taking the qualification increased rapidly between 2014 and 2016, before the increase in entries slowed down.

Figure 1. Entry numbers to GCSE computer science over time, overall and broken down by AO.

Centre characteristics

Figure 2 shows the different types of centres entering students to GCSE computer science over time. Centres are categorised into independent centres, selective centres, maintained schools (including secondary schools, academies and free schools), and colleges. It is notable that in the first couple of years of the qualification being available there was a much larger proportion of students from independent and selective centres. However, following 2014, the proportion of students from different centre types stayed broadly stable.

Figure 2. Proportion of entry in each year from different centre types.

Figure 3 shows the number of students at ‘new’ centres entering the qualification each year. By ‘new’ centres we mean centres that had never previously entered students for the qualification. As can be seen from Figure 3, a large proportion of students that were taking the qualification were from ‘new’ centres until around 2016. Table 3 summarises the number of centres and the average entry per centre in each year. It is notable that as the entry size increased the majority of this increase was through new centres offering the qualification, rather than existing centres increasing the number of students they entered. Average entry size per centre did gradually increase between 2012 and 2016 before stabilising from 2016 onwards, which may suggest there was some change in the cohorts within centres. The standard deviations also indicate that there is a large amount of variation in entry size between centres.

These changes to the cohort are worth noting in the context of evidence showing that when centres are unfamiliar with offering a qualification, students at these centres can perform less well in the assessments (Newton, 2020). In years where a large number of students entering the qualification were at new centres, that could lead to the performance of the cohort being weaker than might otherwise have been expected.

Figure 3. Percentage of students entering in each year from centres that were entering students for the first time.

Table 3. Number of centres and average number of students per centre over time.

Year N Centres Mean entry per centre SD entry per centre
2012 97 18.0 10.7
2013 210 19.9 14.2
2014 724 22.1 15.0
2015 1,437 23.5 15.8
2016 2,340 26.4 19.7
2017 2,652 25.4 17.4
2018 2,845 25.0 16.7
2019 2,922 25.7 17.5

Student characteristics

Next, we look at the characteristics of the students taking computer science over time. Figure 4 shows the average standardised prior attainment score of students over time. This is students’ attainment in KS2 assessments 5 years before taking the GCSE. KS2 score is presented here on a standardised scale between 0 and 100 with a mean of 50 across all GCSE students. From Figure 4 it can be seen that the prior attainment of students taking computer science decreased fairly rapidly between 2012 and 2014, stabilised in 2015, before dropping again in 2016 and gradually increasing until 2019. The relationship between prior attainment and GCSE outcomes is strong for many GCSE subjects (Benton & Sutch, 2014) and so can give a good indication of expected outcomes, where other factors remain stable. Crucially, it is also used in the generation of predictions which are used to help set grade boundaries (see the section Operationalising the setting and maintenance of standards).

Figure 4. Mean prior attainment score over time for GCSE computer science students.

Table 4 shows the proportion of students taking computer science with different characteristics. This shows that the candidature has gradually changed over time. Most notably the proportion of students taking the subject with English as a foreign language (EFL) and those with special educational needs (SEN) has gradually increased since 2013. The candidature has also become more diverse, with a lower proportion of white students and a growing proportion of female students taking the subject. The largest step change in most of the characteristics was between 2015 and 2016, when entries also increased substantially.

Table 4. Characteristics of GCSE computer science cohort over time.

Year % FSM % EFL % SEN % White % Female
2012 5.2% 13.5% 10.3% 79.0% 13.5%
2013 8.4% 14.5% 8.5% 77.9% 14.5%
2014 9.9% 14.9% 9.6% 78.0% 15.4%
2015 9.2% 15.6% 9.0% 78.1% 16.2%
2016 10.3% 17.0% 9.3% 77.3% 20.5%
2017 9.8% 17.2% 9.3% 76.8% 20.2%
2018 9.7% 18.9% 9.6% 74.1% 20.4%
2019 10.7% 19.9% 9.7% 71.8% 21.6%

Finally, we look at outcomes in the qualification over time. This is intentionally presented after the above analysis of other changes over time, as outcomes can change for a number of legitimate reasons that may be related to some of the above changes in entry patterns.

Figure 5 shows the cumulative percentage of students attaining at least a grade C/4 and A/7 over time. While outcomes have generally fallen over time at both grades, there is a particularly notable shift between 2015 and 2016. This coincides with the large increase in entries and some of the changes in candidature noted above. It also coincides with the fall in average prior attainment of the cohort, which could represent a legitimate fall in outcomes.

Figure 5. Cumulative percentage outcomes of students receiving at least a grade A/7 and C/4 over time.

The descriptive analyses reported here provide context for the analysis and discussion that follows. These analyses confirm that the qualification has seen a change in the size and make-up of the cohort taking the subject, along with a change in the outcomes in GCSE computer science. These analyses have shown that outcomes have declined in GCSE computer science over time, most notably between 2012 and 2016.

As noted, changes in cohorts over time make the maintenance of standards more challenging. The analyses that follow seek to identify whether or not the changes in outcomes that have been observed reflect genuine changes in the attainment of the GCSE computer science cohort over time, or whether they may be attributable to a change in standards over that period.

From the above descriptive analysis of student characteristics there are 3 potential legitimate reasons for a change in outcomes:

  • Outcomes could decline because the cohort taking GCSE computer science became weaker over time. Evidence from the prior attainment data suggests that this may be the case.
  • Students at centres that are delivering the qualification for the first time may perform worse in the assessments, potentially due to teacher unfamiliarity with the course content and the assessments. This could see outcomes for those centres being lower than would be the case when their familiarity increases. In years with a large number of new centres, this could contribute to overall dips in outcomes, if the boundaries based on the predictions suggested a quality of student work that could not be supported by the examiners.
  • Cohorts in later years may be functionally different than those in earlier years. As the number of centres increases, cohorts at those newer centres may have typically lower outcomes (relative to their prior attainment) than students at centres taking the qualification in earlier years. This could be due to demographic differences or to factors such as centre resources or teacher expertise differing between early uptake and later uptake centres.

Each of the above factors could lead to legitimate changes in outcomes in the qualification. Throughout the rest of the report, we aim to control and compensate for one or more of these factors in the analyses to understand what may be contributing to a change in outcomes. If changes in outcomes cannot be attributed to the above factors this could indicate an unintended change in standards over time.

Strand 1. Analysis 2. Predictions, grade boundaries and awarding documents

In this section we review data from the AOs offering GCSE computer science over time that results from, or contributes to, decision making regarding grade boundary setting in each year. The aim is to identify whether there might be indicators of a potential change in standards, or risks to the maintenance of standards.

Outcomes relative to predictions

Grade boundaries in GCSE computer science were set using a balance of statistical and judgemental evidence. Each year statistical predictions were created based on a reference year, from which the relationship between prior attainment and outcomes is carried forward, as described in the section ‘Operationalising the setting and maintenance of standards’.

For GCSE computer science, each year, predictions were based on outcomes in the previous year, except for in 2016 when predictions were based on 2014. The reason for updating the reference year for predictions is typically to better reflect the cohort taking the assessment if the cohort make-up is changing over time, as was the case for computer science. The reference year may also be updated if entries have increased in small entry subjects, as larger samples usually provide a more reliable prediction. In 2016 the ‘reference year’ was not updated for computer science as, due to KS2 assessment boycotts in 2010, 2015 had fewer matched candidates.

Figure 6 and Figure 7 summarise the difference between predicted outcomes and matched candidate outcomes for grades A/7 and C/4, respectively. Data is combined across all AOs offering the qualification in each year, weighted by their total entry.

Figure 6. Cumulative actual outcomes and predicted outcomes for matched candidates for GCSE computer science at grade A/7.

Figure 7. Cumulative actual outcomes and predicted outcomes for matched candidates for GCSE computer science at grade C/4.

Figure 6 and Figure 7 show that between 2014 and 2016, outcomes at grade A/7 were slightly below predictions, although within a 1 percentage point (pp) difference. Given that predictions are likely to be less reliable when based on small entry numbers, 1pp does not represent a large difference and awarders may legitimately put more weight on other evidence when statistics are less reliable. At grade C/4 outcomes were close to predictions in all years except 2016, when they were overall around 3pp below prediction.

Information from awarding documents indicates that where outcomes were below predictions it was typically because examiners judged the quality of work to be too low at the grade boundary indicated by the prediction and so a higher boundary was recommended than those suggested by the predictions. Figure 8 shows the grade boundaries set over time for the AO with the largest entry (OCR) in both their examined and controlled assessments.

Figure 8. Grade boundaries over time for OCR assessments.

Malpractice

Documentation highlights examiners’ concerns around malpractice from as early as 2014 up until reform, and the risk that this would lead to grade inflation in the controlled assessments. Despite this concern, typically the grade boundaries in the controlled assessment element were kept stable to reflect the fact that the task, and therefore the demand of the assessment, remained similar from year to year. It is perhaps surprising that despite grade boundaries typically being lowered in the examined element, and possible grade inflation in the controlled assessment, this did not lead to higher, rather than lower, outcomes over time. This implies a cohort which was weaker and less well prepared over subsequent years relative to their prior attainment.

In 2016 OCR made changes to one of their controlled assessments to make the task more open ended, in an attempt to avoid malpractice such as solutions being posted online. This change to the assessment could have resulted in a temporary change in performance due to the newness of this assessment, leading to a sawtooth-like pattern of performance. The grade boundary in the controlled assessment was lowered by one mark to compensate for the potential increase in difficulty (see Figure 8), however boundaries were raised again in 2017 as performance improved.

Reference year

One possible effect arising from the way standards were set during this period is the potential cumulative effect of repeatedly awarding below prediction, followed by the updating of reference years for calculating future predictions. If outcomes are awarded below prediction in a particular year and that year becomes the reference year for a future year’s prediction (to best reflect the most recently observed value-added relationship), then the expected value-added relationship is such that predicted results will be lower for students with the same prior attainment than would previously have been the case. If this happens repeatedly, as in GCSE computer science, this leads to a cumulative lowering of the expected value-added relationship for future cohorts, cumulatively lowering the expected outcomes for students with the same prior attainment over time, in order to reflect the observed performance of students.

Table 5 gives a rough estimate of the size of this effect, taking into account the reference year used in each year. Although this is only a simple calculation, that does not take into account possible changes in the prior attainment distribution over time, it does indicate a potential ‘deflationary effect’ on outcomes, albeit one based on judgements of the acceptability of students work. Table 5 indicates that by 2019 this ‘deflationary’ effect could have led to predictions around 1.5pp lower at A/7, 3.5pp at C/4 and 1.8pp at grade G/1 relative to 2014.

Table 5. Estimated cumulative effect of awarding below prediction on future predictions. Figures indicate the cumulative difference between predictions and outcomes over time in percentage points.

Year Reference year used for predictions Grade A/7 Grade C/4 Grade G/1
2014 2013 -0.87 -0.37 0.05
2015 2014 -1.72 -0.42 0.04
2016 2014 -1.45 -3.33 -1.46
2017 2016 -1.33 -3.70 -1.78
2018 2017 -1.37 -3.63 -1.67
2019 2018 -1.46 -3.49 -1.78

In individual years, these outcomes reflect examiner judgements of the quality of performance that were made during that period. This lowering of predictions may therefore be justified if this represents a permanent change to the cohort’s expected value-added relationship, that is if overall the cohort is performing worse relative to their prior attainment than in previous years and this is expected to continue indefinitely. However, if some of the weaker performance in previous years was due to temporary effects, such as sawtooth or sawtooth-like effects, this could result in an unjustified permanent shift in standards.

Overall, the awarding reports indicate that there were a number of challenges in maintaining standards over time, particularly around the period 2014 to 2016, when there were a large number of students from new centres. AOs set grade boundaries below those suggested by the prior attainment-based predictions on a number of occasions during this period, which may have been due to students at new centres demonstrating weaker performance. The position of the grade boundaries also suggests a growing mismatch in performance between the controlled assessment and exams, which could have been related to malpractice.

One potential risk highlighted from the review of awarding materials was related to the approach to calculating the predictions. The reference years were updated to ensure the predictions were a faithful representation of the awarded value-added relationship observed the previous year. This change, combined with successively awarding below predictions could have led to a small cumulative lowering of expected outcomes, the appropriateness of which may or may not have persisted in future years.

Strand 1. Analysis 3. Outcomes relative to other GCSE qualifications over time

One way to consider qualification standards is to look at how students taking a particular qualification performed in other qualifications they took alongside. The aim of this section is to analyse if the relationship between students’ results in GCSE computer science compared with their results in the other subjects they took alongside changes over time. A change could indicate that standards have changed in computer science.

The intention for this analysis was not to focus on the direct statistical comparability between computer science and other subjects. Absolute differences between subjects in these analyses are not problematic, either in a particular year or persisting over time. Students’ grades in different subjects may be higher or lower than in other subjects for a large number of reasons, which may include teaching time dedicated to the subject, student motivation, how long students have studied the subject, among other factors (for a more detailed discussion see Ofqual, 2015b). We therefore would not expect students’ results to be perfectly aligned across subjects. Instead, the aim of this analysis was to use results in other subjects as a benchmark to identify if the relative difficulty of computer science had changed over time. The key assumption of this analysis is therefore that there is no reason to expect the relative difficulty of the subjects we are comparing to have changed over time.

We used 2 methods to provide a difficulty estimate for GCSE computer science compared with other subjects taken by the same students in each year, a Rasch difficulty model (see Coe, 2008) and Kelly’s method (Kelly, 1976). These analyses give an indication of how well students performed on average in other GCSE subjects in each year relative to computer science and provide a relative ‘difficulty’ estimate for each subject.

It is worth noting, however, that although these methods effectively control for the ‘general ability’ of the cohort, as measured by students’ performance in other GCSE subjects, they are not able to control for those other factors which may change over time and which may affect performance in specific subjects, such as teaching quality or student motivation.

The first method was to use a Rasch difficulty model to equate difficulty in different subjects in each year. For this model, each subject a student took was treated as an individual item on an assessment. However, only the key grades were used as a threshold for performance categories. To facilitate this, grades were converted to a score (see Table 6). Students not taking a subject were treated as missing responses. Only students who had taken at least 3 GCSEs were included in the analysis and only subjects with at least 1,000 entries in each year.

Table 6. Details of grade conversions to scores for the Rasch analysis.

Score Legacy qualifications Reformed qualifications
0 Ungraded Ungraded
1 D, E, F, G 3, 2, 1
2 B, C 6, 5, 4
3 A, A* 9, 8, 7

The Rasch model was then fitted to simultaneously provide a ‘difficulty’ measure for each of the key grades for each subject and an ‘ability’ measure for each student. The difficulty measure is effectively the average ‘ability’ score of students achieving each grade in each subject in each year. A higher score on the Rasch difficulty scale therefore indicates that it is harder for the average student to achieve that grade. Outcomes for the Rasch model are inherently relative to other subjects and are on an arbitrary scale. Therefore, instead of presenting the Rasch difficulty scores in isolation, we provide the relative difference in difficulty estimates between computer science and 3 other subjects, maths, physics and English language in each year. We use these subjects because of their large and relatively stable entry and because maths and English language are taken by the vast majority of 16-year-old students in each year. Therefore, if we assume the ability distribution of students included in the analysis is similar in each year then we can compare these scores between years to see how they change. For a more detailed discussion of the methodology see He and Black (2020) and He and Cadwallader (2022).

The second approach, Kelly’s method, provides an alternative difficulty estimate. It involves calculating the grade ‘adjustment’ required in each subject for the average difference between each student’s grade in that subject and the average of their other subject grades to be 0 (for more details of the methodology see Coe et al, 2008). This estimate can be loosely interpreted as the mean difference in difficulty for each subject from the average subject. The adjustment is calculated on the A*-G (8 to 1) grade scale. Therefore, for this analysis 9 to 1 grades were converted to an 8 to 1 scale, based on the estimated probability a student gaining each numbered grade would have received each lettered grade (see Table 7).

Table 7. Details of conversion of 9 to 1 grades to an 8 to 1 scale for analysis.

Grade on 9 to 1 scale Grade converted to 8 to 1 scale
9 8
8 7.25
7 7
6 6
5 5.5
4 5
3 3.75
2 2.5
1 1.25
0 0

Again, instead of providing the absolute score we present the relative difference in scores between computer science and physics, English language and maths to identify if the gap between computer science and these subjects has changed over time.

Rasch difficulty

Figure 9. Relative difficulty of GCSE computer science compared with other subjects over time – A/7 grade.

Figure 10. Relative difficulty of GCSE computer science compared with o other subjects over time – C/4 grade.

Figure 11. Relative difficulty of GCSE computer science compared with other subjects over time – G/1 grade.

Figure 9, Figure 10 and Figure 11 show the relative difficulty of English language, physics, and maths compared with computer science based on the statistical definition described above. An increase in the relative difficulty score indicates that, based on these measures, computer science has become more difficult relative to the comparison subject.

As can be seen from Figure 9 and Figure 10, there has been a general upward trend in the difficulty of computer science over time relative to other subjects at both grade A/7 and C/4. At grade A/7 this is an increase of between 0.19 and 0.40 on the Rasch scale from 2014 to 2019, and at grade C/4 this is an increase of between 0.16 and 0.30 over the same time period. The absolute change in the score for computer science over this period is 0.1 at grade C/4 and 0.29 at grade A/7. Converting these scores to grades is challenging but, on average, a Rasch score value of 1.4 equates to approximately 1 grade on the 9 to 1 scale in each year across subjects, so the above represents an increase in difficulty of somewhere between 0.12 and 0.21 of a grade between 2014 and 2019 at grade C/4 and 0.14 and 0.28 at grade A/7.

At grade G/1 there are mixed results (Figure 11), with some evidence suggesting a reduction in the difficulty of GCSE computer science relative to other subjects. Using a similar procedure to convert the Rasch score to grades, this would suggest a reduction in the difficulty at grade G/1 of an average of 0.18 grades between 2014 and 2019, although this varies from between -1.5 grades (compared to physics) and +0.55 grades (compared to maths), depending on the comparison subject.

Kelly’s method

Figure 12. The relative difference between the average grade in computer science and their grade in other subjects.

The Kelly’s method analysis indicates that the difference in difficulty between computer science and the other 3 subjects included here has increased over time, particularly between 2015 and 2017 (Figure 12). The analysis estimates that between 2014 and 2019, students received a grade between 0.15 and 0.24 lower in computer science on an A* to G scale compared with the other subjects. Following a simple proportional scaling to the 9 to 1 scale, this equates to between 0.17 and 0.27 grades, with an average adjustment of 0.18 across all other GCSE subjects included in the analysis in each year.

Overall, both of the above methods indicate that students have generally gained increasingly lower outcomes in GCSE computer science compared with other GCSE subjects over time. It is also worth reiterating that here we are not focused on the absolute difference in scores between the different subjects, which as discussed previously can arise for a number of different reasons, but the relative change over time. These relative changes could indicate a change in standards, representing an increase in the difficulty of GCSE computer science over time. However, this relative change in subject outcomes could also be due to other factors which could legitimately result in a change in outcomes in different subjects, such as students’ preparedness for the assessments, which cannot be controlled for by this method.

Strand 1. Analysis 4. Progression analysis

One of the stated purposes of GCSEs is to prepare students for further study. The aim of this analysis is to identify if the relationship between GCSE and A level results in computer science has changed through time. If we assume that the standard of the A level has not changed, then the relationship between GCSE results and A level results should provide an indication of whether the value of a GCSE grade in indicating likely success at A level has changed through time. That is, do students with a particular grade in the GCSE show greater attainment in computer science in some years rather than others, leading to better (or worse) A level outcomes.

If the GCSE has become more difficult then we might expect to see students with the same GCSE grade performing better in the A level over time, as they have higher underlying attainment in the subject than students receiving the same grade in previous years. Conversely, we may expect that students receiving the same A level grade may have, on average, lower GCSE outcomes over time.

It is worth reiterating, however, that a key assumption of this analysis is that grading standards have not changed in the A level through time – an assumption that we do not test here. There may also be an interaction with centre entry policies for A level courses, which cannot be controlled for. However, unlike the GCSE, A level computer science is not a new subject, and there has been no systematic change in the qualification during the period of interest that would suggest that this assumption may be problematic.

A level data was taken from the NPD for years 2014 to 2019 and filtered to 18-year-old students taking computer science. This was then matched to students’ GCSE computer science results from 2 years previous using their unique student ID.

The proportion of GCSE computer science students who went on to take the A level in the same subject was calculated in each year. The inverse was also calculated, that is, what proportion of A level students had previously taken the GCSE.

For the purposes of this analysis students’ A level grades were converted to numerical values with grades A* to E converted to a numeric 6 to 1 scale, respectively. For those that did take the A level, in each year the mean A level grade was calculated for students with different GCSE grades. We also calculated the proportion of students receiving at least a grade C at A level for students with each GCSE grade. The mean grade students received in the GCSE was then calculated for students receiving different A level grades.

For these analyses, students who took their GCSE at a centre offering the qualification for the first time were removed. We only include data from students taking the GCSE until 2017, as after this, students would have received A level grades based on teacher judgements due to the cancellation of exams during the pandemic.

Finally, we created a linear model to examine the relationship between GCSE and A level computer science grade over time. The model took the form below:

y ij = β 0 + β 1 x 1ij + β 2 x 2ij + β 3 X ij + u j + € ij

In this model, the dependent variable was A level grade (y), the key predictor was Year (x 1 ), and students’ GCSE computer science grade was included as a covariate (x 2 ). The model also included a series of control variables (X), for KS2 prior attainment, ethnicity, gender, SEN status, FSM eligibility, language spoken and centre type. A random effect was included to take into account the clustering of students within centres (u). This is to control for the fact that student outcomes within the same centre are not independent of each other and therefore prevents the overestimation of model effects.

If we see the estimated A level grade from the model for each year increase over time (while keeping GCSE computer science attainment stable), this would indicate that students who attain a similar GCSE score are performing better at A level. For this analysis we only include 4 years for those sitting their GCSEs between 2014 and 2017 due to the small numbers of students available for analysis prior to 2014.

Table 8.  Percentage of students that took A level computer science who had previously completed GCSE computer science.

Year of sitting A level N took A level N previously took GCSE Percentage previously took GCSE
2014 3,781 234 6.2%
2015 4,883 511 10.5%
2016 5,473 1,546 28.2%
2017 7,289 3,776 51.8%
2018 9,259 6,240 67.4%
2019 10,076 7,287 72.3%

Table 9. Percentage of students who took GCSE computer science who went on to do A level computer science.

Year of sitting GCSE N took GCSE N subsequently took A level Percent subsequently took A level
2012 1,745 234 13.4%
2013 4,179 511 12.2%
2014 16,011 1,546 9.7%
2015 33,773 3,776 11.2%
2016 61,751 6,240 10.1%
2017 67,374 7,287 10.8%

As shown in Table 8, the proportion of students taking A level computer science who previously completed the GCSE has increased over time, from 6.2% in 2014 to 77.2% in 2020. This may reflect the increasing entry size to GCSE computer science over this period. The inverse is not true, however, and the proportion of students who took GCSE computer science going on to do A level has remained broadly stable (Table 9).

Table 10. Mean A level score for students receiving different GCSE grades over time. Values from cells with less than 100 students have been removed (appears as ‘n/a’). Year indicates the year students took the GCSE.

GCSE Grade 2012 2013 2014 2015 2016 2017
A* n/a n/a 4.52 4.60 4.62 4.60
A n/a 3.57 3.52 3.46 3.42 3.53
B n/a n/a 2.28 2.46 2.45 2.52
C n/a n/a n/a 1.77 1.74 1.81
D n/a n/a n/a n/a n/a 1.65
E n/a n/a n/a n/a n/a n/a

Table 11. Proportion of students receiving at least a C at A level for students receiving different GCSE grades over time. Values from cells with less than 100 students have been removed (appears as ‘n/a’). Year indicates the year students took the GCSE.

GCSE Grade 2012 2013 2014 2015 2016 2017
A* n/a n/a 0.942 0.963 0.965 0.960
A n/a 0.795 0.792 0.788 0.780 0.811
B n/a n/a 0.429 0.476 0.474 0.509
C n/a n/a n/a 0.268 0.252 0.261
D n/a n/a n/a n/a n/a 0.284
E n/a n/a n/a n/a n/a n/a

Table 10 and Table 11 present the mean A level grade achieved and proportion of students achieving A level grade C or above both differentiated by GCSE grade achieved. These analyses do not show any strong patterns for a change in the relationship between GCSE and A level outcomes over time. There is some slight indication that those who received grade A, B or C at GCSE in 2017 may have had higher attainment in computer science than those who received an A, B or C in 2016. This is because, as shown in Table 10, they gained a slightly higher mean A level grade and their probability of attaining at least a C at A level increased. However, between 2013 and 2016 students who gained an A at GCSE received lower mean A level grades each year, which may suggest higher performing students in the GCSE actually had lower attainment over time.

Table 12. Mean GCSE grade of students receiving different grades at A level. Values from cells with less than 100 students have been removed (appears as ‘n/a’). Year indicates the year students took their GCSEs.

A level Grade 2012 2013 2014 2015 2016 2017
A* n/a n/a n/a n/a 7.80 7.74
A n/a n/a 7.43 7.43 7.42 7.40
B n/a n/a 7.09 6.97 6.96 6.91
C n/a n/a 6.64 6.51 6.58 6.43
D n/a n/a 6.37 6.20 6.18 6.06
E n/a n/a n/a 5.76 5.97 5.80

Table 12 shows the mean grade of GCSE students achieving each grade at A level. Here, there is some indication that students receiving higher A level grades in 2017 had slightly lower mean GCSE scores than they did in previous years, across all grades. This could indicate that students receiving these grades had higher computer science attainment than in previous years. For example, between 2014 and 2017 the mean GCSE grade of students attaining a B at A level dropped from 7.09 (just over an A at GCSE) to 6.91 (a high B at GCSE). Which may suggest that on average students with lower GCSE grades are displaying the same level of ability in computer science as those who achieved slightly higher grades in previous years, as represented by their A level grade.

Figure 13 shows the high-level output from the linear model. This shows changes in the A level grades achieve by students controlling for differences in KS2 attainment, centre type and student background characteristics between years. Full model outputs can be seen in appendix A. The results of the linear model showed some indication that students in 2017 who achieved the same GCSE computer science grade as those in 2014, received a higher A level grade by approximately 0.1 grade (β=0.107, p<0.05). Students in 2014 would need to have a grade 0.13 higher in the GCSE computer science (on an A* to G scale) to receive the same A level grade as similar students in 2017. Proportionally, this converts to around 0.15 grades on a 9 to 1 scale. However, Figure 13 indicates the effect is also not clearly linear, after controlling for other factors. Beyond the difference described between 2017 and 2014, taking into account the uncertainty in the model, there is not a clear trend over time.

Figure 13. Marginal effects from linear regression model for reference group students by year.

The aims of the analysis presented in this section were to identify whether the relationship between students’ performance in GCSE computer science and their success in A level computer science has changed over time. To summarise, the above analysis shows some evidence that students with a similar GCSE grade, and other characteristics, performed better at A level over time. This might indicate that these students are more able at computer science, suggesting the GCSE standard may have become more challenging, however, these effects are subtle. As discussed above, this interpretation relies on the assumption that the standard of the A level has not changed. These results could also indicate changes in centres’ entry policies for their A level courses.

Strand 1. Analysis 5. Simulated Predictions

As discussed in the introduction, prior-attainment based statistical predictions are regularly used to support the setting of grade boundaries each exam series, alongside expert judgement and other technical evidence. Details of this approach are described in the section ‘Operationalising the setting and maintenance of standards’. A key assumption of this method is that the cohort of students in the current year is similar to the cohort of students that took the qualification in the reference year in all ways that would affect their outcomes, except their prior attainment distribution. Therefore, that we can reasonably expect the relationship between prior attainment and outcomes to be the same, on average.

The evidence discussed in Strand 1 Analysis 2 described the circumstances that led to the change in value-added relationship in GCSE computer science over time. The aim of this piece of analysis is to quantify the impact of those changes while considering the change in the cohorts prior (or concurrent) attainment distribution.

For this analysis we generate predictions based on different reference years. We generate 2 sets of predictions, accounting for students’ prior attainment (KS2 score) and concurrent attainment (mean GCSE score) respectively. If there are large differences in the predictions generated depending on the reference year this may indicate that standards have changed between years. However, it could also indicate that other factors have changed that would affect outcomes, such as the makeup of the cohort or teaching time dedicated to the subject.

One additional factor that we can attempt to control for here is how familiar teachers are with the qualification. As discussed previously, outcomes from centres entering students for the first time may be lower if students at those are less well prepared for the assessments. We therefore look at the impact of excluding these ‘new’ centres from the predictions generated, as these centres may have a different value-added relationship.

We calculated predictions using a range of reference years (2012 to 2018) to predict outcomes in 2019, but otherwise following the same methodology as would be typically used by AOs.

For 2015 around 20% of the cohort were missing KS2 prior attainment data (due to boycotts of the KS2 assessments in 2010), meaning using this group as a reference for prior-attainment based predictions may be less reliable. Therefore, 2 sets of predictions were produced. The first set of predictions included all 16-year-old students with prior attainment data, excluding students at selective and independent centres. This is typically the approach taken when GCSE predictions are produced in practice since students at selective and independent centres have a different relationship between prior attainment and GCSE outcomes to other centres. The second set of predictions was produced using concurrent attainment (that is, mean GCSE), rather than prior attainment, and included all 16-year-old students that had taken at least 3 GCSEs. The second set of predictions are therefore based on the relationship between a student’s mean GCSE grade in the other subjects that they took concurrently, and their grade in computer science. For this analysis, students at all centre types were included.

A normalised KS2 prior attainment score was calculated for each student replicating the process for calculating prior-attainment-based predictions used in awarding. A similar process was followed to produce a ‘concurrent attainment’ score based on the students mean GCSE score (converted to an 8 to 1 scale) across all of the other subjects each student studied at GCSE.

For each year normalised prior or concurrent attainment scores were divided into 10 equal deciles based on results for the whole GCSE cohort. For each reference year, the proportion of students in each decile attaining each grade in GCSE computer science was calculated in an outcome matrix. For 2019, we then calculated how many students fell into each attainment decile. The outcome matrix was then used to predict how many of the students in each decile in 2019 would receive each grade, based on the proportions in the reference year. The number of students predicted to receive each grade was then summed over all deciles and used to calculate a cumulative percentage predicted outcome at the grades A/7, C/4 and G/1.

Finally, based on the results of the other analyses and the differences observed in the patterns between new and existing centres, a set of predictions was produced excluding ‘new’ centres in both the reference year and the current year (2019) for each prediction. New centres were defined as those with entries to the qualification for the first time in the year being analysed.

Prior attainment based predictions

Table 13. Simulated predictions for 2019 based on different reference years – matched candidates only, excluding selective and independent centres. ‘Difference’ indicates the percentage point difference between each prediction and actual outcomes in 2019.

Reference Year Matched Entry Cumulative % Predicted A/7 Cumulative % Predicted C/4 Cumulative % Predicted G/1 Difference A/7 Difference C/4 Difference G/1
2013 3,210 17.3 59.8 96.3 -1.0 0.0 -0.5
2014 13,100 18.4 61.8 97.0 0.2 2.0 0.3
2015 20,869 17.6 62.0 97.4 -0.7 2.1 0.6
2016 53,297 18.1 59.0 96.1 -0.1 -0.8 -0.6
2017 58,042 17.9 58.5 96.0 -0.4 -1.4 -0.7
2018 59,718 18.1 59.2 96.7 -0.1 -0.7 -0.1
2019 62,287 18.3 59.8 96.7 0.0 0.0 0.0

Table 14. Simulated predictions for 2019 based on different reference years – matched students excluding students at new centres and selective and independent centres. ‘Difference’ indicates the percentage point difference between each prediction and actual outcomes in 2019.

Reference Year Matched Entry Cumulative % Predicted A/7 Cumulative % Predicted C/4 Cumulative % Predicted G/1 Difference A/7 Difference C/4 Difference G/1
2013 1,116 21.5 65.7 97.8 3.0 5.5 1.0
2014 3,601 22.7 66.7 97.3 4.2 6.6 0.5
2015 10,450 19.3 64.8 97.7 0.8 4.6 0.9
2016 33,555 20.1 62.0 96.9 1.7 1.9 0.1
2017 50,693 18.4 59.1 96.1 -0.1 -1.1 -0.7
2018 56,031 18.2 59.4 96.7 -0.2 -0.8 -0.1
2019 59,047 18.5 60.2 96.8 0.0 0.0 0.0

The above analyses indicate that when including all students, prior-attainment-based predictions based on 2014 outcomes would suggest outcomes around 2pp higher at grade C/4 than actual outcomes in 2019 (Table 13). The difference at grades A/7 and G/1 were much smaller and less consistent between years. When students at centres which had never offered GCSE computer science before were excluded, the size of the difference increased in most years (Table 14). When 2014 was used as the reference year, predictions were almost 7pp higher at grades C/4 and 4pp higher at A/7 than actual outcomes in 2019. This suggests that students at new centres tend to receive, on average, lower GCSE results relative to their prior attainment, and if excluded would have led to higher predictions for non-new centres.

Concurrent attainment based predictions

Table 15. Simulated predictions for 2019 based on different reference years – all students. ‘Difference’ indicates the percentage point difference between each prediction and actual outcomes in 2019.

Reference Year Matched Entry Cumulative % Predicted A/7 Cumulative % Predicted C/4 Cumulative % Predicted G/1 Difference A/7 Difference C/4 Difference G/1
2013 3,756 23.3 65.9 97.1 1.9 3.3 0.3
2014 15,092 23.5 65.3 97.1 2.1 2.7 0.3
2015 31,928 22.2 65.3 97.3 0.8 2.7 0.5
2016 59,334 22.5 62.5 96.4 1.0 -0.1 -0.4
2017 65,897 21.9 62.0 96.3 0.4 -0.6 -0.5
2018 68,966 21.7 62.9 96.9 0.2 0.3 0.1
2019 74,530 21.5 62.6 96.8 0.0 0.0 0.0

Table 16. Simulated predictions for 2019 based on different reference years – excluding students at new centres. ‘Difference’ indicates the percentage point difference between each prediction and actual outcomes in 2019.

Reference Year Matched Entry Cumulative % Predicted A/7 Cumulative % Predicted C/4 Cumulative % Predicted G/1 Difference A/7 Difference C/4 Difference G/1
2013 1,530 26.3 70.0 97.5 4.8 7.2 0.6
2014 4,612 27.6 70.1 97.4 6.1 7.3 0.5
2015 16,309 24.3 68.0 97.8 2.8 5.2 0.9
2016 37,995 24.2 65.0 97.1 2.7 2.2 0.3
2017 57,537 22.1 62.2 96.4 0.6 -0.6 -0.4
2018 64,507 21.5 62.9 96.9 0.0 0.1 0.0
2019 69,949 21.5 62.8 96.8 0.0 0.0 0.0

Concurrent-attainment-based predictions show a similar pattern to prior-attainment-based predictions, but with slightly higher predictions than those generated using prior attainment. When including all centres, predictions for 2019 based on 2014 outcomes were around 3pp higher at grade C/4 and 2pp at grade A/7, than actual outcomes (Table 15). After new centres had been removed this prediction was around 7pp higher than actual outcomes at C/4 and around 6pp higher than actual outcomes at grade A/7 (Table 16).

These analyses suggest that predictions based on 2014 outcomes would have been higher than actual outcomes in 2019, regardless of whether the predictions are based on prior attainment or concurrent attainment. This indicates that the value-added relationship has changed such that there is lower value-added relationship for students taking GCSE computer science over time, that is, the same prior or concurrent attainment is associated with lower grades in 2019 compared to 2014. Further, the size of this effect increased when new centres were removed. This suggests that students at new centres tended to perform less well than students at other centres who had a similar prior or concurrent attainment.

In practice predictions are only used to guide awards and it therefore cannot be assumed that a different prediction would have led to different outcomes, particularly in years where examiners recommended grade boundaries below predictions anyway. However, it is not possible to know how different statistical evidence may have influenced the final judgements of examiners in a particular year.

It is also worth considering whether the cohort in each reference year was similar enough to that in the ‘current’ year (2019) to expect a similar value-added relationship. The descriptive analysis presented previously indicated that there have been a large number of changes to the cohort since 2014. These changes could have led to legitimate differences in the value-added relationship over time. The reference year for predictions needs to be carefully considered to ensure the cohort is representative of the current year. A larger number of years between the reference year and the current year results in a higher likelihood that the cohort, and therefore outcomes, may have changed for legitimate reasons.

Disentangling these legitimate changes in outcomes from illegitimate ones is challenging. Therefore, in the next section we carry out some more sophisticated modelling aiming to control for some of these potentially confounding effects.

Strand 1. Analysis 6. Modelling of outcomes over time

In this section we present a series of models of outcomes in GCSE computer science in each year, which as in the previous analysis, control for concurrent or prior attainment, but also for a variety of other student and centre characteristics which may be related to outcomes. The aim of this modelling is to disentangle some of the factors which may be related to changes in outcomes over time, but that are not appropriate to be factored into the statistical predictions, to identify if the changes in outcomes can be reasonably accounted for by these factors.

Primarily we controlled for students’ prior or concurrent attainment, however we also controlled for other student characteristics which may be related to outcomes.  We calculated both a model of GCSE grade on a linear scale, and models of the probability students would receive at least a grade A/7, grade C/4 or grade G/1. If the analysis indicates that outcomes differed between years, after controlling for other variables which might be related to outcomes, this may suggest that standards have changed between years.

However, as discussed previously there may be other factors influencing outcomes over time which do not directly relate to observable student characteristics. We therefore aim here to control for 2 additional factors which could be related to outcomes. Firstly, the experience of centres of delivering the qualification. We control for this by removing centres entering students for only the first or second year from the analysis. Secondly, outcomes could differ if there are qualitative differences between centres entering in different years. We therefore carry out some further models only including the same set of centres in each year. If in these models, we still see a change in outcomes over time, this suggests that there has been a change in standards which cannot easily be explained by other factors.

A numeric grade variable converting both A* to G grades and 9 to 1 grades to an 8-point scale was created (see Table 7 in Strand 1 Analysis 3) along with binary variables indicating if each student received at least a grade G/1, C/4 or A/7. A variable was also created indicating how long each centre had been delivering GCSE computer science, by calculating the number of years since a student at that centre had first received a grade. This was then converted into a binary variable (new/not new centres). For this analysis a slightly more conservative approach was used to the previous analyses and new centres were classed as those entering students for the first or second year.

The primary models used a linear relationship, with students’ numeric GCSE grade as the target variable, and a series of logistic regression models evaluating the probability of a student receiving at least each grade – G/1, C/4 and A/7. All models included ‘Year’ as the key predictor. Models were developed using both prior attainment (standardised KS2 score) and concurrent attainment (standardised mean GCSE). These variables were trialled as both continuous variables and as categorical variables (that is, attainment deciles), all producing similar results, however the continuous models resulted in better model fit. Prior attainment data was missing for around 20% of students in 2015 due to boycotts of KS2 assessments 5 years earlier. Therefore, we focus on the results of the concurrent attainment models in the main text and figures (see appendix B for all full model results).

All of the models controlled for other student characteristics, namely: gender (male/female), SEN status (SEN, no SEN, missing), FSM eligibility (yes, no, missing), primary language spoken (English, other, missing), ethnic group (Asian, Black, Chinese, Mixed, White, other, missing) and centre type (college, selective, independent, mainstream, missing). A random effect of centre number was included in all models to control for centre level clustering. Models only included 16-year-old students with a valid grade, from England only, with prior or concurrent attainment data available (depending on the model). See Table 17 for a summary of the sample included in the analysis. All of the models controlled for other student characteristics, namely; gender (male/female), SEN status (SEN, no SEN, missing), FSM eligibility (yes, no, missing), primary language spoken (English, other, missing), ethnic group (Asian, Black, Chinese, Mixed, White, other, missing) and centre type (college, selective, independent, mainstream, missing). A random effect of centre number was included in all models to control for centre level clustering. Models only included 16-year-old students with a valid grade, from England only, with prior or concurrent attainment data available (depending on the model). See Table 17 for a summary of the sample included in the analysis.

Table 17. Summary of sample used for modelling of outcomes over time.

Year N students - Prior attainment models (all centres) N students - Concurrent attainment models (all centres)
2012 1,583 1,614
2013 3,876 3,756
2014 14,768 15,092
2015 23,322 31,928
2016 57,163 59,334
2017 62,321 65,897
2018 65,167 68,966
2019 68,814 74,530

The majority of models indicated that ‘Year’ had a statistically significant effect on the probability of students attaining key grades, except for the models at grade G/1 (see Table 18). Adding ‘Year’ to models also improved model fit, however, the additional explanatory power was relatively small (increase in R 2 /pseudo-R 2 between 0.1pp and 0.9pp). This likely represents that the main predictor of a students’ outcome in an exam is inherently their own ability and other variables only have a weak relationship with outcomes in comparison.

For each of the models we estimate what the difference in mean grade predicted by the model, or probability of receiving key grades would be, for the full cohort of students included in the model in 2019, using the estimated model coefficients for 2014. This estimate takes into account the effect of changes in the distributions of different subgroups of students, prior attainment and centre types and so gives an estimate of the size of the ‘Year’ effect on actual outcomes (see Table 18).

Table 18. Summary of Year model effects from various different models using concurrent attainment.

Model Restriction Year-2019 coefficient [Ref 2014] (SE) Estimated difference in outcomes from 2014 predicted for 2019 cohort
Linear All centres -0.12 (0.01)*** -0.11
Linear Excluding new centres -0.41 (0.03)*** -0.41
Linear 2014 centres only -0.31 (0.03)*** -0.31
Linear 2015 centres only -0.33 (0.02)*** -0.33
A/7 Grade All centres -0.02 (0.03) -0.17pp
A/7 Grade Excluding new centres -0.48 (0.08)*** -3.47pp
A/7 Grade 2014 centres only -0.40 (0.10)*** -4.52pp
A/7 Grade 2015 centres only -0.30 (0.06)*** -3.40pp
C/4 Grade All centres -0.06 (0.03)* -0.76pp
C/4 Grade Excluding new centres -0.77 (0.10)*** -8.72pp
C/4 Grade 2014 centres only -0.39 (0.11)*** -4.43pp
C/4 Grade 2015 centres only -0.57 (0.07)*** -5.60pp
G/1 grade All centres +0.09 (0.07) -0.15pp
G/1 grade Excluding new centres -0.92 (0.33)** -1.17pp
G/1 grade 2014 centres only -0.27 (0.39) -0.26pp
G/1 grade 2015 centres only -0.22 (0.20) -0.16pp

Note. Statistical significance is indicated by p<0.001 ( *** ), p<0.01 ( ** ), p<0.05 ( * )

It is notable that in all cases excluding new centres increases the estimated size of the ‘Year’ effect. This suggests that including these centres may have masked a potential change in standards. The sections below discuss the different models in detail. Figures show the predicted mean grade or predicted probability of receiving the key grade or above in each year for students in the reference group (that is students with an average attainment score, white, male, not FSM eligible, English speaking, not registered as SEN and attending a mainstream school).

All centre models

We start by looking at the linear models. These models are based on an 8-point grade scale equivalent to A* to G.

Figure 14 shows the output of the model including all 16-year-old students from all centres and gives an indication of the estimated mean grade of similar students in the reference group in each year after controlling for other factors.

Figure 14. Estimated mean grade for students in the reference group for students with average concurrent GCSE attainment in each year. Includes all centres.

From Figure 14 a clear pattern of declining mean grade can be seen between 2015 and 2018. This is after controlling for student characteristics, centre type and students’ attainment in other GCSEs. Although this effect is relatively small, with an average estimated difference in outcomes of 0.12 grades between 2014 and 2019 for an average attaining student, this is still a notable change, representing over 1 in 10 students gaining a grade lower in 2019 when compared with 2014. However, this model does not account for the effects discussed previously which could impact on outcomes; whether centres are new to delivering the qualification or unmeasured differences between centres in different years such as changes to teaching quality. For the next set of models, we therefore first exclude students at centres that have entered students for fewer than 2 years previously.

Model excluding new centres

Figure 15. Estimated mean grade for students in the reference group for students with average concurrent GCSE attainment in each year. Only includes students at centres offering GCSE computer science for the third year or more.

Figure 15 shows that after excluding ‘new’ centres, the effect of declining outcomes becomes more pronounced. This suggests that the different value-added relationship in these centres, combined with differing numbers of new centres in each year, may have masked a larger shift in standards. This model estimates that this shift results in a difference of mean grade of 0.41 grades (once converted to a 9 to 1 scale) between 2014 and 2019.

However, this model is still not accounting for potential qualitative differences between centres taking up the qualification in different years, for example related to teaching quality or resources. For the final set of models, we therefore only include centres entering students to the qualification in every year that is included in the model. For the period between 2014 and 2019 this results in only 85 centres being included in the analysis, so we therefore repeat the analysis for centres entering students every year between 2015 and 2019, which increases the sample to 205 centres.

Models restricted to same set of centres in each year

Figure 16. Estimated mean grade for students in the reference group for students with average concurrent GCSE attainment in each year. Only includes students at centres offering GCSE computer science for the third year or more who entered students in every year 2014 to 2019.

Figure 17. Estimated mean grade for students in the reference group for students with average concurrent GCSE attainment in each year. Only includes students at centres offering GCSE computer science for the third year or more who entered students in every year 2015 to 2019.

Figure 16 and Figure 17 show that, even when the analysis only includes centres who entered students in every year, the estimated mean grade still declines between 2015 and 2017 by around 0.3 grades on average (see Table 18 above). This decline cannot be explained by effects due to centre unfamiliarity, as new centres were excluded from the model, and it also seems unlikely that teaching quality would have consistently declined in this same set of centres over time. There are other factors that might have changed though, such as entry policies for the subject or factors relating to student preparation or motivation during the period. It seems unlikely, however, that these effects would be consistent across centres.

Figure 18 and Figure 19 show outputs from logistic regression models estimating the probability of students attaining the key grades A/7 and C/4 or above. Like the previous model, these models only include centres with entries in all years 2015 to 2019 who first entered students in 2012 or 2013. We have not included the figures for grade G/1 or the models for centres entering students every year between 2014 and 2019 as the sample sizes for these models were small and therefore the models were unreliable.

Figure 18. Estimated probability of attaining an A/7 or above for students in the reference group for students with average concurrent GCSE attainment in each year. Only includes students at centres offering GCSE computer science for the third year or more who entered students in every year 2015 to 2019.

Figure 19. Estimated probability of attaining an C/4 or above for students in the reference group for students with average concurrent GCSE attainment in each year. Only includes students at centres offering GCSE computer science for the third year or more who entered students in every year 2015 to 2019.

The same pattern of declining outcomes can be seen at both grades A/7 and C/4. The models estimate the size of the difference as 3.4pp fewer students attaining an A/7 in 2019 compared with 2015 and 5.6pp fewer students attaining a C/4. Interestingly, the main decrease is slightly later for grade A/7 (occurring between 2016 and 2018), whereas for grade C/4 it occurs between 2015 and 2017.

For grade G/1, the modelling did not consistently indicate a statistically significant difference in the probability of students achieving a grade G/1 between 2019 and previous years (Table 18). A very small number of students receive a grade U, which means consistently estimating model effects is challenging. If an effect exists at this grade it is likely very small, the estimate for the model including centres with entries between 2015 and 2019 model suggested outcomes -0.16pp lower in 2019 compared with 2015.

In summary, these analyses indicate that after controlling as far as possible for changes in cohort characteristics, possible teacher unfamiliarity at new centres and changes between groups of centres entering the qualification in different years, there is a trend of lower results over time. This change is focused around the grade C/4 boundary, with a slightly smaller effect at the grade A/7 boundary. In the following section we aim to carry out a similar analysis focusing on centre level outcomes over time for centres with entries in adjacent pairs of years.

Strand 1. Analysis 7. Common centres analysis

Although evidence shows that outcomes do vary for individual centres from year to year, it is expected that, on average, across a large number of centres, outcomes remain fairly stable when standards are maintained, assuming that the cohort of students entering from each centre remains fairly stable. Schools or colleges which offer the same qualification across 2 or more years are referred to as ‘common centres’ as they are centres ‘in common’ across those years. The aim of this section is to consider evidence relating to the maintenance of standards over time based on changes (or an absence of changes) in outcomes for these common centres.

As outlined previously, we are focusing here on whether outcomes have changed, and are not considering other factors such as the quality of student work. This analysis relies on the assumption that centres typically have similar outcomes between years, reflecting a similar level of student performance over time. Where outcomes do change it is expected to be statistically random, that is, some centres outcomes go up, but balanced by those where outcomes go down. This is built on the premise that students entering a qualification at the same centre will be similar from one year to the next, in terms of things like socioeconomic background, motivation and so on. It also assumes that centre-level factors will remain stable from one year to the next (at least on average), things such as entry policies, resourcing, and teaching quality. Therefore, if these assumptions hold, on average across the population of common centres, a large and consistent difference between the common centres predicted outcomes and the percentage of students who actually received each grade in each year may indicate a change in standards.

The simplest approach to common centres analysis is to consider all centres that offer the qualification in a pair of adjacent years and to directly compare the outcomes across the 2 years. For this simple common centres approach we are assuming that the distribution of grades (that is, the proportion of students attaining each grade) remains the same on average across all the centres included.

This approach does not take into account any changes in the entry size from individual centres. For example, if higher performing centres increased their entries, whereas lower performing centres decreased their entries, we might expect overall outcomes to improve. Therefore, we can calculate a weighted common centres analysis by weighting the outcomes from individual centres to take such changes into account. In this case the assumption is made that the distribution of grades remains the same within each centre regardless of changes in entry size.

A more complex version of common centres analysis takes account of the change in the prior attainment distribution between pairs of years for the centres included in the analysis. This is achieved by applying the prediction matrix methodology, similar to that used to aid in setting standards in GCSEs and A levels, but only applied to the centres in the sample. This is referred to as a ‘prior attainment adjusted’ common centres analysis.

Given that we are looking historically we can also use a fourth alternative. This approach is similar to the prior attainment adjusted analysis but using concurrent attainment. This ‘concurrent attainment adjusted’ analysis utilises a prediction matrix based on the centres in the sample, but uses mean GCSE score to group students by ability in the place of KS2 prior attainment scores.

A common restriction applied to common centres analysis is to only include ‘stable’ common centres. Typically, these are classed as centres with a minimum number of students in each year, and/or those where the number of students has not changed by over a certain percentage. The rationale is that we might expect outcomes in these centres to be more consistent than in other centres. In practice, the effectiveness of these restrictions on improving prediction accuracy requires careful consideration. Previous analysis has shown the potential increased accuracy gained by restricting the sample to more stable centres, is often outweighed by the loss of sample size (Benton, 2013). However, we include them here for comparison and to potentially control for centres with large changes in entries in the early years of the qualification, where it may have a larger impact.

We applied all of the above methods to identify a range of potential predicted outcomes for each year based on each method; simple common centres, weighted common centres, prior attainment adjusted and concurrent attainment adjusted analyses. We also carried out each method with different levels of restriction of the sample of centres included. For the initial analysis, we include all centres with entries in each pair of consecutive years. For the ‘stable’ common centres analysis we carried out 2 versions, the first restricted the sample to only centres with a minimum of 10 students in each of the pair of years being analysed and whose entry did not fluctuate by more than 40% between the first and second year, for the second ‘very stable’ analysis we restricted to centres with at least 20 students and whose entry fluctuated by less than 15%.

For the prior attainment analyses we excluded selective and independent centres, as students at these centres tend to have a different relationship between their KS2 results and GCSE outcomes. Students without prior attainment or concurrent attainment data were also excluded from the respective analyses. For these attainment-adjusted analyses in each pair of years the first year was treated as the reference year. The standard methodology to produce prior-attainment-based predictions was applied here, but only for the subset of centres identified as common across years.

For each of the analyses we compared the common centres predicted outcomes against the actual outcomes at grades A/7, C/4 and G/1. However, pairs where there were less than 500 students retained in the sample in either year have been removed as the predictions are unlikely to be reliable. Therefore, predictions in most cases cover changes in outcomes during the period from 2014 to 2019, except for analyses using very stable centres, which include the period 2015 to 2019 and the prior attainment adjusted analysis with very stable centres, which only covers the period 2016 to 2019.

Similar to the previous analysis, we removed centres that have entered students for the assessments for less than 2 years prior to the ‘reference year’ in all methods as this is the period when their outcomes are most likely to change due to sawtooth-like effects.

Figure 20, Figure 21 and Figure 22 show, for each year, the difference between the common centres predicted outcomes and the actual outcomes for the sample of centres included in each analysis. The figures are cumulative over time, to give an indication of the possible cumulative change in standards over time from 2014. A separate line is included for the combination of each method (simple, weighted, prior attainment adjusted and concurrent attainment adjusted) and each sampling approach (all common centres, stable centres and very stable centres). Table 19 then shows a summary across different methods of the difference between the predicted outcomes and actual outcomes in each year.

Figure 20. Cumulative difference between common centres predictions and actual outcomes over time by common centres method. Grade A/7.

Figure 21. Cumulative difference between common centres predictions and actual outcomes over time by common centres method. Grade C/4.

Figure 22. Cumulative difference between common centres predictions and actual outcomes over time by common centres method. Grade G/1.

Table 19. Summary of common centres analyses across methods, showing the mean and median percentage point difference between predictions and outcomes in each year and the cumulative effect, with 95% confidence intervals.

Grade Method 2014 to 2015 2015 to 2016 2016 to 2017 2017 to 2018 2018 to 2019 Cumulative (2014 to 2019)
A/7 -0.5 0.4 -0.8 -1.0 0.4 -1.3
A/7 n/a n/a n/a n/a n/a
A/7 -0.8 0.4 -0.7 -1.0 0.4 -1.9
C/4 -1.7 -3.2 -1.9 0.3 0.9 -4.7
C/4 n/a n/a n/a n/a n/a
C/4 -1.7 -3.2 -2.0 0.2 1.1 -4.7
G/1 -0.2 -0.8 -0.5 0.5 0.1 -0.7
G/1 n/a n/a n/a n/a n/a
G/1 -0.1 -0.6 -0.4 0.5 0.1 -0.7

Although there is some variation across the different common centres methods, they present a similar picture. At grade A/7 outcomes were slightly lower than predicted by the common centres analyses in 2015, 2017 and 2018, however this was somewhat compensated for by outcomes being above those predicted in 2016 and 2019. If we total estimates across all years, then outcomes are lower in 2019 by somewhere between 0.1pp and 2.5pp than what we may have expected if centres’ outcomes had remained stable over the period studied.

At grade C/4 the average effect across our different methods suggests that outcomes were below predictions in 2015, 2016 and 2017, by around 1.7pp, 3.2pp and 2pp respectively. In 2018 and 2019 outcomes may have been slightly above predictions, although with some variance across methods. This results in a total difference of outcome being around 3.3pp to 6.1pp lower than what would be expected if centres outcomes had remained stable between 2014 and 2019.

At grade G/1 the effects are much smaller. Analysis suggests outcomes were again below prediction in 2016 and 2017, although this was mostly counterbalanced by outcomes being above prediction in 2018 and 2019. Overall, this suggests a slight negative effect of outcomes being around -0.7pp below predictions by 2019.

It is worth noting that the size and even the direction of these effects varied somewhat depending on the common centres method employed. The figures in Table 19 represent an average across methods, whereas individual methods suggest a larger or smaller effect. Estimates ranged from suggesting the cumulative difference in outcomes in 2019 were almost 11pp below predictions to only 1.9pp below prediction at C/4. For grade A/7 there was some variance in the direction of the effect, with estimates ranging from 4.2pp below prediction to 2.6pp above. Methods including all centres typically gave a more negative estimate than only including stable centres.

However, estimates across methods for the cumulative change between 2014 and 2019 were almost uniformly negative. Only the simple common centres method, for both all and stable centres at grade A/7, suggested a positive change in outcomes relative to common centres predictions, estimating outcomes 1.9pp and 2.6pp above predictions respectively.

It is also worth noting that even though we may expect outcomes to remain stable on average for centres over time, there will always be some fluctuation in outcomes. This is because multiple students will always receive the same mark in each year, so it may be impossible to exactly reproduce cumulative percentage outcomes, even if this were desirable.

Overall, these findings suggest there may have been a change in standards over time, particularly at grade C/4. Analysis showed a similar pattern to the previous modelling of a fall in outcomes at grade C/4 between 2015 and 2017, and at grade A/7 between 2016 and 2018. It seems unlikely that the same centres would have outcomes consistently worse in subsequent years of offering the qualification.

One possible reason would be if the centres included in the analysis entered on average lower performing students in subsequent years. However, the prior and concurrent attainment adjusted analyses should have compensated for changes in the general ability of the cohort, yet still generally showed a decline in outcomes. The number of students entering at the centres also did not consistently increase between pairs of years, which does not suggest centres were changing their entry policies and expanding their intake, which may have resulted in less able students taking computer science (see Table 20).

Table 20. Number of centres and change in number of students between each pair of years for common centres analysis.

Group Value 2014 to 2015 2015 to 2016 2016 to 2017 2017 to 2018 2018 to 2019
All centres Change in total entry 323 636 -911 -726 1,269
All centres N Centres 85 196 652 1,278 1,994
Stable centres Change in total entry -12 51 -113 -262 179
Stable centres N Centres 40 108 353 666 1,020
Very stable Centres Change in total entry -7 30 -35 1 -18
Very stable Centres N Centres 10 41 147 225 374

Strand 1. Analysis 8. Comparative judgement of script quality

The previous strands of work all took a statistical approach to comparing standards, focusing on measures of outcomes over time. The aim of this strand of work was to take a different approach, instead focusing on the performance standard, that is the quality of work demonstrated by students to attain the key grades in each year. If the quality of work at the grade boundaries is different between years, this indicates that the performance standard of the qualification has changed.

This strand of research utilised subject experts to judge the quality of students’ work holistically and to compare the quality of students work across different assessments over time. To facilitate this, judgements were collected from experts using a paired comparative judgement (CJ) task. Comparative judgement allows us to collect the consensus view of a group of expert judges, while minimising the potential bias introduced by individual judges’ views. The method requires experts to make relative judgements about students work, which is arguably psychologically easier, and more intuitive, than making absolute judgements of quality.

Within this study, judges were presented with pairs of examples of students work, in the form of exam scripts from different years, and asked which script was higher quality. Multiple comparisons between different pairs of exam scripts based on experts’ holistic view of the quality of students’ work make it possible to construct a scale of ‘perceived quality.’ The location of each script on the scale of perceived quality depends on both the proportion of times it ‘won’ and ‘lost’ each paired comparison, but also the location of the scripts it was compared with (Bramley, 2007). If the distance on this scale between 2 scripts is greater, this means there is a larger probability that the higher scored script is judged as having greater perceived quality than the lower scored script (Bramley & Oates, 2011).

In this CJ exercise experts judged the quality of work in students’ exam scripts at the grade A/7 or C/4 boundary for one exam paper which was broadly comparable pre and post reform. Given that the specifications changed, it was not possible to compare exactly the same exam over time. Therefore, exam papers which were the most similar in terms of content and structure were selected to facilitate comparisons. However, this means caution is needed when interpreting the findings pre- and post-reform as there were some changes to exam content and to the overall structure of the qualification. Non-exam assessments were not included as there was no comparator post-reform and due to the size of the assessment materials, they were deemed not suitable for inclusion in the CJ exercise. Details of the exams included are shown in the materials section below. The aim was to identify if the performance standard at the grade boundaries in the exam had changed over time.

For this exercise, assessments from the AOs with the 2 largest entries for GCSE computer science were considered (AQA and OCR). Pre-reform (2011 to 2017) each AOs’ specification comprised one exam and either one or 2 controlled assessments. Post-reform, following the removal of the non-examination assessment, each AOs’ specification comprised 2 exams. To allow comparison pre- and post-reform, only one of these 2 post-reform exams was considered. For both AOs one of the post-reform exams was similar in content and structure to the pre-reform exams, this exam was therefore used to make the most valid comparison. Details of the assessments are included in Table 21.

Table 21. Details of GCSE computer science assessments pre-reform (2012 to 2017) and post-reform (2018 and 2019). Assessments included in CJ exercise shown in bold.

OCR pre-reform

Exam Computer Systems and Programming 40% of the total GCSE, 1 hour 30 mins, 80 marks
Controlled Assessment Practical Investigation 30% of the total GCSE, ~20 hours, 45 marks
Controlled Assessment Programming Project 30% of the total GCSE, ~20 hours, 45 marks

OCR post-reform

Exam Computer Systems 50% of the total GCSE, 1 hour 30 mins, 80 marks
Exam Computational thinking, algorithms and programming 50% of the total GCSE, 1 hour 30 mins, 80 marks

AQA pre-reform

Exam Computing Fundamentals 40% of the total GCSE, 1 hour 30 mins, 84 marks
Controlled Assessment Practical programming 60% of the total GCSE, ~50 hours, 126 marks

AQA post-reform

Exam Written Assessment 50% of the total GCSE, 1 hour 30 mins, 80 marks
Exam Computational thinking and problem solving 50% of the total GCSE, 1 hour 30 mins, 80 marks

The CJ exercise included student scripts on the grade boundaries from both AOs for each year that the assessments were available between 2011 and 2019. The OCR specification was first available in 2011 and the AQA specification was first available in 2014, resulting in 15 sets of student scripts. For each AO in each year, students’ scripts were requested from AOs. Up to 5 students’ scripts were requested at each of the A/7 and C/4 grade boundaries for each exam paper. For OCR only 3 scripts were available at each boundary in each year, and for AQA 5 scripts were available in most cases (4 in one case). Student scripts were requested that, as far as possible, showed a relatively even or typical performance across the paper.

Students’ scripts were anonymised to remove information identifying the student, year and AO. All mark information was also removed from the scripts, and they were each given a unique ID. This ID could be matched to the blank question papers and mark schemes which were also provided to the judges (any information identifying the AO and year were also removed from these).

Sixteen judges were employed to complete the exercise, all of whom had experience of teaching GCSE computer science. Judges were initially recruited from Ofqual’s list of subject matter specialists and additional judges were then recruited by contacting teachers directly. Judges were paid for their time.

Judging procedure

Judges initially attended an orientation meeting where they were informed of the aims of the study, given an introduction to comparative judgement and the software they would be using. Following the meeting they were sent detailed instructions and access to the judging platform and all additional materials which were stored in a secure online environment.

Following the meeting, judges were asked to familiarise themselves with the exam papers and mark schemes for all of the assessments included in the judging. They were then asked to provide a rating for how demanding they felt each of the individual exam papers was on a 7-point scale, from significantly less demanding than the average paper to significantly more demanding than the average paper. Their reference for this was how demanding they felt the papers were on average. Experts were told that a paper would be considered more demanding if a typical student would likely score proportionally fewer marks, or overall perform less well, than if they had taken another paper. The judges were asked to revisit these scores after they had completed the CJ exercise in case reviewing actual student responses to the papers had changed their opinion.

We know that exam papers differ in demand from year to year, as it is highly challenging to write exam papers which are of the exact same demand. This is usually compensated for by the setting of grade boundaries, as discussed in the introduction. Therefore, exams varying in demand was not a direct concern. Instead, the aims of the rating exercise were 3-fold. First, to initially orientate the judges to the exam papers and to ensure they had thoroughly familiarised themselves with the papers and mark schemes. Second, to attempt to avoid judges’ views on the quality of students’ responses being influenced by the demand of the assessments. Previous research has shown that judgements of the quality of students’ work can be influenced by the demand of the assessment being judged (Good and Cresswell, 1988). Third, so we could evaluate the relationship between judges’ perceptions of paper demand and student performance over time.

The judges were then asked to complete the CJ exercise. For this exercise they were given a unique login for an online judging platform where each judge was given a unique set of judgements to complete. For each judgement, judges were presented with 2 random scripts side by side and asked to consider “Which of these 2 students is the better computer scientist, based on a holistic judgement of script quality?”. Judges were able to scroll up and down on each script individually before making their decision. Judges were asked to make their judgements based on the overall quality of the students’ responses and not to attempt to re-mark the scripts to come to their decision. Judges were asked to make relatively rapid decisions and were informed that it should take around 5 to 6 minutes to judge each pair.

Initially each judge was given an allocation of 70 or 71 judgements, aiming for a total of 20 judgements per script across judges. Due to one judge not being able to complete the full task, their additional allocation was given to one of the other judges, resulting in one judge only completing 52 judgements and another completing 90 judgements.

Following completion of judging, a Bradley-Terry model (Bradley and Terry, 1952) was applied to the judgements to give each script a score indicating its likelihood to ‘win’ individual pairings. For this study, the script scores can be interpreted as indicating the quality of students’ responses, relative to other scripts. For a detailed discussion of CJ methodology and analysis in this context see Curcin et al (2019).

Finally, after judges had completed all other tasks, they were sent a short survey asking how they had found the judging process, how confident they were in their judgements, and their general views about the quality of the students’ work they had seen.

Paper demand

Ratings of paper demand were first standardised within each judge (to a mean rating of 0 and standard deviation of 1) before being averaged across judges. The demand ratings are shown in Figure 23. On average, AQA papers were deemed to be more demanding than OCR papers. Comments from the surveys suggested that judges felt they were less accessible than OCR’s papers. OCR papers were considered to be more demanding in 2015 and 2016, whereas AQA papers were considered to be most demanding in 2017. Post-reform (2018 and 2019) the demand of the papers between the 2 AOs was judged to be more similar. It is difficult to directly interpret the size of these perceived differences in demand as they were all on a relative scale. Discussions with experts indicated that they did think some assessments were more challenging than others (and this is not just an artefact of us asking the question), however it is unclear of how much impact this may have had on student performance.

Figure 23. Mean standardised relative paper demand ratings by judges, with standard errors.

Figure 24 below shows the C/4 grade boundary positions for the same 2 assessments between 2012 and 2019. If all else remains stable, we would typically expect grade boundaries to change to compensate for a change in demand of the assessment. We might therefore expect the inverse pattern from Figure 23, that is in years when the demand increases the grade boundaries should decrease to maintain the same standard within the assessment.

Figure 24. Grade boundaries for the assessments used in the CJ study over time.

Although there is some relationship between the patterns of changing assessment demand and grade boundaries, it is evident from Figure 23 and Figure 24 that the grade boundaries do not move solely as a response to a change in the assessment demand ratings. However, there may be other factors that affect grade boundary position beyond assessment demand. In particular, as standards are maintained at qualification rather than assessment level, we need to take into account the relationship between different assessments which make up the qualification when interpreting changes in grade boundaries.

The main purpose of this exercise was to familiarise the experts with the assessments and to take into account their demands when making their judgements for the CJ exercise. These ratings of paper demand also provide useful context to interpret the main CJ findings presented in the next section.

CJ analysis of script quality

Outputs from the CJ model allow us to evaluate the reliability of the ratings provided by the judges. In particular, infit is a measure of the consistency of the judgements made by judges, compared with the overall model fit. A high infit indicates that a judge was either inconsistent within their own judgements, or when compared with the judgements made by the other experts. Similarly, a script with a high infit may indicate that script was unreliably judged.

Judges took on average just over 7 minutes per judgement. One judge was removed from further analysis as their infit score was notably higher than other judges (1.44) suggesting that their judgements were not consistent with those of the other judges. Their median judging time was only 47 seconds which suggests that they may have not taken sufficient time to make accurate judgements. After removing this judge, the separation reliability was 0.85, which provides reassurance that judgements were consistent between and within judges. Scripts were judged on average 18.65 times (range 15 to 20). Four scripts were removed from the final presentation of results as they had a notably higher infit score than other scripts (over 1.5), suggesting they may have been particularly hard to judge and therefore their script quality scores may have been somewhat unreliable.

Figure 25 summarises the ratings across the different scripts for each year and each AO. Scripts with a higher score, and therefore further up the chart, are those rated as higher quality. Where all scripts in a year are rated as higher quality than other years this may suggest the performance standard needed to attain that grade was higher, which could be described as it being more difficult for students to receive that grade in that year.

Figure 25. Line chart showing mean script scores at both A/7 and C/4 boundaries, with 95% confidence intervals.

A few patterns are notable from the CJ results, shown in Figure 25. At both the A/7 and C/4 boundaries during the period 2014 to 2015 the quality of work was judged as lower for AQA than for OCR. The quality of students’ work for AQA also gradually improved between 2014 and 2017. It should be noted, however, that these temporary differences and changes over time do not appear to be problematic, as this is consistent with patterns expected from the ‘sawtooth effect’. As centres offering the AQA specification were likely to be initially unfamiliar with the teaching material and the structure of the assessment when they were first available, we would expect gradual improvement as they become more familiar with the qualification and assessments. The grade boundaries indicated by the predictions during this period, which were initially based on the OCR outcomes, will have automatically compensated for this effect, suggesting a slightly lower quality of work at the boundary. This can also be seen in the increasing grade boundaries over this period in Figure 24. It is also important to note that the comparisons made between AOs here are based on a subset of assessments for each specification. The relationships between the assessments analysed should not be assumed to mirror those for the qualifications overall.

For OCR, the script quality ratings at the C/4 grade boundary from 2015 are noticeably higher and more variable than most other years. This paper was judged as one of the most demanding papers by experts (Figure 23) and the scripts for this paper were judged the most variable in quality. This may suggest caution in interpreting the results from this paper as experts may not have effectively compensated for the demand of the paper when comparing scripts from this paper with other papers.

The quality of work for OCR papers was lowest at the C/4 boundary for the years 2014 and 2016. Comparing these 2 years with 2017, this would represent a shift to a higher quality of script being required in 2017, which could indicate that it became more difficult to achieve a C/4 in this year. This pattern is similar across both OCR and AQA papers at grade C/4 and is somewhat evident at grade A/7 for OCR. However, this finding should be treated with caution for 2 reasons. First, for AQA, the grade boundaries may still have reflected sawtooth effects being present across this period indicating a change in required quality for reasons other than an unintentional change in standard. Second, this finding ignores the 2015 data point for OCR at grade C/4, which suggests a higher quality of work was required in that year.

Overall, the results of the comparative judgement exercise are somewhat inconclusive. The quality of work required to achieve a C/4 may have dropped in 2014 (for OCR) before rising again in 2016 and 2017 (for both AQA and OCR). The drop in quality in 2014 may have come from predictions being used to aid setting boundaries, when the students may have been less well prepared as new and less specialist centres started offering the qualification. The suggestion of a rise in quality following this would agree with previous analysis which indicate an increase in difficulty in 2016 and 2017, although the findings are not clear cut.

Strand 1. Discussion

To aid discussion of findings, Table 22 summarises all of the above analyses and what they indicate the size of the potential change in standards between 2014 and 2019 might be. These changes are expressed in terms of the estimated percentage point change in students receiving at least an A/7, C/4 or G/1 grade and in terms of mean grade, where appropriate. As discussed previously, each of these analyses has different assumptions and limitations and, in some cases, include slightly different samples of students. Therefore, there needs to be caution in directly comparing the results from the different analyses.

Table 22. Summary of findings from analyses in strand 1.

Method A/7 change from 2014 C/4 change from 2014 G/1 change from 2014 Mean grade change from 2014 (9 to 1 scale)
Raw change in outcomes -1.3pp -4.3pp +0.02pp -0.26
Cumulative effect of awarding under prediction -1.5pp -3.5pp -1.8pp n/a
Rasch analysis n/a n/a n/a -0.12 to -0.28
Kelly’s method n/a n/a n/a -0.17 to -0.27
Progression analysis n/a n/a n/a -0.15
Simulated predictions (excl new) - prior -4.2pp -6.6pp -0.5pp n/a
Simulated predictions (excl new) – concurrent -6.1pp -7.3pp -0.5pp n/a
Model of outcomes over time (2014/2015 centres only) - concurrent attainment -4.5pp / -3.4pp -4.4pp/-5.6pp No change -0.31/-0.33
Model of outcomes over time (2014/2015 centres only) - prior attainment -4.3pp/-2.0pp -3.9pp/-4.3pp No change -0.27/-0.28
Common Centres (95% CI of mean of different models) -2.5pp to -0.1pp -6.1pp to -3.3pp -1.0pp to -0.4pp n/a
Comparative Judgement study No difference More difficult? n/a n/a

Overall, based on the range of analyses conducted in strand 1, between 2014 and 2015, and 2017 there appears to have been a subtle shift in standards in GCSE computer science. This is particularly noticeable around the C/4 grade boundary, but there also appears to be a slightly smaller change at the A/7 grade boundary. The various modelling carried out estimates that by 2019 similar students may have been around 3 to 6pp less likely to attain at least a grade C/4 when compared with students in 2014. At the A/7 boundary the evidence is somewhat less consistent, but there is some evidence that a similar shift may have happened resulting in students being around 2 to 3pp less likely to attain an A/7 in 2019 compared with 2014.

This change in standards was identified across the analysis methods employed, although the exact size of this effect varies between the different analyses. In the introduction we discussed 3 possible factors that could cause a drop in outcomes: 1. Changes in the ability of the cohort, 2. Sawtooth-like effects where centres were offering the subject for the first time and may have been unfamiliar with the qualification or assessments, 3. Changes in other factors over time which may affect students’ preparation for the assessments, such as teaching quality. However, even once we had controlled for the possible impact of all 3 factors, particularly in the common centres analysis and modelling approaches, the change in outcomes was still evident. This suggests that there has been an unintended change in the grading standards in the qualification over time. These changes are likely to have been small within each year and may not have been observable to awarders within each year but resulted in a larger cumulative effect over time.

The reasons for this shift in standards we can only surmise from the data available and from awarding documents acquired from AOs. During the period 2015 to 2017, the number of students taking the qualification more than doubled, many of whom were being taught at centres that had never offered GCSE computer science before. The prior attainment of these students from these centres was, on average, lower than those in previous years. Prior-attainment-based predictions subsequently fell during this period, however, despite this fall in predictions, across AOs grade boundaries were set such that outcomes were below predictions. All of this combined suggests that students may, on average, have performed less well in the assessments over time, potentially leading to a valid decrease in outcomes.

Prior to the first year of reformed assessments in 2018, the qualification included a controlled assessment element, consisting of a project carried out in class. Due to the fact the controlled assessment tasks typically stayed similar from year to year, the grade boundaries were typically kept consistent from year to year. This meant that AOs were most likely to take into account the evidence provided by the statistical predictions through the setting of the grade boundary on the examined component. On average, grade boundaries for the examined components were lowered between 2014 and 2017, but it was judged not appropriate to lower them such that student outcomes met predictions. Awarding reports suggest that awarders did not feel comfortable lowing the grade boundaries any further to meet predictions as they believed the quality of the work was not of a sufficient standard. This may have led to a disparity between performance on the different assessments, which would have made maintaining appropriate standards at qualification level highly challenging.

Alongside these changes, AOs were also dealing with issues of malpractice. Malpractice in the controlled assessment elements was an issue since the inception of the qualification, ultimately leading to the removal of NEA post reform in 2018. In an attempt to counteract this, OCR (the largest provider of the qualification) made one of their controlled assessments more open ended, and therefore potentially more challenging in 2016. Changing assessments can lead to temporary sawtooth-like effects as teachers become familiar with the new assessment structure, which may result in lower student performance.

The above suggests that AOs not meeting predictions in 2016 and the subsequent dropping outcomes in 2016 and 2017 may have been somewhat justified by the weaker performance of the candidature, in part due to the changing composition of the cohort. However, some of the effects that led to this weaker performance may well have been temporary.

In computer science the reference year for predictions was updated almost every year between 2012 and 2019. That is, the year used to benchmark the relationship between prior attainment and outcomes. The intention of this was to reflect the changing cohort and any resulting changes to the value-added relationship. While this is likely to have aided the management of changes in the cohort during this period to a considerable degree, the unintended effect may have been to also carry forward any changes to the value-added relationship from years when performance may have been temporarily weaker. 2018 and 2019 were also the first 2 years of the reformed qualification, a period where there are additional challenges to ensure that standards are maintained. This change may have made it impossible to identify any positive changes in performance standard, which may have followed if some of the effects leading to lower performance were temporary.

Ultimately this means that between 2014 and 2015, and 2017 there was a relatively rapid (albeit small) reduction in the value-added relationship between students’ prior attainment at KS2 and their performance in GCSE computer science. Then between 2017 and 2019 this relationship stayed relatively stable. This could be a valid reflection of the changing cohort and therefore a true representation of their ability in computer science. However, the number of transitory effects seen during this period, and the fact that from our analysis we also saw a fall in outcomes in centres whose outcomes we would expect to have been stable during this period, causes us to question the validity of this reduction in reflecting a genuine, permanent, change in student attainment.

In 2023, GCSE computer science had been widely available to teach for over 10 years. Therefore, it is reasonable to believe that any of the temporarily transitional effects that may have impacted on students’ performance in the early years of the qualification should have passed.

Caveats and limitations

One key limitation to almost all of the analyses presented here is the assumption that there have not been legitimate changes in student outcomes in GCSE computer science assessments, due to factors that have not been controlled for. These legitimate changes could come from a variety of sources. For example, for the analyses that compare outcomes in GCSE computer science to prior or concurrent attainment, a key assumption is that the relationship with prior attainment should have remained stable. Value-added relationships can vary for a wide variety of reasons which may be legitimate and which may be challenging to take account of during standard setting. For example, changes in teaching time, teaching quality, student motivation or changes to content taught could all cause legitimate changes in outcomes. This also applied to methods comparing outcomes in computer science to other subjects.

Another potential legitimate change in outcomes may have come from the reduction in malpractice following the removal of NEA. Prior to reform there were substantial issues with malpractice which may have led to inflated outcomes on this assessment. In 2016, OCR changed its controlled assessment structure to attempt to limit this, which may have led to a small drop in outcomes. Removal of the NEA component following reform could also cause challenges in maintaining a performance standard, although this was aimed to be compensated for through the approach to maintaining standards during reform. Hypothetically a drop in outcomes could represent a more valid reflection of students’ attainment, than that represented by the pre-reform controlled assessments.

Along similar lines there is anecdotal evidence of students being better prepared for controlled assessment tasks than they were for exams when both contributed to the qualification grade. Once this option was removed this could have led to a fall in outcomes if teachers were not well prepared to teach exam content. However, this effect could have been temporary along with other effects during reform.

One additional assumption we have made throughout these analyses is that the standard set in the initial years of the qualification was an appropriate one, and subsequently that 2014 was an appropriate year to benchmark standards to. Setting standards in the first years of a qualification is a challenging task and ultimately the appropriateness of the standard in a subject can only be determined by experts and stakeholders within that field. This is something we return to in strand 2 of this work.

Strand 2 - Performance standard in summer 2023

The previous analyses have considered whether there is evidence of a potential change in standards over time. The aim of this strand was to examine performance standards in GCSE computer science assessments in the most recent series they were available (summer 2023) to understand the impact any change in standards would have on the quality of work needed to be demonstrated by students to receive the key grades considered during awarding (grade 7 and grade 4). This study was therefore focused on the minimum level of performance required for these grades.

There were 2 elements to this. The first was to understand at which point on the mark scale a difference in standards from the grade boundary was consistently identified by experts and, where it was noticeable, whether experts believed the quality of work was acceptable to receive the relevant grade. The aim of the second element was to understand where in the range of student performance experts felt the quality of work indicated students would succeed in further study in computer science. Here we rely on one of the key aims of GCSEs “to provide a strong foundation for further academic and vocational study and for employment” (Ofqual, 2023) as a benchmark for the qualification standard.

This work required the subject experts to make holistic judgements about the quality of students’ work at various points in the mark distribution, and to identify where they could reliably perceive differences in the quality of work. This is a highly challenging task and is particularly difficult where students’ responses are uneven across the assessment, or when judges need to keep in mind a large amount of evidence to make their judgements (Leech and Vitello, 2023). However, given that expert judgement is a key component of setting standards for GCSEs, identifying where experts can identify differences in the quality of work and the magnitude of those differences is important to understanding the impact of any changes to that standard.

Finally, we also wanted to receive any other qualitative insights that the expert group of computer science specialists might have about the standard of the current GCSE computer science qualifications.

Methodology

Recruiting subject experts.

Computer science subject experts were recruited to carry out the review exercise. We recruited experts from a range of backgrounds, all of whom had some familiarity with the current A level or GCSE qualifications. Experts were recruited via a number of sources including Ofqual’s register of subject matter experts, recommendations by BCS – the chartered institute for IT, and from contacting an AO to recommend senior examiners to take part in the work. The intention was to recruit a panel of computer science experts with varied backgrounds, that represent a range of stakeholders in the qualification, to provide detailed insights into the standard of GCSE computer science. We successfully recruited 8 experts with a wide range of experience including current and previous A level and GCSE computer science teachers, representatives of BCS and Computing At Schools (CAS), those with marking experience for different AOs, those with experience of being a senior examiner and awarding (grade boundary setting), those with experience of training other computer science teachers, and those with experience of writing textbooks and other materials to aid in teaching computer science (see Table 23 for summary).

Table 23. Summary of subject experts’ background and computer science (CS) experience

Experience Number of experts (8 total)
Number of years teaching CS or related qualifications Median 17.5, min 14, max 36
Experience of teaching GCSE CS 8
Experience of teaching A level CS 7
Worked as an examiner for CS (any AO) 5
Experience writing/developing CS assessments 3
Experience writing CS training materials or textbooks 7
Experience training other CS teachers 6
Degree or higher in CS (or closely related subject) 6
Worked in CS outside of teaching 2

Exam materials

Exam scripts were requested from AQA, the AO with the second largest entry into GCSE computer science. The experts were generally less familiar with the AQA specification and so were likely to have less preconceived ideas about script quality or assessment demand. Our previous analyses suggested that the standard between AOs is highly similar, and no substantial concerns have been raised about inter-AO comparability, or the lack thereof. Therefore, we believed it was an appropriate assumption that conclusions from one AO about the performance standard could be applicable to all AOs offering GCSE computer science.

Student work was requested across a range of marks, based on the total qualification mark achieved (more details about this are included below). A number of examples of student work were requested at each mark point – 5 on the grade boundaries and 3 on other marks. AQA’s specification includes 2 exam papers, Paper 1: Computational thinking and programming skills and Paper 2: Computing concepts. Paper 1 is available in 3 versions (1A, 1B, 1C) depending on the programming language used (C#, Python or VB.Net respectively). To aid the experts in making comparisons between scripts we only included students who had taken Paper 1B (Python) as it has by far the largest entry. Both exam papers from the same student were requested, and scripts were anonymised to remove any student identifiers and all mark information. Scripts were requested that had a relatively even profile across both exam papers. Both exam scripts from the same student were combined into a single PDF. ‘Packs’ of scripts of students with the same mark total (at qualification level) were then created.

Subject experts attended an orientation session where the researchers introduced the aims of the project, explained the tasks and allowed the experts to ask questions and seek any clarification. Experts were then asked to complete 2 tasks at home, in their own time. Finally, there was a review meeting to discuss the results of the tasks and for the experts to provide any additional insights.

For task 1, experts were provided with packs of scripts at the grade 7 and grade 4 qualification level grade boundary, along with mark schemes and specification documents. These were borderline students who received just enough marks for each grade. Experts were asked to review the scripts in these packs and provide a summary of the strengths and weaknesses demonstrated by students, and what skills or knowledge they displayed (or did not display). The experts were then asked to indicate if they thought the quality of work was at the level they expected for a GCSE grade 7 or 4.

Following this, the experts were presented with a series of packs of scripts at various mark points above and below the grade 4 and grade 7 boundary. Each pack was given a randomly assigned ID and the mark totals on the scripts were removed, so the experts were not aware which pack was which. There were 3 packs above each grade boundary at every other mark from +2 marks above the boundary to +6 marks, and there were 7 packs below each boundary from -2 marks to -14 marks below the boundary.

For each pack the experts were asked if they thought that the overall quality of work (across all students in the pack) was typically much better, slightly better, slightly worse, much worse or not noticeably different to the work at the grade boundary. Where the expert thought there was a difference, they were asked to provide a short summary of what these differences were, that is, were students typically better or worse at demonstrating particular skills or knowledge. Experts were asked as far as possible to form a holistic judgement across the students included in each pack, to get a sense of what was ‘typical’ at each mark point. We were aware that this could be challenging in some cases. Despite the scripts having a relatively even mark profile, different students may have had very different performance profiles across the exams.

After completing task 1 experts were asked to complete task 2. For task 2, experts were asked to think about a GCSE level student who they believe showed enough aptitude to go on to further study in computer science and be successful. They were then asked to describe what skills or knowledge they would expect this student to display. Experts were not initially told what ‘success’ should consist of, but this was discussed with the experts following the task (see results – task 2).

Experts were then presented with a different series of packs of scripts, with the mark information removed. However, this time they were numbered and presented in order of descending marks starting with the grade 7 boundary scripts. Packs were provided in 5-mark intervals working down the mark range until the grade 3 boundary. For this task experts were aware that the packs were ordered by mark total, although they were not aware of the exact mark or grade of each pack or the difference in marks between packs.

For each pack, experts were asked to indicate if they believed the students within that pack were highly likely to succeed in further study, somewhat likely, somewhat unlikely or highly unlikely to succeed. Finally, experts were asked to provide a short rationale for their decision and describe the skills or knowledge they had seen at different parts of the mark range that had informed their decision.

Review meeting

Following completion of the tasks, 2 review meetings were conducted each with 4 of the subject experts. At the meetings, experts were presented with a summary of the findings from the first 2 tasks and were asked for additional insights and reflections. A key part of this was discussing which of the packs presented in task 1 they felt demonstrated enough knowledge and skills to receive a grade 4 or grade 7 where the quality of work was noticeably different from the grade boundary. Experts were also invited to share their views on the overall standard of the qualification and any reflections on the perceived difficulty of GCSE computer science.

After reviewing the grade boundary scripts for each grade but before reviewing the packs for task 1, the experts were asked whether the quality of work at the qualification level grade boundaries was as they expected for that grade, in a free text response. A summary of experts’ responses is shown in Table 24. Experts were not given any guidance about what they should refer to when considering their expectations, as we were interested in their diverse views depending on their background and experience. When questioned, experts stated that they variously drew on their experience of teaching A level, teaching GCSE, awarding the subject and professional experience.

Table 24. Summary of experts’ responses when asked if the quality of work at the grade boundary was as they expected.

Comparison of quality to expectations Number of responses – Grade 4 Number of responses – Grade 7
Better than expected 1 3
Slightly better than expected 1 1
As expected 4 3
Slightly worse than expected 2 1
Worse than expected 0 0

At grade 4 there was a fairly even split between experts believing that the quality of work was higher or lower than expected, with only one expert expressing a strong view that the work was higher quality than they would expect from a borderline grade 4 student. However, at grade 7, 4 of the experts thought the work was better than they expected, whereas only one thought it was worse than expected for a borderline grade 7 student.

The main findings of task 1 are shown in Figure 26 and Figure 27 below. These figures show the percentage of the experts who rated each pack of scripts as being much better, slightly better, slightly worse, much worse or not noticeably different from the quality of work at the grade boundary.

There was a fair amount of variation in subject expert responses to each pack, both in terms of there being differences between experts but also in relation to the mark totals. The discussions with the subject experts indicated that this may have been due to different experts prioritising different skills or parts of the exam papers when making their judgements. This may, in part, reflect the diverse nature of content in the subject. Experts also said that they found this task challenging as there was often a large amount of variation in the skills and knowledge displayed by students within each pack who received the same overall mark. This made it challenging to make a holistic judgement about the quality of student work at each mark point. These results are not dissimilar to previous work which showed that it can be challenging for examiners to consistently identify differences in the quality of students work when the difference in total marks is small (Baird and Dhillon, 2005).

This may also have led to differences between expert ratings and what was credited in the mark schemes, as subject experts may have put more weight on some areas of skills and knowledge than others. For example, the experts noted that they typically prioritised performance on the programming and/or extended response questions when making their judgements about the quality of student work. For some scripts, experts highlighted that students could be inconsistent, for example showing high quality responses in some areas but not responding to all questions, which led to lower mark totals.

Figure 26. Subject expert ratings of packs of scripts at different mark points around the grade 7 boundary.

The results from the review of students’ scripts around the grade 7 boundary indicate that within the range of approximately +4 to -2 marks from the grade boundary, the experts did not consistently identify any difference in the quality of students’ work. Within this range, less than 50% of experts indicated that the packs were noticeably different from the grade boundary scripts in the appropriate direction.

At the review meeting experts were asked to review the scripts below this range, and to provide further views on the quality of work that students demonstrated at these mark points. At -4 marks from the boundary, although experts thought the work was weaker than the work at the grade boundary, the majority of the experts still believed the scripts showed enough knowledge and skills to receive a grade 7 without it having a strong impact on the performance standard indicated by that grade. At -6 marks from the boundary there was some disagreement. While a few of the experts believed that students work at this mark showed high enough quality to receive a grade 7, others disagreed. Below this point, however, the majority of experts believed that there were too many weaknesses in students’ work, with students showing a lack of understanding and having too many gaps in their knowledge for a grade 7.

Figure 27. Subject expert ratings of packs of scripts at different mark points around the grade 4 boundary.

At grade 4 there was also a zone around the grade boundary where the experts did not consistently identify differences in the quality of scripts from the grade boundary. This ranged from +2 marks above the boundary to -4 marks below.

Discussions with the experts after reviewing scripts below this zone again indicated that students at -4 marks may show enough skills and knowledge for a grade 4, with experts agreeing that overall, the quality of work met their expectations. At -6 marks there were mixed views from the subject experts, with some believing the scripts indicated students showed enough aptitude for a grade 4, however others believed that there were too many missed answers and gaps in knowledge. Below this point, all of the experts believed that students showed significant weaknesses, with misunderstandings in key concepts and notably weaker programming skills.

Based on the outcomes from students who sat these assessments in summer 2023, we can convert the mark differences from the grade boundaries into a percentage point change in students who would receive the grade if the grade boundary were moved to different mark points. For this calculation we only included 16-year-old students. This information is summarised in Table 25 for the different mark points discussed above. We present this alongside a simplified summary of the experts’ comments on the difference in the quality of work.

Table 25. Summary of differences in performance standard at different mark points and the difference in percentage points (pp) of students at each mark point

Grade Mark Difference Difference from boundary performance standard Change in % of students attaining grade
7 -2 Not noticeable +1.9pp
7 -4 Minor +3.6pp
7 -6 Moderate +5.4pp
7 -8 Significant +7.1pp
4 -2 Not noticeable +1.2pp
4 -4 Not noticeable +2.5pp
4 -6 Minor or moderate +3.7pp
4 -8 Significant +4.9pp

The purpose of task 2 was to understand the subject experts’ view of what a successful student in further study in computer science would know and could do. During discussions with the experts at the review meeting, the majority of the experts confirmed they had considered a student continuing down a traditional academic route in computer science when completing this task. Most experts focused on students likely to achieve at least a grade C in A level computer science. However, the experts emphasised that for some students a grade E could still be considered a success. A minority of our experts also considered other routes such as a T Level in digital skills.

The experts were asked to provide a summary of what skills and knowledge they would expect a student who would go on to be successful at computer science to show at GCSE level. Experts provided a broad list of skills (summarised in Table 26), which may reflect their varied experience, expertise and perceived priorities within the subject. However, in further discussion during the review meetings, experts agreed that they would not expect a single student to demonstrate all of these skills. They suggested that many of these skills could be summarised as good programming and problem-solving skills. Experts also made it clear that it is often not possible to define a list of skills or knowledge that indicate success and, beyond the content of the qualifications, it is often less tangible factors such as motivation, maturity or willingness to learn which are predictors of a successful student.

Table 26. Summary of skills and knowledge identified by subject experts that might indicate a student who would be likely to succeed in further study in computer science.

  • Ability to read and debug code
  • Good communication skills and use of technical terms
  • Good understanding of theoretical concepts, although not necessarily their application
  • Clear understanding of data types and data representation
  • Basic understanding of computer systems and hardware
  • Ability to discuss legal and ethical issues relating to technology
  • Basic/strong mathematical skills
  • Understand the basics of networking and communication between devices
  • Passionate about the subject and able to see beyond the curriculum
  • Ability to think logically
  • Able to interpret and apply algorithms to solve problems (with reasonable efficiency)
  • Able to apply the principles of computational thinking
  • Reasonable/strong programming skills
  • Confidence with simple data structures (for example, arrays)
  • Proficiency in one programming language
  • Ability to think creatively
  • Strong ability in various number systems (base 8 and 16)
  • Able to write programs to solve non-trivial problems
  • Good ability at problem solving
  • Understand the underlying abstraction in computer systems

During the discussions, the experts noted that they also found task 2 challenging, particularly because of the varied profile of students within each pack. However, they found task 2 easier than task 1, as the packs were presented in mark order and therefore, they had higher confidence in their judgements.

Figure 28 below shows the ratings given by experts to packs of scripts at different points in the mark scale. As in the previous figures the coloured bars indicate the percentage of experts who gave each rating; from students highly likely to succeed in further study to those highly unlikely to succeed in further study.

Figure 28. Subject expert ratings of the likelihood of students in each pack to succeed in further study in computer science.

The key points of Figure 28 are pack 7, the lowest pack where the majority of experts believe that students would be likely to succeed in further study, pack 4 where all experts believe students would be likely to succeed, and pack 3 where the majority of experts believed students would be highly likely to succeed in further study. In terms of grades, pack 7 represents students achieving a low grade 5, pack 4 a high grade 5 and pack 3 a low grade 6.

It is worth reiterating that for this task the packs were presented in order from highest mark total to lowest and the experts were aware of this ordering. This is likely the cause of the greater consistency of ratings across the mark range than in task 1.

The results of this task were shared with the experts at the review meeting. Some of our experts were surprised at how low in the grade distribution students had been rated as likely to succeed in further study, stating that they expected these students to have received a higher grade and that students wouldn’t typically be admitted on to A level courses with a grade lower than a 6. Some experts attributed this to giving the benefit of the doubt to students while completing the tasks, with experts trying to find evidence in the scripts which might indicate quality, particularly where students may have shown aptitude but lost marks by articulating themselves poorly in the exam. Other experts were less surprised, particularly in light of the previous discussion that it is often not the subject specific skills and knowledge displayed in the exam which indicate a promising student, but other factors. These other factors which experts may have seen in students’ answers may often not have been ‘credit worthy’ in the mark scheme resulting in students receiving lower grades.

Broader views on the standard from subject experts

As part of the discussions the experts were encouraged to provide broader views about the qualification and particularly any views they had about the current qualification standard. Although not a unanimous view, some of the experts believed that the GCSE was too challenging; a view which was explored further through discussion. It was expressed that this view was based on a variety of reasons and not simply due to grading standards. In this section we discuss the main points made by the subject experts about the current qualification and factors which may impact on the actual or perceived difficulty for students, these comments are summarised into themes below.

Teaching quality

The experts noted on a number of occasions that they thought that the majority of centres offering computer science do not have a specialist computer science teacher and that was reflected in the quality of students’ work. It was believed by a number of experts that this was the primary driver of students receiving relatively lower grades in computer science to other subjects and the perception of the subject being difficult. However, it was also suggested that this is a difficult problem to overcome as there are likely to be opportunities for employment in other sectors for good computer scientists that might be more financially rewarding. This concern has been raised elsewhere (Royal Society, 2019), with reports indicating that computer science teachers can earn more money in careers outside of teaching (Sibieta, 2018). Previous statistics have also suggested that historically only around 15% of computer science teachers were subject specialists (Dallaway, 2016). In 2017 46% of computer science teachers in secondary schools held a relevant computing qualification (36% computer science, 10% ICT or business with ICT) (Royal Society, 2017). More recent data from the academic year 2022 to 2023 indicate that just over half of the hours taught in computing in secondary schools were taught by teachers with a relevant post-A level qualification (54.1%), which is in contrast to other Ebacc science subjects where the majority of hours were taught by a subject specialist (73-95%) (DfE, 2023).

In some cases, the experts felt like they could see evidence of good students answering questions badly, which could be due to students being poorly prepared for the exam resulting in poor exam technique. They also felt that there was some evidence of certain areas of the content being prioritised over others. It was also noted that computer science is a very practical subject, that can be difficult to teach in a classroom setting – especially if schools do not have the right equipment available.

Content and teaching time

There was discussion throughout both of the review meetings about the variety of content included in the GCSE computer science curriculum. Experts variously showed disagreement about what skills or knowledge should be prioritised, as expressed through their ratings in task 1. It was noted that this was a broader concern within the subject, as there were varying views from those in the field about which skills were important. Experts speculated that this may have resulted in the GCSE content being too broad, leading to difficulty.

Some experts noted that when the content had been originally designed it was expected to sit alongside the GCSE in ICT which has since been discontinued, and perhaps were it to be redesigned it might be advantageous to include some ICT content alongside the computer science content. There was a belief that this may make the subject more accessible.

Experts also noted that due to the broad content they did not think that there was sufficient time to adequately teach it all. This particularly related to the programming elements, which experts believed take much longer to teach than reflected by the weighting they are given in the curriculum and assessments. These concerns have been raised elsewhere (Royal Society, 2017; Ofsted, 2022). There also appears to be a trend of less, rather than more time dedicated to teaching computer science over time (Kemp & Berry, 2019; Royal Society, 2019). The experts suggested that students need to be sufficiently engaged to practice programming skills outside of the classroom if they are to succeed.

Exam structure and mark schemes

One comment that recurred in our discussions with the subject experts was that in some cases there was disparity between the marks students received and their judgements of ‘quality’. Experts in a number of cases identified students who they believed showed some skill or understanding but missed out on marks. There was some supposition that this could be due to the material being taught poorly, or that students having poor exam technique and so missed out on ‘easy’ marks by articulating themselves badly. On the other hand, some experts believed that due to the nature of the exams, students who did not have a good grasp of the subject, could still gain a reasonable number of marks across the paper.

It was commented on a number of occasions that testing coding skills in a written exam may not give a valid indication of a student’s ability. Experts expressed a preference for computer science assessments taken on-screen where students can edit or even trial their code, although experts acknowledged that there were good reasons why the NEA had been removed.

Progression

There was some suggestion from the experts that attainment at GCSE did not necessarily indicate how well students would do at A level. This may be due to many students taking the A level in computer science not having taken the GCSE, therefore A level teachers expect students to have gaps in their knowledge. Experts believed success in further study was more to do with effort, attitude and not being afraid to have a go and make mistakes, than subject knowledge. Related to this, one expert believed that students who received as high as a grade 7 in the GCSE were not necessarily well prepared for A level, as they could do well in the GCSE but still not have had the skills to progress further in computer science. However, experts also believed that programming skills would be beneficial, along with creativity and problem-solving skills.

Grading standards

It was noted that reducing the expectations for each grade may benefit some students but might risk the integrity of the subject. This was born out by our discussions with the experts following task 1, where there was consensus that work only a few marks below the grade boundaries did not show enough knowledge and skills necessary for each grade. A small number of experts commented that they believed the quality of work at the boundaries was actually below what they expected, particularly at grade 7.

Those experts that expressed concern that the current exam standard was too challenging were not consistent about where in the grade range their concerns were focussed, with different experts suggesting that they thought the standard was too challenging at the higher grades (grade 7/8/9) or at the middle grades (4 and 5). Other experts instead thought that the assessment was inaccessible to weaker students at lower grades (4 and below). There was also some suggestion from experts that it was relatively easy to gain a grade 1.

Strand 2. Discussion

This strand aimed to review the performance standard in GCSE computer science in summer 2023, by this we mean the quality of work demonstrated by students to receive the key grades (grade 7 and grade 4). We did this by seeking the views of a group of 8 experts with a diverse range of experience in computer science. These experts represented a variety of views from teachers, industry experts and subject bodies. Overall, there were mixed views from subject experts about whether the current performance standard is appropriate. Some subject experts believed the quality of work at the boundaries was lower and others higher than expected, although slightly more experts believed that the standard of work at grade 7 was higher than expected.

The results of the first task and discussions during the review meeting indicate that there is a region around each grade boundary where the experts did not consistently identify a noticeable difference in the quality of work produced by students (extending to -2 marks below the boundary at grade 7, -4 marks at grade C). Therefore, for these assessments, if the grade boundary had been anywhere within this range, it would have a negligible impact on the performance standard of the qualification. Evidence from experts suggested that moving slightly lower down the mark distribution (-4/6 marks at grade 7 and -6 marks at grade 4), the difference in the quality of work became more noticeable. However, in discussion, experts believed that moving the standard within this range would not undermine the purpose of the qualification. Experts believed that moving any further than this would result in students showing noticeably less skills and knowledge and would have a significant negative impact on the qualification standard.

The second task aimed to understand how well students receiving different grades in GCSE computer science were prepared for further study in the subject, representing one of the main purposes of GCSEs. The results of this task were mixed. The findings suggested that the experts believed that students with a high grade 5 could be successful in further study. To some experts, however, this came a surprise, as it would be unusual to accept a student with a grade 5 to study A level computer science. From discussions with the experts though it was apparent that that it can be difficult to judge which students will do well at A level. Experts believed success typically has less to do with students’ subject content knowledge on entering the A level than their attitude and approach to the subject. This may be because a number of students take the A level without having taken the GCSE and so teachers presume very little, or patchy knowledge from students entering the A level. Experts may therefore have been generous in their rating of students work for this exercise, looking to find evidence of potential even when students performed poorly in the assessment.

Finally, discussions with the experts indicated that although there was some belief that the assessments in GCSE computer science were challenging, the experts thought that there are a large number of other potential reasons that the qualification is seen as too difficult, beyond exam grading standards. Principally among these may be issues with recruiting subject specialist teachers, challenges of validly assessing programming skills and the breadth of content in the qualification.

To conclude, overall, the experts considered that there was some justification for adjusting the standard in GCSE computer science, although views on this were mixed. The findings suggests that the standard in the assessment could be lowered by a small degree, without undermining the qualification, but any larger changes would potentially be considered undesirable by subject experts. Broader issues with the perception of difficulty within GCSE computer science were highlighted which cannot be addressed through changes to the assessment standard or through grade boundary setting.

Overall Conclusion

The aim of strand 1 of this study was to understand if there was any evidence of a change in standards in GCSE computer science over time, which may have led to the subject being more difficult than intended. Across a variety of methods there was an indication of a small change in standards over time, particularly during the period 2014 to 2017. During this period there were a large number of changes to the qualification in terms of the number and make-up of students taking the qualification, the number of new centres entering students to the qualifications for the first time, and some changes to the assessment design and structure. These changes produce challenges in maintaining standards, which in this case may have led to some small incremental changes to the qualification standard. Given that such changes were likely to be small, they are unlikely to have been detectable by senior examiners when setting grade boundaries each year. Cumulatively though, this appears to have led to a more substantive change in standards. Across the methods used in strand one, evidence suggested that there had been a small change in standards at grade A/7 between the period 2014 to 2019, and a slightly larger change at grade C/4. Evidence for any change in standards at grade G/1 was weak.

In strand 2 we aimed to explore what the impact of any change in the standard of the qualification would be on the skills and knowledge demonstrated by students in the assessments and to understand what impact this might have on student progression. The findings indicated that a small change in standards at grade 7 and grade 4 would have a minor impact on the performance standard for each grade, and that this would be unlikely to impact on the progression of students to further study in computer science. However, any larger changes would start to have undesirable consequences for the skills and knowledge that our subject experts would expect students to demonstrate and may risk undermining the value of the qualification. The other feedback from experts in strand 2 did not indicate that a larger change in grading standards was felt to be necessary. Subject experts highlighted a number of factors which may influence the perceived and actual difficulty in the subject beyond grading standards in the assessments. These include teacher expertise, curriculum time, subject content and resourcing.

In summary, the evidence in this report suggests that consideration should be given to making an adjustment to grading standards in GCSE computer science. Evidence from strand 1 indicates that there is likely to have been a small change in standards over time in the qualification, and the findings from strand 2 suggest that a small adjustment to grading standards is unlikely to undermine the value of the qualification or the progression of students to further study in computer science.

Benton, T. (2013). Formalising and evaluating the benchmark centres methodology for setting GCSE standards. Cambridge Assessment Research Report.

Benton, T. S. T., & Sutch, T. (2014). Analysis of use of Key Stage 2 data in GCSE predictions. ARD Research Division.

Bradley, R.A., & Terry, M. (1952). The rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika, 39, 324–345.

Bramley, T. (2007). Paired comparison methods. In P. E. Newton, J. Baird, H. Goldstein, H. Patrick & P. Tymms (Eds.), Techniques for monitoring the comparability of examination standards. (pp. 246-294). Qualifications and Curriculum Authority.

Bramley, T., & Oates, T. (2011). Rank ordering and paired comparisons - the way Cambridge Assessment is using them in operational and experimental work. Research Matters: A Cambridge Assessment Publication, 11, 32-35

Brown, N. C., Sentance, S., Crick, T., & Humphreys, S. (2014). Restart: The resurgence of computer science in UK schools. ACM Transactions on Computing Education (TOCE), 14(2), 1-22.

Coe, R. (2008). Comparability of GCSE examinations in different subjects: An application of the Rasch model. Oxford Review of Education, 34, 609–636.

Coe, R., Searle, J., Barmby, P., Jones, K., & Higgins, S. (2008). Relative difficulty of examinations in different subjects. CEM centre.

Cresswell, M.J. (2003). Heaps, prototypes and ethics: The consequences of using judgements of student performance to set exam standards in a time of change. University of London Institute of Education.

Cuff, B. M., Meadows, M., & Black, B. (2019). An investigation into the Sawtooth Effect in secondary school assessments in England. Assessment in Education: Principles, Policy & Practice, 26(3), 321-339.

Curcin, M., Howard, E., Sully, K., & Black, B. (2019). Improving awarding: 2018/2019 pilots. Ofqual.

Dallaway, E. (2016). GCSE Reform: A New Dawn of computer science. CREST.

DfE (2015). computer science: GCSE Subject content. Department for Education.

DfE (2023). Reporting year 2022: School workforce in England. National Statistics.

Good, F. J., & Cresswell, M. J. (1988). Grade awarding judgements in differentiated examinations. British Educational Research Journal, 14(3), 263-281.

He, Q. and Black, B. (2020). Impact of calculated grades, centre assessment grades and final grades on inter-subject comparability in GCSEs and A levels in 2020. Ofqual.

He, Q. and Cadwallader, S. (2022). An investigation of inter-subject comparability in GCSEs and A levels in summer 2021. Ofqual.

Kelly, A. (1976). A study of the comparability of external examinations in different subjects. Research in Education, 16, 37–63.

Kemp, P.E.J. & Berry, M.G. (2019). The Roehampton Annual Computing Education Report: pre-release snapshot from 2018. University of Roehampton

Newton, P. (2020). What is the Sawtooth effect? Ofqual.

Ofqual (2015a). Further Decisions for Completing GCSE, AS and A Level Reform in 2017. Ofqual.

Ofqual (2015b). Comparability of Different GCSE and A Level Subjects in England: An Introduction. Ofqual.

Ofqual (2016). Decisions on setting the grade standards of new GCSEs in England - part 2. Ofqual.

Ofqual (2017). Consultation on revised assessment arrangements for GCSE computer science. Ofqual.

Ofqual (2019). Decisions on future assessment arrangements for GCSE (9 to 1) computer science. Ofqual.

Ofqual (2023). GCSE (9 to 1) Qualification Level Conditions and Requirements. Ofqual.

Ofsted (2022). Research review series: computing. Ofsted.

Royal Society (2017). After the reboot: Computing education in UK schools. The Royal Society.

Royal Society (2019). Policy briefing on teachers of computing: recruitment, retention and development. The Royal Society.

Sibieta, L. (2018). The teacher labour market in England: shortages, subject expertise and incentives. Education Policy Institute.

Appendix A – Progression analysis modelling output

Table A1. Linear model output for the relationship between mean GCSE score and A level outcomes between years. Model includes control variables for Ethnic group, FSM eligibility, Language group, Gender, SEN status and Centre type, coefficients not shown.

Variable Coefficient SE p value
Standardised KS2 score 0.353 0.014 <0.001
Standardised mean GCSE 0.806 0.012 <0.001
Year 2015 [2014] 0.116 0.056 <0.05
Year 2016 [2014] 0.049 0.051 0.336
Year 2017 [2014] 0.107 0.051 <0.05
Marginal r-squared 0.391
Conditional R-squared 0.473
N Students 12,103
N Centres 1,778

Appendix B – Outcomes over time modelling output

Table B1. Summary of Year model effects from various different models using prior attainment. See text for details.

Model Restriction Year 2019 coefficient [Ref 2014] (SE) Estimated difference in outcomes from 2014 predicted for 2019 cohort
Linear All centres -0.12 (0.01)*** -0.15
Linear Excluding new centres -0.35 (0.04)*** -0.39
Linear 2014 centres only -0.24 (0.05)*** -0.27
Linear 2015 centres only -0.25 (0.03)*** -0.28
A/7 Grade All centres -0.02 (0.03) 0.19pp
A/7 Grade Excluding new centres -0.28 (0.07)*** -2.75pp
A/7 Grade 2014 centres only -0.23 (0.08)** -4.27pp
A/7 Grade 2015 centres only -0.11 (0.06) -2.04pp
C/4 Grade All centres -0.08 (0.03)** -1.81pp
C/4 Grade Excluding new centres -0.53 (0.08)*** -10.47pp
C/4 Grade 2014 centres only -0.21 (0.10)* -3.85pp
C/4 Grade 2015 centres only -0.24 (0.06)*** -4.30pp
G/1 grade All centres -0.05 (0.07) -0.06pp
G/1 grade Excluding new centres -0.91 (0.30)** -0.53pp
G/1 grade 2014 centres only -0.24 (0.38) 0.00pp
G/1 grade 2015 centres only -0.02 (0.20) -0.06pp

Note: Statistical significance is indicated by p<0.001 ( *** ), p<0.01 ( ** ), p<0.05 ( * )

Table B2. Detailed model output for prior attainment linear models. Student and centre characteristic control variables not shown, see text for details.

Variable M1 (all centres) M2 (no new centres) M3 (same centres 2014) M4 (same centres 2015)
Year 2012 [2019] -0.054 (0.039) NA NA NA
Year 2013 [2019] 0.071 (0.026)** NA NA NA
Year 2014 [2019] 0.128 (0.014)*** 0.35 (0.038)*** 0.241 (0.045)*** NA
Year 2015 [2019] 0.129 (0.012)*** 0.296 (0.025)*** 0.223 (0.046)*** 0.248 (0.03)***
Year 2016 [2019] 0.105 (0.009)*** 0.176 (0.013)*** 0.031 (0.042) 0.167 (0.027)***
Year 2017 [2019] 0.059 (0.008)*** 0.08 (0.01)*** -0.171 (0.042)*** -0.012 (0.027)
Year 2018 [2019] -0.025 (0.008)** -0.035 (0.009)*** -0.059 (0.042) -0.097 (0.027)***
Standardised KS2 score 1.069 (0.003)*** 1.118 (0.004)*** 0.995 (0.015)*** 1.031 (0.01)***
R-Squared (Marginal/conditional) 0.36/0.46 0.39/0.49 0.38/0.45 0.38/0.47
N (students/centres) 297,014/3,432 173,787/2,662 12,198/85 26,238/203

Table B3. Detailed model output for prior attainment A/7 binomial models. Student and centre characteristic control variables not shown, see text for details.

Variable M1 (all centres) M2 (no new centres) M3 (same centres 2014) M4 (same centres 2015)
Year 2012 [2019] -0.317 (0.071)*** NA NA NA
Year 2013 [2019] -0.101 (0.049)* NA NA NA
Year 2014 [2019] -0.019 (0.028) 0.283 (0.069)*** 0.233 (0.084)** NA
Year 2015 [2019] -0.091 (0.023)*** 0.139 (0.048)** 0.186 (0.086)* 0.111 (0.057)
Year 2016 [2019] 0.062 (0.018)*** 0.184 (0.027)*** 0.157 (0.079)* 0.241 (0.052)***
Year 2017 [2019] 0.018 (0.017) 0.065 (0.021)** 0.032 (0.08) 0.092 (0.052)
Year 2018 [2019] -0.009 (0.017) -0.011 (0.018) -0.048 (0.079) -0.102 (0.052)*
Standardised KS2 score 1.45 (0.008)*** 1.511 (0.01)*** 1.338 (0.034)*** 1.401 (0.024)***
R-Squared (Marginal/conditional) 0.39/0.48 0.41/0.5 0.38/0.44 0.4/0.47
N (students/centres) 297,014/3,432 173,787/2,662 12,198/85 26,238/203

Table B4. Detailed model output for prior attainment C/4 binomial models. Student and centre characteristic control variables not shown, see text for details.

Variable M1 (all centres) M2 (no new centres) M3 (same centres 2014) M4 (same centres 2015)
Year 2012 [2019] -0.125 (0.077) NA NA NA
Year 2013 [2019] -0.025 (0.048) NA NA NA
Year 2014 [2019] 0.081 (0.026)** 0.525 (0.082)*** 0.211 (0.101)* NA
Year 2015 [2019] 0.084 (0.021)*** 0.363 (0.051)*** 0.067 (0.1) 0.244 (0.062)***
Year 2016 [2019] 0.002 (0.015) 0.085 (0.025)*** -0.243 (0.088)** 0.031 (0.055)
Year 2017 [2019] -0.052 (0.015)*** -0.024 (0.019) -0.578 (0.087)*** -0.196 (0.055)***
Year 2018 [2019] -0.04 (0.015)** -0.055 (0.016)*** -0.206 (0.09)* -0.21 (0.055)***
Standardised KS2 score 1.419 (0.007)*** 1.512 (0.009)*** 1.408 (0.037)*** 1.412 (0.024)***
R-Squared (Marginal/conditional) 0.39/0.5 0.43/0.52 0.44/0.53 0.43/0.52
N (students/centres) 297,014/3,432 173,787/2,662 12,198/85 26,238/203

Table B5. Detailed model output for prior attainment G/1 binomial models. Student and centre characteristic control variables not shown, see text for details.

Variable M1 (all centres) M2 (no new centres) M3 (same centres 2014) M4 (same centres 2015)
Year 2012 [2019] 0.17 (0.251) NA NA NA
Year 2013 [2019] -0.136 (0.125) NA NA NA
Year 2014 [2019] 0.051 (0.068) 0.907 (0.305)** 0.238 (0.385) NA
Year 2015 [2019] 0.159 (0.057)** 0.499 (0.153)** 0.218 (0.385) 0.024 (0.2)
Year 2016 [2019] -0.107 (0.037)** -0.036 (0.061) -0.775 (0.288)** -0.374 (0.175)*
Year 2017 [2019] -0.174 (0.035)*** -0.172 (0.046)*** -1.176 (0.273)*** -0.935 (0.16)***
Year 2018 [2019] 0.012 (0.036) -0.027 (0.04) -0.477 (0.303) -0.418 (0.172)*
Standardised KS2 score 1.177 (0.014)*** 1.279 (0.019)*** 1.159 (0.086)*** 1.216 (0.057)***
R-Squared (Marginal/conditional) 0.34/0.52 0.38/0.52 0.92/0.95 0.65/0.76
N (students/centres) 297,014/3,432 173,787/2,662 12,198/85 26,238/203

Table B6. Detailed model output for concurrent attainment linear models. Student and centre characteristic control variables not shown, see text for details.

Variable M1 (all centres) M2 (no new centres) M3 (same centres 2014) M4 (same centres 2015)
Year 2012 [2019] -0.005 (0.029) NA NA NA
Year 2013 [2019] 0.127 (0.02)*** NA NA NA
Year 2014 [2019] 0.117 (0.011)*** 0.411 (0.027)*** 0.314 (0.034)*** NA
Year 2015 [2019] 0.128 (0.008)*** 0.35 (0.017)*** 0.257 (0.032)*** 0.331 (0.021)***
Year 2016 [2019] 0.097 (0.006)*** 0.208 (0.01)*** 0.078 (0.031)* 0.237 (0.02)***
Year 2017 [2019] 0.046 (0.006)*** 0.076 (0.007)*** -0.128 (0.031)*** 0.008 (0.02)
Year 2018 [2019] -0.018 (0.006)** -0.021 (0.006)*** -0.063 (0.031)* -0.051 (0.02)*
Standardised mean GCSE score 1.592 (0.002)*** 1.625 (0.003)*** 1.485 (0.011)*** 1.527 (0.007)***
R-Squared (Marginal/conditional) 0.65/0.7 0.68/0.73 0.64/0.68 0.65/0.7
N (students/centres) 321,117/3,442 185,439/2,654 13,663/84 28,408/203

Table B7. Detailed model output for concurrent attainment A/7 binomial models. Student and centre characteristic control variables not shown, see text for details.

Variable M1 (all centres) M2 (no new centres) M3 (same centres 2014) M4 (same centres 2015)
Year 2012 [2019] -0.356 (0.084)*** NA NA NA
Year 2013 [2019] -0.058 (0.059) NA NA NA
Year 2014 [2019] 0.024 (0.033) 0.476 (0.08)*** 0.4 (0.097)*** NA
Year 2015 [2019] -0.064 (0.025)* 0.308 (0.052)*** 0.257 (0.093)** 0.3 (0.063)***
Year 2016 [2019] 0.114 (0.021)*** 0.328 (0.031)*** 0.255 (0.091)** 0.441 (0.061)***
Year 2017 [2019] 0.031 (0.02) 0.1 (0.025)*** 0.046 (0.092) 0.127 (0.062)*
Year 2018 [2019] -0.021 (0.019) -0.018 (0.022) -0.078 (0.091) -0.118 (0.061)
Standardised mean GCSE score 3.064 (0.013)*** 3.2 (0.018)*** 2.911 (0.055)*** 3.036 (0.04)***
R-Squared (Marginal/conditional) 0.71/0.75 0.73/0.77 0.7/0.72 0.71/0.75
N (students/centres) 321,117/3,442 185,439/2,654 13,663/84 28,408/203

Table B8. Detailed model output for concurrent attainment C/4 binomial models. Student and centre characteristic control variables not shown, see text for details.

Variable M1 (all centres) M2 (no new centres) M3 (same centres 2014) M4 (same centres 2015)
Year 2012 [2019] -0.05 (0.091) NA NA NA
Year 2013 [2019] 0.079 (0.059) NA NA NA
Year 2014 [2019] 0.061 (0.03)* 0.769 (0.095)*** 0.389 (0.113)*** NA
Year 2015 [2019] 0.11 (0.022)*** 0.657 (0.056)*** 0.269 (0.107)* 0.568 (0.068)***
Year 2016 [2019] -0.01 (0.018) 0.181 (0.03)*** -0.195 (0.099)* 0.206 (0.065)**
Year 2017 [2019] -0.087 (0.017)*** -0.046 (0.023)* -0.659 (0.098)*** -0.196 (0.063)**
Year 2018 [2019] -0.021 (0.017) -0.032 (0.019) -0.24 (0.1)* -0.141 (0.063)*
Standardised mean GCSE score 2.994 (0.012)*** 3.231 (0.017)*** 2.849 (0.059)*** 2.941 (0.041)***
R-Squared (Marginal/conditional) 0.69/0.75 0.73/0.78 0.69/0.74 0.7/0.75
N (students/centres) 321,117/3,442 185,439/2,654 13,663/84 28,408/203

Table B9. Detailed model output for concurrent attainment G/1 binomial models. Student and centre characteristic control variables not shown, see text for details.

Variable M1 (all centres) M2 (no new centres) M3 (same centres 2014) M4 (same centres 2015)
Year 2012 [2019] 0.235 (0.275) NA NA NA
Year 2013 [2019] 0.016 (0.15) NA NA NA
Year 2014 [2019] -0.088 (0.074) 0.921 (0.325)** 0.279 (0.393) NA
Year 2015 [2019] 0.057 (0.056) 0.546 (0.154)*** 0.196 (0.355) 0.216 (0.198)
Year 2016 [2019] -0.205 (0.041)*** -0.005 (0.068) -0.744 (0.292)* -0.211 (0.181)
Year 2017 [2019] -0.248 (0.038)*** -0.226 (0.051)*** -1.264 (0.278)*** -0.886 (0.165)***
Year 2018 [2019] 0.014 (0.039) -0.002 (0.044) -0.524 (0.305) -0.227 (0.177)
Standardised mean GCSE score 2.533 (0.02)*** 2.761 (0.029)*** 2.319 (0.12)*** 2.525 (0.082)***
R-Squared (Marginal/conditional) 0.61/0.71 0.66/0.73 0.93/0.95 0.72/0.79
N (students/centres) 321,117/3,442 185,439/2,654 13,663/84 28,408/203

Is this page useful?

  • Yes this page is useful
  • No this page is not useful

Help us improve GOV.UK

Don’t include personal or financial information like your National Insurance number or credit card details.

To help us improve GOV.UK, we’d like to know more about your visit today. Please fill in this survey (opens in a new tab) .

IMAGES

  1. 🎉 Sample rationale of a research paper. Rationale sample research

    sample rationale in research paper

  2. Sample Research Paper Rationale

    sample rationale in research paper

  3. Sample Research Paper Rationale

    sample rationale in research paper

  4. 😎 How to do a rationale for a research. How Can You Write a Perfect

    sample rationale in research paper

  5. Example Of Rationale Of The Study In Research

    sample rationale in research paper

  6. How to Write the Rationale for a Research Paper

    sample rationale in research paper

VIDEO

  1. KD Video IA3 Rationale & Research Intro

  2. Difference between Rationale and hypothesis

  3. PRACTICAL RESEARCH 1: WRITING RATIONALE OF THE STUDY (STUDENT REPORTING)

  4. Nursing RESEARCH: Common Board Exam Questions

  5. The "WHY" Question in Research: Importance

  6. Second Year New Paper Pattern Computer Science| 2nd year ICs computer paper pattern|ICs part 2

COMMENTS

  1. How to Write the Rationale of the Study in Research (Examples)

    The rationale of the study is the justification for taking on a given study. It explains the reason the study was conducted or should be conducted. This means the study rationale should explain to the reader or examiner why the study is/was necessary. It is also sometimes called the "purpose" or "justification" of a study.

  2. How to Write the Rationale for a Research Paper

    The rationale for your research is the reason why you decided to conduct the study in the first place. The motivation for asking the question. The knowledge gap. This is often the most significant part of your publication. It justifies the study's purpose, novelty, and significance for science or society.

  3. Writing a rationale

    A rationale can be provided by offering longer essay-based support for why it is important to do something in a certain way - in that sense, a whole paper can be a rationale. However, a more specific or focused way of thinking about a rationale is how we can overtly show we are justifying our choices with the language we use.

  4. How to write the rationale for research?| Editage Insights

    To write your rationale, you should first write a background on what all research has been done on your study topic. Follow this with 'what is missing' or 'what are the open questions of the study'. Identify the gaps in the literature and emphasize why it is important to address those gaps. This will form the rationale of your study.

  5. How do you Write the Rationale for Research?

    The rationale for research is also sometimes referred to as the justification for the study. When writing your rational, first begin by introducing and explaining what other researchers have published on within your research field. Having explained the work of previous literature and prior research, include discussion about where the gaps in ...

  6. How to Write a Rationale: A Guide for Research and Beyond

    Typically, the rationale is written toward the end of the introduction section of your paper, providing a logical lead-in to your research questions and methodology . By following these steps and considering your audience's perspective, you can write a strong and compelling rationale that clearly communicates the significance and necessity of ...

  7. Q: Can you give an example of the "rationale of a study"?

    The rationale of your research offers the reason for addressing a particular problem with a spscific solution. Your research proposal needs to explain the reasons why you are conducting the study: this forms the rationale for your research, also referred to as the justification of the study.

  8. How to write rationale in research

    Research rationale helps to ideate new topics which are less addressed. Additionally, it offers fresh perspectives on existing research and discusses the shortcomings in previous studies. It shows that your study aims to contribute to filling these gaps and advancing the field's understanding. 3. Originality and Novelty.

  9. How to write the rationale for your research

    19 November, 2021. The rationale for one's research is the justification for undertaking a given study. It states the reason (s) why a researcher chooses to focus on the topic in question, including what the significance is and what gaps the research intends to fill. In short, it is an explanation that rationalises the need for the study.

  10. How to Write a Rationale for Your Research Paper

    A research rationale is a crucial component of any academic paper, serving as a concise explanation of why your research project is necessary and valuable.. It justifies the importance of your study and outlines its potential contributions to the field, effectively bridging the gap between your research question and the existing body of knowledge.

  11. Q: How do I write a rationale for research in science?

    Answer: The rationale for research basically outlines why you wanted to conduct research on the topic of your choice. The rationale is the justification of the study, and specifies the need to conduct research on the topic. In science, in fact, it is easier to come up with a rationale for research. You should first do a thorough literature ...

  12. PDF CHAPTER ONE INTRODUCTION 1.1 RATIONALE OF THE STUDY

    study's rationale, problem statement, the aims, underlying assumptions, theoretical paradigm and its anticipated value of the study. Chapter Two contains methodological considerations. It details the data collection and sampling process, research tool, approach to data analysis, ethical considerations as well as the study's limitations.

  13. Rationale for the Study

    Rationale for the study, also referred to as justification for the study, is reason why you have conducted your study in the first place. This part in your paper needs to explain uniqueness and importance of your research. Rationale for the study needs to be specific and ideally, it should relate to the following points: 1. The research needs ...

  14. PDF Formulating a convincing rationale for a research study

    rationale for a research study is convincing. Keywords: rationale; research question; research objective; theory Introduction Research is about systematically obtaining and analysing data to increase our knowledge about a topic in which we are interested. In undertaking research, we are trying to answer a

  15. Formulating a convincing rationale for a research study

    convincing rationale for a research study' Coaching: An International Journal of Theory, Research. and Practice 5.1 1-7. Abstract. Explaining the purpos e of a research study and providing a ...

  16. Formulating a convincing rationale for a research study

    Research methods texts (for example Gray, 2009; Robson, 2011; Saunders, Lewis & Thornhill, 2012) consistently argues that a clear research question and/or research objectives supported by a convincing rationale that is justified by the academic literature is an essential building block for high quality research.

  17. Writing a Research Paper Introduction

    Step 1: Introduce your topic. Step 2: Describe the background. Step 3: Establish your research problem. Step 4: Specify your objective (s) Step 5: Map out your paper. Research paper introduction examples. Frequently asked questions about the research paper introduction.

  18. PDF CHAPTER 5 RATIONALE FOR RESEARCH

    CHAPTER 5 - RATIONALE FOR RESEARCH 161 enough evidence to support the fact that an accurate representation or explanation of the phenomenon being studied has been presented. 5.1.2 Research Objectives and Stages of Research to be Deployed The research rationale to be used in the study is informed by the research objectives.

  19. Rationale/Proposal for Research

    A rationale is a kind of sub-proposal within a proposal: it offers the reasons for proceeding to address a particular problem with a particular solution. A rationale for research is a set of reasons offered by a researcher for conducting more research into a particular subject -- either library research, descriptive research, or experimental ...

  20. Research Rationale Example

    RESEARCH RATIONALE. In this study we proposed to determine the Feasibility of an online delivery service in the northern part of Antique. The current situation today due to pandemic resulted in travel restrictions for leisure purposes majority of people do their transactions online like shopping and browsing for food delivery service as of now Food Delivery Service is not much observed in the ...

  21. 6 Examples of a Rationale

    A design rationale documents the reasons for design decisions. This explains why a design was selected from alternatives and how it achieves design goals. For example, the architect for a public school that creates a rationale based on the project's requirements and constraints. The dense urban location of the school and small size of its land ...

  22. Reporting guidelines for precision medicine research of ...

    To determine how many papers should be reviewed as a representative sample of the published literature, an a priori sample size calculation was performed using SAS software v.9.4 (SAS Institute).

  23. Q: Can you give an example of the rationale of a research plan?

    The rationale of a research plan (or proposal) outlines the reason why you are conducting the study. It justifies the research, explaining its relevance to the target problem and the broader research area. It also talks about the gaps in existing literature that you are seeking to address in your research.

  24. A review of standards in GCSE computer science

    OCR papers were considered to be more demanding in 2015 and 2016, whereas AQA papers were considered to be most demanding in 2017. Post-reform (2018 and 2019) the demand of the papers between the ...

  25. PDF Special report 11/2024: The EU's industrial policy on renewable hydrogen

    According to a research paper 51, member states such as Spain, France, Sweden, Finland, Poland, Greece and Italy have high or good potential for creating a renewable energy surplus. This can be used to produce renewable hydrogen. At the same time, the majority of hard-to-decarbonise industrial sites are situated in Germany, Italy,