• Privacy Policy

Research Method

Home » Validity – Types, Examples and Guide

Validity – Types, Examples and Guide

Table of Contents

Validity

Validity is a fundamental concept in research, referring to the extent to which a test, measurement, or study accurately reflects or assesses the specific concept that the researcher is attempting to measure. Ensuring validity is crucial as it determines the trustworthiness and credibility of the research findings.

Research Validity

Research validity pertains to the accuracy and truthfulness of the research. It examines whether the research truly measures what it claims to measure. Without validity, research results can be misleading or erroneous, leading to incorrect conclusions and potentially flawed applications.

How to Ensure Validity in Research

Ensuring validity in research involves several strategies:

  • Clear Operational Definitions : Define variables clearly and precisely.
  • Use of Reliable Instruments : Employ measurement tools that have been tested for reliability.
  • Pilot Testing : Conduct preliminary studies to refine the research design and instruments.
  • Triangulation : Use multiple methods or sources to cross-verify results.
  • Control Variables : Control extraneous variables that might influence the outcomes.

Types of Validity

Validity is categorized into several types, each addressing different aspects of measurement accuracy.

Internal Validity

Internal validity refers to the degree to which the results of a study can be attributed to the treatments or interventions rather than other factors. It is about ensuring that the study is free from confounding variables that could affect the outcome.

External Validity

External validity concerns the extent to which the research findings can be generalized to other settings, populations, or times. High external validity means the results are applicable beyond the specific context of the study.

Construct Validity

Construct validity evaluates whether a test or instrument measures the theoretical construct it is intended to measure. It involves ensuring that the test is truly assessing the concept it claims to represent.

Content Validity

Content validity examines whether a test covers the entire range of the concept being measured. It ensures that the test items represent all facets of the concept.

Criterion Validity

Criterion validity assesses how well one measure predicts an outcome based on another measure. It is divided into two types:

  • Predictive Validity : How well a test predicts future performance.
  • Concurrent Validity : How well a test correlates with a currently existing measure.

Face Validity

Face validity refers to the extent to which a test appears to measure what it is supposed to measure, based on superficial inspection. While it is the least scientific measure of validity, it is important for ensuring that stakeholders believe in the test’s relevance.

Importance of Validity

Validity is crucial because it directly affects the credibility of research findings. Valid results ensure that conclusions drawn from research are accurate and can be trusted. This, in turn, influences the decisions and policies based on the research.

Examples of Validity

  • Internal Validity : A randomized controlled trial (RCT) where the random assignment of participants helps eliminate biases.
  • External Validity : A study on educational interventions that can be applied to different schools across various regions.
  • Construct Validity : A psychological test that accurately measures depression levels.
  • Content Validity : An exam that covers all topics taught in a course.
  • Criterion Validity : A job performance test that predicts future job success.

Where to Write About Validity in A Thesis

In a thesis, the methodology section should include discussions about validity. Here, you explain how you ensured the validity of your research instruments and design. Additionally, you may discuss validity in the results section, interpreting how the validity of your measurements affects your findings.

Applications of Validity

Validity has wide applications across various fields:

  • Education : Ensuring assessments accurately measure student learning.
  • Psychology : Developing tests that correctly diagnose mental health conditions.
  • Market Research : Creating surveys that accurately capture consumer preferences.

Limitations of Validity

While ensuring validity is essential, it has its limitations:

  • Complexity : Achieving high validity can be complex and resource-intensive.
  • Context-Specific : Some validity types may not be universally applicable across all contexts.
  • Subjectivity : Certain types of validity, like face validity, involve subjective judgments.

By understanding and addressing these aspects of validity, researchers can enhance the quality and impact of their studies, leading to more reliable and actionable results.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Content Validity

Content Validity – Measurement and Examples

Construct Validity

Construct Validity – Types, Threats and Examples

External Validity

External Validity – Threats, Examples and Types

Parallel Forms Reliability

Parallel Forms Reliability – Methods, Example...

Split-Half Reliability

Split-Half Reliability – Methods, Examples and...

Internal_Consistency_Reliability

Internal Consistency Reliability – Methods...

Validity in research: a guide to measuring the right things

Last updated

27 February 2023

Reviewed by

Cathy Heath

Short on time? Get an AI generated summary of this article instead

Validity is necessary for all types of studies ranging from market validation of a business or product idea to the effectiveness of medical trials and procedures. So, how can you determine whether your research is valid? This guide can help you understand what validity is, the types of validity in research, and the factors that affect research validity.

Make research less tedious

Dovetail streamlines research to help you uncover and share actionable insights

  • What is validity?

In the most basic sense, validity is the quality of being based on truth or reason. Valid research strives to eliminate the effects of unrelated information and the circumstances under which evidence is collected. 

Validity in research is the ability to conduct an accurate study with the right tools and conditions to yield acceptable and reliable data that can be reproduced. Researchers rely on carefully calibrated tools for precise measurements. However, collecting accurate information can be more of a challenge.

Studies must be conducted in environments that don't sway the results to achieve and maintain validity. They can be compromised by asking the wrong questions or relying on limited data. 

Why is validity important in research?

Research is used to improve life for humans. Every product and discovery, from innovative medical breakthroughs to advanced new products, depends on accurate research to be dependable. Without it, the results couldn't be trusted, and products would likely fail. Businesses would lose money, and patients couldn't rely on medical treatments. 

While wasting money on a lousy product is a concern, lack of validity paints a much grimmer picture in the medical field or producing automobiles and airplanes, for example. Whether you're launching an exciting new product or conducting scientific research, validity can determine success and failure.

  • What is reliability?

Reliability is the ability of a method to yield consistency. If the same result can be consistently achieved by using the same method to measure something, the measurement method is said to be reliable. For example, a thermometer that shows the same temperatures each time in a controlled environment is reliable.

While high reliability is a part of measuring validity, it's only part of the puzzle. If the reliable thermometer hasn't been properly calibrated and reliably measures temperatures two degrees too high, it doesn't provide a valid (accurate) measure of temperature. 

Similarly, if a researcher uses a thermometer to measure weight, the results won't be accurate because it's the wrong tool for the job. 

  • How are reliability and validity assessed?

While measuring reliability is a part of measuring validity, there are distinct ways to assess both measurements for accuracy. 

How is reliability measured?

These measures of consistency and stability help assess reliability, including:

Consistency and stability of the same measure when repeated multiple times and conditions

Consistency and stability of the measure across different test subjects

Consistency and stability of results from different parts of a test designed to measure the same thing

How is validity measured?

Since validity refers to how accurately a method measures what it is intended to measure, it can be difficult to assess the accuracy. Validity can be estimated by comparing research results to other relevant data or theories.

The adherence of a measure to existing knowledge of how the concept is measured

The ability to cover all aspects of the concept being measured

The relation of the result in comparison with other valid measures of the same concept

  • What are the types of validity in a research design?

Research validity is broadly gathered into two groups: internal and external. Yet, this grouping doesn't clearly define the different types of validity. Research validity can be divided into seven distinct groups.

Face validity : A test that appears valid simply because of the appropriateness or relativity of the testing method, included information, or tools used.

Content validity : The determination that the measure used in research covers the full domain of the content.

Construct validity : The assessment of the suitability of the measurement tool to measure the activity being studied.

Internal validity : The assessment of how your research environment affects measurement results. This is where other factors can’t explain the extent of an observed cause-and-effect response.

External validity : The extent to which the study will be accurate beyond the sample and the level to which it can be generalized in other settings, populations, and measures.

Statistical conclusion validity: The determination of whether a relationship exists between procedures and outcomes (appropriate sampling and measuring procedures along with appropriate statistical tests).

Criterion-related validity : A measurement of the quality of your testing methods against a criterion measure (like a “gold standard” test) that is measured at the same time.

  • Examples of validity

Like different types of research and the various ways to measure validity, examples of validity can vary widely. These include:

A questionnaire may be considered valid because each question addresses specific and relevant aspects of the study subject.

In a brand assessment study, researchers can use comparison testing to verify the results of an initial study. For example, the results from a focus group response about brand perception are considered more valid when the results match that of a questionnaire answered by current and potential customers.

A test to measure a class of students' understanding of the English language contains reading, writing, listening, and speaking components to cover the full scope of how language is used.

  • Factors that affect research validity

Certain factors can affect research validity in both positive and negative ways. By understanding the factors that improve validity and those that threaten it, you can enhance the validity of your study. These include:

Random selection of participants vs. the selection of participants that are representative of your study criteria

Blinding with interventions the participants are unaware of (like the use of placebos)

Manipulating the experiment by inserting a variable that will change the results

Randomly assigning participants to treatment and control groups to avoid bias

Following specific procedures during the study to avoid unintended effects

Conducting a study in the field instead of a laboratory for more accurate results

Replicating the study with different factors or settings to compare results

Using statistical methods to adjust for inconclusive data

What are the common validity threats in research, and how can their effects be minimized or nullified?

Research validity can be difficult to achieve because of internal and external threats that produce inaccurate results. These factors can jeopardize validity.

History: Events that occur between an early and later measurement

Maturation: The passage of time in a study can include data on actions that would have naturally occurred outside of the settings of the study

Repeated testing: The outcome of repeated tests can change the outcome of followed tests

Selection of subjects: Unconscious bias which can result in the selection of uniform comparison groups

Statistical regression: Choosing subjects based on extremes doesn't yield an accurate outcome for the majority of individuals

Attrition: When the sample group is diminished significantly during the course of the study

Maturation: When subjects mature during the study, and natural maturation is awarded to the effects of the study

While some validity threats can be minimized or wholly nullified, removing all threats from a study is impossible. For example, random selection can remove unconscious bias and statistical regression. 

Researchers can even hope to avoid attrition by using smaller study groups. Yet, smaller study groups could potentially affect the research in other ways. The best practice for researchers to prevent validity threats is through careful environmental planning and t reliable data-gathering methods. 

  • How to ensure validity in your research

Researchers should be mindful of the importance of validity in the early planning stages of any study to avoid inaccurate results. Researchers must take the time to consider tools and methods as well as how the testing environment matches closely with the natural environment in which results will be used.

The following steps can be used to ensure validity in research:

Choose appropriate methods of measurement

Use appropriate sampling to choose test subjects

Create an accurate testing environment

How do you maintain validity in research?

Accurate research is usually conducted over a period of time with different test subjects. To maintain validity across an entire study, you must take specific steps to ensure that gathered data has the same levels of accuracy. 

Consistency is crucial for maintaining validity in research. When researchers apply methods consistently and standardize the circumstances under which data is collected, validity can be maintained across the entire study.

Is there a need for validation of the research instrument before its implementation?

An essential part of validity is choosing the right research instrument or method for accurate results. Consider the thermometer that is reliable but still produces inaccurate results. You're unlikely to achieve research validity without activities like calibration, content, and construct validity.

  • Understanding research validity for more accurate results

Without validity, research can't provide the accuracy necessary to deliver a useful study. By getting a clear understanding of validity in research, you can take steps to improve your research skills and achieve more accurate results.

Should you be using a customer insights hub?

Do you want to discover previous research faster?

Do you share your research findings with others?

Do you analyze research data?

Start for free today, add your research, and get to key insights faster

Editor’s picks

Last updated: 18 April 2023

Last updated: 27 February 2023

Last updated: 6 February 2023

Last updated: 6 October 2023

Last updated: 5 February 2023

Last updated: 16 April 2023

Last updated: 9 March 2023

Last updated: 12 December 2023

Last updated: 11 March 2024

Last updated: 4 July 2024

Last updated: 6 March 2024

Last updated: 5 March 2024

Last updated: 13 May 2024

Latest articles

Related topics, .css-je19u9{-webkit-align-items:flex-end;-webkit-box-align:flex-end;-ms-flex-align:flex-end;align-items:flex-end;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-flex-direction:row;-ms-flex-direction:row;flex-direction:row;-webkit-box-flex-wrap:wrap;-webkit-flex-wrap:wrap;-ms-flex-wrap:wrap;flex-wrap:wrap;-webkit-box-pack:center;-ms-flex-pack:center;-webkit-justify-content:center;justify-content:center;row-gap:0;text-align:center;max-width:671px;}@media (max-width: 1079px){.css-je19u9{max-width:400px;}.css-je19u9>span{white-space:pre;}}@media (max-width: 799px){.css-je19u9{max-width:400px;}.css-je19u9>span{white-space:pre;}} decide what to .css-1kiodld{max-height:56px;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}@media (max-width: 1079px){.css-1kiodld{display:none;}} build next, decide what to build next.

research on valid

Users report unexpectedly high data usage, especially during streaming sessions.

research on valid

Users find it hard to navigate from the home page to relevant playlists in the app.

research on valid

It would be great to have a sleep timer feature, especially for bedtime listening.

research on valid

I need better filters to find the songs or artists I’m looking for.

Log in or sign up

Get started for free

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

Reliability vs. Validity in Research | Difference, Types and Examples

Published on July 3, 2019 by Fiona Middleton . Revised on June 22, 2023.

Reliability and validity are concepts used to evaluate the quality of research. They indicate how well a method , technique. or test measures something. Reliability is about the consistency of a measure, and validity is about the accuracy of a measure.opt

It’s important to consider reliability and validity when you are creating your research design , planning your methods, and writing up your results, especially in quantitative research . Failing to do so can lead to several types of research bias and seriously affect your work.

Reliability vs validity
Reliability Validity
What does it tell you? The extent to which the results can be reproduced when the research is repeated under the same conditions. The extent to which the results really measure what they are supposed to measure.
How is it assessed? By checking the consistency of results across time, across different observers, and across parts of the test itself. By checking how well the results correspond to established theories and other measures of the same concept.
How do they relate? A reliable measurement is not always valid: the results might be , but they’re not necessarily correct. A valid measurement is generally reliable: if a test produces accurate results, they should be reproducible.

Table of contents

Understanding reliability vs validity, how are reliability and validity assessed, how to ensure validity and reliability in your research, where to write about reliability and validity in a thesis, other interesting articles.

Reliability and validity are closely related, but they mean different things. A measurement can be reliable without being valid. However, if a measurement is valid, it is usually also reliable.

What is reliability?

Reliability refers to how consistently a method measures something. If the same result can be consistently achieved by using the same methods under the same circumstances, the measurement is considered reliable.

What is validity?

Validity refers to how accurately a method measures what it is intended to measure. If research has high validity, that means it produces results that correspond to real properties, characteristics, and variations in the physical or social world.

High reliability is one indicator that a measurement is valid. If a method is not reliable, it probably isn’t valid.

If the thermometer shows different temperatures each time, even though you have carefully controlled conditions to ensure the sample’s temperature stays the same, the thermometer is probably malfunctioning, and therefore its measurements are not valid.

However, reliability on its own is not enough to ensure validity. Even if a test is reliable, it may not accurately reflect the real situation.

Validity is harder to assess than reliability, but it is even more important. To obtain useful results, the methods you use to collect data must be valid: the research must be measuring what it claims to measure. This ensures that your discussion of the data and the conclusions you draw are also valid.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

research on valid

Reliability can be estimated by comparing different versions of the same measurement. Validity is harder to assess, but it can be estimated by comparing the results to other relevant data or theory. Methods of estimating reliability and validity are usually split up into different types.

Types of reliability

Different types of reliability can be estimated through various statistical methods.

Type of reliability What does it assess? Example
The consistency of a measure : do you get the same results when you repeat the measurement? A group of participants complete a designed to measure personality traits. If they repeat the questionnaire days, weeks or months apart and give the same answers, this indicates high test-retest reliability.
The consistency of a measure : do you get the same results when different people conduct the same measurement? Based on an assessment criteria checklist, five examiners submit substantially different results for the same student project. This indicates that the assessment checklist has low inter-rater reliability (for example, because the criteria are too subjective).
The consistency of : do you get the same results from different parts of a test that are designed to measure the same thing? You design a questionnaire to measure self-esteem. If you randomly split the results into two halves, there should be a between the two sets of results. If the two results are very different, this indicates low internal consistency.

Types of validity

The validity of a measurement can be estimated based on three main types of evidence. Each type can be evaluated through expert judgement or statistical methods.

Type of validity What does it assess? Example
The adherence of a measure to  of the concept being measured. A self-esteem questionnaire could be assessed by measuring other traits known or assumed to be related to the concept of self-esteem (such as social skills and ). Strong correlation between the scores for self-esteem and associated traits would indicate high construct validity.
The extent to which the measurement  of the concept being measured. A test that aims to measure a class of students’ level of Spanish contains reading, writing and speaking components, but no listening component.  Experts agree that listening comprehension is an essential aspect of language ability, so the test lacks content validity for measuring the overall level of ability in Spanish.
The extent to which the result of a measure corresponds to of the same concept. A is conducted to measure the political opinions of voters in a region. If the results accurately predict the later outcome of an election in that region, this indicates that the survey has high criterion validity.

To assess the validity of a cause-and-effect relationship, you also need to consider internal validity (the design of the experiment ) and external validity (the generalizability of the results).

The reliability and validity of your results depends on creating a strong research design , choosing appropriate methods and samples, and conducting the research carefully and consistently.

Ensuring validity

If you use scores or ratings to measure variations in something (such as psychological traits, levels of ability or physical properties), it’s important that your results reflect the real variations as accurately as possible. Validity should be considered in the very earliest stages of your research, when you decide how you will collect your data.

  • Choose appropriate methods of measurement

Ensure that your method and measurement technique are high quality and targeted to measure exactly what you want to know. They should be thoroughly researched and based on existing knowledge.

For example, to collect data on a personality trait, you could use a standardized questionnaire that is considered reliable and valid. If you develop your own questionnaire, it should be based on established theory or findings of previous studies, and the questions should be carefully and precisely worded.

  • Use appropriate sampling methods to select your subjects

To produce valid and generalizable results, clearly define the population you are researching (e.g., people from a specific age range, geographical location, or profession).  Ensure that you have enough participants and that they are representative of the population. Failing to do so can lead to sampling bias and selection bias .

Ensuring reliability

Reliability should be considered throughout the data collection process. When you use a tool or technique to collect data, it’s important that the results are precise, stable, and reproducible .

  • Apply your methods consistently

Plan your method carefully to make sure you carry out the same steps in the same way for each measurement. This is especially important if multiple researchers are involved.

For example, if you are conducting interviews or observations , clearly define how specific behaviors or responses will be counted, and make sure questions are phrased the same way each time. Failing to do so can lead to errors such as omitted variable bias or information bias .

  • Standardize the conditions of your research

When you collect your data, keep the circumstances as consistent as possible to reduce the influence of external factors that might create variation in the results.

For example, in an experimental setup, make sure all participants are given the same information and tested under the same conditions, preferably in a properly randomized setting. Failing to do so can lead to a placebo effect , Hawthorne effect , or other demand characteristics . If participants can guess the aims or objectives of a study, they may attempt to act in more socially desirable ways.

It’s appropriate to discuss reliability and validity in various sections of your thesis or dissertation or research paper . Showing that you have taken them into account in planning your research and interpreting the results makes your work more credible and trustworthy.

Reliability and validity in a thesis
Section Discuss
What have other researchers done to devise and improve methods that are reliable and valid?
How did you plan your research to ensure reliability and validity of the measures used? This includes the chosen sample set and size, sample preparation, external conditions and measuring techniques.
If you calculate reliability and validity, state these values alongside your main results.
This is the moment to talk about how reliable and valid your results actually were. Were they consistent, and did they reflect true values? If not, why not?
If reliability and validity were a big problem for your findings, it might be helpful to mention this here.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Degrees of freedom
  • Null hypothesis
  • Discourse analysis
  • Control groups
  • Mixed methods research
  • Non-probability sampling
  • Quantitative research
  • Ecological validity

Research bias

  • Rosenthal effect
  • Implicit bias
  • Cognitive bias
  • Selection bias
  • Negativity bias
  • Status quo bias

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Middleton, F. (2023, June 22). Reliability vs. Validity in Research | Difference, Types and Examples. Scribbr. Retrieved July 16, 2024, from https://www.scribbr.com/methodology/reliability-vs-validity/

Is this article helpful?

Fiona Middleton

Fiona Middleton

Other students also liked, what is quantitative research | definition, uses & methods, data collection | definition, methods & examples, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

research on valid

Validity & Reliability In Research

A Plain-Language Explanation (With Examples)

By: Derek Jansen (MBA) | Expert Reviewer: Kerryn Warren (PhD) | September 2023

Validity and reliability are two related but distinctly different concepts within research. Understanding what they are and how to achieve them is critically important to any research project. In this post, we’ll unpack these two concepts as simply as possible.

This post is based on our popular online course, Research Methodology Bootcamp . In the course, we unpack the basics of methodology  using straightfoward language and loads of examples. If you’re new to academic research, you definitely want to use this link to get 50% off the course (limited-time offer).

Overview: Validity & Reliability

  • The big picture
  • Validity 101
  • Reliability 101 
  • Key takeaways

First, The Basics…

First, let’s start with a big-picture view and then we can zoom in to the finer details.

Validity and reliability are two incredibly important concepts in research, especially within the social sciences. Both validity and reliability have to do with the measurement of variables and/or constructs – for example, job satisfaction, intelligence, productivity, etc. When undertaking research, you’ll often want to measure these types of constructs and variables and, at the simplest level, validity and reliability are about ensuring the quality and accuracy of those measurements .

As you can probably imagine, if your measurements aren’t accurate or there are quality issues at play when you’re collecting your data, your entire study will be at risk. Therefore, validity and reliability are very important concepts to understand (and to get right). So, let’s unpack each of them.

Free Webinar: Research Methodology 101

What Is Validity?

In simple terms, validity (also called “construct validity”) is all about whether a research instrument accurately measures what it’s supposed to measure .

For example, let’s say you have a set of Likert scales that are supposed to quantify someone’s level of overall job satisfaction. If this set of scales focused purely on only one dimension of job satisfaction, say pay satisfaction, this would not be a valid measurement, as it only captures one aspect of the multidimensional construct. In other words, pay satisfaction alone is only one contributing factor toward overall job satisfaction, and therefore it’s not a valid way to measure someone’s job satisfaction.

research on valid

Oftentimes in quantitative studies, the way in which the researcher or survey designer interprets a question or statement can differ from how the study participants interpret it . Given that respondents don’t have the opportunity to ask clarifying questions when taking a survey, it’s easy for these sorts of misunderstandings to crop up. Naturally, if the respondents are interpreting the question in the wrong way, the data they provide will be pretty useless . Therefore, ensuring that a study’s measurement instruments are valid – in other words, that they are measuring what they intend to measure – is incredibly important.

There are various types of validity and we’re not going to go down that rabbit hole in this post, but it’s worth quickly highlighting the importance of making sure that your research instrument is tightly aligned with the theoretical construct you’re trying to measure .  In other words, you need to pay careful attention to how the key theories within your study define the thing you’re trying to measure – and then make sure that your survey presents it in the same way.

For example, sticking with the “job satisfaction” construct we looked at earlier, you’d need to clearly define what you mean by job satisfaction within your study (and this definition would of course need to be underpinned by the relevant theory). You’d then need to make sure that your chosen definition is reflected in the types of questions or scales you’re using in your survey . Simply put, you need to make sure that your survey respondents are perceiving your key constructs in the same way you are. Or, even if they’re not, that your measurement instrument is capturing the necessary information that reflects your definition of the construct at hand.

If all of this talk about constructs sounds a bit fluffy, be sure to check out Research Methodology Bootcamp , which will provide you with a rock-solid foundational understanding of all things methodology-related. Remember, you can take advantage of our 60% discount offer using this link.

Need a helping hand?

research on valid

What Is Reliability?

As with validity, reliability is an attribute of a measurement instrument – for example, a survey, a weight scale or even a blood pressure monitor. But while validity is concerned with whether the instrument is measuring the “thing” it’s supposed to be measuring, reliability is concerned with consistency and stability . In other words, reliability reflects the degree to which a measurement instrument produces consistent results when applied repeatedly to the same phenomenon , under the same conditions .

As you can probably imagine, a measurement instrument that achieves a high level of consistency is naturally more dependable (or reliable) than one that doesn’t – in other words, it can be trusted to provide consistent measurements . And that, of course, is what you want when undertaking empirical research. If you think about it within a more domestic context, just imagine if you found that your bathroom scale gave you a different number every time you hopped on and off of it – you wouldn’t feel too confident in its ability to measure the variable that is your body weight 🙂

It’s worth mentioning that reliability also extends to the person using the measurement instrument . For example, if two researchers use the same instrument (let’s say a measuring tape) and they get different measurements, there’s likely an issue in terms of how one (or both) of them are using the measuring tape. So, when you think about reliability, consider both the instrument and the researcher as part of the equation.

As with validity, there are various types of reliability and various tests that can be used to assess the reliability of an instrument. A popular one that you’ll likely come across for survey instruments is Cronbach’s alpha , which is a statistical measure that quantifies the degree to which items within an instrument (for example, a set of Likert scales) measure the same underlying construct . In other words, Cronbach’s alpha indicates how closely related the items are and whether they consistently capture the same concept . 

Reliability reflects whether an instrument produces consistent results when applied to the same phenomenon, under the same conditions.

Recap: Key Takeaways

Alright, let’s quickly recap to cement your understanding of validity and reliability:

  • Validity is concerned with whether an instrument (e.g., a set of Likert scales) is measuring what it’s supposed to measure
  • Reliability is concerned with whether that measurement is consistent and stable when measuring the same phenomenon under the same conditions.

In short, validity and reliability are both essential to ensuring that your data collection efforts deliver high-quality, accurate data that help you answer your research questions . So, be sure to always pay careful attention to the validity and reliability of your measurement instruments when collecting and analysing data. As the adage goes, “rubbish in, rubbish out” – make sure that your data inputs are rock-solid.

Literature Review Course

Psst… there’s more!

This post is an extract from our bestselling short course, Methodology Bootcamp . If you want to work smart, you don't want to miss this .

Kennedy Sinkamba

THE MATERIAL IS WONDERFUL AND BENEFICIAL TO ALL STUDENTS.

THE MATERIAL IS WONDERFUL AND BENEFICIAL TO ALL STUDENTS AND I HAVE GREATLY BENEFITED FROM THE CONTENT.

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Reliability vs Validity in Research | Differences, Types & Examples

Reliability vs Validity in Research | Differences, Types & Examples

Published on 3 May 2022 by Fiona Middleton . Revised on 10 October 2022.

Reliability and validity are concepts used to evaluate the quality of research. They indicate how well a method , technique, or test measures something. Reliability is about the consistency of a measure, and validity is about the accuracy of a measure.

It’s important to consider reliability and validity when you are creating your research design , planning your methods, and writing up your results, especially in quantitative research .

Reliability vs validity
Reliability Validity
What does it tell you? The extent to which the results can be reproduced when the research is repeated under the same conditions. The extent to which the results really measure what they are supposed to measure.
How is it assessed? By checking the consistency of results across time, across different observers, and across parts of the test itself. By checking how well the results correspond to established theories and other measures of the same concept.
How do they relate? A reliable measurement is not always valid: the results might be reproducible, but they’re not necessarily correct. A valid measurement is generally reliable: if a test produces accurate results, they should be .

Table of contents

Understanding reliability vs validity, how are reliability and validity assessed, how to ensure validity and reliability in your research, where to write about reliability and validity in a thesis.

Reliability and validity are closely related, but they mean different things. A measurement can be reliable without being valid. However, if a measurement is valid, it is usually also reliable.

What is reliability?

Reliability refers to how consistently a method measures something. If the same result can be consistently achieved by using the same methods under the same circumstances, the measurement is considered reliable.

What is validity?

Validity refers to how accurately a method measures what it is intended to measure. If research has high validity, that means it produces results that correspond to real properties, characteristics, and variations in the physical or social world.

High reliability is one indicator that a measurement is valid. If a method is not reliable, it probably isn’t valid.

However, reliability on its own is not enough to ensure validity. Even if a test is reliable, it may not accurately reflect the real situation.

Validity is harder to assess than reliability, but it is even more important. To obtain useful results, the methods you use to collect your data must be valid: the research must be measuring what it claims to measure. This ensures that your discussion of the data and the conclusions you draw are also valid.

Prevent plagiarism, run a free check.

Reliability can be estimated by comparing different versions of the same measurement. Validity is harder to assess, but it can be estimated by comparing the results to other relevant data or theory. Methods of estimating reliability and validity are usually split up into different types.

Types of reliability

Different types of reliability can be estimated through various statistical methods.

Type of reliability What does it assess? Example
The consistency of a measure : do you get the same results when you repeat the measurement? A group of participants complete a designed to measure personality traits. If they repeat the questionnaire days, weeks, or months apart and give the same answers, this indicates high test-retest reliability.
The consistency of a measure : do you get the same results when different people conduct the same measurement? Based on an assessment criteria checklist, five examiners submit substantially different results for the same student project. This indicates that the assessment checklist has low inter-rater reliability (for example, because the criteria are too subjective).
The consistency of : do you get the same results from different parts of a test that are designed to measure the same thing? You design a questionnaire to measure self-esteem. If you randomly split the results into two halves, there should be a between the two sets of results. If the two results are very different, this indicates low internal consistency.

Types of validity

The validity of a measurement can be estimated based on three main types of evidence. Each type can be evaluated through expert judgement or statistical methods.

Type of validity What does it assess? Example
The adherence of a measure to  of the concept being measured. A self-esteem questionnaire could be assessed by measuring other traits known or assumed to be related to the concept of self-esteem (such as social skills and optimism). Strong correlation between the scores for self-esteem and associated traits would indicate high construct validity.
The extent to which the measurement  of the concept being measured. A test that aims to measure a class of students’ level of Spanish contains reading, writing, and speaking components, but no listening component.  Experts agree that listening comprehension is an essential aspect of language ability, so the test lacks content validity for measuring the overall level of ability in Spanish.
The extent to which the result of a measure corresponds to of the same concept. A is conducted to measure the political opinions of voters in a region. If the results accurately predict the later outcome of an election in that region, this indicates that the survey has high criterion validity.

To assess the validity of a cause-and-effect relationship, you also need to consider internal validity (the design of the experiment ) and external validity (the generalisability of the results).

The reliability and validity of your results depends on creating a strong research design , choosing appropriate methods and samples, and conducting the research carefully and consistently.

Ensuring validity

If you use scores or ratings to measure variations in something (such as psychological traits, levels of ability, or physical properties), it’s important that your results reflect the real variations as accurately as possible. Validity should be considered in the very earliest stages of your research, when you decide how you will collect your data .

  • Choose appropriate methods of measurement

Ensure that your method and measurement technique are of high quality and targeted to measure exactly what you want to know. They should be thoroughly researched and based on existing knowledge.

For example, to collect data on a personality trait, you could use a standardised questionnaire that is considered reliable and valid. If you develop your own questionnaire, it should be based on established theory or the findings of previous studies, and the questions should be carefully and precisely worded.

  • Use appropriate sampling methods to select your subjects

To produce valid generalisable results, clearly define the population you are researching (e.g., people from a specific age range, geographical location, or profession). Ensure that you have enough participants and that they are representative of the population.

Ensuring reliability

Reliability should be considered throughout the data collection process. When you use a tool or technique to collect data, it’s important that the results are precise, stable, and reproducible.

  • Apply your methods consistently

Plan your method carefully to make sure you carry out the same steps in the same way for each measurement. This is especially important if multiple researchers are involved.

For example, if you are conducting interviews or observations, clearly define how specific behaviours or responses will be counted, and make sure questions are phrased the same way each time.

  • Standardise the conditions of your research

When you collect your data, keep the circumstances as consistent as possible to reduce the influence of external factors that might create variation in the results.

For example, in an experimental setup, make sure all participants are given the same information and tested under the same conditions.

It’s appropriate to discuss reliability and validity in various sections of your thesis or dissertation or research paper. Showing that you have taken them into account in planning your research and interpreting the results makes your work more credible and trustworthy.

Reliability and validity in a thesis
Section Discuss
What have other researchers done to devise and improve methods that are reliable and valid?
How did you plan your research to ensure reliability and validity of the measures used? This includes the chosen sample set and size, sample preparation, external conditions, and measuring techniques.
If you calculate reliability and validity, state these values alongside your main results.
This is the moment to talk about how reliable and valid your results actually were. Were they consistent, and did they reflect true values? If not, why not?
If reliability and validity were a big problem for your findings, it might be helpful to mention this here.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Middleton, F. (2022, October 10). Reliability vs Validity in Research | Differences, Types & Examples. Scribbr. Retrieved 17 July 2024, from https://www.scribbr.co.uk/research-methods/reliability-or-validity/

Is this article helpful?

Fiona Middleton

Fiona Middleton

Other students also liked, the 4 types of validity | types, definitions & examples, a quick guide to experimental design | 5 steps & examples, sampling methods | types, techniques, & examples.

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

research on valid

Home Market Research

Reliability vs. Validity in Research: Types & Examples

Explore how reliability vs validity in research determines quality. Learn the differences and types + examples. Get insights!

When it comes to research, getting things right is crucial. That’s where the concepts of “Reliability vs Validity in Research” come in. 

Imagine it like a balancing act – making sure your measurements are consistent and accurate at the same time. This is where test-retest reliability, having different researchers check things, and keeping things consistent within your research plays a big role. 

As we dive into this topic, we’ll uncover the differences between reliability and validity, see how they work together, and learn how to use them effectively.

Understanding Reliability vs. Validity in Research

When it comes to collecting data and conducting research, two crucial concepts stand out: reliability and validity. 

These pillars uphold the integrity of research findings, ensuring that the data collected and the conclusions drawn are both meaningful and trustworthy. Let’s dive into the heart of the concepts, reliability, and validity, to comprehend their significance in the realm of research truly.

What is reliability?

Reliability refers to the consistency and dependability of the data collection process. It’s like having a steady hand that produces the same result each time it reaches for a task. 

In the research context, reliability is all about ensuring that if you were to repeat the same study using the same reliable measurement technique, you’d end up with the same results. It’s like having multiple researchers independently conduct the same experiment and getting outcomes that align perfectly.

Imagine you’re using a thermometer to measure the temperature of the water. You have a reliable measurement if you dip the thermometer into the water multiple times and get the same reading each time. This tells you that your method and measurement technique consistently produce the same results, whether it’s you or another researcher performing the measurement.

What is validity?

On the other hand, validity refers to the accuracy and meaningfulness of your data. It’s like ensuring that the puzzle pieces you’re putting together actually form the intended picture. When you have validity, you know that your method and measurement technique are consistent and capable of producing results aligned with reality.

Think of it this way; Imagine you’re conducting a test that claims to measure a specific trait, like problem-solving ability. If the test consistently produces results that accurately reflect participants’ problem-solving skills, then the test has high validity. In this case, the test produces accurate results that truly correspond to the trait it aims to measure.

In essence, while reliability assures you that your data collection process is like a well-oiled machine producing the same results, validity steps in to ensure that these results are not only consistent but also relevantly accurate. 

Together, these concepts provide researchers with the tools to conduct research that stands on a solid foundation of dependable methods and meaningful insights.

Types of Reliability

Let’s explore the various types of reliability that researchers consider to ensure their work stands on solid ground.

High test-retest reliability

Test-retest reliability involves assessing the consistency of measurements over time. It’s like taking the same measurement or test twice – once and then again after a certain period. If the results align closely, it indicates that the measurement is reliable over time. Think of it as capturing the essence of stability. 

Inter-rater reliability

When multiple researchers or observers are part of the equation, interrater reliability comes into play. This type of reliability assesses the level of agreement between different observers when evaluating the same phenomenon. It’s like ensuring that different pairs of eyes perceive things in a similar way. 

Internal reliability

Internal consistency dives into the harmony among different items within a measurement tool aiming to assess the same concept. This often comes into play in surveys or questionnaires, where participants respond to various items related to a single construct. If the responses to these items consistently reflect the same underlying concept, the measurement is said to have high internal consistency. 

Types of validity

Let’s explore the various types of validity that researchers consider to ensure their work stands on solid ground.

Content validity

It delves into whether a measurement truly captures all dimensions of the concept it intends to measure. It’s about making sure your measurement tool covers all relevant aspects comprehensively. 

Imagine designing a test to assess students’ understanding of a history chapter. It exhibits high content validity if the test includes questions about key events, dates, and causes. However, if it focuses solely on dates and omits causation, its content validity might be questionable.

Construct validity

It assesses how well a measurement aligns with established theories and concepts. It’s like ensuring that your measurement is a true representation of the abstract construct you’re trying to capture. 

Criterion validity

Criterion validity examines how well your measurement corresponds to other established measurements of the same concept. It’s about making sure your measurement accurately predicts or correlates with external criteria.

Differences between reliability and validity in research

Let’s delve into the differences between reliability and validity in research.

NoCategoryReliabilityValidity
01MeaningFocuses on the consistency of measurements over time and conditions.Concerns about the accuracy and relevance of measurements in capturing the intended concept.
02What it assessesAssesses whether the same results can be obtained consistently from repeated measurements.Assesses whether measurements truly measure what they are intended to measure.
03Assessment methodsEvaluated through test-retest consistency, interrater agreement, and internal consistency.Assessed through content coverage, construct alignment, and criterion correlation.
04InterrelationA measurement can be reliable (consistent) without being valid (accurate).A valid measurement is typically reliable, but high reliability doesn’t guarantee validity.
05ImportanceEnsures data consistency and replicabilityGuarantees meaningful and credible results.
06FocusFocuses on the stability and consistency of measurement outcomes.Focuses on the meaningfulness and accuracy of measurement outcomes.
07OutcomeReproducibility of measurements is the key outcome.Meaningful and accurate measurement outcomes are the primary goal.

While both reliability and validity contribute to trustworthy research, they address distinct aspects. Reliability ensures consistent results, while validity ensures accurate and relevant results that reflect the true nature of the measured concept.

Example of Reliability and Validity in Research

In this section, we’ll explore instances that highlight the differences between reliability and validity and how they play a crucial role in ensuring the credibility of research findings.

Example of reliability

Imagine you are studying the reliability of a smartphone’s battery life measurement. To collect data, you fully charge the phone and measure the battery life three times in the same controlled environment—same apps running, same brightness level, and same usage patterns. 

If the measurements consistently show a similar battery life duration each time you repeat the test, it indicates that your measurement method is reliable. The consistent results under the same conditions assure you that the battery life measurement can be trusted to provide dependable information about the phone’s performance.

Example of validity

Researchers collect data from a group of participants in a study aiming to assess the validity of a newly developed stress questionnaire. To ensure validity, they compare the scores obtained from the stress questionnaire with the participants’ actual stress levels measured using physiological indicators such as heart rate variability and cortisol levels. 

If participants’ scores correlate strongly with their physiological stress levels, the questionnaire is valid. This means the questionnaire accurately measures participants’ stress levels, and its results correspond to real variations in their physiological responses to stress. 

Validity assessed through the correlation between questionnaire scores and physiological measures ensures that the questionnaire is effectively measuring what it claims to measure participants’ stress levels.

In the world of research, differentiating between reliability and validity is crucial. Reliability ensures consistent results, while validity confirms accurate measurements. Using tools like QuestionPro enhances data collection for both reliability and validity. For instance, measuring self-esteem over time showcases reliability, and aligning questions with theories demonstrates validity. 

QuestionPro empowers researchers to achieve reliable and valid results through its robust features, facilitating credible research outcomes. Contact QuestionPro to create a free account or learn more!

LEARN MORE         FREE TRIAL

MORE LIKE THIS

As your relationships grow, you’ll find that people will come to you for a different perspective or creative way to solve a problem, and it spirals from there.

Winning the Internal CX Battles — Tuesday CX Thoughts

Jul 16, 2024

Knowledge Management

Knowledge Management: What it is, Types, and Use Cases

Jul 12, 2024

Response Weighting: Enhancing Accuracy in Your Surveys

Response Weighting: Enhancing Accuracy in Your Surveys

Jul 11, 2024

Knowledge Management Tools

Top 10 Knowledge Management Tools to Enhance Knowledge Flow

Jul 10, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Tuesday CX Thoughts (TCXT)
  • Uncategorized
  • What’s Coming Up
  • Workforce Intelligence

Validity In Psychology Research: Types & Examples

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

In psychology research, validity refers to the extent to which a test or measurement tool accurately measures what it’s intended to measure. It ensures that the research findings are genuine and not due to extraneous factors.

Validity can be categorized into different types based on internal and external validity .

The concept of validity was formulated by Kelly (1927, p. 14), who stated that a test is valid if it measures what it claims to measure. For example, a test of intelligence should measure intelligence and not something else (such as memory).

Internal and External Validity In Research

Internal validity refers to whether the effects observed in a study are due to the manipulation of the independent variable and not some other confounding factor.

In other words, there is a causal relationship between the independent and dependent variables .

Internal validity can be improved by controlling extraneous variables, using standardized instructions, counterbalancing, and eliminating demand characteristics and investigator effects.

External validity refers to the extent to which the results of a study can be generalized to other settings (ecological validity), other people (population validity), and over time (historical validity).

External validity can be improved by setting experiments more naturally and using random sampling to select participants.

Types of Validity In Psychology

Two main categories of validity are used to assess the validity of the test (i.e., questionnaire, interview, IQ test, etc.): Content and criterion.

  • Content validity refers to the extent to which a test or measurement represents all aspects of the intended content domain. It assesses whether the test items adequately cover the topic or concept.
  • Criterion validity assesses the performance of a test based on its correlation with a known external criterion or outcome. It can be further divided into concurrent (measured at the same time) and predictive (measuring future performance) validity.

table showing the different types of validity

Face Validity

Face validity is simply whether the test appears (at face value) to measure what it claims to. This is the least sophisticated measure of content-related validity, and is a superficial and subjective assessment based on appearance.

Tests wherein the purpose is clear, even to naïve respondents, are said to have high face validity. Accordingly, tests wherein the purpose is unclear have low face validity (Nevo, 1985).

A direct measurement of face validity is obtained by asking people to rate the validity of a test as it appears to them. This rater could use a Likert scale to assess face validity.

For example:

  • The test is extremely suitable for a given purpose
  • The test is very suitable for that purpose;
  • The test is adequate
  • The test is inadequate
  • The test is irrelevant and, therefore, unsuitable

It is important to select suitable people to rate a test (e.g., questionnaire, interview, IQ test, etc.). For example, individuals who actually take the test would be well placed to judge its face validity.

Also, people who work with the test could offer their opinion (e.g., employers, university administrators, employers). Finally, the researcher could use members of the general public with an interest in the test (e.g., parents of testees, politicians, teachers, etc.).

The face validity of a test can be considered a robust construct only if a reasonable level of agreement exists among raters.

It should be noted that the term face validity should be avoided when the rating is done by an “expert,” as content validity is more appropriate.

Having face validity does not mean that a test really measures what the researcher intends to measure, but only in the judgment of raters that it appears to do so. Consequently, it is a crude and basic measure of validity.

A test item such as “ I have recently thought of killing myself ” has obvious face validity as an item measuring suicidal cognitions and may be useful when measuring symptoms of depression.

However, the implication of items on tests with clear face validity is that they are more vulnerable to social desirability bias. Individuals may manipulate their responses to deny or hide problems or exaggerate behaviors to present a positive image of themselves.

It is possible for a test item to lack face validity but still have general validity and measure what it claims to measure. This is good because it reduces demand characteristics and makes it harder for respondents to manipulate their answers.

For example, the test item “ I believe in the second coming of Christ ” would lack face validity as a measure of depression (as the purpose of the item is unclear).

This item appeared on the first version of The Minnesota Multiphasic Personality Inventory (MMPI) and loaded on the depression scale.

Because most of the original normative sample of the MMPI were good Christians, only a depressed Christian would think Christ is not coming back. Thus, for this particular religious sample, the item does have general validity but not face validity.

Construct Validity

Construct validity assesses how well a test or measure represents and captures an abstract theoretical concept, known as a construct. It indicates the degree to which the test accurately reflects the construct it intends to measure, often evaluated through relationships with other variables and measures theoretically connected to the construct.

Construct validity was invented by Cronbach and Meehl (1955). This type of content-related validity refers to the extent to which a test captures a specific theoretical construct or trait, and it overlaps with some of the other aspects of validity

Construct validity does not concern the simple, factual question of whether a test measures an attribute.

Instead, it is about the complex question of whether test score interpretations are consistent with a nomological network involving theoretical and observational terms (Cronbach & Meehl, 1955).

To test for construct validity, it must be demonstrated that the phenomenon being measured actually exists. So, the construct validity of a test for intelligence, for example, depends on a model or theory of intelligence .

Construct validity entails demonstrating the power of such a construct to explain a network of research findings and to predict further relationships.

The more evidence a researcher can demonstrate for a test’s construct validity, the better. However, there is no single method of determining the construct validity of a test.

Instead, different methods and approaches are combined to present the overall construct validity of a test. For example, factor analysis and correlational methods can be used.

Convergent validity

Convergent validity is a subtype of construct validity. It assesses the degree to which two measures that theoretically should be related are related.

It demonstrates that measures of similar constructs are highly correlated. It helps confirm that a test accurately measures the intended construct by showing its alignment with other tests designed to measure the same or similar constructs.

For example, suppose there are two different scales used to measure self-esteem:

Scale A and Scale B. If both scales effectively measure self-esteem, then individuals who score high on Scale A should also score high on Scale B, and those who score low on Scale A should score similarly low on Scale B.

If the scores from these two scales show a strong positive correlation, then this provides evidence for convergent validity because it indicates that both scales seem to measure the same underlying construct of self-esteem.

Concurrent Validity (i.e., occurring at the same time)

Concurrent validity evaluates how well a test’s results correlate with the results of a previously established and accepted measure, when both are administered at the same time.

It helps in determining whether a new measure is a good reflection of an established one without waiting to observe outcomes in the future.

If the new test is validated by comparison with a currently existing criterion, we have concurrent validity.

Very often, a new IQ or personality test might be compared with an older but similar test known to have good validity already.

Predictive Validity

Predictive validity assesses how well a test predicts a criterion that will occur in the future. It measures the test’s ability to foresee the performance of an individual on a related criterion measured at a later point in time. It gauges the test’s effectiveness in predicting subsequent real-world outcomes or results.

For example, a prediction may be made on the basis of a new intelligence test that high scorers at age 12 will be more likely to obtain university degrees several years later. If the prediction is born out, then the test has predictive validity.

Cronbach, L. J., and Meehl, P. E. (1955) Construct validity in psychological tests. Psychological Bulletin , 52, 281-302.

Hathaway, S. R., & McKinley, J. C. (1943). Manual for the Minnesota Multiphasic Personality Inventory . New York: Psychological Corporation.

Kelley, T. L. (1927). Interpretation of educational measurements. New York : Macmillan.

Nevo, B. (1985). Face validity revisited . Journal of Educational Measurement , 22(4), 287-293.

Print Friendly, PDF & Email

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • Write for Us
  • BMJ Journals

You are here

  • Volume 18, Issue 3
  • Validity and reliability in quantitative studies
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • Roberta Heale 1 ,
  • Alison Twycross 2
  • 1 School of Nursing, Laurentian University , Sudbury, Ontario , Canada
  • 2 Faculty of Health and Social Care , London South Bank University , London , UK
  • Correspondence to : Dr Roberta Heale, School of Nursing, Laurentian University, Ramsey Lake Road, Sudbury, Ontario, Canada P3E2C6; rheale{at}laurentian.ca

https://doi.org/10.1136/eb-2015-102129

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Evidence-based practice includes, in part, implementation of the findings of well-conducted quality research studies. So being able to critique quantitative research is an important skill for nurses. Consideration must be given not only to the results of the study but also the rigour of the research. Rigour refers to the extent to which the researchers worked to enhance the quality of the studies. In quantitative research, this is achieved through measurement of the validity and reliability. 1

  • View inline

Types of validity

The first category is content validity . This category looks at whether the instrument adequately covers all the content that it should with respect to the variable. In other words, does the instrument cover the entire domain related to the variable, or construct it was designed to measure? In an undergraduate nursing course with instruction about public health, an examination with content validity would cover all the content in the course with greater emphasis on the topics that had received greater coverage or more depth. A subset of content validity is face validity , where experts are asked their opinion about whether an instrument measures the concept intended.

Construct validity refers to whether you can draw inferences about test scores related to the concept being studied. For example, if a person has a high score on a survey that measures anxiety, does this person truly have a high degree of anxiety? In another example, a test of knowledge of medications that requires dosage calculations may instead be testing maths knowledge.

There are three types of evidence that can be used to demonstrate a research instrument has construct validity:

Homogeneity—meaning that the instrument measures one construct.

Convergence—this occurs when the instrument measures concepts similar to that of other instruments. Although if there are no similar instruments available this will not be possible to do.

Theory evidence—this is evident when behaviour is similar to theoretical propositions of the construct measured in the instrument. For example, when an instrument measures anxiety, one would expect to see that participants who score high on the instrument for anxiety also demonstrate symptoms of anxiety in their day-to-day lives. 2

The final measure of validity is criterion validity . A criterion is any other instrument that measures the same variable. Correlations can be conducted to determine the extent to which the different instruments measure the same variable. Criterion validity is measured in three ways:

Convergent validity—shows that an instrument is highly correlated with instruments measuring similar variables.

Divergent validity—shows that an instrument is poorly correlated to instruments that measure different variables. In this case, for example, there should be a low correlation between an instrument that measures motivation and one that measures self-efficacy.

Predictive validity—means that the instrument should have high correlations with future criterions. 2 For example, a score of high self-efficacy related to performing a task should predict the likelihood a participant completing the task.

Reliability

Reliability relates to the consistency of a measure. A participant completing an instrument meant to measure motivation should have approximately the same responses each time the test is completed. Although it is not possible to give an exact calculation of reliability, an estimate of reliability can be achieved through different measures. The three attributes of reliability are outlined in table 2 . How each attribute is tested for is described below.

Attributes of reliability

Homogeneity (internal consistency) is assessed using item-to-total correlation, split-half reliability, Kuder-Richardson coefficient and Cronbach's α. In split-half reliability, the results of a test, or instrument, are divided in half. Correlations are calculated comparing both halves. Strong correlations indicate high reliability, while weak correlations indicate the instrument may not be reliable. The Kuder-Richardson test is a more complicated version of the split-half test. In this process the average of all possible split half combinations is determined and a correlation between 0–1 is generated. This test is more accurate than the split-half test, but can only be completed on questions with two answers (eg, yes or no, 0 or 1). 3

Cronbach's α is the most commonly used test to determine the internal consistency of an instrument. In this test, the average of all correlations in every combination of split-halves is determined. Instruments with questions that have more than two responses can be used in this test. The Cronbach's α result is a number between 0 and 1. An acceptable reliability score is one that is 0.7 and higher. 1 , 3

Stability is tested using test–retest and parallel or alternate-form reliability testing. Test–retest reliability is assessed when an instrument is given to the same participants more than once under similar circumstances. A statistical comparison is made between participant's test scores for each of the times they have completed it. This provides an indication of the reliability of the instrument. Parallel-form reliability (or alternate-form reliability) is similar to test–retest reliability except that a different form of the original instrument is given to participants in subsequent tests. The domain, or concepts being tested are the same in both versions of the instrument but the wording of items is different. 2 For an instrument to demonstrate stability there should be a high correlation between the scores each time a participant completes the test. Generally speaking, a correlation coefficient of less than 0.3 signifies a weak correlation, 0.3–0.5 is moderate and greater than 0.5 is strong. 4

Equivalence is assessed through inter-rater reliability. This test includes a process for qualitatively determining the level of agreement between two or more observers. A good example of the process used in assessing inter-rater reliability is the scores of judges for a skating competition. The level of consistency across all judges in the scores given to skating participants is the measure of inter-rater reliability. An example in research is when researchers are asked to give a score for the relevancy of each item on an instrument. Consistency in their scores relates to the level of inter-rater reliability of the instrument.

Determining how rigorously the issues of reliability and validity have been addressed in a study is an essential component in the critique of research as well as influencing the decision about whether to implement of the study findings into nursing practice. In quantitative studies, rigour is determined through an evaluation of the validity and reliability of the tools or instruments utilised in the study. A good quality research study will provide evidence of how all these factors have been addressed. This will help you to assess the validity and reliability of the research and help you decide whether or not you should apply the findings in your area of clinical practice.

  • Lobiondo-Wood G ,
  • Shuttleworth M
  • ↵ Laerd Statistics . Determining the correlation coefficient . 2013 . https://statistics.laerd.com/premium/pc/pearson-correlation-in-spss-8.php

Twitter Follow Roberta Heale at @robertaheale and Alison Twycross at @alitwy

Competing interests None declared.

Read the full text or download the PDF:

  • How it works

researchprospect post subheader

Reliability and Validity – Definitions, Types & Examples

Published by Alvin Nicolas at August 16th, 2021 , Revised On October 26, 2023

A researcher must test the collected data before making any conclusion. Every  research design  needs to be concerned with reliability and validity to measure the quality of the research.

What is Reliability?

Reliability refers to the consistency of the measurement. Reliability shows how trustworthy is the score of the test. If the collected data shows the same results after being tested using various methods and sample groups, the information is reliable. If your method has reliability, the results will be valid.

Example: If you weigh yourself on a weighing scale throughout the day, you’ll get the same results. These are considered reliable results obtained through repeated measures.

Example: If a teacher conducts the same math test of students and repeats it next week with the same questions. If she gets the same score, then the reliability of the test is high.

What is the Validity?

Validity refers to the accuracy of the measurement. Validity shows how a specific test is suitable for a particular situation. If the results are accurate according to the researcher’s situation, explanation, and prediction, then the research is valid. 

If the method of measuring is accurate, then it’ll produce accurate results. If a method is reliable, then it’s valid. In contrast, if a method is not reliable, it’s not valid. 

Example:  Your weighing scale shows different results each time you weigh yourself within a day even after handling it carefully, and weighing before and after meals. Your weighing machine might be malfunctioning. It means your method had low reliability. Hence you are getting inaccurate or inconsistent results that are not valid.

Example:  Suppose a questionnaire is distributed among a group of people to check the quality of a skincare product and repeated the same questionnaire with many groups. If you get the same response from various participants, it means the validity of the questionnaire and product is high as it has high reliability.

Most of the time, validity is difficult to measure even though the process of measurement is reliable. It isn’t easy to interpret the real situation.

Example:  If the weighing scale shows the same result, let’s say 70 kg each time, even if your actual weight is 55 kg, then it means the weighing scale is malfunctioning. However, it was showing consistent results, but it cannot be considered as reliable. It means the method has low reliability.

Internal Vs. External Validity

One of the key features of randomised designs is that they have significantly high internal and external validity.

Internal validity  is the ability to draw a causal link between your treatment and the dependent variable of interest. It means the observed changes should be due to the experiment conducted, and any external factor should not influence the  variables .

Example: age, level, height, and grade.

External validity  is the ability to identify and generalise your study outcomes to the population at large. The relationship between the study’s situation and the situations outside the study is considered external validity.

Also, read about Inductive vs Deductive reasoning in this article.

Looking for reliable dissertation support?

We hear you.

  • Whether you want a full dissertation written or need help forming a dissertation proposal, we can help you with both.
  • Get different dissertation services at ResearchProspect and score amazing grades!

Threats to Interval Validity

Threat Definition Example
Confounding factors Unexpected events during the experiment that are not a part of treatment. If you feel the increased weight of your experiment participants is due to lack of physical activity, but it was actually due to the consumption of coffee with sugar.
Maturation The influence on the independent variable due to passage of time. During a long-term experiment, subjects may feel tired, bored, and hungry.
Testing The results of one test affect the results of another test. Participants of the first experiment may react differently during the second experiment.
Instrumentation Changes in the instrument’s collaboration Change in the   may give different results instead of the expected results.
Statistical regression Groups selected depending on the extreme scores are not as extreme on subsequent testing. Students who failed in the pre-final exam are likely to get passed in the final exams; they might be more confident and conscious than earlier.
Selection bias Choosing comparison groups without randomisation. A group of trained and efficient teachers is selected to teach children communication skills instead of randomly selecting them.
Experimental mortality Due to the extension of the time of the experiment, participants may leave the experiment. Due to multi-tasking and various competition levels, the participants may leave the competition because they are dissatisfied with the time-extension even if they were doing well.

Threats of External Validity

Threat Definition Example
Reactive/interactive effects of testing The participants of the pre-test may get awareness about the next experiment. The treatment may not be effective without the pre-test. Students who got failed in the pre-final exam are likely to get passed in the final exams; they might be more confident and conscious than earlier.
Selection of participants A group of participants selected with specific characteristics and the treatment of the experiment may work only on the participants possessing those characteristics If an experiment is conducted specifically on the health issues of pregnant women, the same treatment cannot be given to male participants.

How to Assess Reliability and Validity?

Reliability can be measured by comparing the consistency of the procedure and its results. There are various methods to measure validity and reliability. Reliability can be measured through  various statistical methods  depending on the types of validity, as explained below:

Types of Reliability

Type of reliability What does it measure? Example
Test-Retests It measures the consistency of the results at different points of time. It identifies whether the results are the same after repeated measures. Suppose a questionnaire is distributed among a group of people to check the quality of a skincare product and repeated the same questionnaire with many groups. If you get the same response from a various group of participants, it means the validity of the questionnaire and product is high as it has high test-retest reliability.
Inter-Rater It measures the consistency of the results at the same time by different raters (researchers) Suppose five researchers measure the academic performance of the same student by incorporating various questions from all the academic subjects and submit various results. It shows that the questionnaire has low inter-rater reliability.
Parallel Forms It measures Equivalence. It includes different forms of the same test performed on the same participants. Suppose the same researcher conducts the two different forms of tests on the same topic and the same students. The tests could be written and oral tests on the same topic. If results are the same, then the parallel-forms reliability of the test is high; otherwise, it’ll be low if the results are different.
Inter-Term It measures the consistency of the measurement. The results of the same tests are split into two halves and compared with each other. If there is a lot of difference in results, then the inter-term reliability of the test is low.

Types of Validity

As we discussed above, the reliability of the measurement alone cannot determine its validity. Validity is difficult to be measured even if the method is reliable. The following type of tests is conducted for measuring validity. 

Type of reliability What does it measure? Example
Content validity It shows whether all the aspects of the test/measurement are covered. A language test is designed to measure the writing and reading skills, listening, and speaking skills. It indicates that a test has high content validity.
Face validity It is about the validity of the appearance of a test or procedure of the test. The type of   included in the question paper, time, and marks allotted. The number of questions and their categories. Is it a good question paper to measure the academic performance of students?
Construct validity It shows whether the test is measuring the correct construct (ability/attribute, trait, skill) Is the test conducted to measure communication skills is actually measuring communication skills?
Criterion validity It shows whether the test scores obtained are similar to other measures of the same concept. The results obtained from a prefinal exam of graduate accurately predict the results of the later final exam. It shows that the test has high criterion validity.

Does your Research Methodology Have the Following?

  • Great Research/Sources
  • Perfect Language
  • Accurate Sources

If not, we can help. Our panel of experts makes sure to keep the 3 pillars of Research Methodology strong.

Does your Research Methodology Have the Following?

How to Increase Reliability?

  • Use an appropriate questionnaire to measure the competency level.
  • Ensure a consistent environment for participants
  • Make the participants familiar with the criteria of assessment.
  • Train the participants appropriately.
  • Analyse the research items regularly to avoid poor performance.

How to Increase Validity?

Ensuring Validity is also not an easy job. A proper functioning method to ensure validity is given below:

  • The reactivity should be minimised at the first concern.
  • The Hawthorne effect should be reduced.
  • The respondents should be motivated.
  • The intervals between the pre-test and post-test should not be lengthy.
  • Dropout rates should be avoided.
  • The inter-rater reliability should be ensured.
  • Control and experimental groups should be matched with each other.

How to Implement Reliability and Validity in your Thesis?

According to the experts, it is helpful if to implement the concept of reliability and Validity. Especially, in the thesis and the dissertation, these concepts are adopted much. The method for implementation given below:

Segments Explanation
All the planning about reliability and validity will be discussed here, including the chosen samples and size and the techniques used to measure reliability and validity.
Please talk about the level of reliability and validity of your results and their influence on values.
Discuss the contribution of other researchers to improve reliability and validity.

Frequently Asked Questions

What is reliability and validity in research.

Reliability in research refers to the consistency and stability of measurements or findings. Validity relates to the accuracy and truthfulness of results, measuring what the study intends to. Both are crucial for trustworthy and credible research outcomes.

What is validity?

Validity in research refers to the extent to which a study accurately measures what it intends to measure. It ensures that the results are truly representative of the phenomena under investigation. Without validity, research findings may be irrelevant, misleading, or incorrect, limiting their applicability and credibility.

What is reliability?

Reliability in research refers to the consistency and stability of measurements over time. If a study is reliable, repeating the experiment or test under the same conditions should produce similar results. Without reliability, findings become unpredictable and lack dependability, potentially undermining the study’s credibility and generalisability.

What is reliability in psychology?

In psychology, reliability refers to the consistency of a measurement tool or test. A reliable psychological assessment produces stable and consistent results across different times, situations, or raters. It ensures that an instrument’s scores are not due to random error, making the findings dependable and reproducible in similar conditions.

What is test retest reliability?

Test-retest reliability assesses the consistency of measurements taken by a test over time. It involves administering the same test to the same participants at two different points in time and comparing the results. A high correlation between the scores indicates that the test produces stable and consistent results over time.

How to improve reliability of an experiment?

  • Standardise procedures and instructions.
  • Use consistent and precise measurement tools.
  • Train observers or raters to reduce subjective judgments.
  • Increase sample size to reduce random errors.
  • Conduct pilot studies to refine methods.
  • Repeat measurements or use multiple methods.
  • Address potential sources of variability.

What is the difference between reliability and validity?

Reliability refers to the consistency and repeatability of measurements, ensuring results are stable over time. Validity indicates how well an instrument measures what it’s intended to measure, ensuring accuracy and relevance. While a test can be reliable without being valid, a valid test must inherently be reliable. Both are essential for credible research.

Are interviews reliable and valid?

Interviews can be both reliable and valid, but they are susceptible to biases. The reliability and validity depend on the design, structure, and execution of the interview. Structured interviews with standardised questions improve reliability. Validity is enhanced when questions accurately capture the intended construct and when interviewer biases are minimised.

Are IQ tests valid and reliable?

IQ tests are generally considered reliable, producing consistent scores over time. Their validity, however, is a subject of debate. While they effectively measure certain cognitive skills, whether they capture the entirety of “intelligence” or predict success in all life areas is contested. Cultural bias and over-reliance on tests are also concerns.

Are questionnaires reliable and valid?

Questionnaires can be both reliable and valid if well-designed. Reliability is achieved when they produce consistent results over time or across similar populations. Validity is ensured when questions accurately measure the intended construct. However, factors like poorly phrased questions, respondent bias, and lack of standardisation can compromise their reliability and validity.

You May Also Like

A variable is a characteristic that can change and have more than one value, such as age, height, and weight. But what are the different types of variables?

Inductive and deductive reasoning takes into account assumptions and incidents. Here is all you need to know about inductive vs deductive reasoning.

Content analysis is used to identify specific words, patterns, concepts, themes, phrases, or sentences within the content in the recorded communication.

USEFUL LINKS

LEARNING RESOURCES

researchprospect-reviews-trust-site

COMPANY DETAILS

Research-Prospect-Writing-Service

  • How It Works

research on valid

Understanding Reliability and Validity

These related research issues ask us to consider whether we are studying what we think we are studying and whether the measures we use are consistent.

Reliability

Reliability is the extent to which an experiment, test, or any measuring procedure yields the same result on repeated trials. Without the agreement of independent observers able to replicate research procedures, or the ability to use research tools and procedures that yield consistent measurements, researchers would be unable to satisfactorily draw conclusions, formulate theories, or make claims about the generalizability of their research. In addition to its important role in research, reliability is critical for many parts of our lives, including manufacturing, medicine, and sports.

Reliability is such an important concept that it has been defined in terms of its application to a wide range of activities. For researchers, four key types of reliability are:

Equivalency Reliability

Equivalency reliability is the extent to which two items measure identical concepts at an identical level of difficulty. Equivalency reliability is determined by relating two sets of test scores to one another to highlight the degree of relationship or association. In quantitative studies and particularly in experimental studies, a correlation coefficient, statistically referred to as r , is used to show the strength of the correlation between a dependent variable (the subject under study), and one or more independent variables , which are manipulated to determine effects on the dependent variable. An important consideration is that equivalency reliability is concerned with correlational, not causal, relationships.

For example, a researcher studying university English students happened to notice that when some students were studying for finals, their holiday shopping began. Intrigued by this, the researcher attempted to observe how often, or to what degree, this these two behaviors co-occurred throughout the academic year. The researcher used the results of the observations to assess the correlation between studying throughout the academic year and shopping for gifts. The researcher concluded there was poor equivalency reliability between the two actions. In other words, studying was not a reliable predictor of shopping for gifts.

Stability Reliability

Stability reliability (sometimes called test, re-test reliability) is the agreement of measuring instruments over time. To determine stability, a measure or test is repeated on the same subjects at a future date. Results are compared and correlated with the initial test to give a measure of stability.

An example of stability reliability would be the method of maintaining weights used by the U.S. Bureau of Standards. Platinum objects of fixed weight (one kilogram, one pound, etc...) are kept locked away. Once a year they are taken out and weighed, allowing scales to be reset so they are "weighing" accurately. Keeping track of how much the scales are off from year to year establishes a stability reliability for these instruments. In this instance, the platinum weights themselves are assumed to have a perfectly fixed stability reliability.

Internal Consistency

Internal consistency is the extent to which tests or procedures assess the same characteristic, skill or quality. It is a measure of the precision between the observers or of the measuring instruments used in a study. This type of reliability often helps researchers interpret data and predict the value of scores and the limits of the relationship among variables.

For example, a researcher designs a questionnaire to find out about college students' dissatisfaction with a particular textbook. Analyzing the internal consistency of the survey items dealing with dissatisfaction will reveal the extent to which items on the questionnaire focus on the notion of dissatisfaction.

Interrater Reliability

Interrater reliability is the extent to which two or more individuals (coders or raters) agree. Interrater reliability addresses the consistency of the implementation of a rating system.

A test of interrater reliability would be the following scenario: Two or more researchers are observing a high school classroom. The class is discussing a movie that they have just viewed as a group. The researchers have a sliding rating scale (1 being most positive, 5 being most negative) with which they are rating the student's oral responses. Interrater reliability assesses the consistency of how the rating system is implemented. For example, if one researcher gives a "1" to a student response, while another researcher gives a "5," obviously the interrater reliability would be inconsistent. Interrater reliability is dependent upon the ability of two or more individuals to be consistent. Training, education and monitoring skills can enhance interrater reliability.

Related Information: Reliability Example

An example of the importance of reliability is the use of measuring devices in Olympic track and field events. For the vast majority of people, ordinary measuring rulers and their degree of accuracy are reliable enough. However, for an Olympic event, such as the discus throw, the slightest variation in a measuring device -- whether it is a tape, clock, or other device -- could mean the difference between the gold and silver medals. Additionally, it could mean the difference between a new world record and outright failure to qualify for an event. Olympic measuring devices, then, must be reliable from one throw or race to another and from one competition to another. They must also be reliable when used in different parts of the world, as temperature, air pressure, humidity, interpretation, or other variables might affect their readings.

Validity refers to the degree to which a study accurately reflects or assesses the specific concept that the researcher is attempting to measure. While reliability is concerned with the accuracy of the actual measuring instrument or procedure, validity is concerned with the study's success at measuring what the researchers set out to measure.

Researchers should be concerned with both external and internal validity. External validity refers to the extent to which the results of a study are generalizable or transferable. (Most discussions of external validity focus solely on generalizability; see Campbell and Stanley, 1966. We include a reference here to transferability because many qualitative research studies are not designed to be generalized.)

Internal validity refers to (1) the rigor with which the study was conducted (e.g., the study's design, the care taken to conduct measurements, and decisions concerning what was and wasn't measured) and (2) the extent to which the designers of a study have taken into account alternative explanations for any causal relationships they explore (Huitt, 1998). In studies that do not explore causal relationships, only the first of these definitions should be considered when assessing internal validity.

Scholars discuss several types of internal validity. For brief discussions of several types of internal validity, click on the items below:

Face Validity

Face validity is concerned with how a measure or procedure appears. Does it seem like a reasonable way to gain the information the researchers are attempting to obtain? Does it seem well designed? Does it seem as though it will work reliably? Unlike content validity, face validity does not depend on established theories for support (Fink, 1995).

Criterion Related Validity

Criterion related validity, also referred to as instrumental validity, is used to demonstrate the accuracy of a measure or procedure by comparing it with another measure or procedure which has been demonstrated to be valid.

For example, imagine a hands-on driving test has been shown to be an accurate test of driving skills. By comparing the scores on the written driving test with the scores from the hands-on driving test, the written test can be validated by using a criterion related strategy in which the hands-on driving test is compared to the written test.

Construct Validity

Construct validity seeks agreement between a theoretical concept and a specific measuring device or procedure. For example, a researcher inventing a new IQ test might spend a great deal of time attempting to "define" intelligence in order to reach an acceptable level of construct validity.

Construct validity can be broken down into two sub-categories: Convergent validity and discriminate validity. Convergent validity is the actual general agreement among ratings, gathered independently of one another, where measures should be theoretically related. Discriminate validity is the lack of a relationship among measures which theoretically should not be related.

To understand whether a piece of research has construct validity, three steps should be followed. First, the theoretical relationships must be specified. Second, the empirical relationships between the measures of the concepts must be examined. Third, the empirical evidence must be interpreted in terms of how it clarifies the construct validity of the particular measure being tested (Carmines & Zeller, p. 23).

Content Validity

Content Validity is based on the extent to which a measurement reflects the specific intended domain of content (Carmines & Zeller, 1991, p.20).

Content validity is illustrated using the following examples: Researchers aim to study mathematical learning and create a survey to test for mathematical skill. If these researchers only tested for multiplication and then drew conclusions from that survey, their study would not show content validity because it excludes other mathematical functions. Although the establishment of content validity for placement-type exams seems relatively straight-forward, the process becomes more complex as it moves into the more abstract domain of socio-cultural studies. For example, a researcher needing to measure an attitude like self-esteem must decide what constitutes a relevant domain of content for that attitude. For socio-cultural studies, content validity forces the researchers to define the very domains they are attempting to study.

Related Information: Validity Example

Many recreational activities of high school students involve driving cars. A researcher, wanting to measure whether recreational activities have a negative effect on grade point average in high school students, might conduct a survey asking how many students drive to school and then attempt to find a correlation between these two factors. Because many students might use their cars for purposes other than or in addition to recreation (e.g., driving to work after school, driving to school rather than walking or taking a bus), this research study might prove invalid. Even if a strong correlation was found between driving and grade point average, driving to school in and of itself would seem to be an invalid measure of recreational activity.

The challenges of achieving reliability and validity are among the most difficult faced by researchers. In this section, we offer commentaries on these challenges.

Difficulties of Achieving Reliability

It is important to understand some of the problems concerning reliability which might arise. It would be ideal to reliably measure, every time, exactly those things which we intend to measure. However, researchers can go to great lengths and make every attempt to ensure accuracy in their studies, and still deal with the inherent difficulties of measuring particular events or behaviors. Sometimes, and particularly in studies of natural settings, the only measuring device available is the researcher's own observations of human interaction or human reaction to varying stimuli. As these methods are ultimately subjective in nature, results may be unreliable and multiple interpretations are possible. Three of these inherent difficulties are quixotic reliability, diachronic reliability and synchronic reliability.

Quixotic reliability refers to the situation where a single manner of observation consistently, yet erroneously, yields the same result. It is often a problem when research appears to be going well. This consistency might seem to suggest that the experiment was demonstrating perfect stability reliability. This, however, would not be the case.

For example, if a measuring device used in an Olympic competition always read 100 meters for every discus throw, this would be an example of an instrument consistently, yet erroneously, yielding the same result. However, quixotic reliability is often more subtle in its occurrences than this. For example, suppose a group of German researchers doing an ethnographic study of American attitudes ask questions and record responses. Parts of their study might produce responses which seem reliable, yet turn out to measure felicitous verbal embellishments required for "correct" social behavior. Asking Americans, "How are you?" for example, would in most cases, elicit the token, "Fine, thanks." However, this response would not accurately represent the mental or physical state of the respondents.

Diachronic reliability refers to the stability of observations over time. It is similar to stability reliability in that it deals with time. While this type of reliability is appropriate to assess features that remain relatively unchanged over time, such as landscape benchmarks or buildings, the same level of reliability is more difficult to achieve with socio-cultural phenomena.

For example, in a follow-up study one year later of reading comprehension in a specific group of school children, diachronic reliability would be hard to achieve. If the test were given to the same subjects a year later, many confounding variables would have impacted the researchers' ability to reproduce the same circumstances present at the first test. The final results would almost assuredly not reflect the degree of stability sought by the researchers.

Synchronic reliability refers to the similarity of observations within the same time frame; it is not about the similarity of things observed. Synchronic reliability, unlike diachronic reliability, rarely involves observations of identical things. Rather, it concerns itself with particularities of interest to the research.

For example, a researcher studies the actions of a duck's wing in flight and the actions of a hummingbird's wing in flight. Despite the fact that the researcher is studying two distinctly different kinds of wings, the action of the wings and the phenomenon produced is the same.

Comments on a Flawed, Yet Influential Study

An example of the dangers of generalizing from research that is inconsistent, invalid, unreliable, and incomplete is found in the Time magazine article, "On A Screen Near You: Cyberporn" (De Witt, 1995). This article relies on a study done at Carnegie Mellon University to determine the extent and implications of online pornography. Inherent to the study are methodological problems of unqualified hypotheses and conclusions, unsupported generalizations and a lack of peer review.

Ignoring the functional problems that manifest themselves later in the study, it seems that there are a number of ethical problems within the article. The article claims to be an exhaustive study of pornography on the Internet, (it was anything but exhaustive), it resembles a case study more than anything else. Marty Rimm, author of the undergraduate paper that Time used as a basis for the article, claims the paper was an "exhaustive study" of online pornography when, in fact, the study based most of its conclusions about pornography on the Internet on the "descriptions of slightly more than 4,000 images" (Meeks, 1995, p. 1). Some USENET groups see hundreds of postings in a day.

Considering the thousands of USENET groups, 4,000 images no longer carries the authoritative weight that its author intended. The real problem is that the study (an undergraduate paper similar to a second-semester composition assignment) was based not on pornographic images themselves, but on the descriptions of those images. This kind of reduction detracts significantly from the integrity of the final claims made by the author. In fact, this kind of research is commensurate with doing a study of the content of pornographic movies based on the titles of the movies, then making sociological generalizations based on what those titles indicate. (This is obviously a problem with a number of types of validity, because Rimm is not studying what he thinks he is studying, but instead something quite different. )

The author of the Time article, Philip Elmer De Witt writes, "The research team at CMU has undertaken the first systematic study of pornography on the Information Superhighway" (Godwin, 1995, p. 1). His statement is problematic in at least three ways. First, the research team actually consisted of a few of Rimm's undergraduate friends with no methodological training whatsoever. Additionally, no mention of the degree of interrater reliability is made. Second, this systematic study is actually merely a "non-randomly selected subset of commercial bulletin-board systems that focus on selling porn" (Godwin, p. 6). As pornography vending is actually just a small part of the whole concerning the use of pornography on the Internet, the entire premise of this study's content validity is firmly called into question. Finally, the use of the term "Information Superhighway" is a false assessment of what in actuality is only a few USENET groups and BBSs (Bulletin Board System), which make up only a small fraction of the entire "Information Superhighway" traffic. Essentially, what is here is yet another violation of content validity.

De Witt is quoted as saying: "In an 18-month study, the team surveyed 917,410 sexually-explicit pictures, descriptions, short-stories and film clips. On those USENET newsgroups where digitized images are stored, 83.5 percent of the pictures were pornographic" (De Witt 40).

Statistically, some interesting contradictions arise. The figure 917,410 was taken from adult-oriented BBSs--none came from actual USENET groups or the Internet itself. This is a glaring discrepancy. Out of the 917,410 files, 212,114 are only descriptions (Hoffman & Novak, 1995, p.2). The question is, how many actual images did the "researchers" see?

"Between April and July 1994, the research team downloaded all available images (3,254)...the team encountered technical difficulties with 13 percent of these images...This left a total of 2,830 images for analysis" (p. 2). This means that out of 917,410 files discussed in this study, 914,580 of them were not even pictures! As for the 83.5 percent figure, this is actually based on "17 alt.binaries groups that Rimm considered pornographic" (p. 2).

In real terms, 17 USENET groups is a fraction of a percent of all USENET groups available. Worse yet, Time claimed that "...only about 3 percent of all messages on the USENET [represent pornographic material], while the USENET itself represents 11.5 percent of the traffic on the Internet" (De Witt, p. 40).

Time neglected to carry the interpretation of this data out to its logical conclusion, which is that less than half of 1 percent (3 percent of 11 percent) of the images on the Internet are associated with newsgroups that contain pornographic imagery. Furthermore, of this half percent, an unknown but even smaller percentage of the messages in newsgroups that are 'associated with pornographic imagery', actually contained pornographic material (Hoffman & Novak, p. 3).

Another blunder can be seen in the avoidance of peer-review, which suggests that there was some political interests being served in having the study become a Time cover story. Marty Rimm contracted the Georgetown Law Review and Time in an agreement to publish his study as long as they kept it under lock and key. During the months before publication, many interested scholars and professionals tried in vain to obtain a copy of the study in order to check it for flaws. De Witt justified not letting such peer-review take place, and also justified the reliability and validity of the study, on the grounds that because the Georgetown Law Review had accepted it, it was therefore reliable and valid, and needed no peer-review. What he didn't know, was that law reviews are not edited by professionals, but by "third year law students" (Godwin, p. 4).

There are many consequences of the failure to subject such a study to the scrutiny of peer review. If it was Rimm's desire to publish an article about on-line pornography in a manner that legitimized his article, yet escaped the kind of critical review the piece would have to undergo if published in a scholarly journal of computer-science, engineering, marketing, psychology, or communications. What better venue than a law journal? A law journal article would have the added advantage of being taken seriously by law professors, lawyers, and legally-trained policymakers. By virtue of where it appeared, it would automatically be catapulted into the center of the policy debate surrounding online censorship and freedom of speech (Godwin).

Herein lies the dangerous implication of such a study: Because the questions surrounding pornography are of such immediate political concern, the study was placed in the forefront of the U.S. domestic policy debate over censorship on the Internet, (an integral aspect of current anti-First Amendment legislation) with little regard for its validity or reliability.

On June 26, the day the article came out, Senator Grassley, (co-sponsor of the anti-porn bill, along with Senator Dole) began drafting a speech that was to be delivered that very day in the Senate, using the study as evidence. The same day, at the same time, Mike Godwin posted on WELL (Whole Earth 'Lectronic Link, a forum for professionals on the Internet) what turned out to be the overstatement of the year: "Philip's story is an utter disaster, and it will damage the debate about this issue because we will have to spend lots of time correcting misunderstandings that are directly attributable to the story" (Meeks, p. 7).

As Godwin was writing this, Senator Grassley was speaking to the Senate: "Mr. President, I want to repeat that: 83.5 percent of the 900,000 images reviewed--these are all on the Internet--are pornographic, according to the Carnegie-Mellon study" ( p. 7). Several days later, Senator Dole was waving the magazine in front of the Senate like a battle flag.

Donna Hoffman, professor at Vanderbilt University, summed up the dangerous political implications by saying, "The critically important national debate over First Amendment rights and restrictions of information on the Internet and other emerging media requires facts and informed opinion, not hysteria" (p.1).

In addition to the hysteria, Hoffman sees a plethora of other problems with the study. "Because the content analysis and classification scheme are 'black boxes,'" Hoffman said, "because no reliability and validity results are presented, because no statistical testing of the differences both within and among categories for different types of listings has been performed, and because not a single hypothesis has been tested, formally or otherwise, no conclusions should be drawn until the issues raised in this critique are resolved" (p. 4).

However, the damage has already been done. This questionable research by an undergraduate engineering major has been generalized to such an extent that even the U.S. Senate, and in particular Senators Grassley and Dole, have been duped, albeit through the strength of their own desires to see only what they wanted to see.

Annotated Bibliography

American Psychological Association. (1985). Standards for educational and psychological testing. Washington, DC: Author.

This work on focuses on reliability, validity and the standards that testers need to achieve in order to ensure accuracy.

Babbie, E.R. & Huitt, R.E. (1979). The practice of social research 2nd ed . Belmont, CA: Wadsworth Publishing.

An overview of social research and its applications.

Beauchamp, T. L., Faden, R.R., Wallace, Jr., R.J. & Walters, L . ( 1982). Ethical issues in social science research. Baltimore and London: The Johns Hopkins University Press.

A systematic overview of ethical issues in Social Science Research written by researchers with firsthand familiarity with the situations and problems researchers face in their work. This book raises several questions of how reliability and validity can be affected by ethics.

Borman, K.M. et al. (1986). Ethnographic and qualitative research design and why it doesn't work. American behavioral scientist 30 , 42-57.

The authors pose questions concerning threats to qualitative research and suggest solutions.

Bowen, K. A. (1996, Oct. 12). The sin of omission -punishable by death to internal validity: An argument for integration of quantitative research methods to strengthen internal validity. Available: http://trochim.human.cornell.edu/gallery/bowen/hss691.htm

An entire Web site that examines the merits of integrating qualitative and quantitative research methodologies through triangulation. The author argues that improving the internal validity of social science will be the result of such a union.

Brinberg, D. & McGrath, J.E. (1985). Validity and the research process . Beverly Hills: Sage Publications.

The authors investigate validity as value and propose the Validity Network Schema, a process by which researchers can infuse validity into their research.

Bussières, J-F. (1996, Oct.12). Reliability and validity of information provided by museum Web sites. Available: http://www.oise.on.ca/~jfbussieres/issue.html

This Web page examines the validity of museum Web sites which calls into question the validity of Web-based resources in general. Addresses the issue that all Websites should be examined with skepticism about the validity of the information contained within them.

Campbell, D. T. & Stanley, J.C. (1963). Experimental and quasi-experimental designs for research. Boston: Houghton Mifflin.

An overview of experimental research that includes pre-experimental designs, controls for internal validity, and tables listing sources of invalidity in quasi-experimental designs. Reference list and examples.

Carmines, E. G. & Zeller, R.A. (1991). Reliability and validity assessment . Newbury Park: Sage Publications.

An introduction to research methodology that includes classical test theory, validity, and methods of assessing reliability.

Carroll, K. M. (1995). Methodological issues and problems in the assessment of substance use. Psychological Assessment, Sep. 7 n3 , 349-58.

Discusses methodological issues in research involving the assessment of substance abuse. Introduces strategies for avoiding problems with the reliability and validity of methods.

Connelly, F. M. & Clandinin, D.J. (1990). Stories of experience and narrative inquiry. Educational Researcher 19:5 , 2-12.

A survey of narrative inquiry that outlines criteria, methods, and writing forms. It includes a discussion of risks and dangers in narrative studies, as well as a research agenda for curricula and classroom studies.

De Witt, P.E. (1995, July 3). On a screen near you: Cyberporn. Time, 38-45.

This is an exhaustive Carnegie Mellon study of online pornography by Marty Rimm, electrical engineering student.

Fink, A., ed. (1995). The survey Handbook, v.1 .Thousand Oaks, CA: Sage.

A guide to survey; this is the first in a series referred to as the "survey kit". It includes bibliograpgical references. Addresses survey design, analysis, reporting surveys and how to measure the validity and reliability of surveys.

Fink, A., ed. (1995). How to measure survey reliability and validity v. 7 . Thousand Oaks, CA: Sage.

This volume seeks to select and apply reliability criteria and select and apply validity criteria. The fundamental principles of scaling and scoring are considered.

Godwin, M. (1995, July). JournoPorn, dissection of the Time article. Available: http://www.hotwired.com

A detailed critique of Time magazine's Cyberporn , outlining flaws of methodology as well as exploring the underlying assumptions of the article.

Hambleton, R.K. & Zaal, J.N., eds. (1991). Advances in educational and psychological testing . Boston: Kluwer Academic.

Information on the concepts of reliability and validity in psychology and education.

Harnish, D.L. (1992). Human judgment and the logic of evidence: A critical examination of research methods in special education transition literature . In D.L. Harnish et al. eds., Selected readings in transition.

This article investigates threats to validity in special education research.

Haynes, N. M. (1995). How skewed is 'the bell curve'? Book Product Reviews . 1-24.

This paper claims that R.J. Herrnstein and C. Murray's The Bell Curve: Intelligence and Class Structure in American Life does not have scientific merit and claims that the bell curve is an unreliable measure of intelligence.

Healey, J. F. (1993). Statistics: A tool for social research, 3rd ed . Belmont: Wadsworth Publishing.

Inferential statistics, measures of association, and multivariate techniques in statistical analysis for social scientists are addressed.

Helberg, C. (1996, Oct.12). Pitfalls of data analysis (or how to avoid lies and damned lies). Available: http//maddog/fammed.wisc.edu/pitfalls/

A discussion of things researchers often overlook in their data analysis and how statistics are often used to skew reliability and validity for the researchers purposes.

Hoffman, D. L. and Novak, T.P. (1995, July). A detailed critique of the Time article: Cyberporn. Available: http://www.hotwired.com

A methodological critique of the Time article that uncovers some of the fundamental flaws in the statistics and the conclusions made by De Witt.

Huitt, William G. (1998). Internal and External Validity . http://www.valdosta.peachnet.edu/~whuitt/psy702/intro/valdgn.html

A Web document addressing key issues of external and internal validity.

Jones, J. E. & Bearley, W.L. (1996, Oct 12). Reliability and validity of training instruments. Organizational Universe Systems. Available: http://ous.usa.net/relval.htm

The authors discuss the reliability and validity of training design in a business setting. Basic terms are defined and examples provided.

Cultural Anthropology Methods Journal. (1996, Oct. 12). Available: http://www.lawrence.edu/~bradleyc/cam.html

An online journal containing articles on the practical application of research methods when conducting qualitative and quantitative research. Reliability and validity are addressed throughout.

Kirk, J. & Miller, M. M. (1986). Reliability and validity in qualitative research. Beverly Hills: Sage Publications.

This text describes objectivity in qualitative research by focusing on the issues of validity and reliability in terms of their limitations and applicability in the social and natural sciences.

Krakower, J. & Niwa, S. (1985). An assessment of validity and reliability of the institutinal perfarmance survey . Boulder, CO: National center for higher education management systems.

Educational surveys and higher education research and the efeectiveness of organization.

Lauer, J. M. & Asher, J.W. (1988). Composition Research. New York: Oxford University Press.

A discussion of empirical designs in the context of composition research as a whole.

Laurent, J. et al. (1992, Mar.) Review of validity research on the stanford-binet intelligence scale: 4th Ed. Psychological Assessment . 102-112.

This paper looks at the results of construct and criterion- related validity studies to determine if the SB:FE is a valid measure of intelligence.

LeCompte, M. D., Millroy, W.L., & Preissle, J. eds. (1992). The handbook of qualitative research in education. San Diego: Academic Press.

A compilation of the range of methodological and theoretical qualitative inquiry in the human sciences and education research. Numerous contributing authors apply their expertise to discussing a wide variety of issues pertaining to educational and humanities research as well as suggestions about how to deal with problems when conducting research.

McDowell, I. & Newell, C. (1987). Measuring health: A guide to rating scales and questionnaires . New York: Oxford University Press.

This gives a variety of examples of health measurement techniques and scales and discusses the validity and reliability of important health measures.

Meeks, B. (1995, July). Muckraker: How Time failed. Available: http://www.hotwired.com

A step-by-step outline of the events which took place during the researching, writing, and negotiating of the Time article of 3 July, 1995 titled: On A Screen Near You: Cyberporn .

Merriam, S. B. (1995). What can you tell from an N of 1?: Issues of validity and reliability in qualitative research. Journal of Lifelong Learning v4 , 51-60.

Addresses issues of validity and reliability in qualitative research for education. Discusses philosophical assumptions underlying the concepts of internal validity, reliability, and external validity or generalizability. Presents strategies for ensuring rigor and trustworthiness when conducting qualitative research.

Morris, L.L, Fitzgibbon, C.T., & Lindheim, E. (1987). How to measure performance and use tests. In J.L. Herman (Ed.), Program evaluation kit (2nd ed.). Newbury Park, CA: Sage.

Discussion of reliability and validity as it pertyains to measuring students' performance.

Murray, S., et al. (1979, April). Technical issues as threats to internal validity of experimental and quasi-experimental designs. San Francisco: University of California. 8-12.

(From Yang et al. bibliography--unavailable as of this writing.)

Russ-Eft, D. F. (1980). Validity and reliability in survey research. American Institutes for Research in the Behavioral Sciences August , 227 151.

An investigation of validity and reliability in survey research with and overview of the concepts of reliability and validity. Specific procedures for measuring sources of error are suggested as well as general suggestions for improving the reliability and validity of survey data. A extensive annotated bibliography is provided.

Ryser, G. R. (1994). Developing reliable and valid authentic assessments for the classroom: Is it possible? Journal of Secondary Gifted Education Fall, v6 n1 , 62-66.

Defines the meanings of reliability and validity as they apply to standardized measures of classroom assessment. This article defines reliability as scorability and stability and validity is seen as students' ability to use knowledge authentically in the field.

Schmidt, W., et al. (1982). Validity as a variable: Can the same certification test be valid for all students? Institute for Research on Teaching July, ED 227 151.

A technical report that presents specific criteria for judging content, instructional and curricular validity as related to certification tests in education.

Scholfield, P. (1995). Quantifying language. A researcher's and teacher's guide to gathering language data and reducing it to figures . Bristol: Multilingual Matters.

A guide to categorizing, measuring, testing, and assessing aspects of language. A source for language-related practitioners and researchers in conjunction with other resources on research methods and statistics. Questions of reliability, and validity are also explored.

Scriven, M. (1993). Hard-Won Lessons in Program Evaluation . San Francisco: Jossey-Bass Publishers.

A common sense approach for evaluating the validity of various educational programs and how to address specific issues facing evaluators.

Shou, P. (1993, Jan.). The singer loomis inventory of personality: A review and critique. [Paper presented at the Annual Meeting of the Southwest Educational Research Association.]

Evidence for reliability and validity are reviewed. A summary evaluation suggests that SLIP (developed by two Jungian analysts to allow examination of personality from the perspective of Jung's typology) appears to be a useful tool for educators and counselors.

Sutton, L.R. (1992). Community college teacher evaluation instrument: A reliability and validity study . Diss. Colorado State University.

Studies of reliability and validity in occupational and educational research.

Thompson, B. & Daniel, L.G. (1996, Oct.). Seminal readings on reliability and validity: A "hit parade" bibliography. Educational and psychological measurement v. 56 , 741-745.

Editorial board members of Educational and Psychological Measurement generated bibliography of definitive publications of measurement research. Many articles are directly related to reliability and validity.

Thompson, E. Y., et al. (1995). Overview of qualitative research . Diss. Colorado State University.

A discussion of strengths and weaknesses of qualitative research and its evolution and adaptation. Appendices and annotated bibliography.

Traver, C. et al. (1995). Case Study . Diss. Colorado State University.

This presentation gives an overview of case study research, providing definitions and a brief history and explanation of how to design research.

Trochim, William M. K. (1996) External validity. (. Available: http://trochim.human.cornell.edu/kb/EXTERVAL.htm

A comprehensive treatment of external validity found in William Trochim's online text about research methods and issues.

Trochim, William M. K. (1996) Introduction to validity. (. Available: hhttp://trochim.human.cornell.edu/kb/INTROVAL.htm

An introduction to validity found in William Trochim's online text about research methods and issues.

Trochim, William M. K. (1996) Reliability. (. Available: http://trochim.human.cornell.edu/kb/reltypes.htm

A comprehensive treatment of reliability found in William Trochim's online text about research methods and issues.

Validity. (1996, Oct. 12). Available: http://vislab-www.nps.navy.mil/~haga/validity.html

A source for definitions of various forms and types of reliability and validity.

Vinsonhaler, J. F., et al. (1983, July). Improving diagnostic reliability in reading through training. Institute for Research on Teaching ED 237 934.

This technical report investigates the practical application of a program intended to improve the diagnoses of reading deficient students. Here, reliability is assumed and a pragmatic answer to a specific educational problem is suggested as a result.

Wentland, E. J. & Smith, K.W. (1993). Survey responses: An evaluation of their validity . San Diego: Academic Press.

This book looks at the factors affecting response validity (or the accuracy of self-reports in surveys) and provides several examples with varying accuracy levels.

Wiget, A. (1996). Father juan greyrobe: Reconstructing tradition histories, and the reliability and validity of uncorroborated oral tradition. Ethnohistory 43:3 , 459-482.

This paper presents a convincing argument for the validity of oral histories in ethnographic research where at least some of the evidence can be corroborated through written records.

Yang, G. H., et al. (1995). Experimental and quasi-experimental educational research . Diss. Colorado State University.

This discussion defines experimentation and considers the rhetorical issues and advantages and disadvantages of experimental research. Annotated bibliography.

Yarroch, W. L. (1991, Sept.). The Implications of content versus validity on science tests. Journal of Research in Science Teaching , 619-629.

The use of content validity as the primary assurance of the measurement accuracy for science assessment examinations is questioned. An alternative accuracy measure, item validity, is proposed to look at qualitative comparisons between different factors.

Yin, R. K. (1989). Case study research: Design and methods . London: Sage Publications.

This book discusses the design process of case study research, including collection of evidence, composing the case study report, and designing single and multiple case studies.

Related Links

Internal Validity Tutorial. An interactive tutorial on internal validity.

http://server.bmod.athabascau.ca/html/Validity/index.shtml

Howell, Jonathan, Paul Miller, Hyun Hee Park, Deborah Sattler, Todd Schack, Eric Spery, Shelley Widhalm, & Mike Palmquist. (2005). Reliability and Validity. Writing@CSU . Colorado State University. https://writing.colostate.edu/guides/guide.cfm?guideid=66

research on valid

Validity vs. Reliability in Research: What's the Difference?

research on valid

Introduction

What is the difference between reliability and validity in a study, what is an example of reliability and validity, how to ensure validity and reliability in your research, critiques of reliability and validity.

In research, validity and reliability are crucial for producing robust findings. They provide a foundation that assures scholars, practitioners, and readers alike that the research's insights are both accurate and consistent. However, the nuanced nature of qualitative data often blurs the lines between these concepts, making it imperative for researchers to discern their distinct roles.

This article seeks to illuminate the intricacies of reliability and validity, highlighting their significance and distinguishing their unique attributes. By understanding these critical facets, qualitative researchers can ensure their work not only resonates with authenticity but also trustworthiness.

research on valid

In the domain of research, whether qualitative or quantitative , two concepts often arise when discussing the quality and rigor of a study: reliability and validity . These two terms, while interconnected, have distinct meanings that hold significant weight in the world of research.

Reliability, at its core, speaks to the consistency of a study. If a study or test measures the same concept repeatedly and yields the same results, it demonstrates a high degree of reliability. A common method for assessing reliability is through internal consistency reliability, which checks if multiple items that measure the same concept produce similar scores.

Another method often used is inter-rater reliability , which gauges the consistency of scores given by different raters. This approach is especially amenable to qualitative research , and it can help researchers assess the clarity of their code system and the consistency of their codings . For a study to be more dependable, it's imperative to ensure a sufficient measurement of reliability is achieved.

On the other hand, validity is concerned with accuracy. It looks at whether a study truly measures what it claims to. Within the realm of validity, several types exist. Construct validity, for instance, verifies that a study measures the intended abstract concept or underlying construct. If a research aims to measure self-esteem and accurately captures this abstract trait, it demonstrates strong construct validity.

Content validity ensures that a test or study comprehensively represents the entire domain of the concept it seeks to measure. For instance, if a test aims to assess mathematical ability, it should cover arithmetic, algebra, geometry, and more to showcase strong content validity.

Criterion validity is another form of validity that ensures that the scores from a test correlate well with a measure from a related outcome. A subset of this is predictive validity, which checks if the test can predict future outcomes. For instance, if an aptitude test can predict future job performance, it can be said to have high predictive validity.

The distinction between reliability and validity becomes clear when one considers the nature of their focus. While reliability is concerned with consistency and reproducibility, validity zeroes in on accuracy and truthfulness.

A research tool can be reliable without being valid. For instance, faulty instrument measures might consistently give bad readings (reliable but not valid). Conversely, in discussions about test reliability, the same test measure administered multiple times could sometimes hit the mark and at other times miss it entirely, producing different test scores each time. This would make it valid in some instances but not reliable.

For a study to be robust, it must achieve both reliability and validity. Reliability ensures the study's findings are reproducible while validity confirms that it accurately represents the phenomena it claims to. Ensuring both in a study means the results are both dependable and accurate, forming a cornerstone for high-quality research.

research on valid

Efficient, easy data analysis with ATLAS.ti

Start analyzing data quickly and more deeply with ATLAS.ti. Download a free trial today.

Understanding the nuances of reliability and validity becomes clearer when contextualized within a real-world research setting. Imagine a qualitative study where a researcher aims to explore the experiences of teachers in urban schools concerning classroom management. The primary method of data collection is semi-structured interviews .

To ensure the reliability of this qualitative study, the researcher crafts a consistent list of open-ended questions for the interview. This ensures that, while each conversation might meander based on the individual’s experiences, there remains a core set of topics related to classroom management that every participant addresses.

The essence of reliability in this context isn't necessarily about garnering identical responses but rather about achieving a consistent approach to data collection and subsequent interpretation . As part of this commitment to reliability, two researchers might independently transcribe and analyze a subset of these interviews. If they identify similar themes and patterns in their independent analyses, it suggests a consistent interpretation of the data, showcasing inter-rater reliability .

Validity , on the other hand, is anchored in ensuring that the research genuinely captures and represents the lived experiences and sentiments of teachers concerning classroom management. To establish content validity, the list of interview questions is thoroughly reviewed by a panel of educational experts. Their feedback ensures that the questions encompass the breadth of issues and concerns related to classroom management in urban school settings.

As the interviews are conducted, the researcher pays close attention to the depth and authenticity of responses. After the interviews, member checking could be employed, where participants review the researcher's interpretation of their responses to ensure that their experiences and perspectives have been accurately captured. This strategy helps in affirming the study's construct validity, ensuring that the abstract concept of "experiences with classroom management" has been truthfully and adequately represented.

In this example, we can see that while the interview study is rooted in qualitative methods and subjective experiences, the principles of reliability and validity can still meaningfully inform the research process. They serve as guides to ensure the research's findings are both dependable and genuinely reflective of the participants' experiences.

Ensuring validity and reliability in research, irrespective of its qualitative or quantitative nature, is pivotal to producing results that are both trustworthy and robust. Here's how you can integrate these concepts into your study to ensure its rigor:

Reliability is about consistency. One of the most straightforward ways to gauge it in quantitative research is using test-retest reliability. It involves administering the same test to the same group of participants on two separate occasions and then comparing the results.

A high degree of similarity between the two sets of results indicates good reliability. This can often be measured using a correlation coefficient, where a value closer to 1 indicates a strong positive consistency between the two test iterations.

Validity, on the other hand, ensures that the research genuinely measures what it intends to. There are various forms of validity to consider. Convergent validity ensures that two measures of the same construct or those that should theoretically be related, are indeed correlated. For example, two different measures assessing self-esteem should show similar results for the same group, highlighting that they are measuring the same underlying construct.

Face validity is the most basic form of validity and is gauged by the sheer appearance of the measurement tool. If, at face value, a test seems like it measures what it claims to, it has face validity. This is often the first step and is usually followed by more rigorous forms of validity testing.

Criterion-related validity, a subtype of the previously discussed criterion validity, evaluates how well the outcomes of a particular test or measurement correlate with another related measure. For example, if a new tool is developed to measure reading comprehension, its results can be compared with those of an established reading comprehension test to assess its criterion-related validity. If the results show a strong correlation, it's a sign that the new tool is valid.

Ensuring both validity and reliability requires deliberate planning, meticulous testing, and constant reflection on the study's methods and results. This might involve using established scales or measures with proven validity and reliability, conducting pilot studies to refine measurement tools, and always staying cognizant of the fact that these two concepts are important considerations for research robustness.

While reliability and validity are foundational concepts in many traditional research paradigms, they have not escaped scrutiny, especially from critical and poststructuralist perspectives. These critiques often arise from the fundamental philosophical differences in how knowledge, truth, and reality are perceived and constructed.

From a poststructuralist viewpoint, the very pursuit of a singular "truth" or an objective reality is questionable. In such a perspective, multiple truths exist, each shaped by its own socio-cultural, historical, and individual contexts.

Reliability, with its emphasis on consistent replication, might then seem at odds with this understanding. If truths are multiple and shifting, how can consistency across repeated measures or observations be a valid measure of anything other than the research instrument's stability?

Validity, too, faces critique. In seeking to ensure that a study measures what it purports to measure, there's an implicit assumption of an observable, knowable reality. Poststructuralist critiques question this foundation, arguing that reality is too fluid, multifaceted, and influenced by power dynamics to be pinned down by any singular measurement or representation.

Moreover, the very act of determining "validity" often requires an external benchmark or "gold standard." This brings up the issue of who determines this standard and the power dynamics and potential biases inherent in such decisions.

Another point of contention is the way these concepts can inadvertently prioritize certain forms of knowledge over others. For instance, privileging research that meets stringent reliability and validity criteria might marginalize more exploratory, interpretive, or indigenous research methods. These methods, while offering deep insights, might not align neatly with traditional understandings of reliability and validity, potentially relegating them to the periphery of "accepted" knowledge production.

To be sure, reliability and validity serve as guiding principles in many research approaches. However, it's essential to recognize their limitations and the critiques posed by alternative epistemologies. Engaging with these critiques doesn't diminish the value of reliability and validity but rather enriches our understanding of the multifaceted nature of knowledge and the complexities of its pursuit.

research on valid

A rigorous research process begins with ATLAS.ti

Download a free trial of our powerful data analysis software to make the most of your research.

research on valid

Democracy and Me

What Makes Valid Research? How to Verify if a Source is Credible on the Internet

January 28, 2019 David Childs Democracy & Me Blog , The Role Of Media 57

research on valid

By Dr. David Childs, Ph.D. Northern Kentucky University Introduction Computer and digital technology has increased at an astounding rate within the last several decades. With the advent of various informational Internet resources such as social media, online articles, books and so forth many people purport to do thorough research, but lack the understanding of what research means. The advent of search engines has given everyone the illusion that they have done research and are experts on a particular topic. In reality, people simply pull information from unreliable sources, thinking that they have researched a topic thoroughly. What makes a source not reliable? What makes certain information unreliable and untrustworthy? This article will offer information and resources to help people be able to differentiate between what is a valid source of knowledge and what is not. What is research? Research should involve a thorough reading and analysis of an adequate number of sources on a given subject. One does not have to have a college degree to do research. But the proper time should be devoted in order to draw valid conclusions that can be held up as reliable research. As a side note, some information cannot be obtained without proper research methodologies and even research tools. Examples of this is research in the natural sciences such as biology, chemistry or physics, or in the social sciences in areas such as history, economics or sociology. With the hard sciences one must conduct countless experiments to arrive at certain conclusions that cannot be obtained by simply reading a lot of Internet articles and watching videos. Furthermore, to do valid historical work one must study many reliable primary sources or conduct countless interviews with people who were present during a certain time period the historian is studying. So in this way, valid natural or social science experiments cannot be replaced by reading a few articles on the Internet. At the very least, one can read the work of experts who have devoted their life to research in a particular subject. Teachers in K-12 schools often have not spent their lives conducting research in their field (Of course there are many exceptions to this). Even though some teachers may not be researchers, they have devoted their lives to studying, reading and mastering their content. In this way, a middle school science teacher (for example) can read thoroughly within a certain discipline and gain a wide enough knowledge base on a topic to become a reliable source of information and somewhat of an expert. The knowledge they have gained was achieved through much time and effort. There is no shortcut for conducting research on a topic thoroughly and adequately. In contemporary times, when many individuals do research, their primary means of gathering information is through the Internet. The Internet can be a great resource for gathering information, problems arise when people cannot differentiate between reliable and unreliable sources. Below are some key components that one should consider when trying to verify if an online source is credible. How to Find Reliable Information on the Internet 1) Identify the source of the information and determine whether it is reliable and credible. A good starting point for this is to identify the name of the writer and or the organization from which the source was derived. Is the source reputable and reliable? Is the person or organization a respected authority on the subject matter? What makes a person or organization an authority on a particular topic? It has become very easy to publish information on the Internet and as a result there are many people purporting to be an expert in a particular field that are not qualified to write on that topic. A good way to understand the danger of this is to liken it to public school teachers teaching subjects outside of their certification in order to remedy teacher shortages. For example, one might find a teacher certified in social studies teaching high school math. In this cases, students are not getting the proper instruction in math. In the same way, there is a lot information on the Internet written by individuals that have no expertise in the particular content in which they are writing about. For example, many people that dispute climate change and global warming are not scientists and often rely on political rhetoric to support their claims. Scientists who do work in climate change have devoted their entire lives to research in that area, often holding undergraduate and several graduate degrees in subjects like geology and earth science. When a person is thought to be a well-known and respected expert in a certain field, they have a proven track record of careful study and research and are validated by reputable institutions that are known for producing reliable research. Often non-experts will spend just a few days or weeks “researching” climate change, in an effort to “dispute” data that is backed by decades of careful research. One does not have to have a Ph.D. to understand and challenge mainstream scientific knowledge, but time and energy devoted to research cannot be bypassed.    2) Checking sources for validity against other reliable sources. It is important when doing research on the Internet to check the provided information against other reliable sources to verify accuracy. For example, if every reputable source reports that cigarette smoking causes cancer and one source says otherwise, the lone source should be questioned until further notice because it has no credibility or way to verify its information. When checking facts and data for accuracy provided in an Internet source one should look for reliable and trusted sources. These might include academic articles, books, universities, museums, mainline reputable religious organizations, government agencies and academic associations. Libraries, universities and professional organizations usually provide reliable information. There is a growing public mistrust of long established institutions that has added to the level of uncertainty about knowledge. But it is important to know that institutions have credibility for good reason. Their history, information and knowledge base is backed by hard work, and long held traditions.    3) Is the information presented in a biased way? When one is reading an article or any information on the internet it is important to determine if that information has a specific agenda or goal in mind. What is the author’s agenda? Does the author or organization have a particular religious, sociological or political bent? These factors determine the validity of an information source. For example, oftentimes newspapers will feature op-ed pieces in which the author states up front that the article is largely based on their personal views. Therefore, when one reads an op-ed piece, they understand going into the article that it will be slanted to the right or left or toward a certain worldview. The article is not be completely useless, but the reader should realize they have to sort through the bias and decided what information is helpful to them in their research.  The reader should also search for possible bias in the information presented (Could be political, sociological, religious bias, or other ideas drawn from a particular worldview) and or even claims made that seem unrealistic or unreasonable with no evidence to back it up. 4) Search for citations that support the claims made by the author or organization. Most articles or information on the web will provide a link to do further research on the topic or to back claims made. When this information is not adequately provided one can assume that the source is not reputable. In addition, a site can have many citations but the sources may not be credible or reliable sources. Health and fitness writer Robin Reichert states the following about the topic reliable sources. Readers should “follow the links provided” in the article to “verify that the citations in fact support the writer’s claims. Look for at least two other credible citations to support the information.” Furthermore, readers should “always follow-up on citations that the writer provides to ensure that the assertions are supported by other sources.” It is also important to note that the end designation of a website can help determine credibility. When websites end in “.com” they are often are for profit organizations and trying to sell a product or service. When one comes across a site that ends in “.org” they are often non-profit organizations and thus have a particular social cause they are trying to advance or advocate for. Government agency websites always end in “.gov” while educational institutions end in “.edu.” Government agencies, educational institutions or non-profits generally offer reliable and trustworthy information. Teachers in middle and high schools attempt should spend more time having students do research papers as it teaches students the value of citing valid sources. The projects often call for proper citations using one of the various styles of citation with the most popular being APA, MLA and Chicago. How to Verify if a Source is Credible on the Internet Below I have provided a number of resources for our average internet researchers, students and teachers. The idea of truth and valid, reliable resources are being challenged because people are unsure as to what information is valid and what is not. The links below offer a number of resources that can further offer tools to help  to understand how to do research properly. Resources and References A Comprehensive Guide to APA Citations and Format EasyBib Guide to Citing and Writing in APA Format MLA General Format Formatting a Research Paper EasyBib Guide to MLA 8 Format Chicago Manual of Style 17th Edition Evaluating Internet Resources Check It Out: Verifying Information and Sources in News Coverage How to Do Research: A Step-By-Step Guide: Get Started How can I tell if a website is credible? Detecting Fake News at its Source: Machine learning system aims to determine if an information outlet is accurate or biased. What does “research” mean and are you doing it?

This is a great source of information. There are many times I am reading an article or a research paper revolving around my work. A lot of times I find the information is skewed by antidotal evidence or bias. In addition, what helps here is discussing what websites are more credible vs others. I had no idea .com and .org had differences. One being for profit and the other being not for profit. This goes into what kind of addenda they have and what they want the reader to learn vs providing all of the facts. Lastly, looking at the resources provided and the validity of them is very important. I just read an article today that was advocating for fire based ambulance services vs private and all of the sources were extremely old, none of which were from this or the last decade. So, how can I find the article credible? Bottom line, I can’t.

I thought this article was very informative and gave great information on determining if a resource is reliable or not. I feel like we were never necessarily taught how to find reliable resources. There is a lot of “fake information” online and it can be hard to tell what an accurate resource is and what is not an accurate resource. I thought this article gave some great ways to make sure you have a credible resource. I think this is what is wrong with technology though, there is a lot of fake news that people think is real and from there it creates numerous inaccurate ideas.

I have always had a hard time finding credible resources when I have had to do research for assignments. Especially since the pandemic hit, I think it’s even harder to find credible sources because of all the fake news that has been spread. When I use an online resource, I never put much thought into thinking if it is credible enough or not. If I find a resource that fits, I use it.

I’m a very naive and gullible person that overlooks the sources of where I found the information. Fake news is also more popular than ever and I like how this article helps depict articles to decipher if they are fake or legitimate

I like that this article explains how to properly identify a credible source. We live in a time where it is so easy to believe sources online. It is easier than every for people to upload any information online for people to access and eventually use as not-credible sources.

I like how this article forms a cohesive and understandable format for checking for reliable resources. It also shows how to think critically about the articles used for research.

I like that this article informs about whether an article is credible or not. Doing pre -research to make sure that you are getting the same information for all of your sources. I like that the article tells us to look at bias in our sources because that is a really big factor.

Leave a Reply Cancel reply

Your email address will not be published.

Save my name, email, and website in this browser for the next time I comment.

Copyright © 2024 | WordPress Theme by MH Themes

  • Open access
  • Published: 12 July 2024

Improving the introduction of telemedicine in pre-hospital emergency medicine: understanding users and how acceptability, usability and effectiveness influence this process

  • Seán O’Sullivan   ORCID: orcid.org/0000-0002-3696-4632 1 ,
  • Jennifer Krautwald 1 &
  • Henning Schneider 1  

BMC Emergency Medicine volume  24 , Article number:  114 ( 2024 ) Cite this article

126 Accesses

1 Altmetric

Metrics details

Introduction

Increasing numbers of ambulance calls, vacant positions and growing workloads in Emergency Medicine (EM) are increasing the pressure to find adequate solutions. With telemedicine providing health-care services by bridging large distances, connecting remote providers and even patients while using modern communication technologies, such a technology seems beneficial. As the process of developing an optimal solution is challenging, a need to quantify involved processes could improve implementation. Existing models are based on qualitative studies although standardised questionnaires for factors such as Usability, Acceptability and Effectiveness exist.

A survey was provided to participants within a German county. It was based on telemedical surveys, the System Usabilty Scale (SUS) and earlier works describing Usability, Acceptability and Effectiveness. Meanwhile a telemedical system was introduced in the investigated county. A comparison between user-groups aswell as an exploratory factor analysis (EFA) was performed.

Of n  = 91 included participants n  = 73 (80,2%) were qualified as emergency medical staff (including paramedics n  = 36 (39,56%), EMTs n  = 28 (30,77%), call handlers n  = 9 (9,89%)) and n  = 18 (19,8%) as emergency physicians. Most participants approved that telemedicine positively impacts EM and improved treatment options with an overall Usabilty Score of 68,68. EFA provided a 3-factor solution involving Usability, Acceptability and Effectiveness.

With our results being comparable to earlier studies but telemedicine only having being sparsely introduced, a positive attitude could still be attested. While our model describes 51,28% of the underlying factors, more research is needed to identify further influences. We showed that Usability is correlated with Acceptability (strong effect), Usability and Effectiveness with a medium effect, likewise Acceptability and Effectiveness. Therefore available systems need to improve. Our approach can be a guide for decision makers and developers, that a focus during implementation must be on improving usability and on a valid data driven implementation process.

Peer Review reports

With numbers of emergency calls increasing for Emergency Medical Services (EMS), overcrowding being an issue in Emergency Departments (ED) and generally observing an increasing workload in Emergency Medicine (EM), new solutions and approaches are needed to meet these challenges. Current strategies include increasing the number of staff and resources, but with rising costs for the general medical system [ 1 , 2 , 3 , 4 , 5 , 6 , 7 ]. Therefore new technologies and structures need to be evaluated and introduced to solve these challenges and prevent a loss of performance in EM.

Technologies such as telemedicine are increasingly being introduced and implemented into regular care and treatment processes. As such a technology can deliver health-care services by bridging large distances and connecting remote healthcare providers with each other and/or patients by using modern communication technologies [ 8 ].

Germany EMS and telemedicine

In German EM a nationally available solution hasn’t been established yet while existing systems only provide support in regional or rural areas [ 7 , 9 , 10 ].

These systems often only provide the option of medical support in the pre-hospital field and do not connect to EDs or other specialist structures like a heart catheter laboratory. Although technically this is possible, could improve treatment options and response times [ 11 , 12 , 13 ].

With Germany being a physician-based EM System, the mainly available telemedical systems only provide solutions in which an EM physician is not rapidly available but required by paramedics for the treatment of the patient’s condition. Therefore a telemedical EM physician (TEP) can be requested to support paramedics. Of the initial research projects some have now been adapted to regular practice and can even provide long-term data [ 9 , 14 , 15 , 16 ], but still only represent certain regions in Germany.

As the possibilities and capabilities especially in the field of pre-hospital EM are increasingly being understood, especially for time-critical scenarios, the area of non-time critical emergencies (so called non-emergencies) seems to be rather understudied [ 17 , 18 , 19 ].

Telemedical networks

Current German law and medical legislation require that patients in the pre-hospital field need to be seen by a physician, ideally a GP, while these are also faced with increasing amounts of patients and structural changes [ 20 , 21 ]. A recent survey in which not only physicians but also local german politicians and administrators where questioned, showed that these seemed to support the idea of using telemedicine and limit its application not only to rural areas [ 16 , 22 ].

Therefore a telemedical solution could not only provide a replacement for an in-person visit, but should be developed as a digital network to connect patients with adequate healthcare providers. This could allow the treatment of critical and non-critical patients within a network.

Concepts like the Emergency Talk Network (ETN) even go further and involve specialist structures like paediatricians in a digital emergency medical network [ 22 ]. While ECGs can be transferred using various telemetric devices and healthcare networks seem to improve the treatment f.ex. of acute coronary syndromes, even the leading cardiovascular societies recommend the development and availability of telemedical systems and approaches. [ 12 , 13 ]

Extending this approach to involve not only one but many more specialists in one network could therefore optimise the flow of patients, the use of available limited resources, improve patient safety as well as guideline adherence even more.

Users of telemedical systems in Germany

In a nationwide german survey paramedics and emergency physicians approved of the integration of modern technologies to improve processes and treatments [ 10 ]. Also in an even older survey from 2012 - during the time of a broad implementation of telemedicine in EM structures in Aachen, Germany - paramedics described that telemedicine is not only seen as a tool to control but rather to supervise and improve therapies. Especially for critically ill or when specialist expertise was needed. Generally processes as well as communication between paramedics, dispatch centres and hospitals seemed to improve [ 23 ].

With this technology having been attributed positively for a long time, this must have been achieved by developing systems that would receive high “Acceptability” ratings while also being reliable. Therefore understanding the basis on why there has been such a positive attitude has neither been researched nor quantified with standardised methods. Gaining a better understanding could improve implementation processes and advance the integration of more EM providers and specialist, as there still seems to be factors limiting the integration of telemedicine in many healthcare systems.

Implementation, continuity and understanding users

But as reported for the field of pediatric EM not only technical aspects like feasibility or reliability but also a lack of knowledge seems to be a challenge [ 24 , 25 ].

Therefore healthcare regulators should not only support the development of technical connections, interfaces or provide initial financial investments, but continuously invest in continuous training programs of involved staff.

Users of such telemedical systems are mainly paramedics and physicians. Both groups are faced with the challenges of the introduction of a new technology in an already stressful work environment with constantly changing surroundings.

This also extends to aspects likeuser demographics, suitable use-cases but also the individual healthcare providers needs [ 26 , 27 ].

To understand these Sauers-Ford et al. investigated these using qualitative methods for the application of telemedicine in a paediatric ED: In a connected model the authors described that acceptability of a telemedical system influenced its perceived usability, while usability influences its effectiveness. But also its perceived usability and effectiveness feedback and influence the perceived acceptability.

These aspects “Usability”, “Effectivity” and “Acceptance” which are defined and originate from the fields of user-centered development and computer technologies have only been described in a qualitative approach in EM, although a quantitative approach is regularly used in software development [ 28 , 29 , 30 ].

As this has not been described yet (to our knowledge) for the field of pre-hospital EM, we planned on investigating this as such. an approach could improve implementation and integration of telemedicine in EM.

Therefore we saw the need to gain insights on understand the underlying challenges by investigating these in a German county that was currently in the process of introducing a new telemedical system.

Test region

The German county Main-Taunus Kreis (MTK) is a sub-urban region with a mixed population density in the federal state of Hesse. With a very densely city-like population in one half of the county and the other being more rural [ 31 ], it provides a challenging region to introduce new emergency medical structures.

Located near the city of Frankfurt am Main with an availability of multiple trauma centres, two university hospitals and many academic teaching centers patients, can rapidly be transported and treated at highly specialised facilities [ 31 ].

New telemedical system

Paramedics at the site of emergency can request support from a TEP. The TEP can access vital signs which are provided via the combined monitoring and defibrillator system Corpuls C3 from the company Corpuls. With the provided software application corpuls.mission paramedics can optionally consult a TEP using live audio and visual communication [ 32 , 33 ].

As the process of implementation was planned as a step-wise process, not all ambulances were directly equipped. Only ambulances from two of five possible ambulance stations were equipped with the system during the time of the trial. Overall these were 4 of 11 ambulances.

2 were located in a rural area, while the other 2 were located in a densely populated area as can be seen in the following Fig.  1 .

figure 1

Ambulance stations and ambulances in the county Main-Taunus Kreis during the time of study - original image “Abbildung 25: Rettungswachenversorgungsbereich MTK” page 119 [ 31 ] modified by the author by adding the table

Trial design

To understand the involved processes as well as the users’ opinions a pre-test was performed for one week in March followed by a revision and update of the questionnaire. A final questionnaire was then available via an Online-Link from April 3rd 2023 to May 14th 2023. Participants received an invitation via email from the county’s medical director of prehospital EM (German: Ärztlicher Leiter Rettungsdienst = ÄLRD) and could voluntarily participate. During this time the medical director of prehospital EM sent two reminders to the participants to complete the survey.

Results could only be included if participants worked within this county in the field of EM, the questionnaire was fully completed and a data-protection waiver according to EU-GDPR was approved by the participant. Other results were excluded.

Before the final questionnaire was launched, a pretest was performed for one week in march 2023. Participants were invited to answer the first version of the questionnaire and provide comments.

Participants of this pre-test were medical directors of prehospital EM from several counties of the state of Hesse, the hessian ministry of social affairs responsible for the field of EM and telemedicine (HMSI), paramedics, EMTs, emergency physicians as well as qualified call handlers for emergency dispatch centres.

The participants had to work in regions in which telemedicine in EM was available. These were defined as the counties Giessen and Main-Kinzig-Kreis in Hesse. Other results were excluded. Cronbachs Alpha was analysed with an α-value of 0,05 [ 34 ]. After reviewing the results and comments the authors decided if parts of the questionnaire had to be adapted. Only if both authors SO and JK agreed, a change could be performed. If only one agreed, the other author HS would be involved and a change was performed if the majority approved. Between the pre-test and the finale questionnaire a time of 2–3 weeks was planned for revision of the final questionnaire.

Questionnaire

The final questionnaire consisted of 50 items in german language with single and multiple choice questions as well as Likert scaled and open-ended answer possibilities, which can be viewed in Annex 1 . These were divided in to 4 parts:

Part A - General Part - consisted of 5 questions regarding age, identified sex, field of work, current qualification and work experience.

Part B - Tele Emergency Physician Concept - consisted of 29 questions regarding the use of telemedicine in emergency medicine with a 5-point Likert scale.

14 of these items were based on the questionnaire from Metelmann et al., 3. From Kuntosch et al. [ 35 , 36 ] and 11 were adapted after reviewing the results of the pre-test.

Part C - Usability - were 10 questions of the System Usabilty Scale (SUS) in German [ 37 , 38 ].

Part D - Open Questions - were 6 Open-Ended Questions of which 4 were based on Sauers-ford et al. [ 28 ] and 2 on the results and comments of the pre-test.

Participants

Participants of the final questionnaire had to be medically qualified to work in the field of EM, meaning that they had to be Paramedics, EMTs, Emergency Physicians or be qualified call handlers for emergency dispatch centres. Otherwise participants were excluded.

Data analysis

Before analysis all data was transferred from the online questionnaire platform to a database (Microsoft Excel, Version 22.10, Vermont, USA [ 39 ]). RStudio (2023.12.1 + 402 including R version 4.3.2 (2023-10-31)) was used for statistical analysis as a combined quantitative and qualitative process was planned:

Quantitative approach

The groups were to be compared depending on qualification, sex, availability of a telemedical system and level of knowledge about telemedicine.

For Analysis of the SUS results the t-test and for more than two groups an ANOVA Test was performed. Significant p values were defined at 0.05 .

T-test was performed for the groups sex, physician vs. non-physician and the availability of a telemedical system. ANOVA was performed for the other groups regarding qualifications, age groups and experience.

An analysis for homogeneity of Variance was performed using Leven’s test ( p  < .05). If Variance of homogeneity was not proven a Welch Test was performed [ 40 ].

For significant results a further analysis was performed with the Bernoulli Post-Hoc Test ( α- value 0.05 ).

For correlation- analysis, Pearson’s product-moment correlation was used and α- value was defined at 0.05.

Factor analysis

The qualitative analysis was performed using an exploratory factor analysis (EFA). With regards to the earlier described qualitative results from Sauers-Ford et al. the factors “Acceptability”, “Usability” and “Effectiveness” [ 28 ] were identified and analysed [ 41 , 42 , 43 ]:

Further methodical information on the EFA can be viewed in Supplement 1.

Following the results of the Factor Analysis a multiple regression analysis was performed. Based on the results from Sauers-Ford et al and an earlier performed correlation analysis “Acceptably” would be analysed as a dependent and “Usability” and “Effectiveness” as independent variables.

Multiple regression analysis

The multiple regression analysis included the Mann-Whitney-U- and Kruskal-Wallis-Test as non-parametrical methods.

For significant results of the Kruskal-Wallis-Test, a Bonferroni-Post-Hoc test was performed. The α- value was defined at 0.05. To ensure normal distribution a Kolmogorov-Smirnov-test was performed beforehand with a p- value defined at 0.05 . If a significant group difference was to be found the effect size would be analysed using Pearson’s correlation coefficient ( r ) [ 44 ].

All methods were carried out in accordance with relevant guidelines and regulations.

As no potential harm was to be expected, the local ethics committee (University Hospital Giessen, Germany) solely required informed consent including a data privacy agreement from the participants.

Overall n  = 19 participants (8 Paramedics, 6 Emergency Physicians, 3 EMTs and 2 call handlers) took part in the Pre-Test. Reliability was confirmed with Cronbachs Alpha and an α-value of 0,837.

The average SUS Score was 74,2% and was also confirmed with an α-value of 0,898. This showed a satisfying internal consistency.

As some questions used a past tense adaptations had to be made f.ex in to the present tense. The adaptation can be seen in Annex 1 , which also includes an explanation for each changes.

Final questionnaire

Participants and qualifications.

At the time of the study there were n  = 308 registered professionals (including part-time employees and temporary staff), consisting of n  = 238 (77,3%) non-physicians and n  = 70 Emergency Physicians (22,7%). Of these n  = 91 (29,5%) finished the complete survey and were included in the analysis (e.g. Figure  2 ).

figure 2

Visualization of the participants replies in proportion of agreement (green) and disagreement (yellow)

The participants ( n  = 91) were on average 34,38 ± 10,89 years old 95% KI [33,27; 35,50]. The oldest being 59 and the youngest 19 years old.

Regarding sex n  = 66 participants (72,55%) identified as male, while n  = 25 (27,5%) as females.

n  = 73 (80,2%) participants were qualified as emergency medical staff, which included paramedics n  = 36 (39,56%), EMTs n  = 28 (30,77%) and call handlers n  = 9 (9,89%) for emergency dispatch centres. n  = 18 (19,8%) were qualified as emergency physicians.

Referral of patients to the appropriate treatment centre

A Mann-Whitney U test was performed to evaluate the item ”The telemedicine system supports the referral of patients to the appropriate treatment centre” between physician and non-physicians. Physicians significantly agreed with this item compared to non-physician medical staff [z=-2,074, p  = .038].) The effect size was low r = 0.217.

A comparison within the group of non-physicians was also performed between EMTs ( n  = 24) 42,1% and Paramedics ( n  = 33) 57,9%.

48,5% of Paramedics and 33,3% of EMTs agreed with this item. The result was not significant [z=-1,845, p  = .065].

Level of knowledge about telemedicine

n = 43 (47,3%) Participants were assigned to the group “little or no knowledge”, n = 26 (28,6%) to the group “moderate level of knowledge” and n = 22 (24,2%) to group “high level of knowledge” group.

The item “The telemedical system leads to an improvement of treatment options” showed a significant difference [ χ 2  = 6,871, p  = .032]. The Post-hoc test provided a significant difference between the groups “little or no knowledge” and “high level of knowledge” [z=-2,401, p  = .049] as 90,3% of the “high level of knowledge” group agreed with this item compared to only 65,1% of the “little or no knowledge” group. The effect size was weak r = 0.295.

Intended use of the telemedical system

36,4% of participants replied with a daily to weekly use from the group with a “high level of knowledge” compared to only 23,1% from the group with a “moderate level of knowledge” and 30,2% from the group “little or no knowledge”. There was no significant difference [ χ 2  = 9,521, p  = .199] between the groups.

Comparing age groups

Participants were assigned to 3 age groups: n  = 42 (46,2%) participants were in the age group 19–31 years, n  = 31 (34,1%) to the group 32–45 and n  = 18 (19,8%) to the group 46–59 years.

The item “The tele-emergency physician performs higher-level supervisory and control functions” [ χ 2  = 12,958, p  = .002] provided a significant group difference.

In the Post-hoc tests a significant group difference between the age groups “32–45” and “46–59” was seen [z=-3,356, p  = .002] with a medium size of effect r = 0.479.

A significant difference was described for the age groups „19–31“ and „46–59“ [z=-3,205, p  = .004] with a medium effect size r =0.413.

The participants from the groups 19–31 and 32–45 years agreed more with this item than the group of the 46–59 year olds.

Frequency of intended use

The group „19–31“ years replied with 33,3% for a daily to weekly use, compared to 32,3% of the „32–45“ years group and 16,7% of the group „46–59“. There was no significant group difference [ χ 2  = 8,428, p  = .215].

Request for support

Regarding the items which asked the participants in which area a request for support is likely, the answer “decision on diagnosis and therapy” n  = 71 (58,7%) was chosen the most frequently, followed by “organisational support” n  = 24 (19,8%), “manual skills” n  = 16 (13,2%), “no support needed” n  = 4 (3,3%) and n  = 6 (5,0%) for others (i.e. Table  1 ).

Comparing the 3 age groups n  = 34 (81%) participants in the group 19–31 years and n  = 26 (83,9%) from the group 32–45 requested support on “diagnosis and therapy”. Followed by “manual skills” n  = 7 (22,6%) from the 32–45 years group and n  = 4 (22,2%) from the group 46–59 years.

Regarding the qualification of participants, 75,8% of Paramedics ( n  = 25) and 95,8% of EMTs ( n  = 23) requested support on “diagnosis and therapy”.

System usability scale

With Cronbachs Alpha being 0.829, overall “Usability” received a score of 68,68 (SD 12,76) 95% KI [67,37; 69,99] (i.e. Table  2 ).which allows a system to be described as usable [ 45 ].

Female and male participants

To compare the SUS Score in the female and male group a two-sided t-test was performed.

There was no significant difference in SUS Score between the male (M = 69.24, SD = 12, 95% KI [66.29; 72.18]) and female group (M = 67.20, SD = 14.74, 95% KI [61.12; 73.28]); [t(89) = 0.67952, p  = .499]. (e.g. Fig.  3 Label A)

figure 3

Evaluation of Usability results by the SUS Score differentiated for the categories: A - Sex, B - Physician vs. Non-Physician, C - Availability of TEP, D - Age

Physicians and not physicians

There was no significant difference between the physician (M = 68.89, SD = 10.44, 95% KI [63.70; 74.08]) and non-physician group (M = 68.63, SD = 13.33, 95% KI [65.72; 71.94]) in SUS Scores [t(89) = 0.076628, p  = .939]. (e.g. Fig.  3 Label B)

Availability of TNA system

There was no significant difference in SUS Scores between the group that had a TNA System available (M = 69.21, SD = 13.82, 95% KI [65.54; 72.88]) and not-available (M = 67.79, SD = 10.9, 95% KI [65.72; 71.59]); [t(89) = 0.51014, p  = .611] (e.g. Fig.  3 Label C).

SUS and age

A Pearson correlation coefficient was computed to assess a linear relationship between the participants age and SUS score. There was a no correlation between the two variables, [r(89) =-0.002, p  = .989] (e.g. Fig.  3 Label D).

Results age groups

A one-way ANOVA was performed to compare the effect of the age groups on the SUS Score. Homogeneity of variances was confirmed with Levene’s Test. It revealed that there was no statistically significant difference in mean SUS Scores between at least two groups [F(1, 89) = 0.038, p  = .845]. (e.g. Fig.  4 Label A)

figure 4

Evaluation of Usability results by the SUS Score differentiated for the categories: A - Agegroup, B - Qualification, C - Experience

Qualifications

When comparing the effects of the participants qualifications with the SUS Score, Levene’s Test confirmed homogeneity of variances. No statistically significant difference in mean SUS Scores between at least two groups [F(5, 85)=[2.028], p  = .083] could be seen. (e.g. Fig.  4 Label B)

When Comparing participants experience with the SUS Score, homogeneity of variances was confirmed with Levene’s Test but there was no statistically significant difference in between at least two groups [F(5, 85) = 2.029, p  = .083]. (e.g. Fig.  4 Label C)

Acceptability - usability - effectiveness

To explore the factorial structure 23 items (excluding sub-questions) were subjected to an exploratory factor analysis with orthogonal rotation. Further information on the Factor Analysis can be viewed in Supplement 1 .

With the Kaiser’s criterion of eigenvalues greater than 1 and indicated by the scree plot a three-factor solution was yielded as the best fit for the data, accounting for 51.28% of variance (e.g. Figure  5 ). The results of this factor analysis are presented in detail in Supplement 1 including each Variable and MSA Value (e.g. Supplement 1).

figure 5

Top : Screeplot indicating that a three-factor solution accounts for the majority of variance. Bottom : Regression Model for “Acceptability” based on the variables “Usability” and “Effectiveness”

As indicated by the scree plot, other possible factor solutions could be a 4- or 5-factor model (e.g. Fig.  5 Top). But such corresponding models don’t seem to exist in the available literature and marginally have a larger Eigenvalue than 1.

Korrelation and Regression

The following reliability analysis provided that Cronbachs Alpha for the factor „Usability“ was 0.801, for „Effectiveness“ 0.779 and for the factor „Acceptably“ 0.805. All α-values were between 0.7 and 0.9, which describe consistent subscales [ 34 ].

A Pearson correlation coefficient was computed to assess the linear relationship between “Acceptability” and “Effectiveness”. Positive correlations between the two variables were seen, r(89) = . 439 , p  < .001. and the size of effect was medium (i.e. Table  3 ).

Between “Usability” and “Effectiveness” positive correlation existed r(89) = 0.435, p  < .001 with a medium size effect (i. e. Table  3 ).

Also between “Acceptability” and “Usability” a positive correlation could be analysed r(89) = 0.570, p  < .001. with a strong effect size (i.e. Table  3 ).

Linear regression model

Based on the earlier results and on Sauers-Ford et al.’s concept, the factor “Acceptability” was analysed as a predictable variable and “Usability” and “Effectiveness” as predictor variables.

The regression model was: Acceptability = 0,472*Usability + 0,163*Effectiveness–3,943 (e.g. Fig.  5 Bottom).

Overall regression was statistically significant (R 2  = 0.355, F(2, 88) = 25.801, p  = .001).

It was found that “Usability” significantly predicted “Acceptability” (β = 0.467, p  < .001) and “Effectiveness” significantly predicted “Acceptability” (β = 0.236, p  = .014).

According to Cohen, the effect was strong: ( 𝑓 2 = 0,55).

Successfully introducing and applying a telemedical system is challenging, but even more so in an interdisciplinary field like EM.

This is a challenge not only for those that are involved with the introduction of such a solutions but especially for those that will use these daily as well as those that manage clinical processes.

Therefore understanding the relevant factors of a successful technological roll-out during a continuous implementation to workflows becomes even more vital.

With our study we could not only confirm that – “Acceptability influences usability, which influences effectiveness” [ 28 ]– which already was described by Sauers-Ford et al., but could even quantify these effects: With “Usability” effecting “Acceptability” even more than “Effectiveness”, software and system developers but also those that are involved in choosing and implementing telemedical system need to focus more on “Usability”. Therefore choosing and introducing a system that offers a high level of “Usability” will increase its “Acceptability” far more than only focusing on a systems capability to solve a problem.

So if users perceive a system to be highly effective, it can still be that these will not accept the system, if the perceived “Usability” is not adequate. In worst case the system would not be used and the misinvested funds could have been applied elsewhere in a healthcare system that already struggles with increasingly tighter.

Therefore our results could not only extend this principle but even emphasize its relevance for the whole field of EM as three factors represent over 51% of the influencing effects.

Therefore we recommend that regulators and administrators should perform such analysis regularly and not only monitor existing introduction processes. Furthermore the impact of trainings but also system updates need to be recognized, to allow an earlier handling of problems or recurring challenges.

With biases and beliefs being a relevant factor in accepting changes in one’s work environment, understanding these becomes even more relevant:

Overall a positive attitude could be attest to the participants as telemedicine is seen as an advantageous tool to generally improve treatment options and processes. Especially those that have access to a telemedical system see that there are more treatment options available.

Regarding the introduction within established processes, the results showed a small tendency to see the TEP System to increase general workload.

This proves even more that such systems need to focus not only on being effective but on being usable. This includes rapid availability, improved “Acceptability” and adaptability for various scenarios. Ideally within an established technological ecosystem to improve its effectivity.

While physician participants agreed more with the item that a TEP system allows the referral of patients to an appropriate treatment centre so that specialists can be rapidly available at the site of emergency, this availability could improve patients quality of treatment and also reduce unnecessary transfers [ 46 ]. While this advantage seems to be a present thought for the participating physicians, further extended education will be needed for paramedics. As a new communication network could allow more treatment options, the overall processes in the pre-clinical field will change and would involve paramedics to perform f.ex. more advanced treatments while being supervised by the TEP.

Most participants agreed that a decision on diagnosis and therapy would be the most common request for support currently. With this being a common theme in many surveys, even some which have been published a decade ago [ 15 , 36 ], this reply provides a relevant insight as current telemedical systems - especially in Germany- focus on only being a tool to replace emergency physicians. With a further development of telemedical solutions such systems will probably not only be used for communication on treatment and diagnosis, but will also need to be able to allow a broader application. For example a support on manual skills could be performed if technologies like Augmented Reality (AR) and Point-of-Care Ultrasound could be combined, while allowing supervision by a TEP.

Developing a broad network solution –like an emergency talk network [ 22 ]- could therefore allow not only specialists to be available rapidly but allow advanced imaging to be used for diagnosis and treatment. Combining advanced technology with advanced treatment possibilities in the pre-hospital field could be one of many options.

With an overall “Usability” rating of 68,68 this system would be described as usable, but at a marginal range [ 47 ]. In an early stage of introduction such an evaluation could be expected, but a continuous focus on “Usability” is needed if the long-term goal is high “Acceptability”.

Furthering education and advancing available training could improve implementation while the effects shoud be monitored. At the same time optimising the system according to human factor design and user recommendations [ 27 ] would be the most promising approach for an improvement in telemedical systems.

Limitations

As this trial was only performed in one region of Germany and only involved one telemedical system a direct generalizability of our results could only be performed with regards to these limitations.

EM systems vary nationally and some regions only involve emergency physician for special circumstances. Systems that are developed and used in these countries will need different specifications regarding the user’s needs.

As the sample size especially for a comparison of different groups and specifications were rather small a generalization can also only be performed with limited applicability.

With this being the first reported factor analysis, further confirmatory analysis should be performed in EM. Especially understanding the other involved factors as the effect of the remaining 48% are still unclear. Therefore further research is needed to understand this large proportion as not only implementation processes but also the development of better telemedical systems could allow an improved and more individualised application of telemedicine in prehospital EM.

Conclusions

When introducing a telemedical system a deep understanding of the involved structures, legislation, medical cases, regional differences but especially the users is required. Developing an understanding of these effects is relevant and requires a framework to improve the implementation process. Therefore quantifying this process allows decision makers to understand the challenges and which steps can provide the most impact. Focusing on aspects of “Usability” will improve the acceptance of systems even more than aspects that only focus on a systems “Effectiveness”.

With our approach we developed a framework which can be applied to various settings of telemedicine in EM. But this framework needs further testing and validation in other settings. Telemedicine will not only be a technology to replace currently missing staff or resources but will be a technology that will add more treatment options at the site of emergency in combination with novel developments.

Data availability

Data is provided within the manuscript or supplementary information files. The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Andrew E, Nehme Z, Cameron P, Smith K. Drivers of increasing emergency ambulance demand. Prehosp Emerg Care 3 Mai. 2020;24(3):385–93.

Article   Google Scholar  

Aringhieri R, Bruni ME, Khodaparasti S, van Essen JT. Emergency medical services and beyond: addressing new challenges through a wide literature review. Computers Oper Res 1 Februar. 2017;78:349–68.

Di Somma S, Paladino L, Vaughan L, Lalle I, Magrini L, Magnanti M. März. Overcrowding in emergency department: an international issue. Intern Emerg Med. 1. 2015;10(2):171–5.

Andrew E, Nehme Z, Stephenson M, Walker T, Smith K. The impact of the COVID-19 pandemic on demand for emergency ambulances in Victoria, Australia. Prehospital Emerg Care 2 Januar. 2022;26(1):23–9.

Google Scholar  

Kim DK, Kim TH, Shin SD, Ro YS, Song KJ, Hong KJ. u. a. impact of crowding in local ambulance demand on call-to-ambulance scene arrival in out-of-hospital cardiac arrest. Am J Emerg Med Februar. 2022;52:105–9.

Snoswell CL, Taylor ML, Comans TA, Smith AC, Gray LC, Caffery LJ. Determining if Telehealth can reduce Health System costs: scoping review. J Med Internet Res 19 Oktober. 2020;22(10):e17298.

Regierungskommission für eine moderne und bedarfsgerechte Krankenhausversorgung. Vierte Stellungnahme und Empfehlung Der Regierungskommission für eine moderne und bedarfsgerechte Krankenhausversorgung - Reform Der Notfall- Und Akutversorgung in Deutschland Integrierte Notfallzentren Und Integrierte Leitstellen [Internet]. Feb [zitiert 16. Februar 2023]. (stellungnahme und Empfehlung Der Regierungskommission für eine moderne und bedarfsgerechte Krankenhausversorgung). Report No.: 4. Berlin: Bundesministerium für Gesundheit; 2023. Verfügbar unter. https://www.bundesgesundheitsministerium.de/fileadmin/Dateien/3_Downloads/K/Krankenhausreform/Vierte_Stellungnahme_Regierungskommission_Notfall_ILS_und_INZ.pdf .

World Health Organization. Consolidated telemedicine implementation guide [Internet]. Geneva: World Health Organization; 2022. [zitiert 23. Januar 2024]. Verfügbar unter. https://iris.who.int/bitstream/handle/10665/364221/9789240059184-eng.pdf?sequence=1 .

Rupp D, Benöhr P, König MK, Bollinger M, Wranze-Bielefeld E, Eichen PM. Telenotarztsysteme im deutschen Rettungsdienst: eine nationale Sachstandserhebung. Notfall Rettungsmed [Internet]. 16. August 2022 [zitiert 21. Januar 2024]; Verfügbar unter: https://doi.org/10.1007/s10049-022-01063-3 .

Möllenhoff C, Eder PA, Rashid A, Möllenhoff C, Römer I, Franczyk B. Digitale Systeme zur Unterstützung von präklinischen Notfalleinsätzen. Anaesthesist [Internet]. 6. Januar 2022 [zitiert 9. Januar 2022]; Verfügbar unter: https://doi.org/10.1007/s00101-021-01085-5 .

Amadi-Obi A, Gilligan P, Owens N, O’Donnell C. Telemedicine in pre-hospital care: a review of telemedicine applications in the pre-hospital environment. Int J Emerg Med 5 Juli. 2014;7:29.

Caldarola P, Gulizia MM, Gabrielli D, Sicuro M, De Gennaro L, Giammaria M. u. a. ANMCO/SIT Consensus Document: telemedicine for cardiovascular emergency networks. Eur Heart J Suppl Mai. 2017;19(Suppl D):D229–43.

Byrne RA, Rossello X, Coughlan JJ, Barbato E, Berry C, Chieffo A. u. a. 2023 ESC guidelines for the management of acute coronary syndromes: developed by the task force on the management of acute coronary syndromes of the European Society of Cardiology (ESC). Eur Heart J 7 Oktober. 2023;44(38):3720–826.

Article   CAS   Google Scholar  

Institut für Notfallmedizin und Medizinmanagement. (INM) Klinikum der Universität München. Rettungsdienstbericht Bayern 2019 [Internet]. München: Institut für Notfallmedizin und Medizinmanagement (INM) Klinikum der Universität München; 2019 [zitiert 30. März 2022] S. 128. (Berichtszeitraum: 2009 bis 2018). Verfügbar unter: https://www.inm-online.de/images/stories/pdf/Rettungsdienstbericht_Bayern_2019.pdf .

Kuntosch J, Brinkrolf P, Metelmann C, Metelmann B, Fischer L, Hirsch F. u. a. Etablierung Einer Telenotarzt-Anwendung. In: Hahnenkamp K, Fleßa S, Hasebrook J, Brinkrolf P, Metelmann B, Metelmann C, editors. Herausgeber. Notfallversorgung auf dem land: Ergebnisse Des Pilotprojektes Land|Rettung [Internet]. Berlin, Heidelberg: Springer; 2020. pp. 115–246. Verfügbar unter:. https://doi.org/10.1007/978-3-662-61930-8_4 .

Chapter   Google Scholar  

Schröder H, Beckers SK, Borgs C, Sommer A, Rossaint R, Grüßer L. u. a. long-term effects of a prehospital telemedicine system on structural and process quality indicators of an emergency medical service. Sci Rep 3 Januar. 2024;14:310.

du Toit M, Malau-Aduli B, Vangaveti V, Sabesan S, Ray RA. Use of telehealth in the management of non-critical emergencies in rural or remote emergency departments: a systematic review. J Telemed Telecare 1 Januar. 2019;25(1):3–16.

SharmaRahul EH. The Availablists: Emergency Care without the Emergency Department. N Engl J Med Catalyst [Internet]. 21. Dezember 2021 [zitiert 7. April 2022];2(12). Verfügbar unter: https://catalyst.nejm.org/doi/full/ https://doi.org/10.1056/CAT.21.0310 .

Uscher-Pines L, Pines J, Kellermann A, Gillen E, Mehrotra A. Deciding to visit the Emergency Department for non-urgent conditions: a systematic review of the literature. Am J Manag Care Januar. 2013;19(1):47–59.

Wangler J, Jansky M. How can primary care be secured in the long term? – a qualitative study from the perspective of general practitioners in Germany. Eur J Gen Pract 30 Dezember. 2023;29(1):2223928.

Kuhn B, Kleij KS, Liersch S, Steinhäuser J, Amelung V. Which strategies might improve local primary healthcare in Germany? An explorative study from a local government point of view. BMC Family Pract 20 Dezember. 2017;18(1):105.

O’Sullivan S, Schneider H. Comparing effects and application of telemedicine for different specialties in emergency medicine using the emergency talk application (U-Sim ETA Trial). Sci Rep 16 August. 2023;13(1):13332.

Schneiders MT, Herbst S, Schilberg D, Isenhardt I, Jeschke S, Fischermann H. u. a. telenotarzt auf dem Prüfstand. Notfall Rettungsmed 1 Juni. 2012;15(5):410–5.

Hansen M, Meckler G, Dickinson C, Dickenson K, Jui J, Lambert W. u. a. children’s safety initiative: a national assessment of pediatric educational needs among emergency medical services providers. Prehosp Emerg Care. 2015;19(2):287–91.

Article   PubMed   Google Scholar  

Siew L, Hsiao A, McCarthy P, Agarwal A, Lee E, Chen L. Reliability of Telemedicine in the Assessment of seriously Ill Children. Pediatr März. 2016;137(3):e20150712.

Tang Z, Johnson TR, Tindall RD, Zhang J. Applying heuristic evaluation to improve the usability of a telemedicine system. Telemed J E Health Februar. 2006;12(1):24–34.

Fouquet SD, Miranda AT. Asking the right questions—human factors considerations for Telemedicine Design. Curr Allergy Asthma Rep. 2020;20(11):66.

Article   PubMed   PubMed Central   Google Scholar  

Sauers-Ford HS, Hamline MY, Gosdin MM, Kair LR, Weinberg GM, Marcin JP. u. a. acceptability, usability, and effectiveness: a qualitative study evaluating a Pediatric Telemedicine Program. Acad Emerg Med September. 2019;26(9):1022–33.

Jobé C, Carron PN, Métrailler P, Bellagamba JM, Briguet A, Zurcher L. u. a. introduction of Telemedicine in a Prehospital Emergency Care setting: a pilot study. Int J Telemed Appl 23 März. 2023;2023:1171401.

International Organization for Standardization, (OBP). ISO 9241-11:2018(en), Ergonomics of human-system interaction — Part 11: Usability: Definitions and concepts [Internet]. ISO Online Browsing Platform. 2018 [zitiert 14. Juni 2023]. Verfügbar unter: https://www.iso.org/obp/ui/#iso:std:iso:9241 :-11:ed-2:v1:en.

Der Main-Taunus Kreis. Bereichsplan für den Rettungsdienstbereich Main-Taunus-Kreis – 7. Fortschreibung (Stand 01.10.2023) [Internet]. Hofheim am Taunus: Main-Taunus-Kreis - Der Kreisausschuss; 2023 Jan [zitiert 30. Januar 2024]. Verfügbar unter: https://www.mtk.org/statics/ds_doc/downloads/Bereichsplan_7_Fortschreibung_01102023.pdf .

GS Elektromedizinische Geräte G. Stemple GmbH. Corpuls C3 Produktbroschüre & Zulassungen [Internet]. 86916 Kaufering / Germany; 2016 Juni S. 6. Verfügbar unter: https://corpuls.world/wAssets/docs/broschueren/broschuere-corpuls3-de-min.pdf .

GS corpuls. corpuls.mission | corpuls [Internet]. GS corpuls. 2024 [zitiert 3. Februar 2024]. Verfügbar unter: https://corpuls.world/en/products/corpuls.mission/ .

Taber KS. The Use of Cronbach’s alpha when developing and Reporting Research Instruments in Science Education. Res Sci Educ 1 Dezember. 2018;48(6):1273–96.

Kuntosch J, Metelmann B, Zänger M, Maslo L, Fleßa S. Das Telenotarzt-System als Innovation Im Rettungsdienst: Potenzialbewertung Durch Mitarbeiter deutscher Einsatzleitstellen. Gesundheitswesen [Internet]. 4. September 2020 [zitiert 15. Juli 2021]; Verfügbar unter: http://www.thieme.connect.de/DOI/DOI?10.1055/a-1144-2881 .

Metelmann C, Metelmann B, Bartels J, Laslo T, Fleßa S, Hasebrook J. September, Was erwarten Mitarbeiter Der Notfallmedizin Vom Telenotarzt? Notfall Rettungsmed. 1. 2019;22(6):492–9.

Gao M, Kortum P, Oswald FL. Multi-language Toolkit for the System Usability Scale. Int J Human–Computer Interact 13 Dezember. 2020;36(20):1883–901.

Bernhard Rummel. System Usability Scale – jetzt auch auf Deutsch. [Internet]. SAP User Experience Community. 2015 [zitiert 3. Februar 2021]. Verfügbar unter: https://experience.sap.com/skillup/system-usability-scale-jetzt-auch-auf-deutsch/ .

Davenport T. Updateverlauf für Microsoft 365 Apps (nach Datum aufgelistet) - Office release notes [Internet]. 2023 [zitiert 7. März 2023]. Verfügbar unter: https://learn.microsoft.com/de-de/officeupdates/update-history-microsoft365-apps-by-date .

Lakens D. Taking Parametric assumptions seriously: arguments for the Use of Welch’s F-test instead of the classical F-test in one-way ANOVA. Int Rev Social Psychol 1 August. 2019;32(1):13.

Bryant FB, Yarnold PR, Michelson EA. Statistical methodology. Acad Emerg Med. 1999;6(1):54–66.

Article   CAS   PubMed   Google Scholar  

UCLA: Statistical Consulting Group. A Practical Introduction to Factor Analysis: Exploratory Factor Analysis [Internet]. 2021 [zitiert 4. Februar 2024]. Verfügbar unter: https://stats.oarc.ucla.edu/spss/seminars/introduction-to-factor-analysis/a-practical-introduction-to-factor-analysis/ .

Taherdoost H, Sahibuddin S, Jalaliyoon N. Exploratory Factor Analysis; Concepts and Theory. In: Balicki J, Herausgeber. Advances in Applied and Pure Mathematics [Internet]. WSEAS; 2014 [zitiert 4. Februar 2024]. S. 375–82. (Mathematics and Computers in Science and Engineering Series; Bd. 27). Verfügbar unter: https://hal.science/hal-02557344 .

Sullivan GM, Feinn R. Using effect size—or why the P value is not enough. J Graduate Med Educ 1 September. 2012;4(3):279–82.

Bangor A, Kortum PT, Miller JT. An empirical evaluation of the System Usability Scale. Int J Hum Comput Interact 29 Juli. 2008;24(6):574–94.

Dharmar M, Romano PS, Kuppermann N, Nesbitt TS, Cole SL, Andrada ER. u. a. impact of critical care telemedicine consultations on children in Rural Emergency Departments*. Crit Care Med Oktober. 2013;41(10):2388.

Bangor A, Kortum. Philip, Miller, James. Determining what Individual SUS scores Mean: adding an adjective rating scale. J Usability Stud. 2009;4(3):114–23.

Download references

Acknowledgements

The authors are grateful to the paramedics, emergency physicians, and experts in the field of Emergency Medicine and medical informatics for their continuous support. We are grateful for the contributions made by the staff and administration of the county of the Main-Taunus Kreis and especially its medical director of prehospital emergency medicine ÄLRD Jörg Blau.

Open Access funding enabled and organized by Projekt DEAL. The research did not receive specific funding, but was performed as part of the employment of the authors at the Technische Hochschule Mittelhessen Fachbereich Gesundheit.

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and affiliations.

Faculty of Health Sciences, Technische Hochschule Mittelhessen, Gießen, Germany

Seán O’Sullivan, Jennifer Krautwald & Henning Schneider

You can also search for this author in PubMed   Google Scholar

Contributions

O’Sullivan SF: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Project administration; Supervision; Validation; Visualization; Writing – original draft; Writing – review & editing.Krautwald J: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Project administration; Validation; Visualization; Writing – original draft;.Schneider H: Conceptualization; Formal analysis; Funding acquisition; Methodology; Project administration; Resources; Supervision; Writing – review & editing.

Corresponding author

Correspondence to Seán O’Sullivan .

Ethics declarations

Ethics approval and consent to participate.

As no potential harm was to be expected, the local ethics committee (University Hospital Giessen, Germany) solely required informed consent from the participants. Participation was only possible after written consent including a data-privacy agreement was signed. Therefore the study was approved by the ethics committee of the University of Gießen and informed consent was obtained from all participants.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

O’Sullivan, S., Krautwald, J. & Schneider, H. Improving the introduction of telemedicine in pre-hospital emergency medicine: understanding users and how acceptability, usability and effectiveness influence this process. BMC Emerg Med 24 , 114 (2024). https://doi.org/10.1186/s12873-024-01034-6

Download citation

Received : 30 March 2024

Accepted : 27 June 2024

Published : 12 July 2024

DOI : https://doi.org/10.1186/s12873-024-01034-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Telemedicine
  • Emergency medicine
  • Emergency medicals services
  • Prehospital
  • Human factors
  • Mobile application

BMC Emergency Medicine

ISSN: 1471-227X

research on valid

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Grad Med Educ
  • v.3(2); 2011 Jun

A Primer on the Validity of Assessment Instruments

1. what is reliability 1.

Reliability refers to whether an assessment instrument gives the same results each time it is used in the same setting with the same type of subjects. Reliability essentially means consistent or dependable results. Reliability is a part of the assessment of validity.

2. What is validity? 1

Validity in research refers to how accurately a study answers the study question or the strength of the study conclusions. For outcome measures such as surveys or tests, validity refers to the accuracy of measurement. Here validity refers to how well the assessment tool actually measures the underlying outcome of interest. Validity is not a property of the tool itself, but rather of the interpretation or specific purpose of the assessment tool with particular settings and learners.

Assessment instruments must be both reliable and valid for study results to be credible. Thus, reliability and validity must be examined and reported, or references cited, for each assessment instrument used to measure study outcomes. Examples of assessments include resident feedback survey, course evaluation, written test, clinical simulation observer ratings, needs assessment survey, and teacher evaluation. Using an instrument with high reliability is not sufficient; other measures of validity are needed to establish the credibility of your study.

3. How is reliability measured? 2 – 4

Reliability can be estimated in several ways; the method will depend upon the type of assessment instrument. Sometimes reliability is referred to as internal validity or internal structure of the assessment tool.

For internal consistency 2 to 3 questions or items are created that measure the same concept, and the difference among the answers is calculated. That is, the correlation among the answers is measured.

Cronbach alpha is a test of internal consistency and frequently used to calculate the correlation values among the answers on your assessment tool. 5 Cronbach alpha calculates correlation among all the variables, in every combination; a high reliability estimate should be as close to 1 as possible.

For test/retest the test should give the same results each time, assuming there are no interval changes in what you are measuring, and they are often measured as correlation, with Pearson r.

Test/retest is a more conservative estimate of reliability than Cronbach alpha, but it takes at least 2 administrations of the tool, whereas Cronbach alpha can be calculated after a single administration. To perform a test/retest, you must be able to minimize or eliminate any change (ie, learning) in the condition you are measuring, between the 2 measurement times. Administer the assessment instrument at 2 separate times for each subject and calculate the correlation between the 2 different measurements.

Interrater reliability is used to study the effect of different raters or observers using the same tool and is generally estimated by percent agreement, kappa (for binary outcomes), or Kendall tau.

Another method uses analysis of variance (ANOVA) to generate a generalizability coefficient, to quantify how much measurement error can be attributed to each potential factor, such as different test items, subjects, raters, dates of administration, and so forth. This model looks at the overall reliability of the results. 6

5. How is the validity of an assessment instrument determined? 4 – 7 , 8

Validity of assessment instruments requires several sources of evidence to build the case that the instrument measures what it is supposed to measure. , 9,10 Determining validity can be viewed as constructing an evidence-based argument regarding how well a tool measures what it is supposed to do. Evidence can be assembled to support, or not support, a specific use of the assessment tool. Evidence can be found in content, response process, relationships to other variables, and consequences.

Content includes a description of the steps used to develop the instrument. Provide information such as who created the instrument (national experts would confer greater validity than local experts, who in turn would have more validity than nonexperts) and other steps that support the instrument has the appropriate content.

Response process includes information about whether the actions or thoughts of the subjects actually match the test and also information regarding training for the raters/observers, instructions for the test-takers, instructions for scoring, and clarity of these materials.

Relationship to other variables includes correlation of the new assessment instrument results with other performance outcomes that would likely be the same. If there is a previously accepted “gold standard” of measurement, correlate the instrument results to the subject's performance on the “gold standard.” In many cases, no “gold standard” exists and comparison is made to other assessments that appear reasonable (eg, in-training examinations, objective structured clinical examinations, rotation “grades,” similar surveys).

Consequences means that if there are pass/fail or cut-off performance scores, those grouped in each category tend to perform the same in other settings. Also, if lower performers receive additional training and their scores improve, this would add to the validity of the instrument.

Different types of instruments need an emphasis on different sources of validity evidence. 7 For example, for observer ratings of resident performance, interrater agreement may be key, whereas for a survey measuring resident stress, relationship to other variables may be more important. For a multiple choice examination, content and consequences may be essential sources of validity evidence. For high-stakes assessments (eg, board examinations), substantial evidence to support the case for validity will be required. 9

There are also other types of validity evidence, which are not discussed here.

6. How can researchers enhance the validity of their assessment instruments?

First, do a literature search and use previously developed outcome measures. If the instrument must be modified for use with your subjects or setting, modify and describe how, in a transparent way. Include sufficient detail to allow readers to understand the potential limitations of this approach.

If no assessment instruments are available, use content experts to create your own and pilot the instrument prior to using it in your study. Test reliability and include as many sources of validity evidence as are possible in your paper. Discuss the limitations of this approach openly.

7. What are the expectations of JGME editors regarding assessment instruments used in graduate medical education research?

JGME editors expect that discussions of the validity of your assessment tools will be explicitly mentioned in your manuscript, in the methods section. If you are using a previously studied tool in the same setting, with the same subjects, and for the same purpose, citing the reference(s) is sufficient. Additional discussion about your adaptation is needed if you (1) have modified previously studied instruments; (2) are using the instrument for different settings, subjects, or purposes; or (3) are using different interpretation or cut-off points. Discuss whether the changes are likely to affect the reliability or validity of the instrument.

Researchers who create novel assessment instruments need to state the development process, reliability measures, pilot results, and any other information that may lend credibility to the use of homegrown instruments. Transparency enhances credibility.

In general, little information can be gleaned from single-site studies using untested assessment instruments; these studies are unlikely to be accepted for publication.

8. What are useful resources for reliability and validity of assessment instruments?

The references for this editorial are a good starting point.

Gail M. Sullivan, MD, MPH, is Editor-in-Chief, Journal of Graduate Medical Education .

World Bank Blogs Logo

Global academic research is skewed toward rich countries. How do World Bank policy research papers fit in?

Brian stacy, quy-toan do, deon filmer.

Global academic research is skewed toward rich countries. How do World Bank policy research papers fit in?

Academic research is a major source of insights from data. These insights can be critical in helping policymakers manage public resources, formulate policies, and help the public understand the world around them. However, academic research is not evenly distributed across countries. As previous research has shown, wealthier countries are more likely to be studied by academics.  The World Bank is a major source of academic research. Policy Research Working Papers (PRWPs) are a key output of the World Bank. These papers aim to provide insights to policymakers in World Bank client countries, which are mostly low- and middle-income countries. How do these World Bank publications fare in terms of filling gaps in empirical academic research for World Bank clients? To examine this, we build on the approach of Stacy, Kitzmüller, Wang, Mahler, & Serajuddin (2024) to classify empirical academic articles based on data use. We compare the number of empirical academic articles and World Bank policy research working papers by country. What did we find? Scroll below to learn more.  

Do PRWPs fill gaps in empirical research for World Bank clients? To some extent yes! By focusing more on low- and middle-income countries, PRWPs help fill critical gaps left by other empirical studies. However, while they contribute to a more equitable distribution of research, the challenge of fully correcting the skew towards wealthier countries remains. A lack of high quality, timely, and open data sources is major issue in several low- and middle-income countries. The continued production and dissemination of PRWPs are vital steps toward a more inclusive and comprehensive understanding of global development issues.

Get updates from Data Blog

Thank you for choosing to be part of the Data Blog community!

Your subscription is now active. The latest blog posts and blog-related announcements will be delivered directly to your email inbox. You may unsubscribe at any time.

Brian Stacy

Data Scientist, Development Data Group, World Bank

Quy-Toan Do

Co-Director, World Development Report 2023

Deon Filmer

Director, Development Research Group, World Bank

Join the Conversation

  • Share on mail
  • comments added

research on valid

Special Features

Vendor voice.

research on valid

RADIUS networking protocol blasted into submission through MD5-based flaw

If someone can do a little mitm'ing and hash cracking, they can log in with no valid password needed.

Cybersecurity experts at universities and Big Tech have disclosed a vulnerability in a common client-server networking protocol that allows snoops to potentially bypass user authentication via man-in-the-middle (MITM) attacks.

If the vulnerability, rated 7.5 out of 10 on the CVSS severity scale and tracked as CVE-2024-3596 , is exploited – and it's not that easy to pull off – attackers could theoretically access to network devices and services without needing to obtain any credentials. It does require, on a practical level, MITM'ing someone's network traffic and performing some rapid hash cracking.

Dubbed Blast RADIUS by researchers at Cloudflare, Microsoft, UC San Diego, CWI Amsterdam, and BastionZero, you can probably guess it affects the RADIUS networking protocol. Essentially, the flaw allows someone to log into a client device that relies on a remote RADIUS server to perform the authentication check – without the correct credentials.

If you're wondering how this affects you, the team notes that:

Our attack requires the adversary to have network access to act as a man-in-the-middle attacker on the connection between the victim device’s RADIUS client and RADIUS server. When there are proxies, the attack can occur between any hop. Our attacker will need to be able to act as a full network man-in-the-middle who can read, intercept, block, and modify inbound and outbound network packets.

They go on to say it's not all plain sailing, though: "Such access to RADIUS traffic may happen through different mechanisms. Although sending RADIUS/UDP over the open internet is discouraged, this is still known to happen in practice. For internal network traffic, the attacker might initially compromise part of an enterprise network.

"Even if RADIUS traffic is confined to a protected part of an internal network, configuration or routing mistakes might unintentionally expose this traffic. An attacker with partial network access may be able to exploit DHCP or other mechanisms to cause victim devices to send traffic outside of a dedicated VPN."

The Remote Authentication Dial-In User Service ( RADIUS ) protocol was drummed up in the 1990s and is still used in networks today. The Blast RADIUS flaw is understood to affect RADIUS deployments that use PAP, CHAP, MS-CHAPv2, and other non-EAP authentication methods. IPSec, TLS, 802.1x, Eduroam, and OpenRoaming are all considered safe.

"The RADIUS protocol is a foundational element of most network access systems worldwide. As of July 9, nearly all of these systems are no longer secure," Alan DeKok, CEO of InkBridge Networks, claimed. 

research on valid

"The discovery of the Blast RADIUS issue means that network technicians must install firmware upgrades on every device involved in network security, identity, and authentication. We believe that internet service providers, enterprises, and most cloud identity providers are likely to be affected by this issue."

Blast RADIUS hinges on the way RADIUS clients and servers handle authentication requests, and involves performing collision attacks against the MD5 hashing function. MD5 has been demonstrably broken since the 2000s, though the Blast RADIUS team say their abuse of the algorithm to exploit the RADIUS protocol vulnerability "is more complex than simply applying an old MD5 collision attack." They say their approach is better in terms of speed and scale.

As we indicated, a successful Blast RADIUS attack involves someone manipulating a victim's client-server RADIUS traffic to authenticate themselves to one of the target's clients - such as a router - to cause further mischief and mayhem, all without the need for a valid password . Given the hurdles involved, this sort of attack is largely of use to someone who already has a presence in a network and wants to drill in deeper.

How Blast RADIUS works

This will be a simplified explanation, and for those who want the full story, a technical paper [PDF] is available from the vulnerability's branded website .

Blast RADIUS exploitation begins with an attacker trying to authenticate themselves to a client using whatever username and password combo they want – it doesn't matter, it doesn't need to work.

The client then contacts its RADIUS server over the network to perform the actual authentication using an Access-Request message. If the server determines the presented credentials are correct, it sends back an Access-Accept packet to the client, signaling the user should be allowed to login. Of course, in this instance, the server won't do so because the creds are wrong – it will instead return an Access-Denied packet.

To somewhat protect the communications between the client and server from impersonation, they have a shared secret. When the client sends its Access-Request to the server, the client includes a 16-byte random value called the Request Authenticator, and when the server responds, the server includes a Response Authenticator value that is computed using the client's random Request Authenticator, the shared secret, and other data in the reply.

Thus when the client receives the server's response, the client can use its Request Authenticator value and the shared secret and data in the reply to check that the server computed and sent the correct Response Authenticator with its response. If someone tries to impersonate the server and doesn't know the secret, they can't send the right response, and the client can ignore it. This should ideally undermine MITM attacks.

Diagram of the Blast Radius attack from the technical paper

From the technical paper ... Illustrated guide to exploitation. Click to enlarge

Let's rewind to the client trying to authenticate someone who doesn't know the correct credentials. Here's where the Blast RADIUS MITM happens.

The snoop can intercept the client's Access-Request and its random Request Authenticator value and manipulate its data so that when this altered message is sent by the attacker to the server, the server replies with an Access-Denied message that the attacker can again intercept and tamper with to convert the server response into a valid forged Access-Accept message for the client.

This is done using an MD5 chosen-prefix hash collision attack based on earlier work by Marc Stevens et al , and exploiting the fact that carefully crafted garbage data added to a proxy configuration attribute in the Access-Request message to the server by the attacker is included in the server's Access-Denied reply. With a little cryptographic dance, it's possible to create a forged Access-Accept response that is valid for the client's Request Authenticator value but without knowing the shared secret.

This double interception and manipulation is needed because the attacker doesn't know the secret but can control the contents of the message payloads and thus, through the collision attack, the hashes so that what the attacker sends the client matches the client's expectations.

As far as the client is concerned, it receives a valid Access-Accept response from its server, and grants access to the attacker.

  • 2002: New RADIUS vulns exposed
  • 2014: Crypto collision used to hijack Windows Update goes mainstream
  • 2016: The sloth is coming! Quick, get MD5 out of our internet protocols
  • 2023: Months after NSA disclosed Microsoft cert bug, datacenters remain unpatched

According to Cloudflare's write-up , typically the attack has to be carried out in under five minutes to work on most RADIUS kit in the field, accounting for the standard client timeout tolerance. Most devices tolerate timeouts of between 30 and 60 seconds, and theoretically, well-resourced attackers could make use of cloud computing platforms to speed up exploitation.

Mitigation strategies

We're told by the team behind the research that the makers of RADIUS authentication stacks have developed updates to thwart exploitation of this protocol-level weakness – which was apparently uncovered in February though folks have known for a while the security pitfalls of Access-Request exchanges.

Judging by the boffins' note as follows, you should look out for and install updates for your deployments:

Our recommended short-term mitigation for implementers and vendors is to mandate that clients and servers always send and require Message-Authenticator attributes for all requests and responses. For Access-Accept or Access-Reject responses, the Message-Authenticator should be included as the first attribute. Patches implementing this mitigation have been implemented by all RADIUS implementations that we are aware of. This guidance is being put into an upcoming RADIUS RFC.

The best mitigation for client-server RADIUS deployments, we're told, is to implement RADIUS over TLS (RadSec) to protect RADIUS packets in a strongly encrypted stream from miscreants. See the vuln's website for more details and mitigations. ®

  • Cybersecurity
  • Vulnerability

Narrower topics

  • Cellular network
  • Dynamic Host Configuration Protocol
  • Network interface card
  • Network switch
  • Radio Access Network
  • RSA Conference
  • Software-defined network
  • Streaming video
  • Submarine cable
  • Systems Approach
  • World Wide Web
  • Zero Day Initiative

Broader topics

Send us news

Other stories you might like

No rest for the wiry as cisco nexus switches flip out over latest zero-day, latest ghostscript vulnerability haunts experts as the next big breach enabler, traeger security bugs bad news for grillers with neighborly beef, unleashing the power and control of industry-specific cloud platforms.

research on valid

CISA broke into a US federal agency, and no one noticed for a full 5 months

Nasty regresshion bug in openssh puts roughly 700k linux boxes at risk, juniper networks flings out emergency patches for perfect 10 router vuln, europol says mobile roaming tech is making its job too hard, fcc: us telcos a long way off, several billions short of removing chinese kit, china pushes for network upgrade blitz as ipv6 adoption slows, alibaba cloud reveals its datacenter design, homebrew network used for llm training, i spy another mspy breach: millions more stalkerware buyers exposed.

icon

  • Advertise with us

Our Websites

  • The Next Platform
  • Blocks and Files

Your Privacy

  • Cookies Policy
  • Privacy Policy
  • Ts & Cs

Situation Publishing

Copyright. All rights reserved © 1998–2024

no-js

Follow Polygon online:

  • Follow Polygon on Facebook
  • Follow Polygon on Youtube
  • Follow Polygon on Instagram

Site search

  • Best Prime Day deals
  • Best gaming deals
  • Best board game deals
  • Best Lego deals
  • Best MTG deals
  • Best LOTR deals
  • Best D&D deals
  • All Prime Day deals
  • How to access the DLC
  • What to do first
  • Interactive map
  • Walkthrough
  • Scadutree Fragments
  • Erdtree map fragments
  • Boss locations
  • Hornsent questline
  • Sir Ansbach questline
  • All DLC guides
  • Elden Ring DLC
  • Zenless Zone Zero
  • Zelda: Tears of the Kingdom
  • Baldur’s Gate 3
  • PlayStation
  • Dungeons & Dragons
  • Magic: The Gathering
  • Board Games
  • All Tabletop
  • All Entertainment
  • What to Watch
  • What to Play
  • Buyer’s Guides
  • Really Bad Chess
  • Pile-Up Poker
  • All Puzzles

Filed under:

  • Pokémon Go guide

Pokémon Go ‘The Dawn of a New Discovery’ Radar Tuning choose a path quest steps: Emolga, Crabrawler, or Ducklett?

Get Cosmog for doing this short research

Share this story

  • Share this on Facebook
  • Share this on Reddit
  • Share All sharing options

Share All sharing options for: Pokémon Go ‘The Dawn of a New Discovery’ Radar Tuning choose a path quest steps: Emolga, Crabrawler, or Ducklett?

Cosmog in front of the Pokémon Go Fest 2024 logo

“ The Dawn of a New Discovery ” grants all Pokémon Go players a Cosmog during Day 2 of Go Fest 2024 global.

Free for all players with or without a Go Fest ticket, it grants you a free Cosmog and Cosmog Candy to help with unlocking a Dusk Mane or Dawn Wings Necrozma .

For those who do have a Go Fest ticket, completing this quest then unlocks exclusive quest “The Dusk Settles” , which then rewards an encounter with one of Cosmog’s evolutions and enough Solar or Lunar Fusion Energy to turn fuse your chosen legendary with Necrozma.

One note before we begin — keep in mind that you’ll need to complete a raid in the final step, so you may want to hold on to your daily pass until you get there, if you’re planning on finishing the quest during the event.

‘The Dawn of a New Discovery’ Special Research quest steps

All players who open the game on Sunday, July 14 between 10 a.m. and 6 p.m. will get this quest. This quest is also mandatory for ticket holders to complete in order to unlock the next quest, “The Dusk Settles.”

‘The Dawn of a New Discovery’ step 1 of 3

  • Spin 3 PokéStops or gyms (20 Poké Balls)
  • Catch 10 Pokémon (3 Nanab Berries)
  • Complete 2 Field Research tasks (100 XP)

Rewards : 3 Potions, Pikachu (sun crown) encounter, 100 Stardust

After this point, you’ll need to pick between Emolga, Crabrawler, or Ducklett. The steps and rewards are mostly the same across the paths, except one of the steps, which will reward you with your chosen Pokémon for catching seven of that Pokémon. Your radar will prioritize the chosen Pokémon, too. You can read more on this choice in the later section on this guide.

‘The Dawn of a New Discovery’ step 2 of 3

  • Catch 7 Emolga, Crabrawler or Ducklett (Emolga, Crabrawler, or Ducklett encounter)
  • Power up Pokémon 8 times (20 Poké Balls)
  • Evolve 9 Pokémon (3 Pinap Berries)

Rewards : 1 Silver Pinap Berry, Cosmog encounter, 1 Razz Berry

‘The Dawn of a New Discovery’ step 3 of 3

  • Explore 1 km (3 Revives)
  • Battle in a raid (3 Super Potions)
  • Earn 2,000 XP (3 Max Revives)

Rewards : 5 Necrozma stickers, Pikachu (moon crown) encounter, 10 Cosmog Candy

With the above done, those with a Go Fest Global ticket can then begin “The Dusk Settles” .

‘Radar Tuning’ choose a path outcome: Should you choose Emolga, Crabrawler, or Ducklett?

After step one, you get the choice between tuning your radar towards Emolga, Crabralwer, and Ducklett.

This choice results in two things:

  • The next quest objective will make you catch seven of your chosen Pokémon — Emolga, Crabrawler, or Ducklett — and the reward an encounter with your choice
  • Your radar — also known as the list of Pokémon in the ‘Nearby’ window, accessed by tapping the box in the lower right corner of the map screen — will promote your chosen Pokémon to help find them faster

Though it will not result in additional spawns, you choice will temporarily help you find more of that Pokémon through the Nearby feature, as well as grant you an additional single encounter as part of your rewards.

In terms of what you should choose, it should be based on what you want more of, particularly if you want a shiny of that Pokémon. To clarify, the shiny odds won’t increase as part of this research — but by simply encountering more of that Pokémon, you are more likely to encounter a shiny as a result.

If you’re wondering why Emolga, Crabrawler, and Ducklett are the choices here, these are all recently added shinies as part of in-person Go Fest 2024 events earlier this year, so get more of the spotlight in the Global leg for those wanting to flesh our their shiny Pokédex.

Looking for more to do during Go Fest 2024? Remember to get plenty of Solar and Lunar Fusion Energy for Dusk Mane and Dawn Wings Necrozma — which you can do effectively with strong Necrozma counters — and complete Marshadow research “A Shadowy Caper” .

  • Pokémon Go guides
  • Raid schedule
  • Spotlight Hour schedule
  • Giovanni counters
  • Ditto disguises

research on valid

The next level of puzzles.

Take a break from your day by playing a puzzle or two! We’ve got SpellTower, Typeshift, crosswords, and more.

Sign up for the newsletter Shopkeeper

We deliver the best gaming deals to your inbox for the perfect price: free

Just one more thing!

Please check your email to find a confirmation email, and follow the steps to confirm your humanity.

Oops. Something went wrong. Please enter a valid email and try again.

Loading comments...

Sol (Lee Jung-jae) holding his lightsaber and looking fierce

All The Acolyte season 2 news we’ve heard so far 

Composition of Prime Day gaming deals, including images of the Meta Quest 3 VR. headset and screenshots of Elden Ring, Persona series, Like a Dragon: Infinite Wealth, and Armored Core VI

The best gaming deals of Amazon Prime Day 2024

Image composition of Magic: The Gathering deals for Prime Day 2024

The best Magic: The Gathering deals of Amazon Prime Day 2024

Image composition of various board games on sale for Prime Day 2024

The best Amazon Prime Day board game deals under $50

A compilation of products on sale during Prime Day

Here are the best Prime Day deals we’ve found

A split screen of the box for Tokaido Duo and Betrayal at House on the Hill is imposed on a blue background.

The most popular Amazon Prime Day deals for Polygon readers

  • Open access
  • Published: 15 July 2024

Examining the psychometric properties of the Turkish version of the proactive coping scale in nursing students: A methodological study

  • Esra Özbudak Arıca   ORCID: orcid.org/0000-0003-2622-7863 1  

BMC Nursing volume  23 , Article number:  481 ( 2024 ) Cite this article

Metrics details

The aim of this study was to use the “PROACTIVE Coping Scale” to adapt the scale to Turkish culture, to determine its validity and reliability in a sample of undergraduate nursing students, and to evaluate the proactive coping levels of nursing students.

Proactive coping skills are very important for nursing students to cope effectively with various stressors that they may encounter both in their academic lives and in their future professional lives. There are no valid and reliable instruments for measuring the proactive coping levels of nursing students in Turkey.

The present study is a descriptive and methodological study. Research data were collected between 01.12.2023 and 01.01.2024 via face-to-face interviews. The study was completed by 272 nursing students who voluntarily agreed to participate in the study. In the analysis of the data, number/percentage, exploratory and confirmatory factor analysis, and Cronbach’s Alpha reliability coefficient methods were used.

The scale structure was confirmed with 19 items and 4 factors. The Cronbach’s alpha reliability coefficient of the PROACTIVE Coping Scale was found to be 0.816. The scale explains 67.17% of the total variance, and item correlation values vary between 0.263 and 0.650.

Conclusions

The study showed that the PROACTIVE Coping Scale is a valid and reliable instrument for evaluating the proactive coping levels of nursing students.

Peer Review reports

Introduction

The fact that nursing education programs are based on practice and that the theoretical knowledge learned in the classroom is put into practice in real clinical environments may cause nursing students to experience more stress than other university students [ 1 , 2 ]. While the clinical environment offers rich opportunities for gaining practical experience, it is the greatest source of stress for nursing students [ 2 ]. The main stressors that nursing students may encounter in the clinical environment include unfamiliarity with the clinical setting, lack of confidence, fear of making mistakes, negative reactions to patients’ deaths or suffering, student-instructor relationships, and nurses’ attitudes towards nursing students. In addition to these, nursing students must cope with academic stressors related to the challenges of nursing education, exams, and evaluations [ 1 , 2 ] as well as personal/social stressors related to their private lives [ 2 , 3 , 4 , 5 ]. Furthermore, after completing their education, nursing students face future stressors such as the transition to professional life, concerns about where and when to start their careers, and taking on financial responsibilities [ 6 , 7 ].

Stress can be beneficial to individuals at minimal levels because it increases excitement and motivation. However, inadequacy in coping with stress or chronic stress can negatively affect an individual’s mental and physical health [ 8 , 9 ]. Chronic stress can affect nursing students’ learning, decision-making, and thinking, ultimately impacting their academic success and even causing them to drop out of nursing education [ 8 , 10 ]. Inability to manage stress and inadequacy in coping with stress can lead nursing students to experience negative conditions such as anxiety, anger, worry, tension, depression, guilt, and insomnia [ 9 , 10 ]. Moreover, if adequate measures are not taken against the factors causing stress in future nurses and nursing students experience high levels of prolonged stress, it can negatively affect the workforce providing care services and the quality of patient care in the future [ 9 ]. Determining the most effective stress management methods during nursing education can increase success during education, contribute to the maintenance of knowledge and help students adapt to the nursing profession with a successful transition to professional life [ 7 ]. One of the coping strategies that nursing students can use to manage future stressors is proactive coping.

Aspinwall and Taylor (1997) defined proactive coping as “the efforts undertaken in advance to change the form of or prevent a potentially stressful event before it occurs”. Individuals who adopt proactive coping can recognize the potential difficulties around them, and they can cope with these difficulties before burnout occurs [ 11 ]. Since proactive coping behaviour is used before a potential stressor occurs, the degree of stress experienced in the case of a stressful situation can be minimized. In addition, individuals have too many coping options and sources before stressors occur; however, these options become limited after stressors occur. In addition, proactive coping includes learning from past mistakes, making plans for future goals, predicting the likelihood of stress and taking precautions [ 12 ].

Research on this topic shows that the more individuals use proactive coping strategies whenever they face stressful events, the lower their stress levels will be [ 13 , 14 , 15 ]. Proactive individuals are less likely to build up stress to a point where they cannot put up with it because these individuals take care to plan for the future and create resources to buffer against stress along the way. All these factors enable proactive individuals to overcome challenging goals more easily and to support their personal development [ 11 ]. The literature indicates that nurses with high proactive coping skills have a high quality of life [ 16 ], and they report less emotional burnout, less depersonalization and more personal success [ 11 ]. In a study conducted in China with nursing students, it was found that students with a highly proactive personality had a lower risk of academic burnout [ 17 ], while in another study, it was found that a proactive personality in nursing students was positively correlated with career adaptability and self-perceived employability [ 18 ].

Proactive coping is very important for nursing students to cope effectively with various stressors that they may encounter in their future education and clinical and professional lives. However, when the literature was reviewed, no valid and reliable instrument was found for measuring proactive coping levels in a sample of nurses or nursing students. Tian et al. developed the PROACTIVE Coping Scale in 2023 to measure the proactive coping levels of university students in the USA. In the analyses performed to evaluate its validity and reliability, it was determined that the scale showed strong psychometric properties [ 12 ]. This formed the starting point of the study, and the “PROACTIVE Coping Scale” was used to adapt the scale to Turkish culture, determine its validity and reliability in a sample of undergraduate nursing students, and determine the proactive coping levels of nursing students.

Design and sample

This research was methodologically and descriptively conducted. The universe of the research consisted of a total of 346 students studying in the second, third and fourth grades of the nursing department of a university located in the central region of Turkey. There are opinions that the number of samples in scale adaptation studies should be at least five [ 19 ], 10 [ 20 ] or 15 [ 21 ] times the total number of items. Since the number of items in the PROACTIVE Coping Scale is 19, a sample size between 95 and 285 is sufficient. However, in this study, no sample selection was made, and we planned to include the whole population. The study was completed with the participation of 272 volunteer students. The inclusion criteria for the study were having participated in clinical practice at least once and volunteering to participate in the study. Since nursing education is based on both theoretical knowledge and clinical practice, and students are exposed to numerous stress factors during their clinical practice training, only students who are currently undertaking clinical practice were included in the study. Since clinical practice started in the second year in this faculty, first-year nursing students were not included in the study.

Instruments

The Personal Information Form and the PROACTIVE Coping Scale were used in this study as data collection instruments.

Personal information form

This form was prepared by the researcher and included a total of 10 questions about age, gender, marital status, type of family students, year of study, state of choosing the nursing department willingly, state of loving the nursing profession, state of hesitating to take responsibility, state of coping with difficulties, and state of worrying about future stressors.

PROACTIVE coping scale

The PROACTIVE Coping Scale was developed by Tian et al. in 2023 for university students in the USA [ 12 ]. The original version of the Proactive Coping Scale consists of four sub-dimensions and 19 items: Active Preparation (items 1, 4, 5, 7, 17, 18), which describes activities done to identify, prevent, and prepare for future stressors; Ineffective Preparation (items 3, 6, 11, 15, 19), which identifies perceived difficulties in coping with future stressors; Self-Management (items 8, 9, 12, 14), which describes regulatory activities that facilitate and sustain positive conditions while preparing for future stressors; and Utilization of Social Resources (items 2, 10, 13, 16), which questions the use of social resources to cope with future stressors. The lowest score that can be obtained from the scale is 19 and the highest score is 95. As the score obtained from the scale increases, the proactive coping levels of the participants increase. The scale is a 5-point Likert-type scale, and the participants are asked to respond to all items on a scale ranging from 1 = strongly disagree to 5 = strongly agree. In Tian et al.’s (2023) study, the Cronbach’s alpha reliability coefficients of the original scale were 0.62 for the active preparation factor, 0.81 for the ineffective preparation factor, 0.69 for the self-management factor, and 0.69 for the utilization of social resources factor. In the present study, the Cronbach’s alpha reliability coefficients were 0.816 for the total scale, 0.761 for the active preparation factor, 0.729 for the ineffective preparation factor, 0.786 for the self-management factor, and 0.658 for the utilization of social resources factor. Permission was obtained from the scale owner [ 12 ] via email to conduct validity and reliability studies on the PROACTIVE Coping Scale in the Turkish population and nursing students.

Linguistic validity

Among the validity and reliability studies of the PROACTIVE Coping Scale, the first focused on language validity. Before language validity, permission to use the scale was obtained from Tian et al. (2023). The scale was first translated to Turkish by four translators who were fluent in English and native speakers of Turkish. Later, the scale was back-translated from Turkish to English by three translators who were fluent in both Turkish and English. The scale, which was retranslated to English, was compared with the original version, and the required revisions were made. The scale obtained after the adjustments had the same meaning as the original scale. Then, the scale was submitted to expert opinion for content validity.

Content validity

The PROACTIVE Coping Scale was presented to ten experts (7 faculty members in the field of nursing, 1 specialist nurse, and 2 faculty members in the field of psychology) for the evaluation of content validity. In terms of content validity, the experts evaluated scale items in terms of comprehensibility, purpose, cultural appropriateness and discrimination. The Lawshe (1975) technique was used to evaluate expert opinions [ 22 ]. In the Lawshe (1975) technique, experts evaluate each scale item between points “1” and “3” [1 = Item necessary, 2 = Item useful but not sufficient, 3 = Item unnecessary). In this research, the Content Validity Ratio CVR was taken as 0.62 because of the opinions of ten experts [ 23 ]. There were no items with a CVR < 0.62 within the scope of the study. The content validity index (CVI) was calculated from the means of the CVRs and was found to be 0.95. As a result, since CVI ≥ CVR (0.95 ≥ 0.62), the content validity of the 19-item structure was statistically significant. Therefore, no item within the scope of content validity was removed.

After content validity, a pilot study was conducted with a group of 30 nursing students who were not included in the sample to evaluate the comprehensibility of the scale items. Most of the students included in the pilot study stated that the scale items were understandable. A very small number of students made suggestions regarding spelling and understandability of some items. The required revisions were made, and the scale was finalized in line with the results of the pilot study.

Data collection and analysis

Research data were collected between 01.12.2023 and 01.01.2024 via face-to-face interviews. Before starting the research, the nursing students were informed about the research, and written and verbal consent was obtained. A total of 272 students agreed to participate in the study voluntarily. The test-retest method was used to test the invariance of the scale over time ( n  = 50). The test-retest method is the second application of the measurement tool to the sample group under the same conditions, in which individuals do not remember their responses to the scale items and there is no significant change in the characteristics to be measured [ 24 ]. This time interval is between 15 and 30 days according to the literature [ 25 ]. Therefore, the PROACTIVE Coping Scale was reapplied three weeks after the first application to 50 nursing students who were identified within the scope of the study and who were given nicknames in the first application.

In this study, the data were analysed with the SPSS 26 and AMOS 22 package programs. Reliability analyses were performed using Cronbach’s alpha coefficient, item-total score correlation and test-retest techniques. To test the validity of the scale, language validity, the content validity index and construct validity (explanatory factor analysis) were used. In addition, Kaiser‒Meyer‒Olkin (KMO) tests were used to determine the sample size, and Bartlett’s test of sphericity was used to determine the suitability of the sample volume for factor analysis. p <  0.05 was considered to indicate statistical significance.

Ethical issues and permissions

Before starting the research, permission to use the scale was obtained from Tian, Tsai, Khalsa, Condie, Kopystynsky, Ohde, Zhao (2023) via e-mail. Later, ethics committee approval was received from Yozgat Bozok University Ethics Commission (Decision No: 08/11 Date: 23.11.2023) and institutional permission was obtained from the institution where the research was conducted (No: E-88148187-020-186521). In addition, before starting the application, the students were informed that the research was completely voluntary and that the content, purpose, scope and data of the research would be kept confidential. Written consent was obtained from students who voluntarily agreed to participate in the research. The study was conducted in accordance with the principles of the Declaration of Helsinki (WTO General Assembly, Fortaleza, Brazil, October 2013) and the Law on Medical Research Involving Human Subjects.

Characteristics of the students

Among the nursing students who participated in the study ( n  = 272), 81.6% were female, 98.2% were single, 80.9% grew up in a nuclear family, 58.1% were in their third year, 58.1% chose the nursing department willingly, 67.6% loved the nursing profession, 82% were not willing to take responsibility, 91.2% believed that they could overcome difficulties and 87.5% were concerned about future stressors.

Results of validity

Explanatory factor analysis.

Before determining the factor structure of the PROACTIVE Coping Scale, the KMO test was used to determine whether the data were suitable for factor analysis, and the Bartlett sphericity test was used to evaluate whether the correlations between the analysed variables were significant (Table  1 ).

The KMO value of the PROACTIVE Coping Scale was 0.836 (Table  1 ). The KMO value is between 0 and 1 and shows a more reliable factor structure as it approaches 1. A value > 0.50 is acceptable, a value between 0.50 and 0.70 is normal, a value between 0.70 and 0.80 is good, a value between 0.80 and 0.90 is very good, and a value > 0.90 is interpreted as an excellent sample size [ 26 ]. Thus, factor analysis results are obtained with a useful and usable sample. The sample size of the scale with a KMO value of 0.836 was sufficient, and the Bartlett sphericity test results showed that the data were suitable for factor analysis (χ2 = 1535.000, p  < 0.001) (Table  1 ).

Table  1 shows in detail the factor structure of the PROACTIVE Coping Scale, in which items are included in the factors and the factor loading of each item. According to these results, the factor loadings of the items vary between 0.497 and 0.735 for the active preparation factor, between 0.653 and 0.745 for the ineffective preparation factor, between 0.648 and 0.809 for the self-management factor and between 0.522 and 0.716 for the utilization of social resources factor. The table shows that the factor loadings of all the items are above 0.500 [ 26 ]. Therefore, the four factors of the proactive coping scale measure the subfeatures (Table  1 ).

Total correlations above 0.20 show that the item is important for the question. According to the results, the total correlation values are between 0.263 and 0.650. The results show that the survey is a valid and reliable instrument (Table  1 ) [ 27 ].

The explained variance ratio shows the strength of the factor structure of the scale. While the 6-item active preparation factor explained 15.39% of the total structure, the 5-item ineffective preparation factor explained 13.28% of the total structure, the 4-item self-management factor explained 14.31% of the total structure, and the 4-item utilization of social resources factor explained 16.61% of the total structure. These 4 factors and 19 items explained 67.17% of the total variance (Table  1 ).

Confirmatory factor analysis

CFA examines validity by testing the structures determined with EFA or confirming the results of a previously conducted scale with new data sets. To confirm the 19-item and 4-factor structure established as a result of exploratory factor analysis, the measurement model was analysed with CFA. The χ2/df, RMSEA, IFI, TLI, CFI and GFI were used to evaluate the factor validity of the models within the scope of CFA (Table  2 ).

The RMSEA is an index that is least affected by sample size. However, the RMSEA fit index criteria are inconsistent in different studies. It is thought that a cut-off value close to 0.06 or 0.08 is usually acceptable in this field. IFI, TLI, CFI and GFI fit indices exceeding 0.90 are accepted as proof of sufficient model fit [ 25 ]. In this study, the RMSEA was ≤ 0.05; the IFI, TLI, and CFI were ≥ 0.90; and the GFI was ≥ 0.85 and acceptable. The model obtained for the PROACTIVE Coping Scale (χ2/sd  =  1.947 dF =  147) has four factors. The fit indices of this model show that the model has an acceptable level of fit (Table  2 ).

A path diagram of the factor loading values of the confirmed measurement model is shown in Fig.  1 . When the 19-item and 4-factor measurement models confirmed in this study are examined, which items make up the model and the standardized regression coefficients of the paths on the one-way arrows can be seen (Fig.  1 ).

figure 1

PROACTIVE Coping Scale confirmatory factor analysis model

Each of the path coefficients for the 19 questions was statistically significant ( p <  0.05). According to these results, the active preparation factor consisted of questions 1, 4, 5, 7, 17 and 18; the ineffective preparation factor consisted of questions 3, 6, 11, 15 and 19; the self-management factor consisted of questions 8, 9, 12 and 14; and the utilization of social resources factor consisted of questions 2, 10, 13 and 16. All the factors had highly statistically significant effects on the questions. The path coefficients of the factors active preparation, ineffective preparation, self-management and utilization of social resources are statistically significant ( p <  0.05). The factor with the greatest effect is active preparation, while the factor with the least effect is ineffective preparation.

In addition, item 1 is the strongest indicator of the active preparation factor, with a value of 0.79; item 3 is the strongest indicator of the ineffective preparation factor, with a value of 0.65; item 12 is the strongest indicator of the self-management factor, with a value of 0.78; and items 2 and 10 are the strongest indicators of the utilization of social resources factor, with a value of 0.65.

Results of reliability

Internal consistency (cronbach’s alpha) coefficients.

The internal consistency of the PROACTIVE Coping Scale and factors were evaluated with the Cronbach’s alpha reliability coefficient. According to these results, the 19-item PROACTIVE Coping Scale is highly reliable (α = 0.816), while the factors active preparation (α = 0.761), ineffective preparation (α = 0.729) and self-management (α = 0.786) are reliable at a generally accepted level, and the factor utilization of social resources is moderately reliable (α = 0.658). The results showed that the PROACTIVE Coping Scale is a valid and reliable instrument that can be used to measure the proactive coping behaviours of nursing students [ 24 ].

Test-retest reliability

To determine the invariance of the scale over time, the scale was reapplied to the participants three weeks after the first application (Table  3 ). The correlation between the scores obtained from the first and second administration of the scale was examined with the intraclass correlation coefficient (ICC), and the results are shown in Table  3 .

Table  3 shows that the agreement between the responses to the questions was very good in the retest with the nursing students ( p  < 0.001). [ 24 ].

The mean score of the PROACTIVE Coping Scale was 66.02 ± 8.19, the mean score of the active preparation factor was 21.93 ± 3.41, the mean score of the ineffective preparation factor was 15.22 ± 3.383, the mean score of the self-management factor was 14.16 ± 2.83, and the mean score of the utilization of social resources factor was 14.71 ± 2.39.

The sub-dimensions of the PROACTIVE coping scale and the total score of the scale were obtained by the sum of the number of questions, and there are no reverse items in the scale. When the minimum possible score of the scale is 19, the maximum possible score is 95. The scores that the Active Preparation sub-dimension can receive vary between 6 and 30, the Ineffective Preparation sub-dimension can receive scores between 5 and 25, and the scores that the Self-Management and Social Resource Use sub-dimensions can receive vary between 4 and 20.

While proactive coping contributes to the personality development of nursing students, it also prepares them for the nursing profession. For this reason, it is important to popularize proactive coping in the nursing literature and to introduce relevant measurement tools. The aim of this study was to examine the validity and reliability of the “PROACTIVE Coping Scale” in nursing students, and for this purpose, language validity, content validity, construct validity, reliability analyses and internal consistency analyses were conducted.

In the scale adaptation process, the psycholinguistic properties of the scale are examined first. For linguistic validity, items of the scale should be translated first. The translation-backtranslation method is generally used for translation. To minimize the difference between the cultures in which the scale has been developed and in which adaptation is being made, translations by at least two translators who know both culture and language well are recommended in the literature [ 25 , 28 ]. The translation-backtranslation method was used in this study for linguistic validity. After the required revisions were made, content validity studies were started.

Content validity is the ability of the scale items to adequately measure behaviours [ 29 ]. Expert opinions are taken for content validity. In line with the opinions of the experts, the content validity rate of the whole item varied between 0.80 and 1.00, and the content validity of the scale was calculated as 0.95. No item was removed from the scale since the difference between expert opinions was not statistically significant ( p  > 0.05).

Following linguistic and content validity studies, a pilot study is required to evaluate the comprehensibility of scale items. Pilot studies should include individuals who have the same characteristics as the sample [ 25 ]. It is recommended that the pilot study be carried out with 10% of the main sample or 30 people [ 24 ]. Accordingly, final edits were made to the items by taking the opinions of 30 nursing students who were not included in the study.

In exploratory factor analysis, the KMO test was conducted to evaluate the appropriateness of the sample size for factor analysis, and the KMO value was found to be 0.836, which shows that the sample size was “good enough” for factor analysis [ 30 ]. According to the Bartlett test results, the chi-square value was found to be acceptable and appropriate for factor analysis (χ2 = 1535.000 p  < 0.001). As a result of the factor analysis, the factor loadings of the 4-factor PROACTIVE Coping Scale were found to be between 0.522 and 0.809. According to this result, the significance of all questions in the factor is sufficient.

CFA is used to confirm the factor structure of a scale with a predetermined factor structure [ 31 ]. In the confirmatory factor analysis, χ2/df, GFI, IFI, TLI, CFI and RMSEA were used to evaluate the model fit of the scale [ 29 ]. When the fit indices of this model were examined (χ²/sd: 1.947; GFI: 0.900; IFI: 0.903; TLI(NNFI): 0.905, CFI: 0.901; RMSEA: 0.059), it was found that the model had an acceptable fit. Therefore, 19-item and four-factor versions of the PROACTIVE Coping Scale were introduced to Turkish society.

In addition to being valid, an instrument should also be reliable. Reliability is defined as the degree to which an instrument consistently measures the phenomenon it wants to measure [ 29 ]. The Cronbach’s alpha reliability coefficient should be calculated to determine whether a Likert-type scale is reliable [ 32 ]. A Cronbach’s alpha coefficient close to 1 means that the scale is reliable [ 29 ]. In the present study, the Cronbach’s alpha reliability coefficients were 0.816 for the PROACTIVE Coping Scale, 0.761 for the active preparation factor, α = 0.729 for the ineffective preparation factor, α = 0.786 for the self-management factor and α = 0.658 for the utilization of social resources factor. According to these results, the PROACTIVE Coping Scale is a valid and reliable instrument that can be used to measure the proactive coping levels of nursing students.

Intraclass correlation coefficient values between the test-retest scores obtained to test the invariance of the PROACTIVE Coping Scale and the factors of active preparation, ineffective preparation, self-management and the utilization of social resources over time were found to be 0.817, 0.805, 0.737, 0.724 and 0.796, respectively. It has been reported in the literature that an ICC > 0.740 means that the scale has very good reliability [ 33 ]. Therefore, the PROACTIVE Coping Scale and its factors are reliable.

In the present study, the mean score of the “PROACTIVE Coping Scale” was 66.02 ± 8.19, the mean score of the “Active Preparation” factor was 21.93 ± 3.41, the mean score of the “Ineffective Preparation” factor was 15.22 ± 3.383, the mean score of the “Self-management” factor was 14.16 ± 2.83, and the mean score of the “Utilization of Social Resources” factor was 14.71 ± 2.39. Considering that the maximum possible score of the PROACTIVE Coping Scale is 95, it can be said that nursing students have above-average proactive coping levels. Similar to the results of the present study, in another study conducted in the literature, the proactive coping levels of nurses were found to be above average, and it was emphasized that nurses with high proactive coping levels had good internal control, active coping, and a high sense of personal success and self-efficacy [ 16 ]. In addition, when the literature was reviewed, it was found that highly proactive coping behaviors in nurses were correlated with low burnout [ 16 , 34 ].

Limitations

Before starting the research, the nursing students were asked what proactive coping was. However, the majority of students stated that they did not know this word. Accordingly, before the students completed the questionnaires, proactive coping was explained to the students by the researcher over a 20-minute period. The most important limitations of the research are that nursing is a new concept that students have heard of and that the research took place in the nursing department of a single university.

In conclusion, the PROACTIVE Coping Scale consists of 19 items and 4 factors: active preparation (6 items), ineffective preparation (5 items), self-management (4 items) and utilization of social resources (4 items). There are no reverse-coded items in the scale. The minimum and maximum possible scores of the scale are 19 and 95, respectively. As the score of the scale increased, the proactive coping levels of the nursing students also increased. The Cronbach’s alpha of the PROACTIVE Coping Scale was 0.816. The scale explained 67.17% of the total variance.

Data availability

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Bartlett ML, Taylor H, Nelson JD. Comparison of mental health characteristics and stress between baccalaureate nursing students and nonnursing students. J Nurs Educ. 2016;55(2):87–90. https://doi.org/10.3928/01484834-20160114-05 .

Article   PubMed   Google Scholar  

Lavoie-Tremblay M, Sanzone L, Aubé T, Paquet M. Sources of stress and coping strategies among undergraduate nursing students across all years. Can J Nurs Res. 2022;54(3):261–71. https://doi.org/10.1177/08445621211028076 .

Watson R, Rehman S, Ali PA. Stressors affecting nursing students in Pakistan. Int Nurs Rev. 2017;64(4):536–43. https://doi.org/10.1111/inr.12392 .

Article   CAS   PubMed   Google Scholar  

Llapa Rodrigues EO, Almeida Marques D, Lopes Neto D, López Montesinos MJ, Amado de Oliveira AS. Stressful situations and factors in students of nursing in clinical practice. Invest Educ Enferm. 2016;34(1):211–20. https://doi.org/10.17533/udea.iee.v34n1a23 .

Aljohani W, Banakhar M, Sharif L, Alsaggaf F, Felemban O, Wright R. Sources of stress among Saudi Arabian nursing students: a cross-sectional study. Int J Environ Res Public Health. 2021;18(22):11958. https://doi.org/10.3390/ijerph182211958 .

Article   PubMed   PubMed Central   Google Scholar  

Mussi FC, Pires CGDS, Carneiro LS, Costa ALS, Ribeiro FMS, Santos AFD. Comparison of stress in freshman and senior nursing students. Revista Da Escola De Enfermagem Da USP. 2019;53. https://doi.org/10.1590/S1980-220X2017023503431 .

Turner K, McCarthy VL. Stress and anxiety among nursing students: a review of intervention strategies in literature between 2009 and 2015. Nurse Educ Pract. 2017;22:21–9. https://doi.org/10.1016/j.nepr.2016.11.002 .

Labrague LJ, McEnroe-Petitte DM, Gloe D, Thomas L, Papathanasiou IV, Tsaras K. A literature review on stress and coping strategies in nursing students. J Ment Health. 2017;26(5):471–80. https://doi.org/10.1080/09638237.2016.1244721 .

Dasgupta A, Podder D, Paul B, Bandyopadhyay L, Mandal S, Pal A, Mandal M. Perceived stress and coping behavior among future nurses: a cross-sectional study in West Bengal, India. Indian J Community Med. 2020;45(2):204–8. https://doi.org/10.4103/ijcm.IJCM_200_19 .

Akkaya G, Gümüş AB, Akkuş Y. Determination of factors affecting nursing students’ Educational stress. J Educ Res Nurs. 2018;15(4):202–8.

Google Scholar  

Chang Y, Chan HJ. Optimism and proactive coping in relation to burnout among nurses. J Nurs Adm Manag. 2015;23:401–8. https://doi.org/10.1111/jonm.12148 .

Article   Google Scholar  

Tian L, Tsai C, Khalsa G, Condie M, Kopystynsky N, Ohde K, Zhao A. A PROACTIVE coping scale for U.S. college students: initial evidence and implications. J Psychoeducational Assess. 2023;41(4):395–415. https://doi.org/10.1177/07342829221151005 .

Gan Y, Hu Y, Zhang Y. Proactive and preventive coping in adjustment to college. Psychol Rec. 2010;60(4):643–58.

Gillespie GL, Gates DM. Using proactive coping to manage the stress of trauma patient care. J Trauma Nurs. 2013;20(1):44–50. https://doi.org/10.1097/JTN.0b013e318286608e .

Vaculíková J. Proactive coping behavior in sample of university students in helping professions. Sociální pedagogika/Social Educ. 2016;4(2):38–55. https://doi.org/10.7441/soced.2016.04.02.03 .

Cruz JP, Cabrera DNC, Hufana OD, Alquwez N, Almazan J. Optimism, proactive coping and quality of life among nurses: a cross-sectional study. J Clin Nurs. 2018;27(9–10):2098–108. https://doi.org/10.1111/jocn.14363 .

Kong LN, Yang L, Pan YN, Chen SZ. Proactive personality, professional self-efficacy and academic burnout in undergraduate nursing students in China. J Prof Nurs. 2021;37(4):690–5. https://doi.org/10.1016/j.profnurs.2021.04.003 .

Ma Y, Yue Y, Hou L. The impact of proactive personality and clinical learning environment on nursing college students’ perceived employability. Nurse Educ Pract. 2021;56:103213. https://doi.org/10.1016/j.nepr.2021.103213 .

Bryman A, Cramer D. Quantitative data analysis with SPSS release 10 for windows: a guide for social scientists. London: Routledge; 2021. https://doi.org/10.4324/9780203471548 .

Book   Google Scholar  

Nunnally JC. Psychometric theory. NewYork: McGraw Hill; 1978.

Gorusch RL. Factor analysis. Hillsdale, NJ: Lawrence Erlbaum Associates; 1983.

Lawshe CH. A quantitative approach to content validity. Pers Psychol. 1975;28(4):563–75.

Veneziano L, Hooper J. A method for quantifying content validity of health-related questionnaires. Am J Health Behav. 1997;21(1):67–70.

Ercan İ, Kan İ. Reliability and validity of the scales. Uludağ Univ Fac Med J. 2004;30(3):211–6.

Seçer İ. Psychological test development and adaptation process. SPSS and Lisrel applications. Ankara: Anı Publishing; 2015. s. 65–76.

Cerny BA, Kaiser HF. A study of a measure of sampling adequacy for factor-analytic correlation matrices. Multivar Behav Res. 1977;12(1):43–7.

Article   CAS   Google Scholar  

Crocker L, Algina J. Introduction to classical and modern test theory. Belmont: Wadsworth Pub Co.2006.

Çapık C, Gözüm S, Aksayan S. Stages of cross-cultural scale adaptation, language and culture adaptation: updated guide. FNJN Florence Nightingale J Nurs. 2018;26(3):199–210. https://doi.org/10.26650/FNJN397481 .

Karagöz Y. SPSS and Amos Applied quantitative qualitative mixed Scientific Research methods. Istanbul: Nobel Academic Publishing; 2017.

Field A. Discovering statistics using SPSS for Windows. London, Thousand Oaks: Sage, New Delhi;; 2000.

Ellez AM. Features that measurement tools should have. Tanrıöğen, A, editor. in: Scientific Research Methods. Ankara: Anı Publishing; 2014. s. 167–188.

Alpar R. Applied Statistics and Validity-Reliability. (5th Edition). Ankara: Detay Publishing; 2018. s. 493–604.

Sönmez V, Alacapınar FG. Illustrated Scientific Research Methods. (7th Edition). Ankara: Anı Publishing; 2016.

Chang K, Taylor J. Do your employees use the right stress coping strategies. Int J Commer Strategy. 2014;5(2):99–116.

Download references

Acknowledgements

I would like to thank Tian and her colleagues for allowing us to use the PROACTIVE Coping Scale, all the experts who gave their opinions, and all the students who participated in the research.

No external funding.

Author information

Authors and affiliations.

Faculty of Health Sciences, Yozgat Bozok University, Yozgat, Turkey

Esra Özbudak Arıca

You can also search for this author in PubMed   Google Scholar

Contributions

Esra Özbudak Arıca: Conceptualization; Data curation; Formal analysis; Funding acquisition; Investigation; Methodology; Project administration; Resources; Software; Supervision; Validation; Visualization; Roles/Writing - original draft; Writing - review & editing.

Corresponding author

Correspondence to Esra Özbudak Arıca .

Ethics declarations

Ethics approval and consent to participate.

Before starting the research, permission to use the scale was obtained from Tian via e-mail. Later, ethics committee approval was received from Yozgat Bozok University Ethics Commission (Decision No: 08/11 Date: 23.11.2023) and institutional permission was obtained from the institution where the research was conducted (No: E-88148187-020-186521). In addition, before starting the application, the students were informed that the research was completely voluntary and that the content, purpose, scope and data of the research would be kept confidential. Written and verbal consent was obtained from students who voluntarily agreed to participate in the research.

Consent for publication

Not Applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Arıca, E.Ö. Examining the psychometric properties of the Turkish version of the proactive coping scale in nursing students: A methodological study. BMC Nurs 23 , 481 (2024). https://doi.org/10.1186/s12912-024-02150-1

Download citation

Received : 27 February 2024

Accepted : 05 July 2024

Published : 15 July 2024

DOI : https://doi.org/10.1186/s12912-024-02150-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Nursing students
  • PROACTIVE coping
  • Translation
  • Validity and reliability

BMC Nursing

ISSN: 1472-6955

research on valid

IMAGES

  1. School essay: Components of valid research

    research on valid

  2. Reliability vs. Validity in Research

    research on valid

  3. Validity and reliability in research example

    research on valid

  4. Research validity and reliability

    research on valid

  5. How to establish the validity and reliability of qualitative research?

    research on valid

  6. Examples of reliability and validity in research

    research on valid

VIDEO

  1. Validation Of Research Instruments

  2. Basics of Research Methodology Lecture 1

  3. Junk Science About Men

  4. Got unexpected results from your research? Discover Experimental Results

  5. Reliability and Validity in Research || Validity and Reliability in Research in Urdu and Hindi

  6. Validity vs Reliability || Research ||

COMMENTS

  1. Validity

    Validity Validity is a fundamental concept in research, referring to the extent to which a test, measurement, or study accurately reflects or assesses the specific concept that the researcher is attempting to measure. Ensuring validity is crucial as it determines the trustworthiness and credibility of the research findings.

  2. The 4 Types of Validity in Research

    Validity tells you how accurately a method measures something. If a method measures what it claims to measure, and the results closely correspond to real-world values, then it can be considered valid. There are four main types of validity:

  3. Validity in research: a guide to measuring the right things

    Validity is necessary for all types of studies ranging from market validation of a business or product idea to the effectiveness of medical trials and procedures. So, how can you determine whether your research is valid? This guide can help you understand what validity is, the types of validity in research, and the factors that affect research validity.

  4. Reliability vs. Validity in Research

    Reliability is about the consistency of a measure, and validity is about the accuracy of a measure.opt. It's important to consider reliability and validity when you are creating your research design, planning your methods, and writing up your results, especially in quantitative research. Failing to do so can lead to several types of research ...

  5. Validity & Reliability In Research

    Validity and reliability are two incredibly important concepts in research, especially within the social sciences. Both validity and reliability have to do with the measurement of variables and/or constructs - for example, job satisfaction, intelligence, productivity, etc.

  6. The 4 Types of Validity

    In quantitative research, you have to consider the reliability and validity of your methods and measurements. Validity tells you how accurately a method

  7. What is Validity in Research?

    Validity is an important concept in establishing qualitative research rigor. At its core, validity in research speaks to the degree to which a study accurately reflects or assesses the specific concept that the researcher is attempting to measure or understand. It's about ensuring that the study investigates what it purports to investigate.

  8. Quantitative Research Excellence: Study Design and Reliable and Valid

    Learn how to design and measure quantitative research with excellence and validity from this comprehensive article.

  9. Validity in Research and Psychology: Types & Examples

    Validity in research, statistics, psychology, and testing evaluates how well test scores reflect what they're supposed to measure.

  10. Reliability vs Validity in Research

    Reliability and validity are concepts used to evaluate the quality of research. They indicate how well a method, technique, or test measures something.

  11. Validity in Qualitative Evaluation: Linking Purposes, Paradigms, and

    Although validity in qualitative research has been widely reflected upon in the methodological literature (and is still often subject of debate), the link with evaluation research is underexplored. In this article, I will explore aspects of validity of qualitative research with the explicit objective of connecting them with aspects of evaluation.

  12. Contemporary Test Validity in Theory and Practice: A Primer for

    One particular aspect of (quantitative) social science research that differs markedly from disciplines such as biology and chemistry is the instrumentation used to quantify phenomena. In response, this Research Methods essay offers a contemporary social science perspective on test validity and the validation process.

  13. Reliability vs. Validity in Research: Types & Examples

    Explore how reliability vs validity in research determines quality. Learn the differences and types + examples. Get insights!

  14. Internal and external validity: can you apply research study results to

    The validity of a research study includes two domains: internal and external validity. Internal validity is defined as the extent to which the observed results represent the truth in the population we are studying and, thus, are not due to methodological errors. In our example, if the authors can support that the study has internal validity ...

  15. Validity In Psychology Research: Types & Examples

    In psychology research, validity refers to the extent to which a test or measurement tool accurately measures what it's intended to measure. It ensures that the research findings are genuine and not due to extraneous factors. Validity can be categorized into different types, including construct validity (measuring the intended abstract trait), internal validity (ensuring causal conclusions ...

  16. Validity, reliability, and generalizability in qualitative research

    Validity, reliability, and generalizability in qualitative research. In general practice, qualitative research contributes as significantly as quantitative research, in particular regarding psycho-social aspects of patient-care, health services provision, policy setting, and health administrations. In contrast to quantitative research ...

  17. Validity and reliability in quantitative studies

    Validity. Validity is defined as the extent to which a concept is accurately measured in a quantitative study. For example, a survey designed to explore depression but which actually measures anxiety would not be considered valid. The second measure of quality in a quantitative study is reliability, or the accuracy of an instrument.

  18. Reliability and Validity

    Reliability in research refers to the consistency and stability of measurements or findings. Validity relates to the accuracy and truthfulness of results, measuring what the study intends to. Both are crucial for trustworthy and credible research outcomes.

  19. Understanding Reliability and Validity

    While reliability is concerned with the accuracy of the actual measuring instrument or procedure, validity is concerned with the study's success at measuring what the researchers set out to measure. Researchers should be concerned with both external and internal validity.

  20. Validity vs. Reliability

    What is the difference between reliability and validity in a study? In the domain of research, whether qualitative or quantitative, two concepts often arise when discussing the quality and rigor of a study: reliability and validity. These two terms, while interconnected, have distinct meanings that hold significant weight in the world of research.

  21. Validity and Reliability in Qualitative Research

    Reliability and validity are equally important to consider in qualitative research. Ways to enhance validity in qualitative research include: Building reliability can include one or more of the following: The most well-known measure of qualitative reliability in education research is inter-rater reliability and consensus coding.

  22. What Makes Valid Research? How to Verify if a Source is Credible on the

    Below are some key components that one should consider when trying to verify if an online source is credible. How to Find Reliable Information on the Internet. 1) Identify the source of the information and determine whether it is reliable and credible. A good starting point for this is to identify the name of the writer and or the organization ...

  23. Development and validity testing of a matrix to evaluate maturity of

    There are several areas for future research related to our clinical pathways maturity matrix. First, we did not perform test-retest or inter-rater reliability testing of the matrix, and therefore future studies should evaluate the matrix's reliability and validity.

  24. Improving the introduction of telemedicine in pre-hospital emergency

    Introduction Increasing numbers of ambulance calls, vacant positions and growing workloads in Emergency Medicine (EM) are increasing the pressure to find adequate solutions. With telemedicine providing health-care services by bridging large distances, connecting remote providers and even patients while using modern communication technologies, such a technology seems beneficial. As the process ...

  25. A Primer on the Validity of Assessment Instruments

    What is validity? 1. Validity in research refers to how accurately a study answers the study question or the strength of the study conclusions. For outcome measures such as surveys or tests, validity refers to the accuracy of measurement. Here validity refers to how well the assessment tool actually measures the underlying outcome of interest.

  26. Global academic research is skewed toward rich countries. How do World

    Explore how World Bank Policy Research Papers address the skew in global academic research favoring rich countries, offering vital insights for low- and middle-income nations often overlooked in scholarly studies.

  27. Blast RADIUS attack can bypass authentication for clients

    As far as the client is concerned, it receives a valid Access-Accept response from its server, and grants access to the attacker. 2002: New RADIUS vulns exposed; ... We're told by the team behind the research that the makers of RADIUS authentication stacks have developed updates to thwart exploitation of this protocol-level weakness - which ...

  28. Pokémon Go The Dawn of a New Discovery Radar Tuning choose a ...

    As part of Pokémon Go Fest 2024, there's Cosmog Special Research, "The Dawn of a New Discovery." Learn what Radar Tuning means, and all other quest steps.

  29. Examining the psychometric properties of the Turkish version of the

    There are no valid and reliable instruments for measuring the proactive coping levels of nursing students in Turkey. The present study is a descriptive and methodological study. Research data were collected between 01.12.2023 and 01.01.2024 via face-to-face interviews.