What Is Qualitative Content Analysis?

Qca explained simply (with examples).

By: Jenna Crosley (PhD). Reviewed by: Dr Eunice Rautenbach (DTech) | February 2021

If you’re in the process of preparing for your dissertation, thesis or research project, you’ve probably encountered the term “ qualitative content analysis ” – it’s quite a mouthful. If you’ve landed on this post, you’re probably a bit confused about it. Well, the good news is that you’ve come to the right place…

Overview: Qualitative Content Analysis

  • What (exactly) is qualitative content analysis
  • The two main types of content analysis
  • When to use content analysis
  • How to conduct content analysis (the process)
  • The advantages and disadvantages of content analysis

1. What is content analysis?

Content analysis is a  qualitative analysis method  that focuses on recorded human artefacts such as manuscripts, voice recordings and journals. Content analysis investigates these written, spoken and visual artefacts without explicitly extracting data from participants – this is called  unobtrusive  research.

In other words, with content analysis, you don’t necessarily need to interact with participants (although you can if necessary); you can simply analyse the data that they have already produced. With this type of analysis, you can analyse data such as text messages, books, Facebook posts, videos, and audio (just to mention a few).

The basics – explicit and implicit content

When working with content analysis, explicit and implicit content will play a role. Explicit data is transparent and easy to identify, while implicit data is that which requires some form of interpretation and is often of a subjective nature. Sounds a bit fluffy? Here’s an example:

Joe: Hi there, what can I help you with? 

Lauren: I recently adopted a puppy and I’m worried that I’m not feeding him the right food. Could you please advise me on what I should be feeding? 

Joe: Sure, just follow me and I’ll show you. Do you have any other pets?

Lauren: Only one, and it tweets a lot!

In this exchange, the explicit data indicates that Joe is helping Lauren to find the right puppy food. Lauren asks Joe whether she has any pets aside from her puppy. This data is explicit because it requires no interpretation.

On the other hand, implicit data , in this case, includes the fact that the speakers are in a pet store. This information is not clearly stated but can be inferred from the conversation, where Joe is helping Lauren to choose pet food. An additional piece of implicit data is that Lauren likely has some type of bird as a pet. This can be inferred from the way that Lauren states that her pet “tweets”.

As you can see, explicit and implicit data both play a role in human interaction  and are an important part of your analysis. However, it’s important to differentiate between these two types of data when you’re undertaking content analysis. Interpreting implicit data can be rather subjective as conclusions are based on the researcher’s interpretation. This can introduce an element of bias , which risks skewing your results.

Explicit and implicit data both play an important role in your content analysis, but it’s important to differentiate between them.

2. The two types of content analysis

Now that you understand the difference between implicit and explicit data, let’s move on to the two general types of content analysis : conceptual and relational content analysis. Importantly, while conceptual and relational content analysis both follow similar steps initially, the aims and outcomes of each are different.

Conceptual analysis focuses on the number of times a concept occurs in a set of data and is generally focused on explicit data. For example, if you were to have the following conversation:

Marie: She told me that she has three cats.

Jean: What are her cats’ names?

Marie: I think the first one is Bella, the second one is Mia, and… I can’t remember the third cat’s name.

In this data, you can see that the word “cat” has been used three times. Through conceptual content analysis, you can deduce that cats are the central topic of the conversation. You can also perform a frequency analysis , where you assess the term’s frequency in the data. For example, in the exchange above, the word “cat” makes up 9% of the data. In other words, conceptual analysis brings a little bit of quantitative analysis into your qualitative analysis.

As you can see, the above data is without interpretation and focuses on explicit data . Relational content analysis, on the other hand, takes a more holistic view by focusing more on implicit data in terms of context, surrounding words and relationships.

There are three types of relational analysis:

  • Affect extraction
  • Proximity analysis
  • Cognitive mapping

Affect extraction is when you assess concepts according to emotional attributes. These emotions are typically mapped on scales, such as a Likert scale or a rating scale ranging from 1 to 5, where 1 is “very sad” and 5 is “very happy”.

If participants are talking about their achievements, they are likely to be given a score of 4 or 5, depending on how good they feel about it. If a participant is describing a traumatic event, they are likely to have a much lower score, either 1 or 2.

Proximity analysis identifies explicit terms (such as those found in a conceptual analysis) and the patterns in terms of how they co-occur in a text. In other words, proximity analysis investigates the relationship between terms and aims to group these to extract themes and develop meaning.

Proximity analysis is typically utilised when you’re looking for hard facts rather than emotional, cultural, or contextual factors. For example, if you were to analyse a political speech, you may want to focus only on what has been said, rather than implications or hidden meanings. To do this, you would make use of explicit data, discounting any underlying meanings and implications of the speech.

Lastly, there’s cognitive mapping, which can be used in addition to, or along with, proximity analysis. Cognitive mapping involves taking different texts and comparing them in a visual format – i.e. a cognitive map. Typically, you’d use cognitive mapping in studies that assess changes in terms, definitions, and meanings over time. It can also serve as a way to visualise affect extraction or proximity analysis and is often presented in a form such as a graphic map.

Example of a cognitive map

To recap on the essentials, content analysis is a qualitative analysis method that focuses on recorded human artefacts . It involves both conceptual analysis (which is more numbers-based) and relational analysis (which focuses on the relationships between concepts and how they’re connected).

Need a helping hand?

thesis on content analysis

3. When should you use content analysis?

Content analysis is a useful tool that provides insight into trends of communication . For example, you could use a discussion forum as the basis of your analysis and look at the types of things the members talk about as well as how they use language to express themselves. Content analysis is flexible in that it can be applied to the individual, group, and institutional level.

Content analysis is typically used in studies where the aim is to better understand factors such as behaviours, attitudes, values, emotions, and opinions . For example, you could use content analysis to investigate an issue in society, such as miscommunication between cultures. In this example, you could compare patterns of communication in participants from different cultures, which will allow you to create strategies for avoiding misunderstandings in intercultural interactions.

Another example could include conducting content analysis on a publication such as a book. Here you could gather data on the themes, topics, language use and opinions reflected in the text to draw conclusions regarding the political (such as conservative or liberal) leanings of the publication.

Content analysis is typically used in projects where the research aims involve getting a better understanding of factors such as behaviours, attitudes, values, emotions, and opinions.

4. How to conduct a qualitative content analysis

Conceptual and relational content analysis differ in terms of their exact process ; however, there are some similarities. Let’s have a look at these first – i.e., the generic process:

  • Recap on your research questions
  • Undertake bracketing to identify biases
  • Operationalise your variables and develop a coding scheme
  • Code the data and undertake your analysis

Step 1 – Recap on your research questions

It’s always useful to begin a project with research questions , or at least with an idea of what you are looking for. In fact, if you’ve spent time reading this blog, you’ll know that it’s useful to recap on your research questions, aims and objectives when undertaking pretty much any research activity. In the context of content analysis, it’s difficult to know what needs to be coded and what doesn’t, without a clear view of the research questions.

For example, if you were to code a conversation focused on basic issues of social justice, you may be met with a wide range of topics that may be irrelevant to your research. However, if you approach this data set with the specific intent of investigating opinions on gender issues, you will be able to focus on this topic alone, which would allow you to code only what you need to investigate.

With content analysis, it’s difficult to know what needs to be coded  without a clear view of the research questions.

Step 2 – Reflect on your personal perspectives and biases

It’s vital that you reflect on your own pre-conception of the topic at hand and identify the biases that you might drag into your content analysis – this is called “ bracketing “. By identifying this upfront, you’ll be more aware of them and less likely to have them subconsciously influence your analysis.

For example, if you were to investigate how a community converses about unequal access to healthcare, it is important to assess your views to ensure that you don’t project these onto your understanding of the opinions put forth by the community. If you have access to medical aid, for instance, you should not allow this to interfere with your examination of unequal access.

You must reflect on the preconceptions and biases that you might drag into your content analysis - this is called "bracketing".

Step 3 – Operationalise your variables and develop a coding scheme

Next, you need to operationalise your variables . But what does that mean? Simply put, it means that you have to define each variable or construct . Give every item a clear definition – what does it mean (include) and what does it not mean (exclude). For example, if you were to investigate children’s views on healthy foods, you would first need to define what age group/range you’re looking at, and then also define what you mean by “healthy foods”.

In combination with the above, it is important to create a coding scheme , which will consist of information about your variables (how you defined each variable), as well as a process for analysing the data. For this, you would refer back to how you operationalised/defined your variables so that you know how to code your data.

For example, when coding, when should you code a food as “healthy”? What makes a food choice healthy? Is it the absence of sugar or saturated fat? Is it the presence of fibre and protein? It’s very important to have clearly defined variables to achieve consistent coding – without this, your analysis will get very muddy, very quickly.

When operationalising your variables, you must give every item a clear definition. In other words, what does it mean (include) and what does it not mean (exclude).

Step 4 – Code and analyse the data

The next step is to code the data. At this stage, there are some differences between conceptual and relational analysis.

As described earlier in this post, conceptual analysis looks at the existence and frequency of concepts, whereas a relational analysis looks at the relationships between concepts. For both types of analyses, it is important to pre-select a concept that you wish to assess in your data. Using the example of studying children’s views on healthy food, you could pre-select the concept of “healthy food” and assess the number of times the concept pops up in your data.

Here is where conceptual and relational analysis start to differ.

At this stage of conceptual analysis , it is necessary to decide on the level of analysis you’ll perform on your data, and whether this will exist on the word, phrase, sentence, or thematic level. For example, will you code the phrase “healthy food” on its own? Will you code each term relating to healthy food (e.g., broccoli, peaches, bananas, etc.) with the code “healthy food” or will these be coded individually? It is very important to establish this from the get-go to avoid inconsistencies that could result in you having to code your data all over again.

On the other hand, relational analysis looks at the type of analysis. So, will you use affect extraction? Proximity analysis? Cognitive mapping? A mix? It’s vital to determine the type of analysis before you begin to code your data so that you can maintain the reliability and validity of your research .

thesis on content analysis

How to conduct conceptual analysis

First, let’s have a look at the process for conceptual analysis.

Once you’ve decided on your level of analysis, you need to establish how you will code your concepts, and how many of these you want to code. Here you can choose whether you want to code in a deductive or inductive manner. Just to recap, deductive coding is when you begin the coding process with a set of pre-determined codes, whereas inductive coding entails the codes emerging as you progress with the coding process. Here it is also important to decide what should be included and excluded from your analysis, and also what levels of implication you wish to include in your codes.

For example, if you have the concept of “tall”, can you include “up in the clouds”, derived from the sentence, “the giraffe’s head is up in the clouds” in the code, or should it be a separate code? In addition to this, you need to know what levels of words may be included in your codes or not. For example, if you say, “the panda is cute” and “look at the panda’s cuteness”, can “cute” and “cuteness” be included under the same code?

Once you’ve considered the above, it’s time to code the text . We’ve already published a detailed post about coding , so we won’t go into that process here. Once you’re done coding, you can move on to analysing your results. This is where you will aim to find generalisations in your data, and thus draw your conclusions .

How to conduct relational analysis

Now let’s return to relational analysis.

As mentioned, you want to look at the relationships between concepts . To do this, you’ll need to create categories by reducing your data (in other words, grouping similar concepts together) and then also code for words and/or patterns. These are both done with the aim of discovering whether these words exist, and if they do, what they mean.

Your next step is to assess your data and to code the relationships between your terms and meanings, so that you can move on to your final step, which is to sum up and analyse the data.

To recap, it’s important to start your analysis process by reviewing your research questions and identifying your biases . From there, you need to operationalise your variables, code your data and then analyse it.

Time to analyse

5. What are the pros & cons of content analysis?

One of the main advantages of content analysis is that it allows you to use a mix of quantitative and qualitative research methods, which results in a more scientifically rigorous analysis.

For example, with conceptual analysis, you can count the number of times that a term or a code appears in a dataset, which can be assessed from a quantitative standpoint. In addition to this, you can then use a qualitative approach to investigate the underlying meanings of these and relationships between them.

Content analysis is also unobtrusive and therefore poses fewer ethical issues than some other analysis methods. As the content you’ll analyse oftentimes already exists, you’ll analyse what has been produced previously, and so you won’t have to collect data directly from participants. When coded correctly, data is analysed in a very systematic and transparent manner, which means that issues of replicability (how possible it is to recreate research under the same conditions) are reduced greatly.

On the downside , qualitative research (in general, not just content analysis) is often critiqued for being too subjective and for not being scientifically rigorous enough. This is where reliability (how replicable a study is by other researchers) and validity (how suitable the research design is for the topic being investigated) come into play – if you take these into account, you’ll be on your way to achieving sound research results.

One of the main advantages of content analysis is that it allows you to use a mix of quantitative and qualitative research methods, which results in a more scientifically rigorous analysis.

Recap: Qualitative content analysis

In this post, we’ve covered a lot of ground – click on any of the sections to recap:

If you have any questions about qualitative content analysis, feel free to leave a comment below.

thesis on content analysis

Logo for Open Educational Resources

Chapter 17. Content Analysis


Content analysis is a term that is used to mean both a method of data collection and a method of data analysis. Archival and historical works can be the source of content analysis, but so too can the contemporary media coverage of a story, blogs, comment posts, films, cartoons, advertisements, brand packaging, and photographs posted on Instagram or Facebook. Really, almost anything can be the “content” to be analyzed. This is a qualitative research method because the focus is on the meanings and interpretations of that content rather than strictly numerical counts or variables-based causal modeling. [1] Qualitative content analysis (sometimes referred to as QCA) is particularly useful when attempting to define and understand prevalent stories or communication about a topic of interest—in other words, when we are less interested in what particular people (our defined sample) are doing or believing and more interested in what general narratives exist about a particular topic or issue. This chapter will explore different approaches to content analysis and provide helpful tips on how to collect data, how to turn that data into codes for analysis, and how to go about presenting what is found through analysis. It is also a nice segue between our data collection methods (e.g., interviewing, observation) chapters and chapters 18 and 19, whose focus is on coding, the primary means of data analysis for most qualitative data. In many ways, the methods of content analysis are quite similar to the method of coding.

thesis on content analysis

Although the body of material (“content”) to be collected and analyzed can be nearly anything, most qualitative content analysis is applied to forms of human communication (e.g., media posts, news stories, campaign speeches, advertising jingles). The point of the analysis is to understand this communication, to systematically and rigorously explore its meanings, assumptions, themes, and patterns. Historical and archival sources may be the subject of content analysis, but there are other ways to analyze (“code”) this data when not overly concerned with the communicative aspect (see chapters 18 and 19). This is why we tend to consider content analysis its own method of data collection as well as a method of data analysis. Still, many of the techniques you learn in this chapter will be helpful to any “coding” scheme you develop for other kinds of qualitative data. Just remember that content analysis is a particular form with distinct aims and goals and traditions.

An Overview of the Content Analysis Process

The first step: selecting content.

Figure 17.2 is a display of possible content for content analysis. The first step in content analysis is making smart decisions about what content you will want to analyze and to clearly connect this content to your research question or general focus of research. Why are you interested in the messages conveyed in this particular content? What will the identification of patterns here help you understand? Content analysis can be fun to do, but in order to make it research, you need to fit it into a research plan.

Figure 17.1. A Non-exhaustive List of "Content" for Content Analysis

To take one example, let us imagine you are interested in gender presentations in society and how presentations of gender have changed over time. There are various forms of content out there that might help you document changes. You could, for example, begin by creating a list of magazines that are coded as being for “women” (e.g., Women’s Daily Journal ) and magazines that are coded as being for “men” (e.g., Men’s Health ). You could then select a date range that is relevant to your research question (e.g., 1950s–1970s) and collect magazines from that era. You might create a “sample” by deciding to look at three issues for each year in the date range and a systematic plan for what to look at in those issues (e.g., advertisements? Cartoons? Titles of articles? Whole articles?). You are not just going to look at some magazines willy-nilly. That would not be systematic enough to allow anyone to replicate or check your findings later on. Once you have a clear plan of what content is of interest to you and what you will be looking at, you can begin, creating a record of everything you are including as your content. This might mean a list of each advertisement you look at or each title of stories in those magazines along with its publication date. You may decide to have multiple “content” in your research plan. For each content, you want a clear plan for collecting, sampling, and documenting.

The Second Step: Collecting and Storing

Once you have a plan, you are ready to collect your data. This may entail downloading from the internet, creating a Word document or PDF of each article or picture, and storing these in a folder designated by the source and date (e.g., “ Men’s Health advertisements, 1950s”). Sølvberg ( 2021 ), for example, collected posted job advertisements for three kinds of elite jobs (economic, cultural, professional) in Sweden. But collecting might also mean going out and taking photographs yourself, as in the case of graffiti, street signs, or even what people are wearing. Chaise LaDousa, an anthropologist and linguist, took photos of “house signs,” which are signs, often creative and sometimes offensive, hung by college students living in communal off-campus houses. These signs were a focal point of college culture, sending messages about the values of the students living in them. Some of the names will give you an idea: “Boot ’n Rally,” “The Plantation,” “Crib of the Rib.” The students might find these signs funny and benign, but LaDousa ( 2011 ) argued convincingly that they also reproduced racial and gender inequalities. The data here already existed—they were big signs on houses—but the researcher had to collect the data by taking photographs.

In some cases, your content will be in physical form but not amenable to photographing, as in the case of films or unwieldy physical artifacts you find in the archives (e.g., undigitized meeting minutes or scrapbooks). In this case, you need to create some kind of detailed log (fieldnotes even) of the content that you can reference. In the case of films, this might mean watching the film and writing down details for key scenes that become your data. [2] For scrapbooks, it might mean taking notes on what you are seeing, quoting key passages, describing colors or presentation style. As you might imagine, this can take a lot of time. Be sure you budget this time into your research plan.

Researcher Note

A note on data scraping : Data scraping, sometimes known as screen scraping or frame grabbing, is a way of extracting data generated by another program, as when a scraping tool grabs information from a website. This may help you collect data that is on the internet, but you need to be ethical in how to employ the scraper. A student once helped me scrape thousands of stories from the Time magazine archives at once (although it took several hours for the scraping process to complete). These stories were freely available, so the scraping process simply sped up the laborious process of copying each article of interest and saving it to my research folder. Scraping tools can sometimes be used to circumvent paywalls. Be careful here!

The Third Step: Analysis

There is often an assumption among novice researchers that once you have collected your data, you are ready to write about what you have found. Actually, you haven’t yet found anything, and if you try to write up your results, you will probably be staring sadly at a blank page. Between the collection and the writing comes the difficult task of systematically and repeatedly reviewing the data in search of patterns and themes that will help you interpret the data, particularly its communicative aspect (e.g., What is it that is being communicated here, with these “house signs” or in the pages of Men’s Health ?).

The first time you go through the data, keep an open mind on what you are seeing (or hearing), and take notes about your observations that link up to your research question. In the beginning, it can be difficult to know what is relevant and what is extraneous. Sometimes, your research question changes based on what emerges from the data. Use the first round of review to consider this possibility, but then commit yourself to following a particular focus or path. If you are looking at how gender gets made or re-created, don’t follow the white rabbit down a hole about environmental injustice unless you decide that this really should be the focus of your study or that issues of environmental injustice are linked to gender presentation. In the second round of review, be very clear about emerging themes and patterns. Create codes (more on these in chapters 18 and 19) that will help you simplify what you are noticing. For example, “men as outdoorsy” might be a common trope you see in advertisements. Whenever you see this, mark the passage or picture. In your third (or fourth or fifth) round of review, begin to link up the tropes you’ve identified, looking for particular patterns and assumptions. You’ve drilled down to the details, and now you are building back up to figure out what they all mean. Start thinking about theory—either theories you have read about and are using as a frame of your study (e.g., gender as performance theory) or theories you are building yourself, as in the Grounded Theory tradition. Once you have a good idea of what is being communicated and how, go back to the data at least one more time to look for disconfirming evidence. Maybe you thought “men as outdoorsy” was of importance, but when you look hard, you note that women are presented as outdoorsy just as often. You just hadn’t paid attention. It is very important, as any kind of researcher but particularly as a qualitative researcher, to test yourself and your emerging interpretations in this way.

The Fourth and Final Step: The Write-Up

Only after you have fully completed analysis, with its many rounds of review and analysis, will you be able to write about what you found. The interpretation exists not in the data but in your analysis of the data. Before writing your results, you will want to very clearly describe how you chose the data here and all the possible limitations of this data (e.g., historical-trace problem or power problem; see chapter 16). Acknowledge any limitations of your sample. Describe the audience for the content, and discuss the implications of this. Once you have done all of this, you can put forth your interpretation of the communication of the content, linking to theory where doing so would help your readers understand your findings and what they mean more generally for our understanding of how the social world works. [3]

Analyzing Content: Helpful Hints and Pointers

Although every data set is unique and each researcher will have a different and unique research question to address with that data set, there are some common practices and conventions. When reviewing your data, what do you look at exactly? How will you know if you have seen a pattern? How do you note or mark your data?

Let’s start with the last question first. If your data is stored digitally, there are various ways you can highlight or mark up passages. You can, of course, do this with literal highlighters, pens, and pencils if you have print copies. But there are also qualitative software programs to help you store the data, retrieve the data, and mark the data. This can simplify the process, although it cannot do the work of analysis for you.

Qualitative software can be very expensive, so the first thing to do is to find out if your institution (or program) has a universal license its students can use. If they do not, most programs have special student licenses that are less expensive. The two most used programs at this moment are probably ATLAS.ti and NVivo. Both can cost more than $500 [4] but provide everything you could possibly need for storing data, content analysis, and coding. They also have a lot of customer support, and you can find many official and unofficial tutorials on how to use the programs’ features on the web. Dedoose, created by academic researchers at UCLA, is a decent program that lacks many of the bells and whistles of the two big programs. Instead of paying all at once, you pay monthly, as you use the program. The monthly fee is relatively affordable (less than $15), so this might be a good option for a small project. HyperRESEARCH is another basic program created by academic researchers, and it is free for small projects (those that have limited cases and material to import). You can pay a monthly fee if your project expands past the free limits. I have personally used all four of these programs, and they each have their pluses and minuses.

Regardless of which program you choose, you should know that none of them will actually do the hard work of analysis for you. They are incredibly useful for helping you store and organize your data, and they provide abundant tools for marking, comparing, and coding your data so you can make sense of it. But making sense of it will always be your job alone.

So let’s say you have some software, and you have uploaded all of your content into the program: video clips, photographs, transcripts of news stories, articles from magazines, even digital copies of college scrapbooks. Now what do you do? What are you looking for? How do you see a pattern? The answers to these questions will depend partially on the particular research question you have, or at least the motivation behind your research. Let’s go back to the idea of looking at gender presentations in magazines from the 1950s to the 1970s. Here are some things you can look at and code in the content: (1) actions and behaviors, (2) events or conditions, (3) activities, (4) strategies and tactics, (5) states or general conditions, (6) meanings or symbols, (7) relationships/interactions, (8) consequences, and (9) settings. Table 17.1 lists these with examples from our gender presentation study.

Table 17.1. Examples of What to Note During Content Analysis

One thing to note about the examples in table 17.1: sometimes we note (mark, record, code) a single example, while other times, as in “settings,” we are recording a recurrent pattern. To help you spot patterns, it is useful to mark every setting, including a notation on gender. Using software can help you do this efficiently. You can then call up “setting by gender” and note this emerging pattern. There’s an element of counting here, which we normally think of as quantitative data analysis, but we are using the count to identify a pattern that will be used to help us interpret the communication. Content analyses often include counting as part of the interpretive (qualitative) process.

In your own study, you may not need or want to look at all of the elements listed in table 17.1. Even in our imagined example, some are more useful than others. For example, “strategies and tactics” is a bit of a stretch here. In studies that are looking specifically at, say, policy implementation or social movements, this category will prove much more salient.

Another way to think about “what to look at” is to consider aspects of your content in terms of units of analysis. You can drill down to the specific words used (e.g., the adjectives commonly used to describe “men” and “women” in your magazine sample) or move up to the more abstract level of concepts used (e.g., the idea that men are more rational than women). Counting for the purpose of identifying patterns is particularly useful here. How many times is that idea of women’s irrationality communicated? How is it is communicated (in comic strips, fictional stories, editorials, etc.)? Does the incidence of the concept change over time? Perhaps the “irrational woman” was everywhere in the 1950s, but by the 1970s, it is no longer showing up in stories and comics. By tracing its usage and prevalence over time, you might come up with a theory or story about gender presentation during the period. Table 17.2 provides more examples of using different units of analysis for this work along with suggestions for effective use.

Table 17.2. Examples of Unit of Analysis in Content Analysis

Every qualitative content analysis is unique in its particular focus and particular data used, so there is no single correct way to approach analysis. You should have a better idea, however, of what kinds of things to look for and what to look for. The next two chapters will take you further into the coding process, the primary analytical tool for qualitative research in general.

Further Readings

Cidell, Julie. 2010. “Content Clouds as Exploratory Qualitative Data Analysis.” Area 42(4):514–523. A demonstration of using visual “content clouds” as a form of exploratory qualitative data analysis using transcripts of public meetings and content of newspaper articles.

Hsieh, Hsiu-Fang, and Sarah E. Shannon. 2005. “Three Approaches to Qualitative Content Analysis.” Qualitative Health Research 15(9):1277–1288. Distinguishes three distinct approaches to QCA: conventional, directed, and summative. Uses hypothetical examples from end-of-life care research.

Jackson, Romeo, Alex C. Lange, and Antonio Duran. 2021. “A Whitened Rainbow: The In/Visibility of Race and Racism in LGBTQ Higher Education Scholarship.” Journal Committed to Social Change on Race and Ethnicity (JCSCORE) 7(2):174–206.* Using a “critical summative content analysis” approach, examines research published on LGBTQ people between 2009 and 2019.

Krippendorff, Klaus. 2018. Content Analysis: An Introduction to Its Methodology . 4th ed. Thousand Oaks, CA: SAGE. A very comprehensive textbook on both quantitative and qualitative forms of content analysis.

Mayring, Philipp. 2022. Qualitative Content Analysis: A Step-by-Step Guide . Thousand Oaks, CA: SAGE. Formulates an eight-step approach to QCA.

Messinger, Adam M. 2012. “Teaching Content Analysis through ‘Harry Potter.’” Teaching Sociology 40(4):360–367. This is a fun example of a relatively brief foray into content analysis using the music found in Harry Potter films.

Neuendorft, Kimberly A. 2002. The Content Analysis Guidebook . Thousand Oaks, CA: SAGE. Although a helpful guide to content analysis in general, be warned that this textbook definitely favors quantitative over qualitative approaches to content analysis.

Schrier, Margrit. 2012. Qualitative Content Analysis in Practice . Thousand Okas, CA: SAGE. Arguably the most accessible guidebook for QCA, written by a professor based in Germany.

Weber, Matthew A., Shannon Caplan, Paul Ringold, and Karen Blocksom. 2017. “Rivers and Streams in the Media: A Content Analysis of Ecosystem Services.” Ecology and Society 22(3).* Examines the content of a blog hosted by National Geographic and articles published in The New York Times and the Wall Street Journal for stories on rivers and streams (e.g., water-quality flooding).

  • There are ways of handling content analysis quantitatively, however. Some practitioners therefore specify qualitative content analysis (QCA). In this chapter, all content analysis is QCA unless otherwise noted. ↵
  • Note that some qualitative software allows you to upload whole films or film clips for coding. You will still have to get access to the film, of course. ↵
  • See chapter 20 for more on the final presentation of research. ↵
  • . Actually, ATLAS.ti is an annual license, while NVivo is a perpetual license, but both are going to cost you at least $500 to use. Student rates may be lower. And don’t forget to ask your institution or program if they already have a software license you can use. ↵

A method of both data collection and data analysis in which a given content (textual, visual, graphic) is examined systematically and rigorously to identify meanings, themes, patterns and assumptions.  Qualitative content analysis (QCA) is concerned with gathering and interpreting an existing body of material.    

Introduction to Qualitative Research Methods Copyright © 2023 by Allison Hurst is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License , except where otherwise noted.

  • Knowledge Base
  • Methodology

Content Analysis | A Step-by-Step Guide with Examples

Published on 5 May 2022 by Amy Luo . Revised on 5 December 2022.

Content analysis is a research method used to identify patterns in recorded communication. To conduct content analysis, you systematically collect data from a set of texts, which can be written, oral, or visual:

  • Books, newspapers, and magazines
  • Speeches and interviews
  • Web content and social media posts
  • Photographs and films

Content analysis can be both quantitative (focused on counting and measuring) and qualitative (focused on interpreting and understanding). In both types, you categorise or ‘code’ words, themes, and concepts within the texts and then analyse the results.

Table of contents

What is content analysis used for, advantages of content analysis, disadvantages of content analysis, how to conduct content analysis.

Researchers use content analysis to find out about the purposes, messages, and effects of communication content. They can also make inferences about the producers and audience of the texts they analyse.

Content analysis can be used to quantify the occurrence of certain words, phrases, subjects, or concepts in a set of historical or contemporary texts.

In addition, content analysis can be used to make qualitative inferences by analysing the meaning and semantic relationship of words and concepts.

Because content analysis can be applied to a broad range of texts, it is used in a variety of fields, including marketing, media studies, anthropology, cognitive science, psychology, and many social science disciplines. It has various possible goals:

  • Finding correlations and patterns in how concepts are communicated
  • Understanding the intentions of an individual, group, or institution
  • Identifying propaganda and bias in communication
  • Revealing differences in communication in different contexts
  • Analysing the consequences of communication content, such as the flow of information or audience responses

  • Unobtrusive data collection

You can analyse communication and social interaction without the direct involvement of participants, so your presence as a researcher doesn’t influence the results.

  • Transparent and replicable

When done well, content analysis follows a systematic procedure that can easily be replicated by other researchers, yielding results with high reliability .

  • Highly flexible

You can conduct content analysis at any time, in any location, and at low cost. All you need is access to the appropriate sources.

Focusing on words or phrases in isolation can sometimes be overly reductive, disregarding context, nuance, and ambiguous meanings.

Content analysis almost always involves some level of subjective interpretation, which can affect the reliability and validity of the results and conclusions.

  • Time intensive

Manually coding large volumes of text is extremely time-consuming, and it can be difficult to automate effectively.

If you want to use content analysis in your research, you need to start with a clear, direct  research question .

Next, you follow these five steps.

Step 1: Select the content you will analyse

Based on your research question, choose the texts that you will analyse. You need to decide:

  • The medium (e.g., newspapers, speeches, or websites) and genre (e.g., opinion pieces, political campaign speeches, or marketing copy)
  • The criteria for inclusion (e.g., newspaper articles that mention a particular event, speeches by a certain politician, or websites selling a specific type of product)
  • The parameters in terms of date range, location, etc.

If there are only a small number of texts that meet your criteria, you might analyse all of them. If there is a large volume of texts, you can select a sample .

Step 2: Define the units and categories of analysis

Next, you need to determine the level at which you will analyse your chosen texts. This means defining:

  • The unit(s) of meaning that will be coded. For example, are you going to record the frequency of individual words and phrases, the characteristics of people who produced or appear in the texts, the presence and positioning of images, or the treatment of themes and concepts?
  • The set of categories that you will use for coding. Categories can be objective characteristics (e.g., aged 30–40, lawyer, parent) or more conceptual (e.g., trustworthy, corrupt, conservative, family-oriented).

Step 3: Develop a set of rules for coding

Coding involves organising the units of meaning into the previously defined categories. Especially with more conceptual categories, it’s important to clearly define the rules for what will and won’t be included to ensure that all texts are coded consistently.

Coding rules are especially important if multiple researchers are involved, but even if you’re coding all of the text by yourself, recording the rules makes your method more transparent and reliable.

Step 4: Code the text according to the rules

You go through each text and record all relevant data in the appropriate categories. This can be done manually or aided with computer programs, such as QSR NVivo , Atlas.ti , and Diction , which can help speed up the process of counting and categorising words and phrases.

Step 5: Analyse the results and draw conclusions

Once coding is complete, the collected data is examined to find patterns and draw conclusions in response to your research question. You might use statistical analysis to find correlations or trends, discuss your interpretations of what the results mean, and make inferences about the creators, context, and audience of the texts.

Luo, A. (2022, December 05). Content Analysis | A Step-by-Step Guide with Examples. Scribbr. Retrieved 7 June 2024, from

Is this article helpful?

Amy Luo

How to do a content analysis

Content analysis illustration

What is content analysis?

Why would you use a content analysis, types of content analysis, conceptual content analysis, relational content analysis, reliability and validity, reliability, the advantages and disadvantages of content analysis, a step-by-step guide to conducting a content analysis, step 1: develop your research questions, step 2: choose the content you’ll analyze, step 3: identify your biases, step 4: define the units and categories of coding, step 5: develop a coding scheme, step 6: code the content, step 7: analyze the results, frequently asked questions about content analysis, related articles.

In research, content analysis is the process of analyzing content and its features with the aim of identifying patterns and the presence of words, themes, and concepts within the content. Simply put, content analysis is a research method that aims to present the trends, patterns, concepts, and ideas in content as objective, quantitative or qualitative data , depending on the specific use case.

As such, some of the objectives of content analysis include:

  • Simplifying complex, unstructured content.
  • Identifying trends, patterns, and relationships in the content.
  • Determining the characteristics of the content.
  • Identifying the intentions of individuals through the analysis of the content.
  • Identifying the implied aspects in the content.

Typically, when doing a content analysis, you’ll gather data not only from written text sources like newspapers, books, journals, and magazines but also from a variety of other oral and visual sources of content like:

  • Voice recordings, speeches, and interviews.
  • Web content, blogs, and social media content.
  • Films, videos, and photographs.

One of content analysis’s distinguishing features is that you'll be able to gather data for research without physically gathering data from participants. In other words, when doing a content analysis, you don't need to interact with people directly.

The process of doing a content analysis usually involves categorizing or coding concepts, words, and themes within the content and analyzing the results. We’ll look at the process in more detail below.

Typically, you’ll use content analysis when you want to:

  • Identify the intentions, communication trends, or communication patterns of an individual, a group of people, or even an institution.
  • Analyze and describe the behavioral and attitudinal responses of individuals to communications.
  • Determine the emotional or psychological state of an individual or a group of people.
  • Analyze the international differences in communication content.
  • Analyzing audience responses to content.

Keep in mind, though, that these are just some examples of use cases where a content analysis might be appropriate and there are many others.

The key thing to remember is that content analysis will help you quantify the occurrence of specific words, phrases, themes, and concepts in content. Moreover, it can also be used when you want to make qualitative inferences out of the data by analyzing the semantic meanings and interrelationships between words, themes, and concepts.

In general, there are two types of content analysis: conceptual and relational analysis . Although these two types follow largely similar processes, their outcomes differ. As such, each of these types can provide different results, interpretations, and conclusions. With that in mind, let’s now look at these two types of content analysis in more detail.

With conceptual analysis, you’ll determine the existence of certain concepts within the content and identify their frequency. In other words, conceptual analysis involves the number of times a specific concept appears in the content.

Conceptual analysis is typically focused on explicit data, which means you’ll focus your analysis on a specific concept to identify its presence in the content and determine its frequency.

However, when conducting a content analysis, you can also use implicit data. This approach is more involved, complicated, and requires the use of a dictionary, contextual translation rules, or a combination of both.

No matter what type you use, conceptual analysis brings an element of quantitive analysis into a qualitative approach to research.

Relational content analysis takes conceptual analysis a step further. So, while the process starts in the same way by identifying concepts in content, it doesn’t focus on finding the frequency of these concepts, but rather on the relationships between the concepts, the context in which they appear in the content, and their interrelationships.

Before starting with a relational analysis, you’ll first need to decide on which subcategory of relational analysis you’ll use:

  • Affect extraction: With this relational content analysis approach, you’ll evaluate concepts based on their emotional attributes. You’ll typically assess these emotions on a rating scale with higher values assigned to positive emotions and lower values to negative ones. In turn, this allows you to capture the emotions of the writer or speaker at the time the content is created. The main difficulty with this approach is that emotions can differ over time and across populations.
  • Proximity analysis: With this approach, you’ll identify concepts as in conceptual analysis, but you’ll evaluate the way in which they occur together in the content. In other words, proximity analysis allows you to analyze the relationship between concepts and derive a concept matrix from which you’ll be able to develop meaning. Proximity analysis is typically used when you want to extract facts from the content rather than contextual, emotional, or cultural factors.
  • Cognitive mapping: Finally, cognitive mapping can be used with affect extraction or proximity analysis. It’s a visualization technique that allows you to create a model that represents the overall meaning of content and presents it as a graphic map of the relationships between concepts. As such, it’s also commonly used when analyzing the changes in meanings, definitions, and terms over time.

Now that we’ve seen what content analysis is and looked at the different types of content analysis, it’s important to understand how reliable it is as a research method . We’ll also look at what criteria impact the validity of a content analysis.

There are three criteria that determine the reliability of a content analysis:

  • Stability . Stability refers to the tendency of coders to consistently categorize or code the same data in the same way over time.
  • Reproducibility . This criterion refers to the tendency of coders to classify categories membership in the same way.
  • Accuracy . Accuracy refers to the extent to which the classification of content corresponds to a specific standard.

Keep in mind, though, that because you’ll need to code or categorize the concepts you’ll aim to identify and analyze manually, you’ll never be able to eliminate human error. However, you’ll be able to minimize it.

In turn, three criteria determine the validity of a content analysis:

  • Closeness of categories . This is achieved by using multiple classifiers to get an agreed-upon definition for a specific category by using either implicit variables or synonyms. In this way, the category can be broadened to include more relevant data.
  • Conclusions . Here, it’s crucial to decide what level of implication will be allowable. In other words, it’s important to consider whether the conclusions are valid based on the data or whether they can be explained using some other phenomena.
  • Generalizability of the results of the analysis to a theory . Generalizability comes down to how you determine your categories as mentioned above and how reliable those categories are. In turn, this relies on how accurately the categories are at measuring the concepts or ideas that you’re looking to measure.

Considering everything mentioned above, there are definite advantages and disadvantages when it comes to content analysis:

Let’s now look at the steps you’ll need to follow when doing a content analysis.

The first step will always be to formulate your research questions. This is simply because, without clear and defined research questions, you won’t know what question to answer and, by implication, won’t be able to code your concepts.

Based on your research questions, you’ll then need to decide what content you’ll analyze. Here, you’ll use three factors to find the right content:

  • The type of content . Here you’ll need to consider the various types of content you’ll use and their medium like, for example, blog posts, social media, newspapers, or online articles.
  • What criteria you’ll use for inclusion . Here you’ll decide what criteria you’ll use to include content. This can, for instance, be the mentioning of a certain event or advertising a specific product.
  • Your parameters . Here, you’ll decide what content you’ll include based on specified parameters in terms of date and location.

The next step is to consider your own pre-conception of the questions and identify your biases. This process is referred to as bracketing and allows you to be aware of your biases before you start your research with the result that they’ll be less likely to influence the analysis.

Your next step would be to define the units of meaning that you’ll code. This will, for example, be the number of times a concept appears in the content or the treatment of concept, words, or themes in the content. You’ll then need to define the set of categories you’ll use for coding which can be either objective or more conceptual.

Based on the above, you’ll then organize the units of meaning into your defined categories. Apart from this, your coding scheme will also determine how you’ll analyze the data.

The next step is to code the content. During this process, you’ll work through the content and record the data according to your coding scheme. It’s also here where conceptual and relational analysis starts to deviate in relation to the process you’ll need to follow.

As mentioned earlier, conceptual analysis aims to identify the number of times a specific concept, idea, word, or phrase appears in the content. So, here, you’ll need to decide what level of analysis you’ll implement.

In contrast, with relational analysis, you’ll need to decide what type of relational analysis you’ll use. So, you’ll need to determine whether you’ll use affect extraction, proximity analysis, cognitive mapping, or a combination of these approaches.

Once you’ve coded the data, you’ll be able to analyze it and draw conclusions from the data based on your research questions.

Content analysis offers an inexpensive and flexible way to identify trends and patterns in communication content. In addition, it’s unobtrusive which eliminates many ethical concerns and inaccuracies in research data. However, to be most effective, a content analysis must be planned and used carefully in order to ensure reliability and validity.

The two general types of content analysis: conceptual and relational analysis . Although these two types follow largely similar processes, their outcomes differ. As such, each of these types can provide different results, interpretations, and conclusions.

In qualitative research coding means categorizing concepts, words, and themes within your content to create a basis for analyzing the results. While coding, you work through the content and record the data according to your coding scheme.

Content analysis is the process of analyzing content and its features with the aim of identifying patterns and the presence of words, themes, and concepts within the content. The goal of a content analysis is to present the trends, patterns, concepts, and ideas in content as objective, quantitative or qualitative data, depending on the specific use case.

Content analysis is a qualitative method of data analysis and can be used in many different fields. It is particularly popular in the social sciences.

It is possible to do qualitative analysis without coding, but content analysis as a method of qualitative analysis requires coding or categorizing data to then analyze it according to your coding scheme in the next step.

thesis on content analysis

thesis on content analysis

Using Content Analysis

This guide provides an introduction to content analysis, a research methodology that examines words or phrases within a wide range of texts.

  • Introduction to Content Analysis : Read about the history and uses of content analysis.
  • Conceptual Analysis : Read an overview of conceptual analysis and its associated methodology.
  • Relational Analysis : Read an overview of relational analysis and its associated methodology.
  • Commentary : Read about issues of reliability and validity with regard to content analysis as well as the advantages and disadvantages of using content analysis as a research methodology.
  • Examples : View examples of real and hypothetical studies that use content analysis.
  • Annotated Bibliography : Complete list of resources used in this guide and beyond.

An Introduction to Content Analysis

Content analysis is a research tool used to determine the presence of certain words or concepts within texts or sets of texts. Researchers quantify and analyze the presence, meanings and relationships of such words and concepts, then make inferences about the messages within the texts, the writer(s), the audience, and even the culture and time of which these are a part. Texts can be defined broadly as books, book chapters, essays, interviews, discussions, newspaper headlines and articles, historical documents, speeches, conversations, advertising, theater, informal conversation, or really any occurrence of communicative language. Texts in a single study may also represent a variety of different types of occurrences, such as Palmquist's 1990 study of two composition classes, in which he analyzed student and teacher interviews, writing journals, classroom discussions and lectures, and out-of-class interaction sheets. To conduct a content analysis on any such text, the text is coded, or broken down, into manageable categories on a variety of levels--word, word sense, phrase, sentence, or theme--and then examined using one of content analysis' basic methods: conceptual analysis or relational analysis.

A Brief History of Content Analysis

Historically, content analysis was a time consuming process. Analysis was done manually, or slow mainframe computers were used to analyze punch cards containing data punched in by human coders. Single studies could employ thousands of these cards. Human error and time constraints made this method impractical for large texts. However, despite its impracticality, content analysis was already an often utilized research method by the 1940's. Although initially limited to studies that examined texts for the frequency of the occurrence of identified terms (word counts), by the mid-1950's researchers were already starting to consider the need for more sophisticated methods of analysis, focusing on concepts rather than simply words, and on semantic relationships rather than just presence (de Sola Pool 1959). While both traditions still continue today, content analysis now is also utilized to explore mental models, and their linguistic, affective, cognitive, social, cultural and historical significance.

Uses of Content Analysis

Perhaps due to the fact that it can be applied to examine any piece of writing or occurrence of recorded communication, content analysis is currently used in a dizzying array of fields, ranging from marketing and media studies, to literature and rhetoric, ethnography and cultural studies, gender and age issues, sociology and political science, psychology and cognitive science, and many other fields of inquiry. Additionally, content analysis reflects a close relationship with socio- and psycholinguistics, and is playing an integral role in the development of artificial intelligence. The following list (adapted from Berelson, 1952) offers more possibilities for the uses of content analysis:

  • Reveal international differences in communication content
  • Detect the existence of propaganda
  • Identify the intentions, focus or communication trends of an individual, group or institution
  • Describe attitudinal and behavioral responses to communications
  • Determine psychological or emotional state of persons or groups

Types of Content Analysis

In this guide, we discuss two general categories of content analysis: conceptual analysis and relational analysis. Conceptual analysis can be thought of as establishing the existence and frequency of concepts most often represented by words of phrases in a text. For instance, say you have a hunch that your favorite poet often writes about hunger. With conceptual analysis you can determine how many times words such as hunger, hungry, famished, or starving appear in a volume of poems. In contrast, relational analysis goes one step further by examining the relationships among concepts in a text. Returning to the hunger example, with relational analysis, you could identify what other words or phrases hunger or famished appear next to and then determine what different meanings emerge as a result of these groupings.

Conceptual Analysis

Traditionally, content analysis has most often been thought of in terms of conceptual analysis. In conceptual analysis, a concept is chosen for examination, and the analysis involves quantifying and tallying its presence. Also known as thematic analysis [although this term is somewhat problematic, given its varied definitions in current literature--see Palmquist, Carley, & Dale (1997) vis-a-vis Smith (1992)], the focus here is on looking at the occurrence of selected terms within a text or texts, although the terms may be implicit as well as explicit. While explicit terms obviously are easy to identify, coding for implicit terms and deciding their level of implication is complicated by the need to base judgments on a somewhat subjective system. To attempt to limit the subjectivity, then (as well as to limit problems of reliability and validity ), coding such implicit terms usually involves the use of either a specialized dictionary or contextual translation rules. And sometimes, both tools are used--a trend reflected in recent versions of the Harvard and Lasswell dictionaries.

Methods of Conceptual Analysis

Conceptual analysis begins with identifying research questions and choosing a sample or samples. Once chosen, the text must be coded into manageable content categories. The process of coding is basically one of selective reduction . By reducing the text to categories consisting of a word, set of words or phrases, the researcher can focus on, and code for, specific words or patterns that are indicative of the research question.

An example of a conceptual analysis would be to examine several Clinton speeches on health care, made during the 1992 presidential campaign, and code them for the existence of certain words. In looking at these speeches, the research question might involve examining the number of positive words used to describe Clinton's proposed plan, and the number of negative words used to describe the current status of health care in America. The researcher would be interested only in quantifying these words, not in examining how they are related, which is a function of relational analysis. In conceptual analysis, the researcher simply wants to examine presence with respect to his/her research question, i.e. is there a stronger presence of positive or negative words used with respect to proposed or current health care plans, respectively.

Once the research question has been established, the researcher must make his/her coding choices with respect to the eight category coding steps indicated by Carley (1992).

Steps for Conducting Conceptual Analysis

The following discussion of steps that can be followed to code a text or set of texts during conceptual analysis use campaign speeches made by Bill Clinton during the 1992 presidential campaign as an example. To read about each step, click on the items in the list below:

  • Decide the level of analysis.

First, the researcher must decide upon the level of analysis . With the health care speeches, to continue the example, the researcher must decide whether to code for a single word, such as "inexpensive," or for sets of words or phrases, such as "coverage for everyone."

  • Decide how many concepts to code for.

The researcher must now decide how many different concepts to code for. This involves developing a pre-defined or interactive set of concepts and categories. The researcher must decide whether or not to code for every single positive or negative word that appears, or only certain ones that the researcher determines are most relevant to health care. Then, with this pre-defined number set, the researcher has to determine how much flexibility he/she allows him/herself when coding. The question of whether the researcher codes only from this pre-defined set, or allows him/herself to add relevant categories not included in the set as he/she finds them in the text, must be answered. Determining a certain number and set of concepts allows a researcher to examine a text for very specific things, keeping him/her on task. But introducing a level of coding flexibility allows new, important material to be incorporated into the coding process that could have significant bearings on one's results.

  • Decide whether to code for existence or frequency of a concept.

After a certain number and set of concepts are chosen for coding , the researcher must answer a key question: is he/she going to code for existence or frequency ? This is important, because it changes the coding process. When coding for existence, "inexpensive" would only be counted once, no matter how many times it appeared. This would be a very basic coding process and would give the researcher a very limited perspective of the text. However, the number of times "inexpensive" appears in a text might be more indicative of importance. Knowing that "inexpensive" appeared 50 times, for example, compared to 15 appearances of "coverage for everyone," might lead a researcher to interpret that Clinton is trying to sell his health care plan based more on economic benefits, not comprehensive coverage. Knowing that "inexpensive" appeared, but not that it appeared 50 times, would not allow the researcher to make this interpretation, regardless of whether it is valid or not.

  • Decide on how you will distinguish among concepts.

The researcher must next decide on the , i.e. whether concepts are to be coded exactly as they appear, or if they can be recorded as the same even when they appear in different forms. For example, "expensive" might also appear as "expensiveness." The research needs to determine if the two words mean radically different things to him/her, or if they are similar enough that they can be coded as being the same thing, i.e. "expensive words." In line with this, is the need to determine the level of implication one is going to allow. This entails more than subtle differences in tense or spelling, as with "expensive" and "expensiveness." Determining the level of implication would allow the researcher to code not only for the word "expensive," but also for words that imply "expensive." This could perhaps include technical words, jargon, or political euphemism, such as "economically challenging," that the researcher decides does not merit a separate category, but is better represented under the category "expensive," due to its implicit meaning of "expensive."

  • Develop rules for coding your texts.

After taking the generalization of concepts into consideration, a researcher will want to create translation rules that will allow him/her to streamline and organize the coding process so that he/she is coding for exactly what he/she wants to code for. Developing a set of rules helps the researcher insure that he/she is coding things consistently throughout the text, in the same way every time. If a researcher coded "economically challenging" as a separate category from "expensive" in one paragraph, then coded it under the umbrella of "expensive" when it occurred in the next paragraph, his/her data would be invalid. The interpretations drawn from that data will subsequently be invalid as well. Translation rules protect against this and give the coding process a crucial level of consistency and coherence.

  • Decide what to do with "irrelevant" information.

The next choice a researcher must make involves irrelevant information . The researcher must decide whether irrelevant information should be ignored (as Weber, 1990, suggests), or used to reexamine and/or alter the coding scheme. In the case of this example, words like "and" and "the," as they appear by themselves, would be ignored. They add nothing to the quantification of words like "inexpensive" and "expensive" and can be disregarded without impacting the outcome of the coding.

  • Code the texts.

Once these choices about irrelevant information are made, the next step is to code the text. This is done either by hand, i.e. reading through the text and manually writing down concept occurrences, or through the use of various computer programs. Coding with a computer is one of contemporary conceptual analysis' greatest assets. By inputting one's categories, content analysis programs can easily automate the coding process and examine huge amounts of data, and a wider range of texts, quickly and efficiently. But automation is very dependent on the researcher's preparation and category construction. When coding is done manually, a researcher can recognize errors far more easily. A computer is only a tool and can only code based on the information it is given. This problem is most apparent when coding for implicit information, where category preparation is essential for accurate coding.

  • Analyze your results.

Once the coding is done, the researcher examines the data and attempts to draw whatever conclusions and generalizations are possible. Of course, before these can be drawn, the researcher must decide what to do with the information in the text that is not coded. One's options include either deleting or skipping over unwanted material, or viewing all information as relevant and important and using it to reexamine, reassess and perhaps even alter one's coding scheme. Furthermore, given that the conceptual analyst is dealing only with quantitative data, the levels of interpretation and generalizability are very limited. The researcher can only extrapolate as far as the data will allow. But it is possible to see trends, for example, that are indicative of much larger ideas. Using the example from step three, if the concept "inexpensive" appears 50 times, compared to 15 appearances of "coverage for everyone," then the researcher can pretty safely extrapolate that there does appear to be a greater emphasis on the economics of the health care plan, as opposed to its universal coverage for all Americans. It must be kept in mind that conceptual analysis, while extremely useful and effective for providing this type of information when done right, is limited by its focus and the quantitative nature of its examination. To more fully explore the relationships that exist between these concepts, one must turn to relational analysis.

Relational Analysis

Relational analysis, like conceptual analysis, begins with the act of identifying concepts present in a given text or set of texts. However, relational analysis seeks to go beyond presence by exploring the relationships between the concepts identified. Relational analysis has also been termed semantic analysis (Palmquist, Carley, & Dale, 1997). In other words, the focus of relational analysis is to look for semantic, or meaningful, relationships. Individual concepts, in and of themselves, are viewed as having no inherent meaning. Rather, meaning is a product of the relationships among concepts in a text. Carley (1992) asserts that concepts are "ideational kernels;" these kernels can be thought of as symbols which acquire meaning through their connections to other symbols.

Theoretical Influences on Relational Analysis

The kind of analysis that researchers employ will vary significantly according to their theoretical approach. Key theoretical approaches that inform content analysis include linguistics and cognitive science.

Linguistic approaches to content analysis focus analysis of texts on the level of a linguistic unit, typically single clause units. One example of this type of research is Gottschalk (1975), who developed an automated procedure which analyzes each clause in a text and assigns it a numerical score based on several emotional/psychological scales. Another technique is to code a text grammatically into clauses and parts of speech to establish a matrix representation (Carley, 1990).

Approaches that derive from cognitive science include the creation of decision maps and mental models. Decision maps attempt to represent the relationship(s) between ideas, beliefs, attitudes, and information available to an author when making a decision within a text. These relationships can be represented as logical, inferential, causal, sequential, and mathematical relationships. Typically, two of these links are compared in a single study, and are analyzed as networks. For example, Heise (1987) used logical and sequential links to examine symbolic interaction. This methodology is thought of as a more generalized cognitive mapping technique, rather than the more specific mental models approach.

Mental models are groups or networks of interrelated concepts that are thought to reflect conscious or subconscious perceptions of reality. According to cognitive scientists, internal mental structures are created as people draw inferences and gather information about the world. Mental models are a more specific approach to mapping because beyond extraction and comparison because they can be numerically and graphically analyzed. Such models rely heavily on the use of computers to help analyze and construct mapping representations. Typically, studies based on this approach follow five general steps:

  • Identifing concepts
  • Defining relationship types
  • Coding the text on the basis of 1 and 2
  • Coding the statements
  • Graphically displaying and numerically analyzing the resulting maps

To create the model, a researcher converts a text into a map of concepts and relations; the map is then analyzed on the level of concepts and statements, where a statement consists of two concepts and their relationship. Carley (1990) asserts that this makes possible the comparison of a wide variety of maps, representing multiple sources, implicit and explicit information, as well as socially shared cognitions.

Relational Analysis: Overview of Methods

As with other sorts of inquiry, initial choices with regard to what is being studied and/or coded for often determine the possibilities of that particular study. For relational analysis, it is important to first decide which concept type(s) will be explored in the analysis. Studies have been conducted with as few as one and as many as 500 concept categories. Obviously, too many categories may obscure your results and too few can lead to unreliable and potentially invalid conclusions. Therefore, it is important to allow the context and necessities of your research to guide your coding procedures.

The steps to relational analysis that we consider in this guide suggest some of the possible avenues available to a researcher doing content analysis. We provide an example to make the process easier to grasp. However, the choices made within the context of the example are but only a few of many possibilities. The diversity of techniques available suggests that there is quite a bit of enthusiasm for this mode of research. Once a procedure is rigorously tested, it can be applied and compared across populations over time. The process of relational analysis has achieved a high degree of computer automation but still is, like most forms of research, time consuming. Perhaps the strongest claim that can be made is that it maintains a high degree of statistical rigor without losing the richness of detail apparent in even more qualitative methods.

Three Subcategories of Relational Analysis

Affect extraction: This approach provides an emotional evaluation of concepts explicit in a text. It is problematic because emotion may vary across time and populations. Nevertheless, when extended it can be a potent means of exploring the emotional/psychological state of the speaker and/or writer. Gottschalk (1995) provides an example of this type of analysis. By assigning concepts identified a numeric value on corresponding emotional/psychological scales that can then be statistically examined, Gottschalk claims that the emotional/psychological state of the speaker or writer can be ascertained via their verbal behavior.

Proximity analysis: This approach, on the other hand, is concerned with the co-occurrence of explicit concepts in the text. In this procedure, the text is defined as a string of words. A given length of words, called a window , is determined. The window is then scanned across a text to check for the co-occurrence of concepts. The result is the creation of a concept determined by the concept matrix . In other words, a matrix, or a group of interrelated, co-occurring concepts, might suggest a certain overall meaning. The technique is problematic because the window records only explicit concepts and treats meaning as proximal co-occurrence. Other techniques such as clustering, grouping, and scaling are also useful in proximity analysis.

Cognitive mapping: This approach is one that allows for further analysis of the results from the two previous approaches. It attempts to take the above processes one step further by representing these relationships visually for comparison. Whereas affective and proximal analysis function primarily within the preserved order of the text, cognitive mapping attempts to create a model of the overall meaning of the text. This can be represented as a graphic map that represents the relationships between concepts.

In this manner, cognitive mapping lends itself to the comparison of semantic connections across texts. This is known as map analysis which allows for comparisons to explore "how meanings and definitions shift across people and time" (Palmquist, Carley, & Dale, 1997). Maps can depict a variety of different mental models (such as that of the text, the writer/speaker, or the social group/period), according to the focus of the researcher. This variety is indicative of the theoretical assumptions that support mapping: mental models are representations of interrelated concepts that reflect conscious or subconscious perceptions of reality; language is the key to understanding these models; and these models can be represented as networks (Carley, 1990). Given these assumptions, it's not surprising to see how closely this technique reflects the cognitive concerns of socio-and psycholinguistics, and lends itself to the development of artificial intelligence models.

Steps for Conducting Relational Analysis

The following discussion of the steps (or, perhaps more accurately, strategies) that can be followed to code a text or set of texts during relational analysis. These explanations are accompanied by examples of relational analysis possibilities for statements made by Bill Clinton during the 1998 hearings.

  • Identify the Question.

The question is important because it indicates where you are headed and why. Without a focused question, the concept types and options open to interpretation are limitless and therefore the analysis difficult to complete. Possibilities for the Hairy Hearings of 1998 might be:

What did Bill Clinton say in the speech? OR What concrete information did he present to the public?
  • Choose a sample or samples for analysis.

Once the question has been identified, the researcher must select sections of text/speech from the hearings in which Bill Clinton may have not told the entire truth or is obviously holding back information. For relational content analysis, the primary consideration is how much information to preserve for analysis. One must be careful not to limit the results by doing so, but the researcher must also take special care not to take on so much that the coding process becomes too heavy and extensive to supply worthwhile results.

  • Determine the type of analysis.

Once the sample has been chosen for analysis, it is necessary to determine what type or types of relationships you would like to examine. There are different subcategories of relational analysis that can be used to examine the relationships in texts.

In this example, we will use proximity analysis because it is concerned with the co-occurrence of explicit concepts in the text. In this instance, we are not particularly interested in affect extraction because we are trying to get to the hard facts of what exactly was said rather than determining the emotional considerations of speaker and receivers surrounding the speech which may be unrecoverable.

Once the subcategory of analysis is chosen, the selected text must be reviewed to determine the level of analysis. The researcher must decide whether to code for a single word, such as "perhaps," or for sets of words or phrases like "I may have forgotten."

  • Reduce the text to categories and code for words or patterns.

At the simplest level, a researcher can code merely for existence. This is not to say that simplicity of procedure leads to simplistic results. Many studies have successfully employed this strategy. For example, Palmquist (1990) did not attempt to establish the relationships among concept terms in the classrooms he studied; his study did, however, look at the change in the presence of concepts over the course of the semester, comparing a map analysis from the beginning of the semester to one constructed at the end. On the other hand, the requirement of one's specific research question may necessitate deeper levels of coding to preserve greater detail for analysis.

In relation to our extended example, the researcher might code for how often Bill Clinton used words that were ambiguous, held double meanings, or left an opening for change or "re-evaluation." The researcher might also choose to code for what words he used that have such an ambiguous nature in relation to the importance of the information directly related to those words.

  • Explore the relationships between concepts (Strength, Sign & Direction).

Once words are coded, the text can be analyzed for the relationships among the concepts set forth. There are three concepts which play a central role in exploring the relations among concepts in content analysis.

  • Strength of Relationship: Refers to the degree to which two or more concepts are related. These relationships are easiest to analyze, compare, and graph when all relationships between concepts are considered to be equal. However, assigning strength to relationships retains a greater degree of the detail found in the original text. Identifying strength of a relationship is key when determining whether or not words like unless, perhaps, or maybe are related to a particular section of text, phrase, or idea.
  • Sign of a Relationship: Refers to whether or not the concepts are positively or negatively related. To illustrate, the concept "bear" is negatively related to the concept "stock market" in the same sense as the concept "bull" is positively related. Thus "it's a bear market" could be coded to show a negative relationship between "bear" and "market". Another approach to coding for strength entails the creation of separate categories for binary oppositions. The above example emphasizes "bull" as the negation of "bear," but could be coded as being two separate categories, one positive and one negative. There has been little research to determine the benefits and liabilities of these differing strategies. Use of Sign coding for relationships in regard to the hearings my be to find out whether or not the words under observation or in question were used adversely or in favor of the concepts (this is tricky, but important to establishing meaning).
  • Direction of the Relationship: Refers to the type of relationship categories exhibit. Coding for this sort of information can be useful in establishing, for example, the impact of new information in a decision making process. Various types of directional relationships include, "X implies Y," "X occurs before Y" and "if X then Y," or quite simply the decision whether concept X is the "prime mover" of Y or vice versa. In the case of the 1998 hearings, the researcher might note that, "maybe implies doubt," "perhaps occurs before statements of clarification," and "if possibly exists, then there is room for Clinton to change his stance." In some cases, concepts can be said to be bi-directional, or having equal influence. This is equivalent to ignoring directionality. Both approaches are useful, but differ in focus. Coding all categories as bi-directional is most useful for exploratory studies where pre-coding may influence results, and is also most easily automated, or computer coded.
  • Code the relationships.

One of the main differences between conceptual analysis and relational analysis is that the statements or relationships between concepts are coded. At this point, to continue our extended example, it is important to take special care with assigning value to the relationships in an effort to determine whether the ambiguous words in Bill Clinton's speech are just fillers, or hold information about the statements he is making.

  • Perform Statisical Analyses.

This step involves conducting statistical analyses of the data you've coded during your relational analysis. This may involve exploring for differences or looking for relationships among the variables you've identified in your study.

  • Map out the Representations.

In addition to statistical analysis, relational analysis often leads to viewing the representations of the concepts and their associations in a text (or across texts) in a graphical -- or map -- form. Relational analysis is also informed by a variety of different theoretical approaches: linguistic content analysis, decision mapping, and mental models.

The authors of this guide have created the following commentaries on content analysis.

Issues of Reliability & Validity

The issues of reliability and validity are concurrent with those addressed in other research methods. The reliability of a content analysis study refers to its stability , or the tendency for coders to consistently re-code the same data in the same way over a period of time; reproducibility , or the tendency for a group of coders to classify categories membership in the same way; and accuracy , or the extent to which the classification of a text corresponds to a standard or norm statistically. Gottschalk (1995) points out that the issue of reliability may be further complicated by the inescapably human nature of researchers. For this reason, he suggests that coding errors can only be minimized, and not eliminated (he shoots for 80% as an acceptable margin for reliability).

On the other hand, the validity of a content analysis study refers to the correspondence of the categories to the conclusions , and the generalizability of results to a theory.

The validity of categories in implicit concept analysis, in particular, is achieved by utilizing multiple classifiers to arrive at an agreed upon definition of the category. For example, a content analysis study might measure the occurrence of the concept category "communist" in presidential inaugural speeches. Using multiple classifiers, the concept category can be broadened to include synonyms such as "red," "Soviet threat," "pinkos," "godless infidels" and "Marxist sympathizers." "Communist" is held to be the explicit variable, while "red," etc. are the implicit variables.

The overarching problem of concept analysis research is the challenge-able nature of conclusions reached by its inferential procedures. The question lies in what level of implication is allowable, i.e. do the conclusions follow from the data or are they explainable due to some other phenomenon? For occurrence-specific studies, for example, can the second occurrence of a word carry equal weight as the ninety-ninth? Reasonable conclusions can be drawn from substantive amounts of quantitative data, but the question of proof may still remain unanswered.

This problem is again best illustrated when one uses computer programs to conduct word counts. The problem of distinguishing between synonyms and homonyms can completely throw off one's results, invalidating any conclusions one infers from the results. The word "mine," for example, variously denotes a personal pronoun, an explosive device, and a deep hole in the ground from which ore is extracted. One may obtain an accurate count of that word's occurrence and frequency, but not have an accurate accounting of the meaning inherent in each particular usage. For example, one may find 50 occurrences of the word "mine." But, if one is only looking specifically for "mine" as an explosive device, and 17 of the occurrences are actually personal pronouns, the resulting 50 is an inaccurate result. Any conclusions drawn as a result of that number would render that conclusion invalid.

The generalizability of one's conclusions, then, is very dependent on how one determines concept categories, as well as on how reliable those categories are. It is imperative that one defines categories that accurately measure the idea and/or items one is seeking to measure. Akin to this is the construction of rules. Developing rules that allow one, and others, to categorize and code the same data in the same way over a period of time, referred to as stability , is essential to the success of a conceptual analysis. Reproducibility , not only of specific categories, but of general methods applied to establishing all sets of categories, makes a study, and its subsequent conclusions and results, more sound. A study which does this, i.e. in which the classification of a text corresponds to a standard or norm, is said to have accuracy .

Advantages of Content Analysis

Content analysis offers several advantages to researchers who consider using it. In particular, content analysis:

  • looks directly at communication via texts or transcripts, and hence gets at the central aspect of social interaction
  • can allow for both quantitative and qualitative operations
  • can provides valuable historical/cultural insights over time through analysis of texts
  • allows a closeness to text which can alternate between specific categories and relationships and also statistically analyzes the coded form of the text
  • can be used to interpret texts for purposes such as the development of expert systems (since knowledge and rules can both be coded in terms of explicit statements about the relationships among concepts)
  • is an unobtrusive means of analyzing interactions
  • provides insight into complex models of human thought and language use

Disadvantages of Content Analysis

Content analysis suffers from several disadvantages, both theoretical and procedural. In particular, content analysis:

  • can be extremely time consuming
  • is subject to increased error, particularly when relational analysis is used to attain a higher level of interpretation
  • is often devoid of theoretical base, or attempts too liberally to draw meaningful inferences about the relationships and impacts implied in a study
  • is inherently reductive, particularly when dealing with complex texts
  • tends too often to simply consist of word counts
  • often disregards the context that produced the text, as well as the state of things after the text is produced
  • can be difficult to automate or computerize

The Palmquist, Carley and Dale study, a summary of "Applications of Computer-Aided Text Analysis: Analyzing Literary and Non-Literary Texts" (1997) is an example of two studies that have been conducted using both conceptual and relational analysis. The Problematic Text for Content Analysis shows the differences in results obtained by a conceptual and a relational approach to a study.

Related Information: Example of a Problematic Text for Content Analysis

In this example, both students observed a scientist and were asked to write about the experience.

Student A: I found that scientists engage in research in order to make discoveries and generate new ideas. Such research by scientists is hard work and often involves collaboration with other scientists which leads to discoveries which make the scientists famous. Such collaboration may be informal, such as when they share new ideas over lunch, or formal, such as when they are co-authors of a paper.
Student B: It was hard work to research famous scientists engaged in collaboration and I made many informal discoveries. My research showed that scientists engaged in collaboration with other scientists are co-authors of at least one paper containing their new ideas. Some scientists make formal discoveries and have new ideas.

Content analysis coding for explicit concepts may not reveal any significant differences. For example, the existence of "I, scientist, research, hard work, collaboration, discoveries, new ideas, etc..." are explicit in both texts, occur the same number of times, and have the same emphasis. Relational analysis or cognitive mapping, however, reveals that while all concepts in the text are shared, only five concepts are common to both. Analyzing these statements reveals that Student A reports on what "I" found out about "scientists," and elaborated the notion of "scientists" doing "research." Student B focuses on what "I's" research was and sees scientists as "making discoveries" without emphasis on research.

Related Information: The Palmquist, Carley and Dale Study

Consider these two questions: How has the depiction of robots changed over more than a century's worth of writing? And, do students and writing instructors share the same terms for describing the writing process? Although these questions seem totally unrelated, they do share a commonality: in the Palmquist, Carley & Dale study, their answers rely on computer-aided text analysis to demonstrate how different texts can be analyzed.

Literary texts

One half of the study explored the depiction of robots in 27 science fiction texts written between 1818 and 1988. After texts were divided into three historically defined groups, readers look for how the depiction of robots has changed over time. To do this, researchers had to create concept lists and relationship types, create maps using a computer software (see Fig. 1), modify those maps and then ultimately analyze them. The final product of the analysis revealed that over time authors were less likely to depict robots as metallic humanoids.

Non-literary texts

The second half of the study used student journals and interviews, teacher interviews, texts books, and classroom observations as the non-literary texts from which concepts and words were taken. The purpose behind the study was to determine if, in fact, over time teacher and students would begin to share a similar vocabulary about the writing process. Again, researchers used computer software to assist in the process. This time, computers helped researchers generated a concept list based on frequently occurring words and phrases from all texts. Maps were also created and analyzed in this study (see Fig. 2).

Annotated Bibliography

Busch, Carol, Paul S. De Maret, Teresa Flynn, Rachel Kellum, Sheri Le, Brad Meyers, Matt Saunders, Robert White, and Mike Palmquist. (2005). Content Analysis. Writing@CSU. Colorado State University.

Content Analysis

Content analysis is a research tool used to determine the presence of certain words, themes, or concepts within some given qualitative data (i.e. text). Using content analysis, researchers can quantify and analyze the presence, meanings, and relationships of such certain words, themes, or concepts. As an example, researchers can evaluate language used within a news article to search for bias or partiality. Researchers can then make inferences about the messages within the texts, the writer(s), the audience, and even the culture and time of surrounding the text.


Sources of data could be from interviews, open-ended questions, field research notes, conversations, or literally any occurrence of communicative language (such as books, essays, discussions, newspaper headlines, speeches, media, historical documents). A single study may analyze various forms of text in its analysis. To analyze the text using content analysis, the text must be coded, or broken down, into manageable code categories for analysis (i.e. “codes”). Once the text is coded into code categories, the codes can then be further categorized into “code categories” to summarize data even further.

Three different definitions of content analysis are provided below.

Definition 1: “Any technique for making inferences by systematically and objectively identifying special characteristics of messages.” (from Holsti, 1968)

Definition 2: “An interpretive and naturalistic approach. It is both observational and narrative in nature and relies less on the experimental elements normally associated with scientific research (reliability, validity, and generalizability) (from Ethnography, Observational Research, and Narrative Inquiry, 1994-2012).

Definition 3: “A research technique for the objective, systematic and quantitative description of the manifest content of communication.” (from Berelson, 1952)

Uses of Content Analysis

Identify the intentions, focus or communication trends of an individual, group or institution

Describe attitudinal and behavioral responses to communications

Determine the psychological or emotional state of persons or groups

Reveal international differences in communication content

Reveal patterns in communication content

Pre-test and improve an intervention or survey prior to launch

Analyze focus group interviews and open-ended questions to complement quantitative data

Types of Content Analysis

There are two general types of content analysis: conceptual analysis and relational analysis. Conceptual analysis determines the existence and frequency of concepts in a text. Relational analysis develops the conceptual analysis further by examining the relationships among concepts in a text. Each type of analysis may lead to different results, conclusions, interpretations and meanings.

Conceptual Analysis

Typically people think of conceptual analysis when they think of content analysis. In conceptual analysis, a concept is chosen for examination and the analysis involves quantifying and counting its presence. The main goal is to examine the occurrence of selected terms in the data. Terms may be explicit or implicit. Explicit terms are easy to identify. Coding of implicit terms is more complicated: you need to decide the level of implication and base judgments on subjectivity (an issue for reliability and validity). Therefore, coding of implicit terms involves using a dictionary or contextual translation rules or both.

To begin a conceptual content analysis, first identify the research question and choose a sample or samples for analysis. Next, the text must be coded into manageable content categories. This is basically a process of selective reduction. By reducing the text to categories, the researcher can focus on and code for specific words or patterns that inform the research question.

General steps for conducting a conceptual content analysis:

1. Decide the level of analysis: word, word sense, phrase, sentence, themes

2. Decide how many concepts to code for: develop a pre-defined or interactive set of categories or concepts. Decide either: A. to allow flexibility to add categories through the coding process, or B. to stick with the pre-defined set of categories.

Option A allows for the introduction and analysis of new and important material that could have significant implications to one’s research question.

Option B allows the researcher to stay focused and examine the data for specific concepts.

3. Decide whether to code for existence or frequency of a concept. The decision changes the coding process.

When coding for the existence of a concept, the researcher would count a concept only once if it appeared at least once in the data and no matter how many times it appeared.

When coding for the frequency of a concept, the researcher would count the number of times a concept appears in a text.

4. Decide on how you will distinguish among concepts:

Should text be coded exactly as they appear or coded as the same when they appear in different forms? For example, “dangerous” vs. “dangerousness”. The point here is to create coding rules so that these word segments are transparently categorized in a logical fashion. The rules could make all of these word segments fall into the same category, or perhaps the rules can be formulated so that the researcher can distinguish these word segments into separate codes.

What level of implication is to be allowed? Words that imply the concept or words that explicitly state the concept? For example, “dangerous” vs. “the person is scary” vs. “that person could cause harm to me”. These word segments may not merit separate categories, due the implicit meaning of “dangerous”.

5. Develop rules for coding your texts. After decisions of steps 1-4 are complete, a researcher can begin developing rules for translation of text into codes. This will keep the coding process organized and consistent. The researcher can code for exactly what he/she wants to code. Validity of the coding process is ensured when the researcher is consistent and coherent in their codes, meaning that they follow their translation rules. In content analysis, obeying by the translation rules is equivalent to validity.

6. Decide what to do with irrelevant information: should this be ignored (e.g. common English words like “the” and “and”), or used to reexamine the coding scheme in the case that it would add to the outcome of coding?

7. Code the text: This can be done by hand or by using software. By using software, researchers can input categories and have coding done automatically, quickly and efficiently, by the software program. When coding is done by hand, a researcher can recognize errors far more easily (e.g. typos, misspelling). If using computer coding, text could be cleaned of errors to include all available data. This decision of hand vs. computer coding is most relevant for implicit information where category preparation is essential for accurate coding.

8. Analyze your results: Draw conclusions and generalizations where possible. Determine what to do with irrelevant, unwanted, or unused text: reexamine, ignore, or reassess the coding scheme. Interpret results carefully as conceptual content analysis can only quantify the information. Typically, general trends and patterns can be identified.

Relational Analysis

Relational analysis begins like conceptual analysis, where a concept is chosen for examination. However, the analysis involves exploring the relationships between concepts. Individual concepts are viewed as having no inherent meaning and rather the meaning is a product of the relationships among concepts.

To begin a relational content analysis, first identify a research question and choose a sample or samples for analysis. The research question must be focused so the concept types are not open to interpretation and can be summarized. Next, select text for analysis. Select text for analysis carefully by balancing having enough information for a thorough analysis so results are not limited with having information that is too extensive so that the coding process becomes too arduous and heavy to supply meaningful and worthwhile results.

There are three subcategories of relational analysis to choose from prior to going on to the general steps.

Affect extraction: an emotional evaluation of concepts explicit in a text. A challenge to this method is that emotions can vary across time, populations, and space. However, it could be effective at capturing the emotional and psychological state of the speaker or writer of the text.

Proximity analysis: an evaluation of the co-occurrence of explicit concepts in the text. Text is defined as a string of words called a “window” that is scanned for the co-occurrence of concepts. The result is the creation of a “concept matrix”, or a group of interrelated co-occurring concepts that would suggest an overall meaning.

Cognitive mapping: a visualization technique for either affect extraction or proximity analysis. Cognitive mapping attempts to create a model of the overall meaning of the text such as a graphic map that represents the relationships between concepts.

General steps for conducting a relational content analysis:

1. Determine the type of analysis: Once the sample has been selected, the researcher needs to determine what types of relationships to examine and the level of analysis: word, word sense, phrase, sentence, themes. 2. Reduce the text to categories and code for words or patterns. A researcher can code for existence of meanings or words. 3. Explore the relationship between concepts: once the words are coded, the text can be analyzed for the following:

Strength of relationship: degree to which two or more concepts are related.

Sign of relationship: are concepts positively or negatively related to each other?

Direction of relationship: the types of relationship that categories exhibit. For example, “X implies Y” or “X occurs before Y” or “if X then Y” or if X is the primary motivator of Y.

4. Code the relationships: a difference between conceptual and relational analysis is that the statements or relationships between concepts are coded. 5. Perform statistical analyses: explore differences or look for relationships among the identified variables during coding. 6. Map out representations: such as decision mapping and mental models.

Reliability and Validity

Reliability : Because of the human nature of researchers, coding errors can never be eliminated but only minimized. Generally, 80% is an acceptable margin for reliability. Three criteria comprise the reliability of a content analysis:

Stability: the tendency for coders to consistently re-code the same data in the same way over a period of time.

Reproducibility: tendency for a group of coders to classify categories membership in the same way.

Accuracy: extent to which the classification of text corresponds to a standard or norm statistically.

Validity : Three criteria comprise the validity of a content analysis:

Closeness of categories: this can be achieved by utilizing multiple classifiers to arrive at an agreed upon definition of each specific category. Using multiple classifiers, a concept category that may be an explicit variable can be broadened to include synonyms or implicit variables.

Conclusions: What level of implication is allowable? Do conclusions correctly follow the data? Are results explainable by other phenomena? This becomes especially problematic when using computer software for analysis and distinguishing between synonyms. For example, the word “mine,” variously denotes a personal pronoun, an explosive device, and a deep hole in the ground from which ore is extracted. Software can obtain an accurate count of that word’s occurrence and frequency, but not be able to produce an accurate accounting of the meaning inherent in each particular usage. This problem could throw off one’s results and make any conclusion invalid.

Generalizability of the results to a theory: dependent on the clear definitions of concept categories, how they are determined and how reliable they are at measuring the idea one is seeking to measure. Generalizability parallels reliability as much of it depends on the three criteria for reliability.

Advantages of Content Analysis

Directly examines communication using text

Allows for both qualitative and quantitative analysis

Provides valuable historical and cultural insights over time

Allows a closeness to data

Coded form of the text can be statistically analyzed

Unobtrusive means of analyzing interactions

Provides insight into complex models of human thought and language use

When done well, is considered a relatively “exact” research method

Content analysis is a readily-understood and an inexpensive research method

A more powerful tool when combined with other research methods such as interviews, observation, and use of archival records. It is very useful for analyzing historical material, especially for documenting trends over time.

Disadvantages of Content Analysis

Can be extremely time consuming

Is subject to increased error, particularly when relational analysis is used to attain a higher level of interpretation

Is often devoid of theoretical base, or attempts too liberally to draw meaningful inferences about the relationships and impacts implied in a study

Is inherently reductive, particularly when dealing with complex texts

Tends too often to simply consist of word counts

Often disregards the context that produced the text, as well as the state of things after the text is produced

Can be difficult to automate or computerize

Home » Content Analysis – Methods, Types and Examples

Content Analysis – Methods, Types and Examples

Table of Contents

Content Analysis

Content Analysis


Content analysis is a research method used to analyze and interpret the characteristics of various forms of communication, such as text, images, or audio. It involves systematically analyzing the content of these materials, identifying patterns, themes, and other relevant features, and drawing inferences or conclusions based on the findings.

Content analysis can be used to study a wide range of topics, including media coverage of social issues, political speeches, advertising messages, and online discussions, among others. It is often used in qualitative research and can be combined with other methods to provide a more comprehensive understanding of a particular phenomenon.

Types of Content Analysis

There are generally two types of content analysis:

Quantitative Content Analysis

This type of content analysis involves the systematic and objective counting and categorization of the content of a particular form of communication, such as text or video. The data obtained is then subjected to statistical analysis to identify patterns, trends, and relationships between different variables. Quantitative content analysis is often used to study media content, advertising, and political speeches.

Qualitative Content Analysis

This type of content analysis is concerned with the interpretation and understanding of the meaning and context of the content. It involves the systematic analysis of the content to identify themes, patterns, and other relevant features, and to interpret the underlying meanings and implications of these features. Qualitative content analysis is often used to study interviews, focus groups, and other forms of qualitative data, where the researcher is interested in understanding the subjective experiences and perceptions of the participants.

Methods of Content Analysis

There are several methods of content analysis, including:

Conceptual Analysis

This method involves analyzing the meanings of key concepts used in the content being analyzed. The researcher identifies key concepts and analyzes how they are used, defining them and categorizing them into broader themes.

Content Analysis by Frequency

This method involves counting and categorizing the frequency of specific words, phrases, or themes that appear in the content being analyzed. The researcher identifies relevant keywords or phrases and systematically counts their frequency.

Comparative Analysis

This method involves comparing the content of two or more sources to identify similarities, differences, and patterns. The researcher selects relevant sources, identifies key themes or concepts, and compares how they are represented in each source.

Discourse Analysis

This method involves analyzing the structure and language of the content being analyzed to identify how the content constructs and represents social reality. The researcher analyzes the language used and the underlying assumptions, beliefs, and values reflected in the content.

Narrative Analysis

This method involves analyzing the content as a narrative, identifying the plot, characters, and themes, and analyzing how they relate to the broader social context. The researcher identifies the underlying messages conveyed by the narrative and their implications for the broader social context.

Content Analysis Conducting Guide

Here is a basic guide to conducting a content analysis:

  • Define your research question or objective: Before starting your content analysis, you need to define your research question or objective clearly. This will help you to identify the content you need to analyze and the type of analysis you need to conduct.
  • Select your sample: Select a representative sample of the content you want to analyze. This may involve selecting a random sample, a purposive sample, or a convenience sample, depending on the research question and the availability of the content.
  • Develop a coding scheme: Develop a coding scheme or a set of categories to use for coding the content. The coding scheme should be based on your research question or objective and should be reliable, valid, and comprehensive.
  • Train coders: Train coders to use the coding scheme and ensure that they have a clear understanding of the coding categories and procedures. You may also need to establish inter-coder reliability to ensure that different coders are coding the content consistently.
  • Code the content: Code the content using the coding scheme. This may involve manually coding the content, using software, or a combination of both.
  • Analyze the data: Once the content is coded, analyze the data using appropriate statistical or qualitative methods, depending on the research question and the type of data.
  • Interpret the results: Interpret the results of the analysis in the context of your research question or objective. Draw conclusions based on the findings and relate them to the broader literature on the topic.
  • Report your findings: Report your findings in a clear and concise manner, including the research question, methodology, results, and conclusions. Provide details about the coding scheme, inter-coder reliability, and any limitations of the study.

Applications of Content Analysis

Content analysis has numerous applications across different fields, including:

  • Media Research: Content analysis is commonly used in media research to examine the representation of different groups, such as race, gender, and sexual orientation, in media content. It can also be used to study media framing, media bias, and media effects.
  • Political Communication : Content analysis can be used to study political communication, including political speeches, debates, and news coverage of political events. It can also be used to study political advertising and the impact of political communication on public opinion and voting behavior.
  • Marketing Research: Content analysis can be used to study advertising messages, consumer reviews, and social media posts related to products or services. It can provide insights into consumer preferences, attitudes, and behaviors.
  • Health Communication: Content analysis can be used to study health communication, including the representation of health issues in the media, the effectiveness of health campaigns, and the impact of health messages on behavior.
  • Education Research : Content analysis can be used to study educational materials, including textbooks, curricula, and instructional materials. It can provide insights into the representation of different topics, perspectives, and values.
  • Social Science Research: Content analysis can be used in a wide range of social science research, including studies of social media, online communities, and other forms of digital communication. It can also be used to study interviews, focus groups, and other qualitative data sources.

Examples of Content Analysis

Here are some examples of content analysis:

  • Media Representation of Race and Gender: A content analysis could be conducted to examine the representation of different races and genders in popular media, such as movies, TV shows, and news coverage.
  • Political Campaign Ads : A content analysis could be conducted to study political campaign ads and the themes and messages used by candidates.
  • Social Media Posts: A content analysis could be conducted to study social media posts related to a particular topic, such as the COVID-19 pandemic, to examine the attitudes and beliefs of social media users.
  • Instructional Materials: A content analysis could be conducted to study the representation of different topics and perspectives in educational materials, such as textbooks and curricula.
  • Product Reviews: A content analysis could be conducted to study product reviews on e-commerce websites, such as Amazon, to identify common themes and issues mentioned by consumers.
  • News Coverage of Health Issues: A content analysis could be conducted to study news coverage of health issues, such as vaccine hesitancy, to identify common themes and perspectives.
  • Online Communities: A content analysis could be conducted to study online communities, such as discussion forums or social media groups, to understand the language, attitudes, and beliefs of the community members.

Purpose of Content Analysis

The purpose of content analysis is to systematically analyze and interpret the content of various forms of communication, such as written, oral, or visual, to identify patterns, themes, and meanings. Content analysis is used to study communication in a wide range of fields, including media studies, political science, psychology, education, sociology, and marketing research. The primary goals of content analysis include:

  • Describing and summarizing communication: Content analysis can be used to describe and summarize the content of communication, such as the themes, topics, and messages conveyed in media content, political speeches, or social media posts.
  • Identifying patterns and trends: Content analysis can be used to identify patterns and trends in communication, such as changes over time, differences between groups, or common themes or motifs.
  • Exploring meanings and interpretations: Content analysis can be used to explore the meanings and interpretations of communication, such as the underlying values, beliefs, and assumptions that shape the content.
  • Testing hypotheses and theories : Content analysis can be used to test hypotheses and theories about communication, such as the effects of media on attitudes and behaviors or the framing of political issues in the media.

When to use Content Analysis

Content analysis is a useful method when you want to analyze and interpret the content of various forms of communication, such as written, oral, or visual. Here are some specific situations where content analysis might be appropriate:

  • When you want to study media content: Content analysis is commonly used in media studies to analyze the content of TV shows, movies, news coverage, and other forms of media.
  • When you want to study political communication : Content analysis can be used to study political speeches, debates, news coverage, and advertising.
  • When you want to study consumer attitudes and behaviors: Content analysis can be used to analyze product reviews, social media posts, and other forms of consumer feedback.
  • When you want to study educational materials : Content analysis can be used to analyze textbooks, instructional materials, and curricula.
  • When you want to study online communities: Content analysis can be used to analyze discussion forums, social media groups, and other forms of online communication.
  • When you want to test hypotheses and theories : Content analysis can be used to test hypotheses and theories about communication, such as the framing of political issues in the media or the effects of media on attitudes and behaviors.

Characteristics of Content Analysis

Content analysis has several key characteristics that make it a useful research method. These include:

  • Objectivity : Content analysis aims to be an objective method of research, meaning that the researcher does not introduce their own biases or interpretations into the analysis. This is achieved by using standardized and systematic coding procedures.
  • Systematic: Content analysis involves the use of a systematic approach to analyze and interpret the content of communication. This involves defining the research question, selecting the sample of content to analyze, developing a coding scheme, and analyzing the data.
  • Quantitative : Content analysis often involves counting and measuring the occurrence of specific themes or topics in the content, making it a quantitative research method. This allows for statistical analysis and generalization of findings.
  • Contextual : Content analysis considers the context in which the communication takes place, such as the time period, the audience, and the purpose of the communication.
  • Iterative : Content analysis is an iterative process, meaning that the researcher may refine the coding scheme and analysis as they analyze the data, to ensure that the findings are valid and reliable.
  • Reliability and validity : Content analysis aims to be a reliable and valid method of research, meaning that the findings are consistent and accurate. This is achieved through inter-coder reliability tests and other measures to ensure the quality of the data and analysis.

Advantages of Content Analysis

There are several advantages to using content analysis as a research method, including:

  • Objective and systematic : Content analysis aims to be an objective and systematic method of research, which reduces the likelihood of bias and subjectivity in the analysis.
  • Large sample size: Content analysis allows for the analysis of a large sample of data, which increases the statistical power of the analysis and the generalizability of the findings.
  • Non-intrusive: Content analysis does not require the researcher to interact with the participants or disrupt their natural behavior, making it a non-intrusive research method.
  • Accessible data: Content analysis can be used to analyze a wide range of data types, including written, oral, and visual communication, making it accessible to researchers across different fields.
  • Versatile : Content analysis can be used to study communication in a wide range of contexts and fields, including media studies, political science, psychology, education, sociology, and marketing research.
  • Cost-effective: Content analysis is a cost-effective research method, as it does not require expensive equipment or participant incentives.

Limitations of Content Analysis

While content analysis has many advantages, there are also some limitations to consider, including:

  • Limited contextual information: Content analysis is focused on the content of communication, which means that contextual information may be limited. This can make it difficult to fully understand the meaning behind the communication.
  • Limited ability to capture nonverbal communication : Content analysis is limited to analyzing the content of communication that can be captured in written or recorded form. It may miss out on nonverbal communication, such as body language or tone of voice.
  • Subjectivity in coding: While content analysis aims to be objective, there may be subjectivity in the coding process. Different coders may interpret the content differently, which can lead to inconsistent results.
  • Limited ability to establish causality: Content analysis is a correlational research method, meaning that it cannot establish causality between variables. It can only identify associations between variables.
  • Limited generalizability: Content analysis is limited to the data that is analyzed, which means that the findings may not be generalizable to other contexts or populations.
  • Time-consuming: Content analysis can be a time-consuming research method, especially when analyzing a large sample of data. This can be a disadvantage for researchers who need to complete their research in a short amount of time.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Uniform Histogram

Uniform Histogram – Purpose, Examples and Guide


Phenomenology – Methods, Examples and Guide

Documentary Analysis

Documentary Analysis – Methods, Applications and...

Cluster Analysis

Cluster Analysis – Types, Methods and Examples

Discriminant Analysis

Discriminant Analysis – Methods, Types and...

Probability Histogram

Probability Histogram – Definition, Examples and...

Analyst Answers

Data & Finance for Work & Life

content analysis

Qualitative Content Analysis: a Simple Guide with Examples

Content analysis is a type of qualitative research (as opposed to quantitative research) that focuses on analyzing content in various mediums, the most common of which is written words in documents.

It’s a very common technique used in academia, especially for students working on theses and dissertations, but here we’re going to talk about how companies can use qualitative content analysis to improve their processes and increase revenue.

Whether you’re new to content analysis or a seasoned professor, this article provides all you need to know about how data analysts use content analysis to improve their business. It will also help you understand the relationship between content analysis and natural language processing — what some even call natural language content analysis.

Don’t forget, you can get the free Intro to Data Analysis eBook , which will ensure you build the right practical skills for success in your analytical endeavors.

What is qualitative content analysis, and what is it used for?

Any content analysis definition must consist of at least these three things: qualitative language , themes , and quantification .

In short, content analysis is the process of examining preselected words in video, audio, or written mediums and their context to identify themes, then quantifying them for statistical analysis in order to draw conclusions. More simply, it’s counting how often you see two words close to each other.

For example, let’s say I place in front of you an audio bit, a old video with a static image, and a document with lots of text but no titles or descriptions. At the start, you would have no idea what any of it was about.

Let’s say you transpose the video and audio recordings on paper. Then you use a counting software to count the top ten most used words, excluding prepositions (of, over, to, by) and articles (the, a), conjunctions (and, but, or) and other common words like “very.”

Your results are that the top 5 words are “candy,” “snow,” “cold,” and “sled.” These 5 words appear at least 25 times each, and the next highest word appears only 4 times. You also find that the words “snow” and “sled” appear adjacent to each other 95% of the time that “snow” appears.

Well, now you have performed a very elementary qualitative content analysis .

This means that you’re probably dealing with a text in which snow sleds are important. Snow sleds, thus, become a theme in these documents, which goes to the heart of qualitative content analysis.

The goal of qualitative content analysis is to organize text into a series of themes . This is opposed to quantitative content analysis, which aims to organize the text into categories .

Types of qualitative content analysis

If you’ve heard about content analysis, it was most likely in an academic setting. The term itself is common among PhD students and Masters students writing their dissertations and theses. In that context, the most common type of content analysis is document analysis.

There are many types of content analysis , including:

  • Short- and long-form survey questions
  • Focus group transcripts
  • Interview transcripts
  • Legislature
  • Public records
  • Comments sections
  • Messaging platforms

This list gives you an idea for the possibilities and industries in which qualitative content analysis can be applied.

For example, marketing departments or public relations groups in major corporations might collect survey, focus groups, and interviews, then hand off the information to a data analyst who performs the content analysis.

A political analysis institution or Think Tank might look at legislature over time to identify potential emerging themes based on their slow introduction into policy margins. Perhaps it’s possible to identify certain beliefs in the senate and house of representatives before they enter the public discourse.

Non-governmental organizations (NGOs) might perform an analysis on public records to see how to better serve their constituents. If they have access to public records, it would be possible to identify citizen characteristics that align with their goal.

Analysis logic: inductive vs deductive

There are two types of logic we can apply to qualitative content analysis: inductive and deductive. Inductive content analysis is more of an exploratory approach. We don’t know what patterns or ideas we’ll discover, so we go in with an open mind.

On the other hand, deductive content analysis involves starting with an idea and identifying how it appears in the text. For example, we may approach legislation on wildlife by looking for rules on hunting. Perhaps we think hunting with a knife is too dangerous, and we want to identify trends in the text.

Neither one is better per se, and they each have carry value in different contexts. For example, inductive content analysis is advantageous in situations where we want to identify author intent. Going in with a hypothesis can bias the way we look at the data, so the inductive method is better

Deductive content analysis is better when we want to target a term. For example, if we want to see how important knife hunting is in the legislation, we’re doing deductive content analysis.

Measurements: idea coding vs word frequency

Two main methodologies exist for analyzing the text itself: coding and word frequency. Idea coding is the manual process of reading through a text and “coding” ideas in a column on the right. The reason we call this coding is because we take ideas and themes expressed in many words, and turn them into one common phrase. This allows researchers to better understand how those ideas evolve. We will look at how to do this in word below.

In short, coding in the context qualitative content analysis follows 2 steps:

  • Reading through the text one time
  • Adding 2-5 word summaries each time a significant theme or idea appears

Word frequency is simply counting the number of times a word appears in a text, as well as its proximity to other words. In our “snow sled” example above, we counted the number of times a word appeared, as well as how often it appeared next to other words. There’s are online tool for this we’ll look at below.

In short, word frequency in the context of content analysis follows 2 steps:

  • Decide whether you want to find a word, or just look at the most common words
  • Use word’s Replace function for the first, or an online tool such as Text Analyzer for the second (we’ll look at these in more detail below).

Many data scientists consider coding as the only qualitative content analysis, since word frequency turns to counting the number of times a word appears, making is quantitative.

While there is merit to this claim, I personally do not consider word frequency a part of quantitative content analysis. The fact that we count the frequency of a word does not mean we can draw direct conclusions from it. In fact, without a researcher to provide context on the number of time a word appears, word frequency is useless. True quantitative research carries conclusive value on its own.

Measurements AND analysis logic

There are four ways to approach qualitative content analysis given our two measurement types and inductive/deductive logical approaches. You could do inductive coding, inductive word frequency, deductive coding, and deductive word frequency.

The two best are inductive coding and deductive word frequency. If you would like to discover a document, trying to search for specific words will not inform you about its contents, so inductive word frequency is un-insightful.

Likewise, if you’re looking for the presence of a specific idea, you do not want to go through the whole document to code just to find it, so deductive coding is not insightful. Here’s simple matrix to illustrate:

Qualitative content analysis example

We looked at a small example above, but let’s play out all of the above information in a real world example. I will post the link to the text source at the bottom of the article, but don’t look at it yet . Let’s jump in with a discovery mentality , meaning let’s use an inductive approach and code our way through each paragraph.

Qualitative Content Analysis Example Download

*Click the “1” superscript to the right for a link to the source text. 1

How to do qualitative content analysis

We could use word frequency analysis to find out which are the most common x% of words in the text (deductive word frequency), but this takes some time because we need to build a formula that excludes words that are common but that don’t have any value (a, the, but, and, etc).

As a shortcut, you can use online tools such as Text Analyzer and WordCounter , which will give you breakdowns by phrase length (6 words, 5 words, 4 words, etc), without excluding common terms. Here are a few insightful example using our text with 7 words:

thesis on content analysis

Perhaps more insightfully, here is a list of 5 word combinations, which are much more common:

thesis on content analysis

The downside to these tools is that you cannot find 2- and 1-word strings without excluding common words. This is a limitation, but it’s unlikely that the work required to get there is worth the value it brings.

OK. Now that we’ve seen how to go about coding our text into quantifiable data, let’s look at the deductive approach and try to figure out if the text contains a single word we’re looking for. (This is my favorite.)

Deductive word frequency

We know the text now because we’ve already looked through it. It’s about the process of becoming literate, namely, the elements that impact our ability to learn to read. But we only looked at the first four sections of the article, so there’s more to explore.

Let’s say we want to know how a household situation might impact a student’s ability to read . Instead of coding the entire article, we can simply look for this term and it’s synonyms. The process for deductive word frequency is the following:

  • Identify your term
  • Think of all the possible synonyms
  • Use the word find function to see how many times they appear
  • If you suspect that this word often comes in connection with others, try searching for both of them

In my example, the process would be:

  • Parents, parent, home, house, household situation, household influence, parental, parental situation, at home, home situation
  • Go to “Edit>Find>Replace…” This will enable you to locate the number of instances in which your word or combinations appear. We use the Replace window instead of the simply Find bar because it allows us to visualize the information.
  • Accounted for in possible synonyms

The results: 0! None of these words appeared in the text, so we can conclude that this text has nothing to do with a child’s home life and its impact on his/her ability to learn to read. Here’s a picture:

deductive word frequency content analysis

Don’t Be Afraid of Content Analysis

Content analysis can be intimidating because it uses data analysis to quantify words. This article provides a starting point for your analysis, but to ensure you get 90% reliability in word coding, sign up to receive our eBook Beginner Content Analysis . I went from philosophy student to a data-heavy finance career, and I created it to cater to research and dissertation use cases.

thesis on content analysis

Content analysis vs natural language processing

While similar, content analysis, even the deductive word frequency approach, and natural language processing (NLP) are not the same. The relationship is hierarchical. Natural language processing is a field of linguistics and data science that’s concerned with understanding the meaning behind language.

On the other hand, content analysis is a branch of natural language processing that focuses on the methodologies we discussed above: discovery-style coding (sometimes called “tokenization”) and word frequency (sometimes called the “bag of words” technique)

For example, we would use natural language processing to quantify huge amounts of linguistic information, turn it into row-and-column data, and run tests on it. NLP is incredibly complex in the details, which is why it’s nearly impossible to provide a synopsis or example technique here (we’ll provide them in coursework on ). However, content analysis only focuses on a few manual techniques.

Content analysis in marketing

Content analysis in marketing is the use of content analysis to improve marketing reach and conversions. has grown in importance over the past ten years. As digital platforms become more central to our understanding and interaction with others, we use them more.

We write out ideas, small texts. We post our thoughts on Facebook and Twitter, and we write blog posts like this one. But we also post videos on youtube and express ourselves in podcasts.

All of these mediums contain valuable information about who we are and what we might want to buy . A good marketer aims to leverage this information in three ways:

  • Collect the data
  • Analyze the data
  • Modify his/her marketing messaging to better serve the consumer
  • Pretend, with bots or employees, to be a consumer and craft messages that influence potential buyers

The challenge for marketers doing this is getting the rights to access this data. Indeed, data privacy laws have gone into play in the European Union (General Data Protection Regulation, or GDPR) as well as in Brazil (General Data Protection Law, or GDPL).

Content analysis vs narrative analysis

Content analysis is concerned with themes and ideas, whereas narrative analysis is concerned with the stories people express about themselves or others. Narrative analysis uses the same tools as content analysis, namely coding (or tokenization) and word frequency, but its focus is on narrative relationship rather than themes. This is easier to understand with an example. Let’s look at how we might code the following paragraph from the two perspectives:

I do not like green eggs and ham. I do not like them, Sam-I-Am. I do not like them here or there. I do not like them anywhere!

Content analysis : the ideas expressed include green eggs and ham. the narrator does not like them

Narrative analysis : the narrator speaks from first person. He has a relationship with Sam-I-Am. He orients himself with regards to time and space. he does not like green eggs and ham, and may be willing to act on that feeling.

Content analysis vs document analysis

Content analysis and document analysis are very similar, which explains why many people use them interchangeably. The core difference is that content analysis examines all mediums in which words appear , whereas document analysis only examines written documents .

For example, if I want to carry out content analysis on a master’s thesis in education, I would consult documents, videos, and audio files. I may transcribe the video and audio files into a document, but I wouldn’t exclude them form the beginning.

On the other hand, if I want to carry out document analysis on a master’s thesis, I would only use documents, excluding the other mediums from the start. The methodology is the same, but the scope is different. This dichotomy also explains why most academic researchers performing qualitative content analysis refer to the process as “document analysis.” They rarely look at other mediums.

Content Gap Analysis

Content gap analysis is a term common in the field of content marketing, but it applies to the analytical fields as well. In a sentence, content gap analysis is the process of examining a document or text and identifying the missing pieces, or “gap,” that it needs to be completed.

As you can imagine, a content marketer uses gap analysis to determine how to improve blog content. An analyst uses it for other reasons. For example, he/she may have a standard for documents that merit analysis. If a document does not meet the criteria, it must be rejected until it’s improved.

The key message here is that content gap analysis is not content analysis. It’s a way of measuring the distance an underperforming document is from an acceptable document. It is sometimes, but not always, used in a qualitative content analysis context.

  • Link to Source Text [ ↩ ]

About the Author

Noah is the founder & Editor-in-Chief at AnalystAnswers. He is a transatlantic professional and entrepreneur with 5+ years of corporate finance and data analytics experience, as well as 3+ years in consumer financial products and business software. He started AnalystAnswers to provide aspiring professionals with accessible explanations of otherwise dense finance and data concepts. Noah believes everyone can benefit from an analytical mindset in growing digital world. When he's not busy at work, Noah likes to explore new European cities, exercise, and spend time with friends and family.

File available immediately.

thesis on content analysis

Notice: JavaScript is required for this content.

thesis on content analysis

Content Analysis in the Research Field of Corporate Communication

  • Open Access
  • First Online: 25 September 2022

Cite this chapter

You have full access to this open access chapter

thesis on content analysis

  • Juliane A. Lischka 6  

10k Accesses

2 Citations

Content analyses in corporate communication can reveal organizational phenomena that are otherwise hard to obtain. Research themes are manifold and range from corporate social responsibility (CSR) and corporate reputation to stakeholder relations and crisis responses as well as corporate culture and employee commitment. Content analyses are able to assess concepts such as the vagueness of annual reports or the courage in speeches of chief executive officers (CEOs). Research designs employing content analysis follow qualitative, standardized manual, dictionary and machine-learning approaches, partly combined with surveys of stakeholder groups or interviews with corporate actors.

You have full access to this open access chapter,  Download chapter PDF

Similar content being viewed by others

thesis on content analysis

Qualitative Content Analysis: Theoretical Background and Procedures

thesis on content analysis

Qualitative Text Analysis: A Systematic Approach

thesis on content analysis

Nur Kommunikation macht Verantwortung sichtbar

  • Organizational Communication
  • Corporate Social Responsibility

1 Introduction

Corporate communication is an interdisciplinary concept that is approached from marketing, public relations (PR), organizational communication, and linguistic perspectives. In marketing, the role of corporate communication for loyal relationships with stakeholders is central. In PR, it is the managing of dialogic relations with an organization’s publics. For organizational communication, the social co-creation of the process of organizing is in focus (Mazzei 2014 ). In linguistics, business communication addresses the pragmatic dimension of language, often taking an (inter-)cultural perspective (Fuoli 2018 ). Regarding marketing and PR, corporate communication is often regarded as strategic communication (Zerfass et al. 2018 ). This contribution will largely focus on content analyses from a corporate communication perspective.

One central capacity of corporate communication is supporting to build intangible resources that reduce transaction costs for organizations and are key for an organization’s long-term competitive advantage (Barney 1991 , 2001 ). These intangible resources include concepts such as knowledge, trust, loyalty, reputation, responsibility, or identity (Cornelissen 2013 ; Fuoli 2018 ; Mazzei 2014 ). One major theme in corporate communication research is the role of corporate communication for explaining stakeholder attitudes and behavior, according to Zerfass and Viertmann’s ( 2017 ) meta study of research into corporate communication. Beyond the capacity of building intangible resources, corporate communication also enables operations, adjusts strategy, and ensures flexibility of firms (Zerfass and Viertmann 2017 ). That is, corporate communication supports strategic alignment, market positioning, innovation, or organizational change. These themes can become research topics in content analyses of corporate communication material. 

As organizations require monetary and human resources from their environment as well as seek sales markets, organizations also acquire social support, i.e., legitimacy from their environment (Palazzo and Scherer 2006 ; Suddaby et al. 2017 ). In this institutional perspective, organizations employ strategic communication to pursue their goals and to manage their legitimacy (Suchman 1995 ). Against this background, corporate social responsibility (CSR) has become a focus in corporate communication research. CSR is often conceptualized as a company’s capacity to conform to business, legal, ethical, and philanthropic standards (Carroll 1991 , 2016 ). Operating profitably (business) and obeying the law (legal) comprise rather essential requirements, while to do what is just and fair (ethical) and to be a good citizen (philanthropic) is less obligatory but desired by society (Carroll 1991 ). Research in CSR studies has focused on perception, impact and promotion; image and reputation; performance; and generally the rhetoric of organizations (Ellerup Nielsen and Thomsen 2018 ). In CSR research, content analysis is used to assess the performance (Gunawan and Abadi 2017 ) and the credibility of CSR reports (Lock and Seele 2016 ), for instance.

Content analyses have gained popularity in corporate communication as well as CSR research since the availability of computer-aided text analysis (CATA) (Duriau et al. 2007 ; Short et al. 2010 ), a label used in organizational research. Cornelissen ( 2013 ) claims that most research into corporate communication uses surveys, e.g., for stakeholder evaluations of company reputation, while content analyses are often part in case studies alongside interviews and observations. Yet, content analyses are indispensable to identify “who says what,” in the terms of Lasswell ( 1948 ), and thus represent a classical method for analyzing corporate documents. Content analysis of annual reports “can be of real usefulness for understanding some issues of corporate strategy,” argues Bowman ( 1984 , p. 70), because it can not only measure complex organizational constructs, including corporate culture, risk affinity, or CSR. Content analysis can also “show relationships [between constructs] which are otherwise difficult to obtain and which can be tested for validity” (ibid., p. 61). Similarly, Duriau et al. ( 2007 , p. 6) emphasize that content analyses can reliably access “values, intentions, attitudes, and cognitions” that have manifested in corporate messages. Hence, content analyses are used in organizational studies to reveal attitudinal or cognitive aspects of organizations and organizing. In comparison to responsive methods such as surveys or interviews, Harris ( 2001 , p. 195) suggests that content analyses serve as a “reality check” of managerial decision making.

The remainder of the article aims at providing an overview about the diversity of research themes and designs of content analyses in corporate communication.

2 Frequent Research Themes

To describe frequent research themes, I refer to two meta studies: Duriau et al. ( 2007 ) and Zerfass and Viertmann ( 2017 ). Duriau et al. ( 2007 ) conduct a meta study of content analyses in the field of organization studies between 1980 and 2005. Their analysis suggests that research into corporate communication differs regarding studies of corporate communication and studies using corporate communication material for researching corporate phenomena. They identify two major research themes that most frequently apply content analyses: (a) strategic management issues that address topics such as impression management, corporate reputation, or strategy reformulation and (b) the issue of managerial cognition involving corporate values and culture, sensemaking, blame attribution, or managerial attention in crises (Duriau et al. 2007 ).

Zerfass and Viertmann ( 2017 , p. 69) analyze publications from the fields of “corporate communication, organizational communication, public relations, marketing, and strategic management,” independent from the application of content analyses. They identify twelve central constructs of tangible and intangible outcomes of corporate communication that are studied, i.e., relationships, trust, legitimacy, thought leadership, innovation potential, crisis resilience, reputation, brands, corporate culture, publicity, customer preferences, and employee commitment (Zerfass and Viertmann 2017 ). Beyond surveying stakeholder groups, for example for assessing corporate reputation (Wartick 2016 ), some of these concepts can on principle be measured by analyzing the content of corporate communication material and user-generated content.

The following examples provide an impression of the variety of themes studied in corporate communication and may serve as starting point for further investigation into a specific area of interest. Interactivity dimensions of corporate websites are analyzed using content analysis (Ha and James 1998 ), addressing stakeholder relationships. In crisis communication research, content analysis is conducted to understand which  crisis response strategies are used in corporate messages and how news coverage as well as users respond, for instance on social media (Holladay 2010 ). Combining document analysis and interviews, Huang-Horowitz and Evans ( 2020 ) reveal how small companies communicate their organizational identity to gain legitimacy. Regarding leadership, content analyses can reveal the degree of courage expressed by executives and related news coverage (Harris 2001 ). Li et al. ( 2018 ) regard innovation potential as one dimension of corporate culture, along with integrity, quality, respect, and teamwork. They measure corporate culture using a machine learning (ML) approach on a corpus of earnings calls, in which public companies discuss their financial results addressing the investor and analyst communities. The sentiment of user-generated online product reviews indicates customer preferences (Jo and Oh 2011 ; Tirunillai and Tellis 2014 ). Concerning employee commitment, Bujaki et al. ( 2018 ) reveal impression management strategies of accounting firms addressing diversity‐sensitive employees. Regarding internal communication, Darics (2020) analyses instant message conversations between employees and shows that instant messages intend to achieve complex communication goals, including fostering informality and building team identity.

The themes of CSR messages are analysed for various industries in CSR reports (Landrum and Ohsowski  2018 ) or on social media platforms like Instagram (Kwon and Lee 2021 ). Moreover, Lock and Seele ( 2016 ) quantitatively analyze the credibility of CSR reports by measuring truth of statements, accuracy, completeness, standards used, and sincerity, and reveal that CSR reports can be considered as mediocrely credible. Hoffmann et al. ( 2018 ) discursively analyze Facebook’s CEO speech revealing it surrounds self - identity, constructs user identity and the relationship between Facebook and its users. As a final example, VanDyke and Tedesco ( 2016 ) analyze responsibility frames in green advertising over time, indicating that a habitat protection issue changes into energy efficiency.

3 Frequent Research Designs

Regarding research designs, corporate communication can represent the independent, dependent, or mediating variable. Regarding the independent variable, corporate communication messages represent an antecedent to explain attitudinal outcomes (trust and reputation in customers) as well as operational outcomes (e.g., economic results, stock market performance, speed of news product releases) (see Duriau et al. 2007 ; Zerfass and Viertmann 2017 ). Here, content analyses are used to evaluate corporate content material—but also content generated by customers or followers. Moreover, research into corporate communication addresses the relation between symbolic communication, which can be assessed with content analyses, and substantive corporate action (Seiffert et al. 2011 ), often comparing the content of CSR communication and action (Jong and van der Meer 2017 ; Perez-Batres et al. 2012 ; Schons and Steinmeier 2016 ; Wickert et al. 2016 ). Concerning the dependent variable, corporate communication content is regarded as a manifestation of internal processes such as managerial sensemaking or cognition. In this case, content analysis is used to deduce on such internal processes (see Duriau et al. 2007 ). —One central limitation for the deduction is intentional bias in corporate communication for specific stakeholder groups. For instance, annual reports include a bias toward the positive (Rutherford 2005 ) or dramatize ideas (Jameson 2000 ). Methodological responses to this challenge include using multiple data sources and richer databases, triangulation, and sophisticated methods that provide more accurate measurements (Duriau et al. 2007 ).—Corporate communication messages can also be conceptualized as a mediating variable between internal processes and organizational outcomes. For instance, Porcu et al. ( 2016 ) regard internal corporate communication as a mediator between corporate culture and operational outcomes, however, use a survey for data collection.

Methodologically, research designs employing content analysis follow qualitative, standardized manual, quantitative-computational approaches, or combinations thereof. Which design to follow depends on the availability of data sources for a research question at hand and the production contexts of the specific material to be analyzed (Steenkamp and Northcott 2007 ). For instance, studies into corporate communication addressing journalists as stakeholder group often compare corporate messages and news coverage using quantitative content analysis (e.g., Jonkman et al. 2020 ; Lischka et al. 2017 ; Nijkrake et al. 2015 ). Qualitative approaches aim at revealing organizational narratives, for instance regarding corporate responsibility (Haack et al. 2012 ), strategy change (Lischka 2019c ), and legitimacy (van Leeuwen and Wodak 1999 ).

According to Duriau et al. ( 2007 ), primary data sources of corporate communication content analyses are annual reports, followed by proxy statements, trade magazines, publicly available corporate documents, mission statements, internal company documents, and notes from interviews or answers to open-ended survey questions. Moreover, news coverage (e.g., Seiffert et al. 2011 ; Strycharz et al. 2017 ), CSR reports (e.g., Lock and Seele 2016 ), CEO speech (e.g., Beelitz and Merkl-Davies 2012 ; Hoffmann et al. 2018 ), social media communication and engagement (e.g., Abitbol and Lee 2017 ; Choy and Wu 2018 ; Kim et al. 2014 ; Macnamara and Zerfass 2012 ), corporate blogs (e.g., Catalano 2007 ; Colton and Poploski 2018 ), advertising (e.g., VanDyke and Tedesco 2016 ), and text messages (Darics 2020 ) represent data sources. Researchers from linguistics often build a corpus based on one corporate material genre from multiple organizations, for instance, a corpus of annual reports (Fuoli 2018 ; Rutherford 2005 ) or CRS reports (Yu and Bondi 2017 ). Researchers from other disciplines may also create corpora but without labelling their approach as a corpus approach (e.g., Seiffert et al. 2011 ).

For computational analyses, researchers have developed dictionaries, for instance, a finance- and accounting-specific dictionary in English (Loughran and McDonald 2011 , 2015 ) and German (Bannier et al. 2019 ), for environmental sustainability (Deng et al. 2017 ), and for vagueness in corporate communication (Guo et al. 2017 ). Also more general dictionaries such as Linguistic Inquiry and Word Count (LIWC) are applied as in Merkl‐Davies et al. ( 2011 ) and Lee et al. ( 2020 ).

There is a variety of methodological trends regarding content analyses of corporate communication. Research combines content analysis with other data collection methods, applies machine learning (ML) and (deep) natural language processing (NLP) techniques, and extends data capacity, contexts, and materiality. The following list provides recent exemplary studies for trends in computational methods, design, sampling, and material, with methods of computational content analysis representing a comparatively large evolving field.

ML and (deep) NLP

NLP is a computational method for analyzing naturally occurring human language by building statistical models of language, which has been applied in linguistics (Manning and Schütze 1999 ). With ML, algorithms are developed that should improve through training data and can be combined with human coding in supervised or semi-supervised settings. In deep ML, artificial neural networks are used for training (Deng and Liu 2018 ). Deep NLP can therefore use “both sentence structure and context of the text to provide a deeper understanding of the language” (Lee et al. 2020 ).

Combining human coding and ML (Park et al. 2019 ),

Applying semi-supervised ML (van Zoonen and van der Meer 2016 )

Applying topic modeling, which is unsupervised as it uses statistical associations of words in a text to generate topics without dictionaries or interpretive rules (Hannigan et al. 2019 ; Jaworksa and Nanda 2016 ; Kobayashi et al. 2018 ; Schmiedel et al. 2018 )

Specific dictionary development for corporate communication issues (Deng et al. 2017 ; Guo et al. 2017 )

Comparing deep NLP (IBM Watson Explorer) with dictionary approaches and human coding to detect the level of charisma in leadership speeches (Lee et al. 2020 )

Triangulation: Combining content analyses with surveys (Dudenhausen et al. 2020 ), combining qualitative and quantitative approaches (Jaworksa and Nanda 2016 )

Comparative designs: Comparative approaches within Western countries (Köhler and Zerfass 2019 ; Yu and Bondi 2019 ; Yuan 2019 ), and beyond, such as in Asia (Bondi and Yu 2015 ) and in Americana (Loureiro and Gomes 2016 )

Non-Western context: CSR communication in India (Jain and Moya 2016 ), in restrictive systems such as China (Zhang et al. 2017 ) and Russia (Sorokin et al. 2019 )

Visuality: Analyzing visual rhetoric in corporate reports (Goransson and Fagerholm 2018 ; Greenwood et al. 2018 ; Ruggiero 2020 ) and multimodal (textual and visual) content analysis, for instance to account for the multimodality of corporate websites (Höllerer et al. 2019 )

5 Research Desiderata

The trend on employing large collections of texts combined with ML, such as applying topic modelling algorithms, requires advances in methodological standards, for instance regarding procedures such as structural topic models (Roberts et al. 2019 ), validity comparisons across content analysis methods (van Atteveldt et al. 2021 ), and quality criteria for automated content analyses (Laugwitz 2021 ). With the ability to analyze extensive data sets, complex research designs may become better attainable. For instance, the various agents and processes that constitute organizational legitimacy as proposed in Bitektine and Haack ( 2015 ) may be tackled. In doing so, qualitative approaches, for instance to understand the dynamics of corporate narratives as in Jaworksa and Nanda ( 2016 ), can be fruitfully combined with computational analyses.

Regarding research objects, Zerfass and Viertmann ( 2017 ) suggest that the capacity of corporate communication should be assessed across various types and sizes of organizations (e.g., start-ups, small-and-medium enterprises, large corporations, non-profit organizations), across stakeholder groups (e.g., customers, employees, investors, and journalists), in various situational contexts (e.g., product launches, crises, and mergers), and industries. While organizations in any industry can become objects of analysis for corporate communication research, scholars in the field of communication and journalism studies may be especially interested in communication of organizations involved in public communication such a media organizations (Bachmann 2016 ; Lischka 2019b ; Siegert and Hangartner 2017 ) or social media platforms (Gillespie 2010 ; Iosifidis and Nicoli 2020 ; Lischka 2019a ). Against the background of globally acting organizations having the power, and sometimes the obligation, to assume political roles on a global scale (Scherer and Palazzo 2011 ), future research should focus on such global corporations to understand how they communicate their political stances and roles. There is additional need for comparative studies and, in particular, analyses of non-Western countries.

Moreover, the interaction of communication by multiple organizations can deliver relevant insights. Suchman ( 1995 , p. 592) argues, orchestrated communication by a group of companies, such as social media platforms and search engines, can become a powerful “collective evangelism” when occupying an issue. From an institutional perspective, analyzing potentially orchestrated communication of globally acting organizations can show how new institutions in societies are negotiated.

Lastly, there has been a normative turn in management research towards the “grand” challenges of global societies, including poverty, good economic growth, health disparities, climate change, and sustainability (United Nations n.d. ). Against the background that organizations should build value for societies, management researchers wish to contribute to how organizations can help to address and solve these grand problems (George et al. 2016 ). Corporate communication researchers, especially those focusing on CSR, are uniquely positioned to addressing grand challenges from a corporate communication perspective. Content analyses using material from companies as well as produced by various stakeholder groups can reveal links between communication and corporate goals as well as societal challenges on a broader scale.

Abitbol, A., & Lee, S. Y. (2017). Messages on CSR-dedicated Facebook pages: What works and what doesn’t. Public Relations Review , 43 (4), 796–808.

Article   Google Scholar  

Bachmann, P. (2016). Medienunternehmen und der strategische Umgang mit Media Responsibility und Corporate Social Responsibility (Dissertation). Springer Fachmedien Wiesbaden GmbH.

Google Scholar  

Bannier, C., Pauls, T., & Walter, A. (2019). Content analysis of business communication: Introducing a German dictionary. Journal of Business Economics , 89 (1), 79–123.

Barney, J. B. (1991). Firm resources and sustained competitive advantage. Journal of Management , 17 (1), 99–120.

Barney, J. B. (2001). Resource-based theories of competitive advantage: A ten-year retrospective on the resource-based view. Journal of Management , 27 (6), 643–650.

Beelitz, A., & Merkl-Davies, D. M. (2012). Using discourse to restore organisational legitimacy: ‘CEO-speak’ after an incident in a German nuclear power plant. Journal of Business Ethics , 108 (1), 101–120.

Bitektine, A., & Haack, P. (2015). The "macro" and the "micro" of legitimacy: Toward a multilevel theory of the legitimacy process. Academy of Management Review , 40 (1), 49–75.

Bondi, M., & Yu, D. (2015). Textual voices in corporate reporting: A cross-cultural analysis of Chinese, Italian, and American CSR reports. International Journal of Business Communication , 56 (2), 173–197.

Bowman, E. H. (1984). Content analysis of annual reports for corporate strategy and risk. Interfaces , 14 (1), 61–71. Retrieved from

Bujaki, M., Durocher, S., Brouard, F., Neilson, L., & Pyper, R. (2018). Protect, profit, profess, promote: Establishing legitimacy through logics of diversity in Canadian accounting firm recruitment documents. Canadian Journal of Administrative Sciences / Revue Canadienne des Sciences de l'Administration , 35 (1), 162–178.

Carroll, A. B. (1991). The pyramid of corporate social responsibility: Toward the moral management of organizational stakeholders. Business Horizons , 34 (4), 39–48.

Carroll, A. B. (2016). Carroll’s pyramid of CSR: Taking another look. International Journal of Corporate Social Responsibility , 1 (1), 446.

Catalano, C. S. (2007). Megaphones to the internet and the world: The role of blogs in corporate communications. International Journal of Strategic Communication , 1 (4), 247–262.

Choy, C. H. Y., & Wu, F. (2018). Comparative case study: When brands handle online confrontations. International Journal of Conflict Management , 29 (5), 640–658.

Colton, D. A., & Poploski, S. P. (2018). A content analysis of corporate blogs to identify communications strategies, objectives and dimensions of credibility. Journal of Promotion Management , 25 (4), 609–630.

Cornelissen, J. P. (2013). Corporate communication. The International Encyclopedia of Communication . American Cancer Society.

Darics, E. (2020). E-Leadership or “How to be boss in instant messaging? ” The role of nonverbal communication. International Journal of Business Communication , 57 (1), 3–29.

Deng, L., & Liu, Y. (Eds.) (2018). Deep learning in natural language processing . Singapore: Springer Singapore. Retrieved from

Deng, Q., Hine, M., Ji, S., & Sur, S. (2017). Building an environmental sustainability dictionary for the IT industry. Proceedings of the Annual Hawaii International Conference on System Sciences . Hawaii International Conference on System Sciences.

Dudenhausen, A., Röttger, U., & Czeppel, D. (2020). Do corporations communicate what the general public expects? Investigating the gap between corporate self-image and public perceptions of corporate responsibility. International Journal of Strategic Communication , 14 (1), 25–40.

Duriau, V. J., Reger, R. K., & Pfarrer, M. D. (2007). A content analysis of the content analysis literature in organization studies: Research themes, data sources, and methodological refinements. Organizational Research Methods , 10 (1), 5–34.

Ellerup Nielsen, A., & Thomsen, C. (2018). Reviewing corporate social responsibility communication: A legitimacy perspective. Corporate Communications: An International Journal , 23 (4), 492–511.

Fuoli, M. (2018). Building a trustworthy corporate identity: A corpus-based analysis of stance in annual and corporate social responsibility reports. Applied Linguistics , 39 (6), 846–885.

George, G., Howard-Grenville, J., Joshi, A., & Tihanyi, L. (2016). Understanding and tackling societal grand challenges through management research. Academy of Management Journal , 59 (6), 1880–1895.

Gillespie, T. (2010). The politics of ‘platforms’. New Media & Society , 12 (3), 347–364.

Goransson, K., & Fagerholm, A.-S. (2018). Towards visual strategic communications. Journal of Communication Management , 22 (1), 46–66.

Greenwood, M., Jack, G., & Haylock, B. (2018). Toward a methodology for analyzing visual rhetoric in corporate reports. Organizational Research Methods , 22 (3), 798–827.

Gunawan, J., & Abadi, K. (2017). Content analysis method: a proposed scoring for quantitiative and qualitative disclosures. In D. Crowther & L. M. Lauesen (Eds.), Handbook of research methods in corporate social responsibility (pp. 349–363). Cheltenham, UK, Northampton, MA, USA: Edward Elgar Publishing.

Guo, W., Yu, T., & Gimeno, J. (2017). Language and competition: Communication vagueness, interpretation difficulties, and market entry. Academy of Management Journal , 60 (6), 2073–2098.

Ha, L., & James, E. L. (1998). Interactivity reexamined: A baseline analysis of early business web sites. Journal of Broadcasting & Electronic Media , 42 (4), 457–474.

Haack, P., Schoeneborn, D., & Wickert, C. (2012). Talking the talk, moral entrapment, creeping commitment? Exploring narrative dynamics in corporate responsibility standardization. Organization Studies , 33 (5-6), 815–845.

Hannigan, T. R., Haans, R. F. J., Vakili, K., Tchalian, H., Glaser, V. L., Wang, M. S., . . . Jennings, P. D. (2019). Topic modeling in management research: Rendering new theory from textual data. The Academy of Management Annals , 13 (2), 586–632.

Harris, H. (2001). Content analysis of secondary data: A study of courage in managerial decision making. Journal of Business Ethics , 34 (3/4), 191–208.

Hoffmann, A. L., Proferes, N., & Zimmer, M. (2018). “Making the world more open and connected”: Mark Zuckerberg and the discursive construction of Facebook and its users. New Media & Society , 20 (1), 199-218.

Holladay, S. J. (2010). Are they practicing what we are preaching? An investigation of crisis communication strategies in the media coverage of chemical accidents. In S. J. Holladay & W. T. Coombs (Eds.), Handbooks in communication and media. The handbook of crisis communication (pp. 159–180). Chichester, U.K, Malden, MA: Wiley-Blackwell.

Höllerer, M. A., van Leeuwen, T., & Jancsary, D. (2019). Visual and multimodal research in organization and management studies . Routledge studies in management, organizations and society .

Huang-Horowitz, N. C., & Evans, S. K. (2020). Communicating organizational identity as part of the legitimation process: A case study of small firms in an Emerging Field. International Journal of Business Communication , 57 (3), 327–351.

Iosifidis, P., & Nicoli, N. (2020). The battle to end fake news: A qualitative content analysis of Facebook announcements on how it combats disinformation. International Communication Gazette , 82 (1), 60–81.

Jain, R., & Moya, M. de (2016). News media and corporate representation of CSR in India. International Journal of Strategic Communication , 11 (1), 61–78.

Jameson, D. A. (2000). Telling the investment story: A narrative analysis of shareholder reports. Journal of Business Communication , 37 (1), 7–38.

Jaworksa, S., & Nanda, A. (2016). Doing well by talking good: A topic modelling-assisted discourse study of corporate social responsibility. Applied Linguistics , 38 , amw014.

Jo, Y., & Oh, A. H. (2011). Aspect and sentiment unification model for online review analysis. In I. King, W. Nejdl, & H. Li (Eds.), Proceedings of the fourth ACM international conference on Web search and data mining - WSDM '11 (p. 815). New York, New York, USA: ACM Press.

Jong, M. D. T. de, & van der Meer, M. (2017). How does it fit? Exploring the congruence between organizations and their corporate social responsibility (CSR) activities. Journal of Business Ethics , 143 (1), 71–83.

Jonkman, J. G.F., Trilling, D., Verhoeven, P., & Vliegenthart, R. (2020). To pass or not to pass: How corporate characteristics affect corporate visibility and tone in company news coverage. Journalism Studies , 21 (1), 1–18.

Kim, S., Kim, S.-Y., & Hoon Sung, K. (2014). Fortune 100 companies’ Facebook strategies: Corporate ability versus social responsibility. Journal of Communication Management , 18 (4), 343–362.

Kobayashi, V. B., Mol, S. T., Berkers, H. A., Kismihók, G., & Den Hartog, D. N. (2018). Text mining in organizational research. Organizational Research Methods , 21 (3), 733–765.

Köhler, K., & Zerfass, A. (2019). Communicating the corporate strategy. Journal of Communication Management , 23 (4), 348–374.

Kwon, K., & Lee, J. (2021) Corporate social responsibility advertising in social media: a content analysis of the fashion industry’s CSR advertising on Instagram. Corporate Communications: An International Journal 26 (4) 700–715

Landrum, N. E., & Ohsowski, B. (2018). Identifying Worldviews on Corporate Sustainability: a content analysis of corporate sustainability reports. Business Strategy and the Environment 27 (1), 128–151.

Laugwitz, L. (2021). Qualitätskriterien für die automatische Inhaltsanalyse. Zur Integration von Verfahren des maschinellen Lernens in die Kommunikationswissenschaft.

Lasswell, H. D. (1948). The structure and function of communication in society. In L. Bryson (Ed.), The communication of ideas (pp. 37–52). New York: Harper.

Lee, L. W., Dabirian, A., McCarthy, I. P., & Kietzmann, J. (2020). Making sense of text: Artificial intelligence-enabled content analysis. European Journal of Marketing , 54 (3), 615–644.

Li, K., Mai, F., Shen, R., & Yan, X. (2018). Measuring corporate culture using machine learning. SSRN Electronic Journal. Advance online publication.

Book   Google Scholar  

Lischka, J. A., Stressig, J., & Bünzli, F. (2017). News about newspaper advertisers: To what extent can corporate advertising budgets predict editorial uptake and coverage of corporate press releases? Journalism , 18 (10), 1397–1414.

Lischka, J. A. (2019a). A badge of honor? How the New York Times discredits president Trump’s fake news accusations. Journalism Studies , 20 (2), 287–304.

Lischka, J. A. (2019b). Strategic communication as discursive institutional work: A critical discourse analysis of Mark Zuckerberg’s legitimacy talk at the European Parliament. International Journal of Strategic Communication , 13 (3), 197–213.

Lischka, J. A. (2019c). Strategic renewal during technology change: Tracking the digital journey of legacy news companies. Journal of Media Business Studies , 16 (3), 182–201.

Lock, I., & Seele, P. (2016). The credibility of CSR (corporate social responsibility) reports in Europe. Evidence from a quantitative content analysis in 11 countries. Journal of Cleaner Production , 122 , 186–200.

Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. Journal of Finance , 66 (1), 35–65.

Loughran, T., & McDonald, B. (2015). The use of word lists in textual analysis. Journal of Behavioral Finance , 16 (1), 1–11.

Loureiro, S. M. C., & Gomes, D. G. (2016). Relationship between companies and the public on Facebook: The Portuguese and the Brazilian context. Journal of Promotion Management , 22 (5), 705–718.

Macnamara, J., & Zerfass, A. (2012). Social media communication in organizations: The challenges of balancing openness, strategy, and management. International Journal of Strategic Communication , 6 (4), 287–308.

Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing . Cambridge, Mass.: MIT Press.

Mazzei, A. (2014). A multidisciplinary approach for a new understanding of corporate communication. Corporate Communications: An International Journal , 19 (2), 216–230.

Merkl‐Davies, D. M., Brennan, N. M., & McLeay, S. J. (2011). Impression management and retrospective sense‐making in corporate narratives. Acc Auditing Accountability J , 24 (3), 315–344.

Nijkrake, J., Gosselt, J. F., & Gutteling, J. M. (2015). Competing frames and tone in corporate communication versus media coverage during a crisis. Public Relations Review , 41 (1), 80–88.

Palazzo, G., & Scherer, A. G. (2006). Corporate legitimacy as deliberation: A communicative framework. Journal of Business Ethics , 66 (1), 71–88.

Park, Y. E., Son, H., Yang, S.-U., & Lee, J. K. (2019). A good company gone bad. Journal of Communication Management , 23 (1), 31–51.

Perez-Batres, L., Doh, J., van Miller, & Pisani, M. (2012). Stakeholder pressures as determinants of CSR strategic choice: Why do firms choose symbolic versus substantive self-regulatory codes of conduct? Journal of Business Ethics , 110 (2), 157–172.

Porcu, L., del Barrio-García, S., Alcántara-Pilar, J. M., & Crespo-Almendros, E. (2016). The mediating role of integrated corporate communication on the relationship between organizational culture and market performance. In L. Petruzzellis & R. S. Winer (Eds.), Developments in Marketing Science: Proceedings of the Academy of Marketing Science. Rediscovering the Essentiality of Marketing (pp. 433–438). Cham: Springer International Publishing.

Roberts, M. E., Stewart, B. M., & Tingley, D. (2019). stm: An R package for structural topic models. Journal of Statistical Software , 91 (2).

Ruggiero, P. (2020). No longer only numbers: An exploratory analysis of the visual turn in reporting of public sector organisations. In F. Manes-Rossi & R. Levy Orelli (Eds.), New Trends in Public Sector Reporting: Integrated Reporting and Beyond (pp. 105–127). Cham: Springer International Publishing.

Rutherford, B. A. (2005). Genre analysis of corporate annual report narratives: A corpus linguistics-based approach. Journal of Business Communication , 42 (4), 349–378.

Scherer, A. G., & Palazzo, G. (2011). The new political role of business in a globalized world: A review of a new perspective on CSR and its implications for the firm, governance, and democracy. Journal of Management Studies , 48 (4), 899–931.

Schmiedel, T., Müller, O., & Vom Brocke, J. (2018). Topic modeling as a strategy of inquiry in organizational research: A tutorial with an application example on organizational culture. Organizational Research Methods , 22 (4), 941–968.

Schons, L., & Steinmeier, M. (2016). Walk the talk? How symbolic and substantive CSR actions affect firm performance depending on stakeholder proximity. Corporate Social Responsibility and Environmental Management , 23 (6), 358–372.

Seiffert, J., Bentele, G., & Mende, L. (2011). An explorative study on discrepancies in communication and action of German companies. Journal of Communication Management , 15 (4), 349–367.

Short, J. C., Broberg, J. C., Cogliser, C. C., & Brigham, K. H. (2010). Construct validation using computer-aided text analysis (CATA). Organizational Research Methods , 13 (2), 320–347.

Siegert, G., & Hangartner, S. (2017). Media branding: A strategy to align values to media management? In K.-D. Altmeppen, C. A. Hollifield, & J. van Loon (Eds.), Value‐oriented media management: Decision making between profit and responsibility . Berlin: Springer International.

Sorokin, G. G., Rybakova, A. I., & Popova, I. N. (2019). Print mass media as a government tool in strategic communications: A study based on content analysis of publications in Russia. Media Watch , 10 (1).

Steenkamp, N., & Northcott, D. (2007). Content analysis in accounting research: The practical challenges. Australian Accounting Review , 17 (43), 12–25.

Strycharz, J., Strauss, N., & Trilling, D. (2017). The role of media coverage in explaining stock market fluctuations: Insights for strategic financial communication. International Journal of Strategic Communication , 12 (1), 67–85.

Suchman, M. C. (1995). Managing legitimacy: Strategic and institutional approaches. Academy of Management Review , 20 (3), 571–610.

Suddaby, R., Bitektine, A., & Haack, P. (2017). Legitimacy. The Academy of Management Annals , 11 (1), 451–478.

Tirunillai, S., & Tellis, G. J. (2014). Mining marketing meaning from online chatter: Strategic brand analysis of big data using latent dirichlet allocation. Journal of Marketing Research , 51 (4), 463–479.

United Nations (n.d.). Sustainable development goals: 17 goals to transform our world. Retrieved from

van Atteveldt, W., van der Velden, M. A. C. G., & Boukes, M. (2021). The validity of sentiment analysis: comparing manual annotation, crowd-coding, dictionary approaches, and machine learning algorithms. Communication Methods and Measures 15 (2), 121–140.

van Leeuwen, T., & Wodak, R. (1999). Legitimizing immigration control: A discourse-historical analysis. Discourse Studies , 1 (1), 83–118.

van Zoonen, W., & van der Meer, T. G.L.A. (2016). Social media research: The application of supervised machine learning in organizational communication research. Computers in Human Behavior , 63 , 132–141.

VanDyke, M. S., & Tedesco, J. C. (2016). Understanding green content strategies: An analysis of environmental advertising frames from 1990 to 2010. International Journal of Strategic Communication , 10 (1), 36–50.

Wartick, S. L. (2016). Measuring corporate reputation. Business & Society , 41 (4), 371–392.

Wickert, C., Scherer, A. G., & Spence, L. J. (2016). Walking and talking corporate social responsibility: Implications of firm size and organizational cost. Journal of Management Studies , 53 (7), 1169–1196.

Yu, D., & Bondi, M. (2017). The generic structure of CSR reports in Italian, Chinese, and English: A corpus-based analysis. IEEE Transactions on Professional Communication , 60 (3), 273–291.

Yu, D., & Bondi, M. (2019). A genre-based analysis of forward-looking statements in corporate social responsibility reports. Written Communication , 36 (3), 379–409.

Yuan, S. (2019). Comparative analysis of Chinese and Japanese corporate communication on facebook and twitter. Chinese Journal of Communication , 12 (2), 224–243.

Zerfass, A., Verčič, D., Nothhaft, H., & Werder, K. P. (2018). Strategic communication: Defining the field and its contribution to research and practice. International Journal of Strategic Communication , 12 (4), 487–505.

Zerfass, A., & Viertmann, C. (2017). Creating business value through corporate communication. Journal of Communication Management , 21 (1), 68–81.

Zhang, T., Khalitova, L., Myslik, B., Mohr, T. L., Kim, J. Y., & Kiousis, S. (2017). Comparing Chinese state-sponsored media’s agenda-building influence on Taiwan and Singapore media during the 2014 Hong Kong Protest. Chinese Journal of Communication , 11 (1), 66–87.

Download references

Author information

Authors and affiliations.

Journalistik und Kommunikationswissenschaft, Universität Hamburg, Hamburg, Germany

Juliane A. Lischka

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Juliane A. Lischka .

Editor information

Editors and affiliations.

Fachhochschule Graubünden, Chur, Schweiz

Franziska Oehmer-Pedrazzi

IKMZ - Institut für Kommunikationswissenschaft und Medienforschung, Universität Zürich, Zürich, Schweiz

Sabrina Heike Kessler

Edda Humprecht

Zürcher Hochschule für angewandte Wissenschaft (ZHAW), Zürich, Schweiz

Katharina Sommer

Laia Castro

Rights and permissions

Open Access Dieses Kapitel wird unter der Creative Commons Namensnennung 4.0 International Lizenz ( ) veröffentlicht, welche die Nutzung, Vervielfältigung, Bearbeitung, Verbreitung und Wiedergabe in jeglichem Medium und Format erlaubt, sofern Sie den/die ursprünglichen Autor(en) und die Quelle ordnungsgemäß nennen, einen Link zur Creative Commons Lizenz beifügen und angeben, ob Änderungen vorgenommen wurden.

Die in diesem Kapitel enthaltenen Bilder und sonstiges Drittmaterial unterliegen ebenfalls der genannten Creative Commons Lizenz, sofern sich aus der Abbildungslegende nichts anderes ergibt. Sofern das betreffende Material nicht unter der genannten Creative Commons Lizenz steht und die betreffende Handlung nicht nach gesetzlichen Vorschriften erlaubt ist, ist für die oben aufgeführten Weiterverwendungen des Materials die Einwilligung des jeweiligen Rechteinhabers einzuholen.

Reprints and permissions

Copyright information

© 2023 Der/die Autor(en)

About this chapter

Lischka, J.A. (2023). Content Analysis in the Research Field of Corporate Communication. In: Oehmer-Pedrazzi, F., Kessler, S.H., Humprecht, E., Sommer, K., Castro, L. (eds) Standardisierte Inhaltsanalyse in der Kommunikationswissenschaft – Standardized Content Analysis in Communication Research. Springer VS, Wiesbaden.

Download citation


Published : 25 September 2022

Publisher Name : Springer VS, Wiesbaden

Print ISBN : 978-3-658-36178-5

Online ISBN : 978-3-658-36179-2

eBook Packages : Social Science and Law (German Language)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Afr J Emerg Med
  • v.7(3); 2017 Sep

A hands-on guide to doing content analysis

Christen erlingsson.

a Department of Health and Caring Sciences, Linnaeus University, Kalmar 391 82, Sweden

Petra Brysiewicz

b School of Nursing & Public Health, University of KwaZulu-Natal, Durban 4041, South Africa

Associated Data

There is a growing recognition for the important role played by qualitative research and its usefulness in many fields, including the emergency care context in Africa. Novice qualitative researchers are often daunted by the prospect of qualitative data analysis and thus may experience much difficulty in the data analysis process. Our objective with this manuscript is to provide a practical hands-on example of qualitative content analysis to aid novice qualitative researchers in their task.

African relevance

  • • Qualitative research is useful to deepen the understanding of the human experience.
  • • Novice qualitative researchers may benefit from this hands-on guide to content analysis.
  • • Practical tips and data analysis templates are provided to assist in the analysis process.


There is a growing recognition for the important role played by qualitative research and its usefulness in many fields, including emergency care research. An increasing number of health researchers are currently opting to use various qualitative research approaches in exploring and describing complex phenomena, providing textual accounts of individuals’ “life worlds”, and giving voice to vulnerable populations our patients so often represent. Many articles and books are available that describe qualitative research methods and provide overviews of content analysis procedures [1] , [2] , [3] , [4] , [5] , [6] , [7] , [8] , [9] , [10] . Some articles include step-by-step directions intended to clarify content analysis methodology. What we have found in our teaching experience is that these directions are indeed very useful. However, qualitative researchers, especially novice researchers, often struggle to understand what is happening on and between steps, i.e., how the steps are taken.

As research supervisors of postgraduate health professionals, we often meet students who present brilliant ideas for qualitative studies that have potential to fill current gaps in the literature. Typically, the suggested studies aim to explore human experience. Research questions exploring human experience are expediently studied through analysing textual data e.g., collected in individual interviews, focus groups, documents, or documented participant observation. When reflecting on the proposed study aim together with the student, we often suggest content analysis methodology as the best fit for the study and the student, especially the novice researcher. The interview data are collected and the content analysis adventure begins. Students soon realise that data based on human experiences are complex, multifaceted and often carry meaning on multiple levels.

For many novice researchers, analysing qualitative data is found to be unexpectedly challenging and time-consuming. As they soon discover, there is no step-wise analysis process that can be applied to the data like a pattern cutter at a textile factory. They may become extremely annoyed and frustrated during the hands-on enterprise of qualitative content analysis.

The novice researcher may lament, “I’ve read all the methodology but don’t really know how to start and exactly what to do with my data!” They grapple with qualitative research terms and concepts, for example; differences between meaning units, codes, categories and themes, and regarding increasing levels of abstraction from raw data to categories or themes. The content analysis adventure may now seem to be a chaotic undertaking. But, life is messy, complex and utterly fascinating. Experiencing chaos during analysis is normal. Good advice for the qualitative researcher is to be open to the complexity in the data and utilise one’s flow of creativity.

Inspired primarily by descriptions of “conventional content analysis” in Hsieh and Shannon [3] , “inductive content analysis” in Elo and Kyngäs [5] and “qualitative content analysis of an interview text” in Graneheim and Lundman [1] , we have written this paper to help the novice qualitative researcher navigate the uncertainty in-between the steps of qualitative content analysis. We will provide advice and practical tips, as well as data analysis templates, to attempt to ease frustration and hopefully, inspire readers to discover how this exciting methodology contributes to developing a deeper understanding of human experience and our professional contexts.

Overview of qualitative content analysis

Synopsis of content analysis.

A common starting point for qualitative content analysis is often transcribed interview texts. The objective in qualitative content analysis is to systematically transform a large amount of text into a highly organised and concise summary of key results. Analysis of the raw data from verbatim transcribed interviews to form categories or themes is a process of further abstraction of data at each step of the analysis; from the manifest and literal content to latent meanings ( Fig. 1 and Table 1 ).

An external file that holds a picture, illustration, etc.
Object name is gr1.jpg

Example of analysis leading to higher levels of abstraction; from manifest to latent content.

Glossary of terms as used in this hands-on guide to doing content analysis. *

The initial step is to read and re-read the interviews to get a sense of the whole, i.e., to gain a general understanding of what your participants are talking about. At this point you may already start to get ideas of what the main points or ideas are that your participants are expressing. Then one needs to start dividing up the text into smaller parts, namely, into meaning units. One then condenses these meaning units further. While doing this, you need to ensure that the core meaning is still retained. The next step is to label condensed meaning units by formulating codes and then grouping these codes into categories. Depending on the study’s aim and quality of the collected data, one may choose categories as the highest level of abstraction for reporting results or you can go further and create themes [1] , [2] , [3] , [5] , [8] .

Content analysis as a reflective process

You must mould the clay of the data , tapping into your intuition while maintaining a reflective understanding of how your own previous knowledge is influencing your analysis, i.e., your pre-understanding. In qualitative methodology, it is imperative to vigilantly maintain an awareness of one’s pre-understanding so that this does not influence analysis and/or results. This is the difficult balancing task of keeping a firm grip on one’s assumptions, opinions, and personal beliefs, and not letting them unconsciously steer your analysis process while simultaneously, and knowingly, utilising one’s pre-understanding to facilitate a deeper understanding of the data.

Content analysis, as in all qualitative analysis, is a reflective process. There is no “step 1, 2, 3, done!” linear progression in the analysis. This means that identifying and condensing meaning units, coding, and categorising are not one-time events. It is a continuous process of coding and categorising then returning to the raw data to reflect on your initial analysis. Are you still satisfied with the length of meaning units? Do the condensed meaning units and codes still “fit” with each other? Do the codes still fit into this particular category? Typically, a fair amount of adjusting is needed after the first analysis endeavour. For example: a meaning unit might need to be split into two meaning units in order to capture an additional core meaning; a code modified to more closely match the core meaning of the condensed meaning unit; or a category name tweaked to most accurately describe the included codes. In other words, analysis is a flexible reflective process of working and re-working your data that reveals connections and relationships. Once condensed meaning units are coded it is easier to get a bigger picture and see patterns in your codes and organise codes in categories.

Content analysis exercise

The synopsis above is representative of analysis descriptions in many content analysis articles. Although correct, such method descriptions still do not provide much support for the novice researcher during the actual analysis process. Aspiring to provide guidance and direction to support the novice, a practical example of doing the actual work of content analysis is provided in the following sections. This practical example is based on a transcribed interview excerpt that was part of a study that aimed to explore patients’ experiences of being admitted into the emergency centre ( Fig. 2 ).

An external file that holds a picture, illustration, etc.
Object name is gr2.jpg

Excerpt from interview text exploring “Patient’s experience of being admitted into the emergency centre”

This content analysis exercise provides instructions, tips, and advice to support the content analysis novice in a) familiarising oneself with the data and the hermeneutic spiral, b) dividing up the text into meaning units and subsequently condensing these meaning units, c) formulating codes, and d) developing categories and themes.

Familiarising oneself with the data and the hermeneutic spiral

An important initial phase in the data analysis process is to read and re-read the transcribed interview while keeping your aim in focus. Write down your initial impressions. Embrace your intuition. What is the text talking about? What stands out? How did you react while reading the text? What message did the text leave you with? In this analysis phase, you are gaining a sense of the text as a whole.

You may ask why this is important. During analysis, you will be breaking down the whole text into smaller parts. Returning to your notes with your initial impressions will help you see if your “parts” analysis is matching up with your first impressions of the “whole” text. Are your initial impressions visible in your analysis of the parts? Perhaps you need to go back and check for different perspectives. This is what is referred to as the hermeneutic spiral or hermeneutic circle. It is the process of comparing the parts to the whole to determine whether impressions of the whole verify the analysis of the parts in all phases of analysis. Each part should reflect the whole and the whole should be reflected in each part. This concept will become clearer as you start working with your data.

Dividing up the text into meaning units and condensing meaning units

You have now read the interview a number of times. Keeping your research aim and question clearly in focus, divide up the text into meaning units. Located meaning units are then condensed further while keeping the central meaning intact ( Table 2 ). The condensation should be a shortened version of the same text that still conveys the essential message of the meaning unit. Sometimes the meaning unit is already so compact that no further condensation is required. Some content analysis sources warn researchers against short meaning units, claiming that this can lead to fragmentation [1] . However, our personal experience as research supervisors has shown us that a greater problem for the novice is basing analysis on meaning units that are too large and include many meanings which are then lost in the condensation process.

Suggestion for how the exemplar interview text can be divided into meaning units and condensed meaning units ( condensations are in parentheses ).

Formulating codes

The next step is to develop codes that are descriptive labels for the condensed meaning units ( Table 3 ). Codes concisely describe the condensed meaning unit and are tools to help researchers reflect on the data in new ways. Codes make it easier to identify connections between meaning units. At this stage of analysis you are still keeping very close to your data with very limited interpretation of content. You may adjust, re-do, re-think, and re-code until you get to the point where you are satisfied that your choices are reasonable. Just as in the initial phase of getting to know your data as a whole, it is also good to write notes during coding on your impressions and reactions to the text.

Suggestions for coding of condensed meaning units.

Developing categories and themes

The next step is to sort codes into categories that answer the questions who , what , when or where? One does this by comparing codes and appraising them to determine which codes seem to belong together, thereby forming a category. In other words, a category consists of codes that appear to deal with the same issue, i.e., manifest content visible in the data with limited interpretation on the part of the researcher. Category names are most often short and factual sounding.

In data that is rich with latent meaning, analysis can be carried on to create themes. In our practical example, we have continued the process of abstracting data to a higher level, from category to theme level, and developed three themes as well as an overarching theme ( Table 4 ). Themes express underlying meaning, i.e., latent content, and are formed by grouping two or more categories together. Themes are answering questions such as why , how , in what way or by what means? Therefore, theme names include verbs, adverbs and adjectives and are very descriptive or even poetic.

Suggestion for organisation of coded meaning units into categories and themes.

Some reflections and helpful tips

Understand your pre-understandings.

While conducting qualitative research, it is paramount that the researcher maintains a vigilance of non-bias during analysis. In other words, did you remain aware of your pre-understandings, i.e., your own personal assumptions, professional background, and previous experiences and knowledge? For example, did you zero in on particular aspects of the interview on account of your profession (as an emergency doctor, emergency nurse, pre-hospital professional, etc.)? Did you assume the patient’s gender? Did your assumptions affect your analysis? How about aspects of culpability; did you assume that this patient was at fault or that this patient was a victim in the crash? Did this affect how you analysed the text?

Staying aware of one’s pre-understandings is exactly as difficult as it sounds. But, it is possible and it is requisite. Focus on putting yourself and your pre-understandings in a holding pattern while you approach your data with an openness and expectation of finding new perspectives. That is the key: expect the new and be prepared to be surprised. If something in your data feels unusual, is different from what you know, atypical, or even odd – don’t by-pass it as “wrong”. Your reactions and intuitive responses are letting you know that here is something to pay extra attention to, besides the more comfortable condensing and coding of more easily recognisable meaning units.

Use your intuition

Intuition is a great asset in qualitative analysis and not to be dismissed as “unscientific”. Intuition results from tacit knowledge. Just as tacit knowledge is a hallmark of great clinicians [11] , [12] ; it is also an invaluable tool in analysis work [13] . Literally, take note of your gut reactions and intuitive guidance and remember to write these down! These notes often form a framework of possible avenues for further analysis and are especially helpful as you lift the analysis to higher levels of abstraction; from meaning units to condensed meaning units, to codes, to categories and then to the highest level of abstraction in content analysis, themes.

Aspects of coding and categorising hard to place data

All too often, the novice gets overwhelmed by interview material that deals with the general subject matter of the interview, but doesn’t seem to answer the research question. Don’t be too quick to consider such text as off topic or dross [6] . There is often data that, although not seeming to match the study aim precisely, is still important for illuminating the problem area. This can be seen in our practical example about exploring patients’ experiences of being admitted into the emergency centre. Initially the participant is describing the accident itself. While not directly answering the research question, the description is important for understanding the context of the experience of being admitted into the emergency centre. It is very common that participants will “begin at the beginning” and prologue their narratives in order to create a context that sets the scene. This type of contextual data is vital for gaining a deepened understanding of participants’ experiences.

In our practical example, the participant begins by describing the crash and the rescue, i.e., experiences leading up to and prior to admission to the emergency centre. That is why we have chosen in our analysis to code the condensed meaning unit “Ambulance staff looked worried about all the blood” as “In the ambulance” and place it in the category “Reliving the rescue”. We did not choose to include this meaning unit in the categories specifically about admission to the emergency centre itself. Do you agree with our coding choice? Would you have chosen differently?

Another common problem for the novice is deciding how to code condensed meaning units when the unit can be labelled in several different ways. At this point researchers usually groan and wish they had thought to ask one of those classic follow-up questions like “Can you tell me a little bit more about that?” We have examples of two such coding conundrums in the exemplar, as can be seen in Table 3 (codes we conferred on) and Table 4 (codes we reached consensus on). Do you agree with our choices or would you have chosen different codes? Our best advice is to go back to your impressions of the whole and lean into your intuition when choosing codes that are most reasonable and best fit your data.

A typical problem area during categorisation, especially for the novice researcher, is overlap between content in more than one initial category, i.e., codes included in one category also seem to be a fit for another category. Overlap between initial categories is very likely an indication that the jump from code to category was too big, a problem not uncommon when the data is voluminous and/or very complex. In such cases, it can be helpful to first sort codes into narrower categories, so-called subcategories. Subcategories can then be reviewed for possibilities of further aggregation into categories. In the case of a problematic coding, it is advantageous to return to the meaning unit and check if the meaning unit itself fits the category or if you need to reconsider your preliminary coding.

It is not uncommon to be faced by thorny problems such as these during coding and categorisation. Here we would like to reiterate how valuable it is to have fellow researchers with whom you can discuss and reflect together with, in order to reach consensus on the best way forward in your data analysis. It is really advantageous to compare your analysis with meaning units, condensations, coding and categorisations done by another researcher on the same text. Have you identified the same meaning units? Do you agree on coding? See similar patterns in the data? Concur on categories? Sometimes referred to as “researcher triangulation,” this is actually a key element in qualitative analysis and an important component when striving to ensure trustworthiness in your study [14] . Qualitative research is about seeking out variations and not controlling variables, as in quantitative research. Collaborating with others during analysis lets you tap into multiple perspectives and often makes it easier to see variations in the data, thereby enhancing the quality of your results as well as contributing to the rigor of your study. It is important to note that it is not necessary to force consensus in the findings but one can embrace these variations in interpretation and use that to capture the richness in the data.

Yet there are times when neither openness, pre-understanding, intuition, nor researcher triangulation does the job; for example, when analysing an interview and one is simply confused on how to code certain meaning units. At such times, there are a variety of options. A good starting place is to re-read all the interviews through the lens of this specific issue and actively search for other similar types of meaning units you might have missed. Another way to handle this is to conduct further interviews with specific queries that hopefully shed light on the issue. A third option is to have a follow-up interview with the same person and ask them to explain.

Additional tips

It is important to remember that in a typical project there are several interviews to analyse. Codes found in a single interview serve as a starting point as you then work through the remaining interviews coding all material. Form your categories and themes when all project interviews have been coded.

When submitting an article with your study results, it is a good idea to create a table or figure providing a few key examples of how you progressed from the raw data of meaning units, to condensed meaning units, coding, categorisation, and, if included, themes. Providing such a table or figure supports the rigor of your study [1] and is an element greatly appreciated by reviewers and research consumers.

During the analysis process, it can be advantageous to write down your research aim and questions on a sheet of paper that you keep nearby as you work. Frequently referring to your aim can help you keep focused and on track during analysis. Many find it helpful to colour code their transcriptions and write notes in the margins.

Having access to qualitative analysis software can be greatly helpful in organising and retrieving analysed data. Just remember, a computer does not analyse the data. As Jennings [15] has stated, “… it is ‘peopleware,’ not software, that analyses.” A major drawback is that qualitative analysis software can be prohibitively expensive. One way forward is to use table templates such as we have used in this article. (Three analysis templates, Templates A, B, and C, are provided as supplementary online material ). Additionally, the “find” function in word processing programmes such as Microsoft Word (Redmond, WA USA) facilitates locating key words, e.g., in transcribed interviews, meaning units, and codes.

Lessons learnt/key points

From our experience with content analysis we have learnt a number of important lessons that may be useful for the novice researcher. They are:

  • • A method description is a guideline supporting analysis and trustworthiness. Don’t get caught up too rigidly following steps. Reflexivity and flexibility are just as important. Remember that a method description is a tool helping you in the process of making sense of your data by reducing a large amount of text to distil key results.
  • • It is important to maintain a vigilant awareness of one’s own pre-understandings in order to avoid bias during analysis and in results.
  • • Use and trust your own intuition during the analysis process.
  • • If possible, discuss and reflect together with other researchers who have analysed the same data. Be open and receptive to new perspectives.
  • • Understand that it is going to take time. Even if you are quite experienced, each set of data is different and all require time to analyse. Don’t expect to have all the data analysis done over a weekend. It may take weeks. You need time to think, reflect and then review your analysis.
  • • Keep reminding yourself how excited you have felt about this area of research and how interesting it is. Embrace it with enthusiasm!
  • • Let it be chaotic – have faith that some sense will start to surface. Don’t be afraid and think you will never get to the end – you will… eventually!

Peer review under responsibility of African Federation for Emergency Medicine.

Appendix A Supplementary data associated with this article can be found, in the online version, at .

Appendix A. Supplementary data

Content Analysis vs Thematic Analysis: What's the Difference?

thesis on content analysis

This is part of our Practical Guide to Qualitative Content Analysis | Start a Free Trial of Delve | Take Our Free Online Qualitative Data Analysis Course

Thematic analysis and qualitative content analysis are two popular approaches used to analyze qualitative data. Confusingly, the two research approaches are often defined in similar ways or even used interchangeably in defining literature. 

Joffe (2012) points out that thematic analysis originally emerged from content analysis, but it developed into a separate approach with its own unique research goals. This evolution over time contributes to the mix-up between the two methods. Sub-categories like conventional content analysis and relational content analysis add another wrinkle of complexity by introducing variations and nuances to content analysis as a whole.

In this article, we clarify the difference between thematic analysis and the common forms of qualitative content analysis—and offer researchers a rational way to match the purpose of their intended research with the appropriate method of data analysis.

Thematic vs Content Analysis: Tl;dr Version

Thematic analysis is an intuitive approach to qualitative data analysis that allows researchers to explore patterns across their data. It involves identifying and understanding key themes in the data and how they relate to one another. “Themes” are overarching categories of common information related to a research phenomenon, which tells a story about its dimensions. 

On the other hand, content analysis is a more practical approach that can be used as a quantitative or qualitative method of data analysis. It can be applied to both textual and visual data but is more often applied to the latter. At its core, content analysis is a data collection technique used to determine the presence of certain words, themes, or concepts within data.

[Streamline your coding—regardless of the method—with Delve . Try it free for 14 days .]

Understanding Content vs Thematic Analysis

What is thematic analysis.

Thematic analysis is a qualitative research method for analyzing data that entails searching across a data set to identify, analyze, and report repeated patterns (Braun and Clarke 2006). You can conduct thematic analysis alone or with others through collaborative thematic analysis .

Eponymously, the themes derived from the data actively construct the patterns of meaning to answer a research question. In short, themes are ‘a patterned response or meaning’ derived from coded data that represent overarching ideas embedded within the larger data set. [1][2] 

As a result, thematic analysis is an effective qualitative research method for describing data that also involves your own interpretation to select codes and construct themes.

What are the main goals of thematic analysis?

The three main goals of thematic analysis are:

To identify important themes from the data.

To understand how themes relate to one another and how they are manifested in the data.

To use themes to generate new insights about a particular phenomenon.

When to use thematic analysis

Thematic analysis is a useful way to understand experiences, thoughts, or behaviors across a data set. Additionally, due to the clear, easy-to-follow processes outlined by Braun and Clarke (2006, 2012, 2017), researchers have suggested that thematic analysis is an ideal analytic method for novice qualitative researchers (Nowell et al. 2017).

What is Content Analysis?

Content analysis is a data collection technique used to determine the presence of certain words, themes, or concepts within qualitative data—either inductively or deductively —to explain a phenomenon. In short, the purpose of content analysis is to describe the characteristics of the document's content by examining who says what, to whom, and with what effect [3].

For example, researchers could use content analysis to evaluate language used within poems to search for a collective understanding of a phenomenon within a specific community—such as malaria in rural Africa . Researchers can then make inferences about the messages within the texts, the writer(s), the audience, and even the culture and time surrounding the poems.[4]

What are the main goals of qualitative content analysis?

The three main goals of qualitative content analysis are:

To identify and understand themes, patterns, and relationships within the data.

To explore how the data can inform theoretical claims made in research studies.

To quantify qualitative data.

When to use qualitative content analysis?

You can use qualitative content analysis to quantify and analyze the presence, meanings, and relationships of such certain words, themes, or concepts within textual data. You can also consider using qualitative content analysis when you want to apply a more interpretive level of analysis to your data than would be possible through quantitative content analysis. 

Qualitative analysis doesn't have to be overwhelming

Take delve's free online course to learn how to find themes and patterns in your qualitative data. get started here..

thesis on content analysis

The difference between thematic analysis and content analysis in qualitative research

Thematic analysis focuses on extracting high-level themes from within data, while content analysis—especially subcategorical methods like summative content analysis —focus on the reoccurrence of concepts or keywords at a more surface-level of analysis i.e. their frequency. 

In essence, the main difference between the two methods lies in the possibility of quantification of data in content analysis by measuring the frequency of different categories and themes. [4] While frequency is generally a core tenet of qualitative content analysis where statistical findings are tabulated or visualized in the final write-up, it is not a focus of thematic analysis. 

Instead, in contrast to tallying concepts or keywords to infer meaning as you would in content analysis, a theme is not necessarily reflective of the frequency of its appearance within the data in a thematic analysis (Braun and Clarke 2006; Nowell et al. 2017).

In summary, statistical data is core to most content analysis but is not typically cited in thematic analysis. And while the former tends to focus on more manifest data that is apparent through surface-level analysis, neither method is inherently more beneficial or astute than the other. 

Main differences between thematic analysis and content analysis are:

Thematic analysis (TA) is a qualitative method used to uncover themes in textual data, while content analysis (CA) is either a quantitative or a qualitative approach that also involves some quantification of data.

CA generally counts the occurrence of concepts or keywords to infer meaning, while TA assigns meaning by extracting high-level ideas.

TA focuses on the overarching themes in the data and how those themes relate to one another, while in CA researchers count instances of coded concepts and keywords within large amounts of textual data with less focus on comparing or contrasting those codes. 

Some differences in how thematic analysis and content analysis are used:

To elaborate further, these next differences exemplify how thematic analysis and content analysis are commonly used in practice. Though it is important to note that there are exceptions to each.

Thematic analysis always involves an inductive portion of analysis. While there are forms of inductive content analysis, it is more common in content analysis to apply existing theories and frameworks through a deductive analytical technique.

As the name implies, content analysis was historically applied to “content”. This includes qualitative data such as newspapers, books, research journals, and letters. The data for thematic analysis is often directly collected by the researcher, such as through semi-structured interviews . That being said, you may still apply thematic analysis to newspaper articles, and content analysis to semi-structured interviews.

Content analysis is able to use “automated” forms of analysis, and the researcher may not need to read their entire dataset. For example, in summative content analysis, you only seek specific keywords and could use Delve’s search functionality to quickly find those keywords and code them. In thematic analysis, automated forms of analysis are still a valuable aid, but the researcher will almost always still need to read the entire data set.

It's important to note that both methods have their advantages and disadvantages depending on the research question being asked and the type of data being analyzed.

Thematic analysis vs content analysis: the similarities

Now that we have covered the differences between qualitative content analysis and thematic analysis, it is important to note that similarities also exist between each method. 

For instance, both content analysis and thematic analysis share the same aim of analytically examining narrative materials from life stories by breaking the text into relatively small units of content and submitting them to descriptive treatment (Sparkes, 2005). Both are descriptive qualitative approaches to data analysis that achieve a similar goal, just in different ways.

Beyond that, these are some other overlapping characteristics:

They both involve examining qualitative data.

Both are used to generate new knowledge from the data.

Both are iterative processes that require intimate knowledge of the data you study. 

Both approaches can be used to inform theoretical claims in research studies.

No matter which method you choose, it's important to understand how each qualitative research method works so you can confidently decide which one best suits your research needs. Now that you’ve read this article, you are equipped with the knowledge to do just that!

Ready to streamline your qualitative data analysis?

Whether for thematic analysis or content analysis, Delve can simplify your qualitative data analysis. Delve users also appreciate its robust features for collaborative qualitative analysis , simplifying teamwork across locations and with various team members.

thesis on content analysis

Delve Qualitative Data Analysis (QDA) Software

Get Started With Delve Today:

Start Your Free Trial : Dive into qualitative data analysis with Delve. Try our platform free for 14 days, and streamline your coding process.

Take Our Free Course : Want to master qualitative data analysis? Enroll in our Free Online Qualitative Data Analysis Course and enhance your research skills.

Michelle E. Kiger & Lara Varpio (2020): Thematic analysis of qualitative data: AMEE Guide No. 131, Medical Teacher.

Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3 (2), 77–101.

Vaismoradi M, Turunen H, Bondas T. Content analysis and thematic analysis: Implications for conducting a qualitative descriptive study. Nurs Health Sci . 2013 Sep;15(3):398-405. 

Content Analysis. (n.d.).

Nowell, L. S., Norris, J. M., White, D. E., & Moules, N. J. (2017). Thematic Analysis: Striving to Meet the Trustworthiness Criteria. International Journal of Qualitative Methods , 16(1). 

Bloor, M. and Wood, F. (2006) Keywords in Qualitative Methods. Sage Publications, Inc., London.

Joffe, H. (2011). Thematic analysis. In D. Harper & A. R. Thompson (Eds.), Qualitative methods in mental health and psychotherapy: A guide for students and practitioners (pp. 209–223). Chichester, UK: Wiley.

Sparkes A. Narrative analysis: exploring the whats and hows of personal stories. In: Holloway I (ed.). Qualitative Research in Health Care (1st edn). Berkshire: Open University Press, 2005; 191–208.

Cite this blog post:

Delve, Ho, L., & Limpaecher, A. (2023c, February 15). Content Analysis vs Thematic Analysis: What's the Difference?

The University of Chicago The Law School

College essays and diversity in the post-affirmative action era, sonja starr’s latest research adds data, legal analysis to discussion about race in college admissions essays.

A woman sitting on a couch with a book on her lap

Editor’s Note: This story is part of an occasional series on research projects currently in the works at the Law School.

The Supreme Court’s decision in June 2023 to bar the use of affirmative action in college admissions raised many questions. One of the most significant is whether universities should consider applicants’ discussion of race in essays. The Court’s decision in Students for Fair Admissions (SFFA) v. Harvard did not require entirely race-blind admissions. Rather, the Court explicitly stated that admissions offices may weigh what students say about how race affected their lives. Yet the Court also warned that this practice may not be used to circumvent the bar on affirmative action.

Many university leaders made statements after SFFA suggesting that they take this passage seriously, and that it potentially points to a strategy for preserving diversity. But it’s not obvious how lower courts will distinguish between consideration of “race-related experience” and consideration of “race qua race.” Sonja Starr, Julius Kreeger Professor of Law & Criminology at the Law School, was intrigued by the implication of that question, calling the key passage of the Court’s opinion the “essay carveout.”

“Where is the line?” she wrote in a forthcoming article, the first of its kind to discuss this issue in depth in the post- SFFA era. “And what other potential legal pitfalls could universities encounter in evaluating essays about race?”

To inform her paper’s legal analysis, Starr conducted empirical analyses of how universities and students have included race in essays, both before and after the Court’s decision. She concluded that large numbers of applicants wrote about race, and that college essay prompts encouraged them to do so, even before SFFA .

Some thought the essay carveout made no sense. Justice Sonia Sotomayor called it “an attempt to put lipstick on a pig” in her dissent. Starr, however, disagrees. She argues that universities are on sound legal footing relying on the essay carveout, so long as they consider race-related experience in an individualized way. In her article, Starr points out reasons the essay carveout makes sense in the context of the Court’s other arguments. However, she points to the potential for future challenges—on both equal protection and First Amendment grounds—and discusses how colleges can survive them.

What the Empirical Research Showed

After SFFA , media outlets suggested that universities would add questions about race or identity in their admissions essays and that students would increasingly focus on that topic. Starr decided to investigate this speculation. She commissioned a professional survey group to recruit a nationally representative sample of recent college applicants. The firm queried 881 people about their essay content, about half of whom applied in 2022-23, before SFFA , and half of whom submitted in 2023-24.

The survey found that more than 60 percent of students in non-white groups wrote about race in at least some of their essays, as did about half of white applicants. But contrary to what the media suggested, there were no substantial changes between the pre-and post- SFFA application cycles.

Starr also reviewed essay prompts that 65 top schools have used over the last four years. She found that diversity and identity questions—as well as questions about overcoming adversity, which, for example, provide opportunities for students to discuss discrimination that they have faced—are common and have increased in frequency both before and after SFFA.

A Personally Inspired Interest

Although Starr has long written about equal protection issues, until about two years ago, she would have characterized educational admissions as a bit outside her wheelhouse. Her research has mostly focused on the criminal justice system, though race is often at the heart of it. In the past, for example, she has assessed the role of race in sentencing, the constitutionality of algorithmic risk assessment instruments in criminal justice, as well as policies to expand employment options for people with criminal records.

But a legal battle around admissions policies at Fairfax County’s Thomas Jefferson High School for Science and Technology—the high school that Starr attended—caught her attention. Starr followed the case closely and predicted that “litigation may soon be an ever-present threat for race-conscious policymaking” in a 2024 Stanford Law Review article on that and other magnet school cases.

“I got really interested in that case partly because of the personal connection,” she said. “But I ended up writing about it as an academic matter, and that got me entrenched in this world of educational admissions questions and their related implications for other areas of equal protection law.”

Implications in Education and Beyond

Starr’s forthcoming paper argues that the essay carveout provides a way for colleges to maintain diversity and stay on the right side of the Court’s decision.

“I believe there’s quite a bit of space that’s open for colleges to pursue in this area without crossing that line,” she said. “I lay out the arguments that colleges can put forth.”

Nevertheless, Starr expects future litigation targeting the essay carveout.

“I think we could see cases filed as soon as this year when the admissions numbers come out,” she said, pointing out that conservative legal organizations, such as the Pacific Legal Foundation, have warned that they’re going to be keeping a close eye on admissions numbers and looking for ways that schools are circumventing SFFA .

Starr envisions her paper being used as a resource for schools that want to obey the law while also maintaining diversity. “The preservation of diversity is not a red flag that something unconstitutional is happening,” she said. “There are lots of perfectly permissible ways that we can expect diversity to be maintained in this post- affirmative action era.”

Starr’s article, “Admissions Essays after SFFA ,” is slated to be published in Indiana Law Journal in early 2025.

2024 Theses Doctoral

Statistically Efficient Methods for Computation-Aware Uncertainty Quantification and Rare-Event Optimization

He, Shengyi

The thesis covers two fundamental topics that are important across the disciplines of operations research, statistics and even more broadly, namely stochastic optimization and uncertainty quantification, with the common theme to address both statistical accuracy and computational constraints. Here, statistical accuracy encompasses the precision of estimated solutions in stochastic optimization, as well as the tightness or reliability of confidence intervals. Computational concerns arise from rare events or expensive models, necessitating efficient sampling methods or computation procedures. In the first half of this thesis, we study stochastic optimization that involves rare events, which arises in various contexts including risk-averse decision-making and training of machine learning models. Because of the presence of rare events, crude Monte Carlo methods can be prohibitively inefficient, as it takes a sample size reciprocal to the rare-event probability to obtain valid statistical information about the rare-event. To address this issue, we investigate the use of importance sampling (IS) to reduce the required sample size. IS is commonly used to handle rare events, and the idea is to sample from an alternative distribution that hits the rare event more frequently and adjusts the estimator with a likelihood ratio to retain unbiasedness. While IS has been long studied, most of its literature focuses on estimation problems and methodologies to obtain good IS in these contexts. Contrary to these studies, the first half of this thesis provides a systematic study on the efficient use of IS in stochastic optimization. In Chapter 2, we propose an adaptive procedure that converts an efficient IS for gradient estimation to an efficient IS procedure for stochastic optimization. Then, in Chapter 3, we provide an efficient IS for gradient estimation, which serves as the input for the procedure in Chapter 2. In the second half of this thesis, we study uncertainty quantification in the sense of constructing a confidence interval (CI) for target model quantities or prediction. We are interested in the setting of expensive black-box models, which means that we are confined to using a low number of model runs, and we also lack the ability to obtain auxiliary model information such as gradients. In this case, a classical method is batching, which divides data into a few batches and then constructs a CI based on the batched estimates. Another method is the recently proposed cheap bootstrap that is constructed on a few resamples in a similar manner as batching. These methods could save computation since they do not need an accurate variability estimator which requires sufficient model evaluations to obtain. Instead, they cancel out the variability when constructing pivotal statistics, and thus obtain asymptotically valid t-distribution-based CIs with only few batches or resamples. The second half of this thesis studies several theoretical aspects of these computation-aware CI construction methods. In Chapter 4, we study the statistical optimality on CI tightness among various computation-aware CIs. Then, in Chapter 5, we study the higher-order coverage errors of batching methods. Finally, Chapter 6 is a related investigation on the higher-order coverage and correction of distributionally robust optimization (DRO) as another CI construction tool, which assumes an amount of analytical information on the model but bears similarity to Chapter 5 in terms of analysis techniques.

  • Operations research
  • Stochastic processes--Mathematical models
  • Mathematical optimization
  • Bootstrap (Statistics)
  • Sampling (Statistics)

thumnail for He_columbia_0054D_18524.pdf

More About This Work

  • DOI Copy DOI to clipboard


  1. (PDF) Content Analysis

    thesis on content analysis

  2. (PDF) Content Analysis or Thematic Analysis: Doctoral Students

    thesis on content analysis

  3. Differences Between Content Analysis and Thematic Analysis

    thesis on content analysis

  4. 2 Critical Types of Content Analysis

    thesis on content analysis

  5. Content analysis sample thesis proposal

    thesis on content analysis

  6. 10 Content Analysis Examples (2024)

    thesis on content analysis


  1. Making Thesis Content

  2. How to do content analysis in Excel and the concept of content analysis ( Amharic tutorial)

  3. Content Analysis Method || Content Analysis Method in hindi || Content Analysis Research Method

  4. What is Thematic Content Analysis

  5. 5 Creative Tips to Write your Thesis Faster & Professional

  6. Thesis and Semester project Presentation |thesis defense|የምረቃ ወረቀት ስለማቅረብ| #በአማረኛ


  1. Content Analysis

    Content analysis is a research method used to identify patterns in recorded communication. To conduct content analysis, you systematically collect data from a set of texts, which can be written, oral, or visual: Books, newspapers and magazines. Speeches and interviews. Web content and social media posts. Photographs and films.

  2. Qualitative Content Analysis 101 (+ Examples)

    2. The two types of content analysis. Now that you understand the difference between implicit and explicit data, let's move on to the two general types of content analysis: conceptual and relational content analysis. Importantly, while conceptual and relational content analysis both follow similar steps initially, the aims and outcomes of each are different.

  3. Chapter 17. Content Analysis

    Chapter 17. Content Analysis Introduction. Content analysis is a term that is used to mean both a method of data collection and a method of data analysis. Archival and historical works can be the source of content analysis, but so too can the contemporary media coverage of a story, blogs, comment posts, films, cartoons, advertisements, brand packaging, and photographs posted on Instagram or ...

  4. (PDF) Content Analysis: A Flexible Methodology

    Abstract. Content analysis is a highly fl exible research method that has been. widely used in library and infor mation science (LIS) studies with. varying research goals and objectives. The ...

  5. (PDF) Content Analysis: a short overview

    Inductive content analysis listed all the tweets and each frequent word in two coding books (Appendix : Tables 1 and 2). Content analysis is a research methodology; numerous other analytic ...

  6. A hands-on guide to doing content analysis

    A common starting point for qualitative content analysis is often transcribed interview texts. The objective in qualitative content analysis is to systematically transform a large amount of text into a highly organised and concise summary of key results. Analysis of the raw data from verbatim transcribed interviews to form categories or themes ...

  7. Content Analysis

    Step 1: Select the content you will analyse. Based on your research question, choose the texts that you will analyse. You need to decide: The medium (e.g., newspapers, speeches, or websites) and genre (e.g., opinion pieces, political campaign speeches, or marketing copy)

  8. How to plan and perform a qualitative study using content analysis

    In a review of the literature, different opinions on the use of concepts, procedures and interpretation in content analysis are presented. However, there are similarities in the way the researchers explain the process: either they do it by using different distinguishing stages, (Burnard, 1991, Downe-Wambolt, 1992), or in running text (Berg, 2001, Catanzaro, 1988).

  9. How to do a content analysis [7 steps]

    A step-by-step guide to conducting a content analysis. Step 1: Develop your research questions. Step 2: Choose the content you'll analyze. Step 3: Identify your biases. Step 4: Define the units and categories of coding. Step 5: Develop a coding scheme. Step 6: Code the content. Step 7: Analyze the Results. In Closing.

  10. Guide: Using Content Analysis

    Eberhardt uses content analysis in this thesis paper to analyze three journal articles that reported on President Ronald Reagan's address in which he responded to the Tower Commission report concerning the IranContra Affair. The reports concentrated on three rhetorical elements: idea generation or content; linguistic style or choice of language ...

  11. Content Analysis Method and Examples

    Content analysis is a readily-understood and an inexpensive research method. A more powerful tool when combined with other research methods such as interviews, observation, and use of archival records. It is very useful for analyzing historical material, especially for documenting trends over time. Disadvantages of Content Analysis

  12. PDF Katie Reichenbach Thesis

    Using Content Analysis to Examine the Relationship between Commercial and Nonprofit Organizations' Motives and Consumer Engagement on Facebook Katherine Reichenbach Dr. Shelly Rodgers, Thesis Chair Abstract This study is a content analysis of 20 Facebook pages from 10 nonprofit

  13. Content analysis and thematic analysis ...

    Content analysis and thematic analysis as qualitative descriptive approaches. According to Sandelowski and Barroso research findings can be placed on a continuum indicating the degree of transformation of data during the data analysis process from description to interpretation.The use of qualitative descriptive approaches such as descriptive phenomenology, content analysis, and thematic ...

  14. Content Analysis

    Content analysis is a research method used to analyze and interpret the characteristics of various forms of communication, such as text, images, or audio. It involves systematically analyzing the content of these materials, identifying patterns, themes, and other relevant features, and drawing inferences or conclusions based on the findings.

  15. Content Analysis, Quantitative

    Abstract. Quantitative content analysis is a research method in which features of textual, visual, or aural material are systematically categorized and recorded so that they can be analyzed. Widely employed in the field of communication, it also has utility in a range of other fields. Central to content analysis is the process of coding, which ...

  16. Qualitative Content Analysis: a Simple Guide with Examples

    Content analysis is concerned with themes and ideas, whereas narrative analysis is concerned with the stories people express about themselves or others. ... On the other hand, if I want to carry out document analysis on a master's thesis, I would only use documents, excluding the other mediums from the start. The methodology is the same, but ...

  17. A Quantitative Content Analysis of Mercer University Theses

    Quantitative content analysis of a body of research not only helps budding researchers understand the culture, language, and expectations of scholarship, it helps identify deficiencies and inform policy and practice. Because of these benefits, an analysis of a census of 980 Mercer University M.Ed., Ed.S., and doctoral theses was conducted.

  18. Media Audiences' Engagement with Social Issues: A content analysis of

    A content analysis of online daily newspaper articles and audi-ence comments Jessica Marie Edgar B.A., Integrated Communications, Spring Hill College - Mo-bile, 2011 A Thesis Submitted to The Graduate School at the University of Missouri - St. Louis in partial fulfillment of the require-ments for the degree Master of Arts in Communication

  19. Content Analysis in the Research Field of Corporate Communication

    In CSR research, content analysis is used to assess the performance (Gunawan and Abadi ) and the credibility of CSR reports (Lock and Seele ), for instance. Content analyses have gained popularity in corporate communication as well as CSR research since the availability of computer-aided text analysis (CATA) (Duriau et al. ; Short et al. ), a ...

  20. Warrior women : a qualitative content analysis of the perceptions of

    Through the use of postcolonial feminist theory and qualitative content analysis methodology of ten articles from the Winter/Spring 2003 Special Issue : Native Experiences in the Ivory Tower of the American Indian Quarterly , this study examined the

  21. A hands-on guide to doing content analysis

    Content analysis, as in all qualitative analysis, is a reflective process. There is no "step 1, 2, 3, done!" linear progression in the analysis. This means that identifying and condensing meaning units, coding, and categorising are not one-time events. It is a continuous process of coding and categorising then returning to the raw data to ...

  22. Content Analysis vs Thematic Analysis: What's the Difference?

    The difference between thematic analysis and content analysis in qualitative research. Thematic analysis focuses on extracting high-level themes from within data, while content analysis—especially subcategorical methods like summative content analysis—focus on the reoccurrence of concepts or keywords at a more surface-level of analysis i.e. their frequency.

  23. What Is a Thesis?

    Revised on April 16, 2024. A thesis is a type of research paper based on your original research. It is usually submitted as the final step of a master's program or a capstone to a bachelor's degree. Writing a thesis can be a daunting experience. Other than a dissertation, it is one of the longest pieces of writing students typically complete.

  24. College Essays and Diversity in the Post-Affirmative Action Era

    The firm queried 881 people about their essay content, about half of whom applied in 2022-23, before SFFA, and half of whom submitted in 2023-24. The survey found that more than 60 percent of students in non-white groups wrote about race in at least some of their essays, as did about half of white applicants.

  25. Statistically Efficient Methods for Computation-Aware Uncertainty

    The thesis covers two fundamental topics that are important across the disciplines of operations research, statistics and even more broadly, namely stochastic optimization and uncertainty quantification, with the common theme to address both statistical accuracy and computational constraints. Here, statistical accuracy encompasses the precision of estimated solutions in stochastic optimization ...