Doing a content analysis

As with any research project, one starts with a question or a hypothesis. If your question can be addressed by content analysis, define the population of interest, for example, music videos, Internet blogs by adolescents, political cartoons, children's television, etc.

Creating categories

Generally, format is easier to code than content. Categories for structural elements (such as amount of space, time, color, and location) are quickly constructed and readily understood. Some content also will be straightforward, such as number of people shown, subject matter, target audience. Other content-related themes may be subjective and less-easily agreed upon; for example, emotional tone, hostility, pleasure, joy, achievement. If coders can independently agree (without influencing each other) with regard classifying material into to the category, than it is legitimate to use, even if it is subjective.

Select the unit of analysis: e.g., for narrative (either written or spoken) will you use words, sentences, or paragraphs as the unit of analysis? For photos, the entire photo or individuals in the photo? Children's drawings - the single picture, or parts of it? Will a single 2-page advertisement be scored the same way as two 1-page ads? More than one system can be used -- for example, coding ads by size (space used) and also by their number (frequency of occurrence).

Classifying format

Decide what aspects you want to cover. Create an easy-to-use checklist, for example

Instructions: For each ad, select the best category for each of the following dimensions.
Size Text/graphic
  __ full page
__ more than half
__ half page
__ between 1/4 and 1/2 page
__ quarter page or less
  __ text only
__ graphic only
__ both
Graphic Color
  __ photograph
__ cartoon
__ painting/sketch
__ other (describe) _____________
  __ Black and white
__ Grayscale
__ 2-toned (e.g., sepia, not B/W)
__ 3 or more colors

Classifying content

  1. The best way to select categories is to skim the material and make a list of the main themes. When you run out of new themes, you are ready for the next step.

  2. Define your categories in ways that will be understandable to others. Use operational definitions. Later, you will need to have an independent coder replicate your classification. Make the definitions as clear and specific as possible.

  3. Check your categories for overlap. Can some be combined? If a category seems too broad, break it down into smaller categories. The list of categories must be comprehensive; that is, it must cover all of the possibilities. You can use an "other" or "miscellaneous" category. If the "other" category becomes large, create additional categories into which the information can be placed.

  4. The categories must also be mutually-exclusive (i.e. non-overlapping) -- so that you can code the information into one or another category, but not both. For example, you might be coding ads as "hot" or "cool", but find some that are both. In that case you would need a third category labeled "hot/cool" or "both hot and cool."

  5. You may need to code non-instances (absence, or non-occurrence). Example: If you are testing a hypothesis of bias in the media, you cannot simply count the stereotypes, but need to count their absence as well.

  6. Then you can construct categories. The following could be added to the above classification of advertisements.
Subject matter (select one) Gender depicted (select one)

__ fashion (clothes, handbags, shoes, etc.)
__ health
__ food or beverage (except alcohol)
__ alcoholic beverage
__ cigarettes
__ media (movies, videos, DVDs, books)
__ automotive
__ technology (TV, cell phones, computers)
__ travel (inc. vacation, hotels, resorts)
__ home appliances
__ furniture
__ jewelry (inc. watches)
__ other (describe) _______

  __ female
__ male
__ both
__ unable to determine
  Target audience (check all that apply)
    __ adolescent (17 or younger)
__ young adult (18-24)
__ adult (25-39)
__ middle age (40-64)
__ senior citizen (65+)

Latrinalia - coding verbal content and emotional tone.


Unless you are coding information that you have generated as part of your research (in which case you probably will want to analyze all of it) or are analyzing all of the available information, you will need to select a sample. Estimate how much time you have, and decide how much material you can cover. Then create a decision rule for selecting the final sample in an unbiased way (see sampling module).

For periodicals (things published on a regular basis) you will need to decide which issues to analyze. For example, you might choose the preceding year, selecting every third issue. If you are sampling electronic media such as television or the Internet, you need to define your population of interest (e.g., children's programs, or political blogs), and then create a decision rule to generate as representative a sample as possible.

Example - testing the Face-ism hypothesis

Coding information into categories

Keep the need for reliability in mind as you code the material. Another coder should be able to independently classify the material into the same categories. That may require training and clarification, so be sure to allow time for it.

Pre-established coding systems

Standardized coding systems are available for verbal material. One is the Gottschalk analysis of verbal behavior, which is designed to uncover emotional themes. Another is the CHILDRES system for analyzing talk. Learning these techniques requires a considerable amount of time and training (see Resources list for references) .

Computer software

The availability of computer software has led to an explosion of studies of analyzing text. There are several software packages available for coding material that have been transcribed into electronic files. The Resources section lists programs used in the social sciences. Others can be found by doing an electronic database search using the search terms "content analysis" and "computer software" (see Using the electronic database).

Establishing reliability

Have at least one rater independently code the data into categories (independently means that they do it without your being present). Check the level of agreement. For straightforward categories the level of agreement should be very high. If it is not, clarify the categories and redo the classification. For more subjective ratings, it may be necessary to discuss and resolve differences.

On to Strengths and limitations