3.5. Writing your theory section

Your theory section should provide the hypotheses that you will test in your empirical research. Hypotheses are to be preferred above ‘conceptual models’. A concept does not explain anything, it merely captures or describes a phenomenon. Whether that phenomenon or relation actually occurs is an empirical issue that can be settled through research. A definition is an arbitrary agreement about the meaning of a certain word. If it is useful for your readers (see 2.2.2 above) discuss definitions in your introduction. Avoid talking about definitions in your theory section. Instead you should formulate hypotheses.

A hypothesis is a predictive statement about facts before you know them. A hypothesis is not a question. In a causal model, a hypothesis is a statement about the relationship between two variables. A hypothesis is a statement you can test; a definition or a concept cannot be tested. It cannot be true or false. Avoid HIDING: including a Hypothesis In a Definition.

The causal model graphically displays the hypotheses of your thesis. In your theory section, work through your causal model from left to right, or from right to left. Each arrow in your model represents a hypothesis about the relationship between two variables.

For each hypothesis you write a paragraph of text on which the hypothesis is based. The paragraph typically consists of the following three elements:

  1. The paragraph starts with an explanation of the argument from a theory about the influence of X on Y. If there are multiple theories about the influence, identify them and contrast the predictions from each of these theories.
  2. The paragraph continues with a summary of the findings of previous research on the influence of X on Y. Give an overall summary: “Seven studies have tested this hypothesis, five finding a weakly positive relationship, and two finding no association.” Next, you can go in some detail about these results, with the depth of your discussion depending on the number of words you have available.
  3. The paragraph ends with the literal formulation of the hypothesis. This text is often printed in italics, as in the following example:

Given the above, I expect:

H1. The frequency of volunteering increases with the level of religious orthodoxy.

 

3.5.1. Constructing hypotheses

  1. “Can I just invent a hypothesis myself?”

Yes, you must! That is not to say you can just state a hypothesis and posit it with no argument at all. For each hypothesis, develop an argument about the sign of the relationship. Start with the argument, and end your paragraph with the literal statement of the hypothesis, as a conclusion.

The argument is more than a phrase like ‘Previous research has shown that X is positively related to Y’. Obviously the results of previous research include important information, but they are not an argument for a hypothesis; they may support a specific argument about why X influences Y.

Also the basis for a hypothesis is more than the statement that you expect it to be true ‘because it is logical’ or that it is ‘common sense’. Neither is it enough to summarize previous research supporting the hypothesis. A hypothesis may not have been tested at all in previous research. It is not necessary for a hypothesis to be supported by previous research. What matters is whether the argument is sound, and preferably based on a theory.

A proper argument takes the form of a syllogism. The syllogism is a logical form consisting of a General Law, an assertion about specific Conditions, and a Hypothesis. The Hypothesis is the conclusion drawn from the combination of the General Law and the assertion about specific Conditions.

An example of a general law is: the stronger a person is attached to a group, the more likely that this person follows the norms in this group. An example of a condition is: church attendance indicates attachment to a religious group. The resulting hypothesis is: the higher the frequency of church attendance of a person, the higher the likelihood that this person follows the norms within the church. You can test this hypothesis for different forms of behavior that religious groups have norms about, such as monetary contributions to the group, or voting behavior. Additional conditions specify these norms: religious groups proscribe that their members should contribute time and money to the benefit of the group.

Sometimes there are valid arguments for a positive as well as a negative relationship between two variables. In that case try to reason which direction is the strongest. The empirical test will tell you which effect dominates. If there is no way to tell which one is strongest, consider phrasing two alternative hypotheses.

In your causal model some arrows are missing, even though the empirical association between the variables are positive. These are spurious relationships. When you work through your model, also explain why some arrows are missing, and why the relationship is spurious. The argument typically takes the form of an omitted variable. An omitted variable is a variable that has an effect on the outcome, but is not included in the model. It is sometimes represented by the letter Z. In the causal diagram in Figure 14 I use the letter U to remind you that it is an unmeasured factor.

Figure 14. Causal diagram example: omitted variable bias

In the most extreme case of omitted variable bias, the relationship between X and Y is due to the fact that X and Y are both the result of U, while there is no causal effect of X on Y. This happens in the diagram above: the relationship between X and Y is dotted. In this case, U is called a confounder. In a weaker case, an omitted variable or a set of omitted variables are responsible for some of the relationship between X and Y, but not all. In this case, including the variables that were previously omitted in the analysis turns them into measured confounding variables or control variables. Adding them may reduce some of the relationship between X and Y, but not completely eliminate it. The result of adding omitted variables may even be that the relationship is not affected. In this case, the omitted variables were not actually confounding the relationship between X and Y.

 

3.5.2. Formulating hypotheses

Examples of hypotheses displayed graphically in Figure 8 are:

  1. Protestants have a higher frequency of church attendance than Catholics. [Arrow A in Figure 8: a dichotomous independent variable and an ordinal dependent variable]
  2. The higher the frequency of church attendance, the higher the likelihood of volunteering. [Arrow B in Figure 8: an ordinal independent variable and a dichotomous dependent variable]
  3. Altruistic values increase with the frequency of church attendance. [Arrow C in Figure 8; two ordinal variables]

 

3.5.3. Pitfalls in formulations of hypotheses

Common pitfalls in the formulation of hypotheses are the following:

  1. The use of the word ‘important’ and ‘role’. If you say that gender plays an important role in charitable giving it is unclear what your expectation is. When you find yourself using these words, reformulate your hypothesis such that it is clear to the reader how you are going to test it and what you expect the result of the test to be. For example, your hypothesis could be that women are more likely to give than men, particularly to health and basic needs.
  2. The failure to specify a direction: “Gender is related to giving”.  A better formulation would be: “Women give more often than men but lower amounts”.
  3. The use of ‘data language’, such as variable labels: “PROT is positively associated with ATTEND” (representing Arrow A in figure 8). A better formulation would be: “Protestant attend church more often than members of other religious groups”. By the way: the use of data language in the description of your results is also a bad idea.

Suppose you would like to support your hypothesis with the results of a lot of previous studies. One way to do this is to write: “As many previous studies have shown, …. ”. A better way to do this is to list a number of the previous studies after the first part of your sentence: “As many previous studies have shown (e.g., author 1 (year), author 2 (year), …. ”.

 

Most certainly, yes. A hypothesis follows from a theory. It may have received support in previous research, but is also possible that no one has tested it before. It is also possible that you’ve been searching for previous research using the wrong keywords, that is: using lay terms that are not used in academic research. If you’re sure that no one has tested your hypothesis before, you have an argument in favor of the scientific relevance of your study.