Finding and Using Health Statistics

2. Common Terms and Equations

Dependent and Independent Variables

In health research there are generally two types of variables. Independent variables are what we expect will influence dependent variables. A dependent variable is what happens as a result of the independent variable. Generally, the dependent variable is the disease or outcome of interest for the study, and the independent variables are the factors that may influence the outcome. For example, if we want to explore whether high concentrations of vehicle exhaust impact incidence of asthma in children, the concentration of vehicle exhaust is the independent variable while asthma incidence is the dependent variable.

Graph showing the Y-axis as dependent variable and the X-axis as the independent variable

Confounding Variable

A confounding variable, or confounder, is a separate factor that is related with both the dependent and the independent variables. A confounder variable affects or changes the relationship between the independent and dependent variables because the effects of the independent variable are mixed in with the effects of the confounder. A confounder may strengthen, weaken, or even eliminate the relationship between the independent and dependent variables. To avoid this issue, the potential confounders can be controlled for either in the study design or in the analysis.¹

In the example of car exhaust and asthma, a confounding variable would be a different exposure to other factors that increase respiratory issues, like cigarette smoke or pollution from nearby factories. Because it would be unethical to expose a randomized group of people to high levels of vehicle exhaust,² a study comparing two populations with different exposures to vehicle exhaust would rely on a natural experiment, or a situation in which this already occurs due to factors unrelated to the researchers. In this natural experiment, a community living near higher concentrations of car exhaust may also live near factories that pollute or have higher rates of smoking.

Another example is the relationship between birth order and Down syndrome, where the prevalence of Down syndrome increases with increasing birth order. In this case, birth order of children is mixed up with maternal age when a child is born. Thus, the relationship between birth order and Down syndrome is not causal, but it merely reflects the progressive relation between maternal age and the occurrence of Down syndrome.³

Bias

When conducting a study or analyzing data statistically, researchers try to remove or account for as many of the confounding variables as possible in their study design or analysis. Confounding variables lead to bias by resulting in estimates that differ from the true population value. Bias is a systematic error in study design, subject recruitment, data collection, or analysis that results in a mistaken estimate of the true population value.⁴

Although there are many types of bias, two common types are selection bias and information bias. Selection bias occurs when the procedures used to select subjects and other factors that influence participation in the study produce a result that is different from what would have been obtained if all members of the target population were included in the study.⁴ For example, a website that rates the quality of primary care physicians based on patients’ input may produce ratings that suffer from selection bias. This is because individuals that had a particularly bad (or good) experience with the physician may be more likely to go to the website and provide a rating than a patient who had an average visit.

Information bias refers to a “systematic error due to inaccurate measurement or classification of disease, exposure, or other variables.”⁵ Recall bias, a type of information bias, occurs when study participants do not remember the information they report accurately or completely. For example, when asking mothers about behaviors during pregnancy (e.g., food intake, medication, illness, etc.), mothers with children who were born with an illness are more likely to provide more detailed information than mothers who have healthy children. The subject of confounding and bias relates to a larger discussion of the relationship between correlation and causation. Although two variables may be correlated, this does not imply that there is a causal relationship between them.

1. Pourhoseingholi, Mohamad Amin et al. "How to control confounding effects by statistical analysis." Gastroenterology and hepatology from bed to bench vol. 5,2 (2012): 79-83.

2. Trochim, W.M.K. “Establishing Cause and Effect.” Research Methods Knowledge Base, 10/20/2006. Web 1/24/2017.

3. LaMorte, Wayne. “Down Syndrome and Maternal Age.” PH717 Module 11 - Confounding and Effect Measure Modification, 11 Nov. 2021.

4. "Bias, Confounding and Effect Modification" Stat 507, Epidemiological Research Methods, Penn State Eberly College of Science, 2017 Web 1/24/17.

5. Aschengrau A. and G.R. Seage. (2014) Epidemiology in public health. 3rd ed. Burlington, MA: Jones & Bartlett Learning.