Major Confounding factors
Maturation – Mainly concerns longitudinal studies (and children) – as subjects grow older between pre- and posttreatment/test it may affect the results. The children, for instance, might get more sophisticated, get more experience, bigger, stronger, and so on, as the age. Natural maturation halso happen in other subjects. When in a new environment adults make predictable changes or adjustments over time. Diseases usually have predictive courses. This can lead to the fact that observed changes over time may be due to maturation rather than the independent variable.
History – During the course of a study, independent events that will affect the outcome can occur. Generally, threats to internal validity are due to history when there are long times between pre- and posttest measurements.
Testing – repeated testing of participants can threaten the internal validity, because the participants might get more skilled through repeated training on the measurement instrument.
Instrumentation – Findings can be due to changes in the measuring instrument over time rather than due to IV.
Regression to the Mean – when selecting subjects on the basis of their scores on a measure is extremely high or low they are usually not that extreme in a second testing. That is, their scores will regress to toward the mean. The amount of regression is contingent upon how much the performance of the test is due to variable factors. These variable factors can be, i.e., amount of study. More variable factors equals more regression.
Selection – These confounding factor appears when, for instance, comparing groups that are not equivalent before the manipulation begins.
Attrition – Attrition occurs when participants that drop out of the study due to some biasing factor. For instance, if participants drop out from one group but not from another (or not as much) one can lose important characteristics etc. It is important to not create situations or use procedures that can bias some participants against completing the study, and changing the outcome.
Diffusion of Treatment – If participants from that have different experimental conditions are able to talk with each other, some can expose the procedures to others. Test-participants might talk to control-participants that might not be aware that they are in a control group. These types of information exchanges are called diffusion of treatment and can affect the data such that the differences between groups disappear.
Sequence effects – experiences with one condition might affect responses to later conditions. If condition order is ABC systematic confounding can occur. For instance, performance in BC might reflect both the effect of the condition or the effect of already been exposed to A. To get rid of sequence effect one use more than one order.
Subject and Experimenter Effects
Expectations and biases of both the experimenter and the subjects can systematically affect the results of a study in subtle ways, thus reducing validity of the study.
Subject Effects – Participants in an experiment are not completely naïve. That is, they will have understandings, ideas and maybe misunderstandings about what to expect in the study. Different people have different reasons for participating. These reasons can be money, course credit, etc. Others might participate because they hope to learn something. Participants volunteer and carry out their role based on different motivations, understandings, expectations, and biases, which all can affect the outcome of a study. An experimental setting is not natural. When being observed people might behave differently than if they were not observed. This can lead to subject effects. Subject effects refer to any changes in behavior that was due being part of an experiment rather than experimental variables. Demand characteristics are when participants get cues on how they are expected to behave (according to hypotheses, etc). Demand characteristics usually occur unintentionally. Placebo effect, a related phenomenon, occurs when participants are expecting a specific effect.
Experimenter effects – concerns with any biasing effects that are due to actions of the researcher. Experimenter expectancies – the experimenter’s expectation about the outcome of the study. These expectations might cause researchers to bias results in many ways. The experimenter can influence the participant’s behavior in favor of the hypotheses, cherry picking data and statistical methods, and interpret results in a biased manner.
Example of ways experimenter can influence the participant: Presenting cues in the form of intonation, facial expressions, change in posture, verbally reinforce some responses and not others, or incorrectly record participants’ responses.
Pre-posttest with control group controls for history and maturation.
- Systematic between-groups variance
- Difference between groups could be due to
- Effect of the independent variable (experimental variance which is what we want!)
- Effects of confounding variables (extraneous variance)
- A combination of (1) and (2)
- Difference between groups could be due to
Natural variability that is due to sampling error will increase the group variability some.
- Nonsystematic Within-Groups Variance
- Error Variance – non-systematic within-groups variability.
Due to random factors affecting some participants more than other within a group rather than systematically reflecting all members of a group. Error variance can increase by factors that are not stable, such as participant feeling ill or uncomfortable participating… Experimenter and equipment variations can also cause measurement errors for some participants.
- Error Variance – non-systematic within-groups variability.
“In experimentation, each study is designed so as to maximize experimental variance, control extraneous variance, and minimize error variance.”
Maximizing experimental variance. Experiment variance is due to independent variables (IV) effect on dependent variables (DV). At least to levels of de IV should be present in an experiment. Experimental conditions need to be distinct! It can be useful to have a manipulation check to see that manipulation had the planned effect on p’s. One way to check if to use ratings.
To efficiently control for extraneous variables and minimize their possible different effects on the groups we must be sure that (1) the two groups (experimental and control) are AS similar as possible, (2) the groups are treated in exactly the same way EXCEPT for the IV manipulation.
Ways to control extraneous variance:
- Random assignment to groups decreases probability that the groups will differ – Best method
- Homogenous sample
- Confounding variables can be built into the experiment as an additional IV
- Matching or Within-subjets deisgn
Minimizing Error Variance.
Large error variance can hide differences between conditions due to the experimental manipulations. Measurement error is one error variance source. If participants does not respond consistent from trial to trial due to such factors the instrument is unreliable. To minimize sources of error variance carefully controlled conditions of measurement and have reliable instruments. Another source of error variance is individual differences. These types of variances minimized by within-subjects designs.
Experimental designs – Randomize when possible!
The four basic designs to test single IV using independent groups:
- Randomized, posttest-only, control-group design
Here we have two groups: Group A and Group B. The treatment in the groups are compared in the post-test only. This is made to test hypothesis that IV affect dependent measurements.
Random selection will protect external validity. Furthermore, attrition and regression to the mean are also reduced by random assignment of participants (i.e., both groups will have [roughly] the same amount of extremes). Threats to internal validity is from instrumentation, history, and maturation are minimized due to inclusion of control group.
- Randomized, pretest-posttest, control-group design
Improvement of R pt-only c-g design (the one above). Pretreatment/test
- Multilevel, completely randomized, between-subjects design
- Solomon’s four-group design. Pretests will affect participants’ responses to the treatment or to the posttest. Pretest can interact with the experimental manipulation which will produce confounding interaction effects.
T-test evaluates the size of the difference between the means of the two groups. The two means are divided by an error term. The error term is a function of the variance scores within each group and the sample sizes. Easy applied, common, and useful to test differences between two groups.
Analysis of Variance (ANOVA)
For multilevel designs with more than two groups. One-way ANOVAQ – only one independent variable. ANOVA uses both the within-groups variance and the between-group variance. Within-groups variance is a measure of nonsystematic variation within a group – error or chance variation among individual participants within a group. Due to factors such as individual differences and measurement errors. Between-groups variance is representing how variable group means are. Is a measurement of both systematic factors that affect the groups differently and of variation due to sampling error. The systematic factors include experimental variance and extraneous variance. Furthermore it also represents how variable the group means are. Approx. same means = small between-groups variance -> large difference in group means = between-groups variance is large.
The F-test is used to get statistical significance from an ANOVA. The F-test involves the ratio of the between-group mean square to the within-groups mean square.
F= mean square between groups/mean square within groups
The ratio can be increased by either increasing the between-groups mean square or by decreasing the within-groups mean square. Between-group mean squares increases by maximizing the differences between groups. The within-groups mean square is minimized by controlling as many potential sources of random error as possible. Maximization of experimental variance and minimization of error variance is what we want!
Rejection by the hypotheses that there are no systematic differences between groups UNLESS the F-ratio is larger than we would expect by chance alone.
UPDATE: I found an exceptional post on how to do one-way ANOVA using Python. In fact, there are 4 different Python methods for doing a Python ANOVA: One-Way ANOVA in Python.
Planned comparison is done to probe possible significance differences between the means. The F-ratio will only tell us that there IS a difference. Not in which direction or between which groups. This is done by the means of planned comparison/a priori comparison/contrast.