The three (main) principles of experimental design
- Replication
Repeating an experiment means that you can assess the consistency and reliability of any observed effects. If you observe the effect repeatedly is less likely that it occurred simply due to random variation. There are many ways replication can be included in an experiment. What you repeat (e.g., treatments or measurements) depends on what effect or source of variation you wish to investigate. The table below summarizes, perhaps, the three most commonly employed types of replication.
Replication Type | Description | Why |
---|---|---|
Biological Replication | Each treatment is independently applied to several humans, animals, or plants. | To generalize results to the population. |
Technical Replication | Two or more samples from the same biological source independently processed. | Advantageous if processing steps introduce a lot of variation; increases precision in comparing relative abundances between treatments. |
Pseudo-replication | One sample from the same biological source divided into two or more aliquots independently measured. | Advantageous for noisy measuring instruments; increases precision in comparing relative abundances between treatments. |
- Randomization
Employing randomization in your experiment goes towards ensuring the validity, reliability, and generalizability of your results. The main reason to randomize allocation of treatment to experimental units is to protect against bias. We, typically, wish to plan the experiment in such a way that the variations caused by extraneous factors can all be combined under the general heading of “chance”. Doing so ensures that each treatment has the same probability of getting good (or bad) units and thus avoids systematic bias. Random allocation can cancel out population bias; it ensures that any other possible causes for the experimental results are split equally between groups. Hence, by creating comparable treatment groups through random assignment we can minimize bias, increase validity and genralizability of the results.
Typically statistical analysis assumes that observations are independent. This is almost never strictly true in practice but randomization means that our estimates will behave as if they were based on independent observations. Randomization is also useful in situations where we may be are unaware of all potential variables as it is more likely to distribute the effects of unknown factors evenly. In addition, randomizing treatment allocation can aid in the ethical conduct of our research as it promotes fair distribution of benefits and risks among participants.
- Blocking
Blocking helps control variability by making treatment groups more alike. Experimental units are divided into subsets (called blocks) so that units within the same block are more similar (homogeneous) than units from different subsets or blocks. The experiment is then conducted separately within each block. Blocking is a technique for dealing with nuisance factors, a factor that has some effect on the response, but is of no interest. By grouping similar units together in blocks we reduce the effect of the nuisance factors, which can improve the precision of our estimates (magnitude of effects). In addition, blocking helps prevent confounding by controlling the impact of known or suspected sources of variation.
Let’s briefly delve into the realms of fantasy6. The fantastical example we consider is set at Basgiath College where students are trained to serve the Empire. Students are either trained as Riders or Scribes entering the respective Quadrant. Along comes a new Professor, Professor Llewelyn, who thinks that their new dragon-riding coaching technique is guaranteed to result in huge improvements in a student’s dragon-riding ability. They conduct an experiment comparing the new and old coaching techniques.
To conduct their experiment they use a mix of students from the two different Quadrants: Riders (who have been riding dragons all their lives) and Scribes (who have never ridden a dragon before). Before any coaching is given all students are asked to complete a dragon assault course whilst atop a dragon, which is timed. Then, over a few months, each student randomly receives one of the coaching techniques (i.e., new or old). After this, the students are asked to complete another assault course of similar difficulty and their time is again recorded. Their improvement (if any) in dragon riding is measured by the difference between the two times.
However, the huge variation in baseline riding ability between these two groups (i.e., Riders are obviously going to be much better at riding dragons than Scribes) will obfuscate/obscure any improvement the new teaching technique may induce. Therefore, to find out if the new dragon-riding coaching technique works Professor Llewelyn blocks by Quadrant setting up the experiment as follows.
Title | |
---|---|
Objective of experiment | Examine the impact of a new dragon riding coaching technique |
Primary Variable of Interest | Improvement in dragon riding (difference between assult course completion times) |
Nuisance Factors Introducing Variability | Quadrant (i.e., Riders or Scribes) |
Blocking | We use blocking based on Quadrants to control for variability introduced by the differences among Riders and Scribes |
Experimental Units | Students are the experimental units |
Blocking Procedure | 1. Identify Quadrant of student (i.e., Rider or Scribe). 2. Create Blocks: Group students based on Quadrant. 3. Random Assignment within Blocks: Randomly assign students to different coaching techniques (i.e., old and new). This ensures each technique (e.g., old and new) is tested with different Quadrant. |
Experimental Setup | Each block (representing a Quadrant) contains a random assignment of students receiving different dragon riding coaching. |
Benefits of Blocking | By using blocking based on Quadrant, the experiment controls for potential variations dragon riding improvement caused by the baseline differences in a student’s dragon riding ability due to their Quadrant. This enhances the precision of the experiment and allows for a more accurate assessment of the impact of the coaching on dragon riding skill. |
Example Outcome | After the experiment, researchers analyze dragon riding improvement separately for each Quadrant, drawing conclusions about the technique’s effectiveness within specific Quadrants while minimizing the influence of Quadrant on the overall results. |
A simple example: Charlotte’s coffee
Last Christmas I was given a gift set of three types of coffee beans. I want to know which makes the darkest coffee; to do this I measure the opacity after the coffee is made.
- Three types of coffee beans are: Arabica , Liberica , and Robusta .
- 12 identical cups are chosen and sets of four cups are randomly allocated one of three treatments.
- Four sets of each type of coffee beans are ground and cups made (in the same way) resulting in a total of 12 cups of coffee, see below.
- Samples are taken from each cup and the coffee colour is measured.
Experiment design:
Scientific question: Does coffee colour differ between bean types of coffee beans?
There are 3 treatments (types of coffee beans): Arabica, Liberica, and Robusta.
In this case the experimental unit would be the coffee cup as each one is allocated a different bean type (treatment).
The observational unit changes depending on the scenario (i.e., what and how samples are taken). For example,
- if a single ml of liquid is taken from each cup and one measurement is taken per ml taken \(\rightarrow\) the observational unit would be the cup;
- if single ml of liquid is taken from each cup and two subsamples are then taken from each ml, then if measurements are taken per subsample \(\rightarrow\) then the observational unit would be a subsample;
- if four \(\times\) 1 ml of liquid were taken from each cup and from each ml a measurement is taken \(\rightarrow\) then the observational unity would be each 1ml sample.
A realm based on that in the Fourth Wing.↩︎