Stratified sampling, also known as stratified random sampling, is a probability sampling technique that considers the different layers or strata characterizing a population and allows you to replicate those layers in the sample.
The stratification in stratified sampling is done based on shared characteristics of the population members such as age, gender, income, education, etc.
This research tutorial will look at stratified sampling, how and when to use it, and compare it with other probability sampling techniques.
Furthermore, we will conduct a step-by-step stratified sampling on a hypothetical population so that you can get intimate with the stratified random sampling technique as fast as possible.
Without further ado, let’s get started.
What is Stratified Sampling?
A population usually consists of different characteristics and features [age, gender, income, education, etc.].
Unlike other probability sampling techniques, e.g., simple random sampling, stratified sampling is first to stratify a population into groups of members with the same characteristics and features.
Definition: Stratification is the process of classifying the population into specific groups. The groups are referred to as layers or strata.
For example, a population can consist of members of different income levels. If your research goal is to determine if a product is overpriced, it makes perfect sense to stratify a population into different income groups [e.g., low, average, high income].
Why? Because members of different income levels may have different opinions about a product price. For instance, medium and high-income members may find the price affordable, while low-income members may find it overpriced.
How To Conduct Stratified Sampling?
There are four essential steps in taking a stratified sample from a population. In this section, we will discuss each step in detail.
Step 1: Define Your Population
The first step in sampling a population is to define the population itself.
For instance, if your study focuses on Gen Y purchasing behavior on online shopping, it makes perfect sense to define your population to include members born between 1981 and 1996.
Similarly, if you plan to study the perception of older adults towards nursing homes, it makes no sense to include students as part of your population.
Remember, the population you define in your research must be relevant to the context of your study.
Let’s assume we have a hypothetical population of 50 members. The first thing we need to do is to list the population members, as shown in the table below. As you can see, our population is a mix of both males and females names.
Step 2: Stratification [grouping]
Once you define your population, you need to choose the characteristics for stratification.
This is a crucial step as one member of the population can only be assigned to one group. You should clearly define the criteria for assigning a member to a group in your paper.
You can stratify your population based on multiple characteristics as well, as long as you can exactly match one member to one group.
For instance, if you want to stratify for both gender [male, female] and income [low, average, high], you will have 2 x 3 = 6 groups. In this way, a male with an average income will be assigned only to one sub-group, namely the males with an average income sub-group.
To keep things simple and easier to understand, let’s assume we want to take a gender-based stratified sample of 10 based on our hypothetical population.
Because we want to classify our population based on gender, we will separate our population members into two main groups: males and females [Table 2 and 3].
As you can see in the tables above, our hypothetical population consists of 20 males and 30 females. To preserve the random nature of stratified sampling, the names in the table above were not listed in alphabetical order.
Step 3: Random Selection
Now that our population is stratified, it is time to perform the sampling using random selection.
NOTE: In Stratified Sampling and any other probability sampling technique, each member of the population MUST have an equal andindependent chance of being selected.
Equal because there is no bias that the researcher will choose a person in favor of another and Independent because choosing one person does not bias the research results in favor of or against another person. This is a crucial aspect of probability sampling.
We need to assign numbers for each name for both males and females tables, as seen below. You will see in a moment why.
|1. Dennis||11. Fred|
|2. Louis||12. Bruce|
|3. James||13. Stephen|
|4. Harry||14. Nicholas|
|5. Jeremy||15. Jerry|
|6. Leo||16. Peter|
|7. Michael||17. Douglas|
|8. Sean||18. John|
|9. Christopher||19. Russell|
|10. Paul||20. Terry|
|1. Nona||11. Louise||21. Marilyn|
|2. Luna||12. Ava||22. Stephanie|
|3. Heather||13. Aria||23. Rebecca|
|4. Helen||14. Ruby||24. Olivia|
|5. Natcha||15. Amelia||25. Patricia|
|6. Alice||16. Donna||26. Christina|
|7. Riley||17. Sara||27. Emma|
|8. Emily||18. Anna||28. Judith|
|9. Catherine||19. Judy||29. Mia|
|10. Sara||20. Debora||30. Sophia|
Next, we need to create a table of random numbers using Microsoft® Excel or any other worksheet alternative.
NOTE: If you don’t know how to do that, here is a step-by-step guide to generate random numbers using Excel. It takes less than 5 minutes.
Your table with random numbers should look like the one below:
So what’s with all those random numbers and how we are going to use them?
In any probability sampling, avoiding any bias is probably the most important aspect to be considered. So instead of selecting a name, we will select a random number associated with a name. This way, every individual in our population has an equal and independent chance of being selected.
You can also see that the randomly generated numbers above are unique [do not repeat].
Now, let’s assume we want to take a sample of 10 from our hypothetical population of 50 males and females.
Because we want our sample to represent the population we are sampling, it should preserve the male-female ratio of 40% males and 60% females, respectively 4 males and 6 females.
Step 4: Sampling
Let’s start first by sampling the male group. With the table of random numbers in front of you, close your eyes and point your finger [or pen] anywhere on the table. This represents the starting point in the sampling process.
For instance, I selected the number 90672 in the second column on my first try, the third row from the top. Since the population we sample consists of two digits , we will use the last two digits of every random number we select.
The last two digits of the 90772 are 72, therefore outside the range of 01-20 assigned to the male population [Table 4]. In this case, we can use the next two digits in the same number, respectively 07.
The number 07 is assigned to the name Michael. Note this name and repeat the process another 3 times until the male sample [4 subjects] is completed.
Consequently, use the same procedure to sample the female strata by repeating the same process 6 times until the female sample [6 subjects] is completed.
Once you finished, you should have a sample size of 8 consisting of 4 males and 4 females.
Here is my random number selection for the male’s strata using the stratified sampling technique explained so far:
Here is my table of male names corresponding with the numbers selected above in the order of selection:
Here is my random number selection for the females’ strata using the stratified sampling technique explained so far:
Here is my table of female names corresponding with the numbers selected above in the order of selection:
There you go. You just took a stratified random sampling using two groups.
Frequently Asked Questions
Often students are confused about why and when the stratified sampling technique should be used. Here are a few questions students frequently ask in a research class.
Q1: Why is stratified sampling used?
The most common reason for using stratified sampling is because it enables researchers to produce a sample population that best reflects the total population being investigated while also ensuring that each subgroup of interest is represented.
Q2: Where is stratified random sampling used?
Stratified random sampling is used when a researcher seeks to understand any existing relationship between two or more groups in a given population. This technique allows even the smallest sub-group in a population to be represented in a study.
Q3: What are the advantages of stratified sampling?
Using stratified sampling provides a few advantages over other probability sampling techniques. For instance, it allows higher accuracy than a simple random sample on similar sample size. Because being accurate, is often less costly as it requires a smaller sample size while still being precise in representing the larger population.
Q4: What is the difference between stratified sampling and simple random sampling?
The main difference is that stratified sampling provides greater accuracy due to allowing stratification of a population in groups and sub-groups [also called strata or layers]. In contrast, simple random sampling only allows sampling without any additional classification.
Stratified sampling is one of the most frequently used probability sampling techniques employed in research due to its great accuracy and ability to represent a population even when small sample size is used.
It allows stratification of a population in groups [called strata or layers] where even the most minute group has an equal chance of being represented in a study.
I hope this tutorial provides some value for your research. If so, take a second to share it with your colleagues and friends – they might need some help as well.
Cite this article in your research paper: