Simple Random Sampling in Python Programming

Hey, Python coder! This tutorial will cover the most basic type of sampling techniques in Python, i.e., Simple Random Sampling. But before moving forward, let’s first understand the terms and definitions regarding the concept.
Let’s start with the conceptual understanding!
Introduction to Sampling
Let’s say you have a big packet of candies in different colors and want to know its flavors. Tasting every candy in the packet is not advisable because it would take too much time and work. So, we would instead take a small group of sweets to taste and verify the flavor. Look at the illustration below where the whole packet is known as ‘population,’ and the small share of candies chosen is known as ‘sample.’
In Machine Learning, Sampling works similarly. Instead of studying every item in a large group (population), you pick a smaller group (sample) to gather information from. This sample is selected carefully to give you an idea of the entire population without examining every member.
That’s what Sampling is. This can be achieved through various approaches, namely, random sampling, stratified sampling, and clustered sampling. In this tutorial, we have limited our learning to Random Sampling Technique.
Introduction to Simple Random Sampling
Now, instead of candies, let’s be more specific and take the illustration of Gummy Bears. When it comes to Simple Random Sampling, when creating a sample, each Gummy Bear has an EQUAL chance of getting selected. They all stand on the same level, and no bear can be chosen above another bear. Let’s understand some basic terminologies for this example in the illustration below.
Have a look at the terminologies below:
- Big Group (Population): The whole packet of Gummy Bears that came sealed from the production.
- Equal Opportunity: In Simple Random Sampling, each gummy bear in the packet has an equal chance of being chosen.
- Random Selection: Now, you close your eyes, reach into the packet, and grab a gummy bear without looking. You’re not choosing based on color, shape, or specific criteria – it’s entirely random.
Code Implementation for Simple Random Sampling
In this section, we will cover the implementation of Simple Random Sampling in a step-by-step manner. We will take the example of the Gummy Bear packet and apply random sampling to the packets we create.
Step 1 – Importing Modules.
We will use the following modules for this tutorial: Numpy, Matplotlib, and Random Module.
import numpy as np import matplotlib.pyplot as plt import random plt.style.use('seaborn')
Step 2 – Create gummy bear packets.
Packets can be created very simply using the code snippet below. The code demonstrates a simple function that takes a single parameter, i.e., the packet size, creates a list of Gummy bears, and returns the same.
def producePacket(packetSize): gummyBears = ["GB_" + str(i) for i in range(packetSize)] return gummyBears packet_1 = producePacket(50)
This will result in a packet of 50 Gummy Bears. Using the code snippet below, let’s visualize the packet using a bar graph. We will take out the count of unique Gummy bears present and then plot the bar graph using the bar plot function.
unique, counts = np.unique(packet_1, return_counts=True) plt.figure(figsize=(20, 6)) plt.bar(unique, counts) plt.title("Initial Data for Gummy Bears Packet") plt.xlabel("Gummy Bears ->") plt.ylabel("Count ->") plt.xticks(rotation=45, ha='right') plt.tight_layout() plt.show()
The resulting plot is shown below:
Step 3 – Implement Random Sampling
To implement random sampling, we will first check if the size of the sample mentioned exceeds the size of the packet, which is impossible. If the condition is false, we will take a random sample using the random.sample
function. We will plot the sampled data using the same approach as earlier.
def simpleRandomSampling(data,sampleSize): sizePacket = len(data) if(sampleSize > sizePacket): print("Sample size cannot be greater than packet size!") return None sampledGB_1 = random.sample(data, sampleSize) unique, counts = np.unique(sampledGB_1, return_counts=True) plt.figure(figsize=(4, 4)) plt.bar(unique, counts) plt.title("Sampled Data for Gummy Bears Packet") plt.xlabel("Gummy Bears ->") plt.ylabel("Count ->") plt.xticks(rotation=45, ha='right') plt.tight_layout() plt.show()
When you call the function for a sample size of 10 gummy bears then the resulting plot of the sample data is below:
Hope you are now clear about Simple Random Sampling and how to implement it using Python programming.
Also Read:
- What is Reservoir Sampling? Perform it using the program in Python.
- Thompson Sampling for Multi-Armed Bandit Problem in Python
- random.sample() vs random.choice() in Python
Happy Learning!
Leave a Reply