Get values from column that appear more than N times in Pandas

In this tutorial, we will learn how to get values from column that appear more than N times in Pandas.

We often come across dataframes that have common values in a particular column. So, in this text, we will see how to get the common values that occur more than one time in a specific column of a particular Pandas dataframe. We can easily obtain these values by simply using the value_counts() method. This method makes our task easier by identifying the count of values that occur more than once in a specific column of a Pandas dataframe.

To get values from a column in a Pandas DataFrame that appear more than a certain number of times, you can use the value_counts() method in combination with a condition on the returned Pandas Series. Here is an example:

# Import the necessary modules
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({'col1': ['a', 'b', 'c', 'd', 'e', 'c', 'b', 'a', 'd'],
                   'col2': [1, 2, 3, 4, 5, 6, 7, 8, 9]})

# Set the number of occurrences
n = 2

# Get the values from 'col1' that appear more than n times
values = df['col1'].value_counts()
values = values[values > n].index.tolist()

# Print the resulting values
print(values)

This will print the values from the ‘col1’ column that appear more than 2 times:

['c', 'b', 'a']

Some more examples:

The example below shows the frequency of items in the column ‘Brand’ and prints the item that occurs more than once in the same column.

# import pandas library to the codespace
import pandas as pd
# Create the dataframe
sales_data={'Devices':['Laptop','iPhone,'LED','LCD','Smart-Phone','Washing-Machine'],
           'Brand':['Lenovo','Apple','Samsung','Samsung','Samsung','Whirpool'],
           'Sales':[1000,2000,4000,2000,1000,4000],
           'Profit':[500,1000,1000,1500,1000,1500],
           'Pices left':[5000,4000,4000,5000,5000,1000]}
# create the pandas dataframe
df=pd.DataFrame(sales_data)
# Frequency of items in the column brand
Frequency=(df.Brand.value_counts())
# print the product that occurs more than once
Frequent_product = Frequency[Frequency>1].index[0]
# display the items along with their frequency
display(Frequency)
# print the item that occurs more than once
print(" This item appears more than once:",Frequent_product)

Output:

Get values from column that appear more than N times in Pandas

# import pandas module
import pandas as pd
# create data
sales_data={'Devices':['Laptop','iphone','LED','LCD','Smart-Phone','Washing-Machine'],
           'Brand':['Lenovo','Apple','Samsung','Samsung','Samsung','Whirpool'],
           'Sales':["1k","2k","4k","4k","1k","4k"],
           'Profit':["Five Hundred","One Thousand","One Thousand","Four Thousand","One Thousand","Fifteen Hundred"],
           'Pices left':[5000,4000,4000,5000,5000,1000]}
# create pandas dataframe
df=pd.DataFrame(sales_data)
# frequency of sales
Frequency=(df.Sales.value_counts())
# most equal amount of sales
Frequent_sales = Frequency[Frequency>1].index[0]
# print frequency of sales
print(Frequency)
# print most equal amount of sales
print("Most equal amount of sales : ",Frequent_sales)

Output:

obtain the frequency of each item in the column Sales
In this example, we can see that now we are able to obtain the frequency of each item in the column ‘Sales’ and also the most frequent item in the same column.

Leave a Reply

Your email address will not be published. Required fields are marked *