Introduction to Apriori algorithm
This tutorial is about Introduction to Apriori algorithm. Apriori is one of the algorithms that we use in recommendation systems. As you can see in the e-commerce websites and other websites like youtube we get recommended contents which can be provided by the recommendation system.
If you want to learn the implementation of this algorithm in Python kindly see this tutorial: Apriori Algorithm in Python
To understand apriori better, you must be acquainted with recommendation system. Recommendation systems come under sub field of artificial intelligence called Association Rule Mining. Basically, association rules are used to find useful information from large data. Recommendation systems are of two kinds:
- One kind of recommendation system recommends the frequently bought items to the user(Collaborative Filtering ). For instance, if chocolates are bought most frequently, it will recommend chocolate to the user.
- Second kind of recommendation system finds the feature. This feature is one that made user buy that item and recommends items possessing that feature to user (Content Based) .
What is Apriori algorithm?
Apriori algorithm is one of the algorithms used in recommendation systems. This algorithm is generally applied to transactional databases i.e. when we have transactions. The result is we get frequent item sets i.e. items which are bought most frequently.
Apriori property- Consider an item set to be infrequent i.e. support count is less than given support count. Support count of one-item set is the number of times that item is occurring in transactions.Then ,its super sets will also be infrequent. Therefore, these should not be calculated. In this way, this helps in reducing search space.
Algorithm behind apriori algorithm:
- We will start with one-item set.
- Start comparing the support each item in one-item set with the given support count. If it is greater, keep the item else remove item from your item set.
- Now, you will join the items in the item set to generate two-item sets.
- As we have done in step 2, we will compare support count of two-item sets with a given support count. Keep the frequent ones and remove infrequent ones.
- Keep generating three-item set or four item set if it is possible, else we will terminate the algorithm.
- Now when we have frequent item set, we will be generating a rule using this item set. This rule is the strong rule and we will use this to recommend items to user.
For instance, I now have item set (A, B,C). I will be generating rules as shown below:
- A -> B C ( order doesn’t matter i.e. same as A -> C B)
- B -> A C
- C-> A B
- A B -> C
- B C -> A
- C A -> B
These are the possible rules. Now we will find support of these rules which we will calculate as:
support(A,B,C) = support(LHS) / support(RHS)
For ex- support of 1st rule will be : support( A -> B C ) = support( A ) / support( B , C )
where support( B , C ) means support count of B and C occurring together in the database.
Similarly, we will calculate support count of all the rules. Then we will compare support count of rules with the minimum support given. The rules that satisfy the criteria, we will consider them as strong. Hence, we will recommend item on the basis of this rule.
- Time consuming: Although this algorithm reduces the search space. But when we have huge transactional database, this algorithm will take much time.
- Faulty Rules: Consider a huge transactional database, we will reduce the minimum support count. Consequently, there will be more rules that will satisfy minimum support. Hence, faulty rules will be there.
Applications of Recommendation systems:
- Location recommendation: We can suggest a place on the basis of his past visit that a person can visit in future.
- E-commerce website: Many online websites that sell products use this algorithm to suggest similar items that user can buy in future.
- Social media platforms: Social media platforms like Facebook, Instagram etc uses these engines. Taking into account recent friends that we add, user gets to see new suggestions.
This is all about “Introduction to Apriori algorithm”. In the next tutorial, we will discuss its implementation in Python. For any doubts, feel free to post your doubts in the comments section.
Also, give a read to,