Fetch all email id from a text file in Python
Hi there, In this tutorial, we are going to look at how you can fetch or get all the email ID present in the text file using Python programming language. Sometimes there are such scenarios where you need to find out all the email ids present in the document; for example, an admin needs to know the email ids of the students who registered for a particular course and now he/she needs the emails of each registered student so that he/she can follow-up with them or in a case you yourself want to know an email id written in your resume. So let’s jump straight into it and see how can we do that.
Python program to fetch or get all email id from a text file
For doing this, we need a minimum of three things in our basket which are:
- Python’s urllib package. You can install it as
pip install urllib3
- Python’s Regular Expression package. You can install it as
pip install regex
- And last but not the list you need a text document.
Since you have all these three requirements satisfied let’s move on and look at how can we read email-ids from a text document.
Firstly import the two installed packages as:
import urllib.request import re
Remember urllib.request is a Python package for fetching URLs. This means it is going to interact with the internet. But if you want to perform these on your local file which is stored on your computer than you do not need urllib.request.
Another Python package re which you imported just now is a string of text that allows you to create patterns that help match, locate, and manage text.
Let’s jump to the code now:
import urllib.request import re openfile = open('text.txt', 'r') with openfile as input: print (re.findall(r'\b([a-z0-9-_.]+?@[a-z0-9-_.]+)\b', f_input.read(), re.I))
Understanding the code :
- Import required packages
- Now since you need to read the content present inside a text document, you first need to open it. To open the text document we took the help of open() function in Python. The open() function takes two parameters i.e. filename, and mode.
- The next step is to just create a pattern that would recognize the email-id present in the text document. For this, we created a pattern as r’\b([a-z0-9-_.]+?@[a-z0-9-_.]+)\b’ which will find all matches present in the text document for us using findall() function. Where [0-9] Returns a match for any digit between 0 and 9, [a-z] Returns a match for any character alphabetically between a and z, lower case OR upper case and remaining are the special characters that return a match if they are present in a text document.
- Lastly, we need to tell our findall() function to read all matches present and print it.
The input text file we used in this tutorial looks something like these:
CodeSpeedy. A Place Where You Find Solutions In Coding And Programming. Contact us at email@example.com
Thank you for learning with CodeSpeedy. Feel free to comment on your doubts in the comments section and make sure you check out more amazing programming solutions in Python at More on Python at CodeSpeedy