Regular Expression Operations in Python

In this article, we will see the regular expression operations in Python. Before that, we need to know what regular expressions are:

A regular expression, regex or regexp is a sequence of characters that define 
a search pattern.
                                                               - Wikipedia

It is important that you know how regular expressions work before we proceed using them in Python. Please refer to the basic concepts beforehand.

Regular Expression Operations in Python:

Python supports regular expressions with the help of the re module and the re module has several useful methods. Some of the basic ones are:

  • re.findall() :
    The findall() method finds all the non-overlapping substrings that match the given regex pattern. The matched substrings are returned as a list and suppose if no matches are found, it returns an empty list.
    Let’s look at an example:

    import re
    
    string = "abaacada"
    
    # finds all non overlapping substrings that match
    matches = re.findall("a.a", string)
    
    print(matches)
    

    Output:

    ['aba', 'aca']

    The substring “ada” is not in the list since the “a” is already a part of another matched substring.

  • re.search():
    The re.search() method is similar to the findall method. It searches for the given pattern and returns the first match found. If no match is found, it returns None. Now, Let’s look at an example of it:

    import re
    
    string = "abaacada"
    
    # finds the first substring that matches the regex pattern
    matches = re.search("a.a", string).group()
    
    print(matches)
    

    Output:

    aba
  • re.split():
    The re.split() method splits the string wherever there is a regex pattern match and returns the split string as a list of strings. Let’s look at an example now:

    import re
    
    string = "abcdefghij"
    
    # splits the strings at the matched indexes
    matches = re.split("['a', 'e', 'i', 'o', 'u']", string)
    
    print(matches)
    

    Output:

    ['', 'bcd', 'fgh', 'j']
    
    
  • re.sub():
    The re.sub() method finds wherever there’s a match to the regex pattern in the string and replaces the matched substring with the specified string.

    import re
    
    string = "abcdefghij"
    
    # matches and replaces it with the given string
    string = re.sub("['a', 'e', 'i', 'o', 'u']", "1", string)
    
    print(string)
    

    Output:

    1bcd1fgh1j

Finally, I hope you found this article helpful in understanding the regular expression operations in Python.

See also:

Leave a Reply

Your email address will not be published. Required fields are marked *