Remove special characters from a string except space in Python
This tutorial is about removing the special characters in a string where we need to remove all the special characters except for a space in the string. This is a very interesting task to do and will also be proved to be helpful. I had been asked this question in one of my placement drives.
Example- There is a string consisting of the following sentences- “Hello! We% are %#$%^ trying ^&%$to remove^& special! characters%^ from*( a(^ string": except{} for_a space.
”
So the corresponding output after removing special characters is-“Hello We are trying to remove special characters from a string except for a space.”
This could be done using a tool in Python known as Regex.
A regex is a tool that is a shorthand for Regular expression. We can do many tasks like checking for a specified pattern in a string.
Python has a built-in module known as re
which is imported with a lot of functions like findall()
, search()
, split()
etc.
Removing special characters from a string except space
We will implement this using two ways-
- Using findall()
- Using sub()
Method 1: Using findall()
We are going to use findall()
function which finds a pattern then replaces the pattern and returns the string.
Here is the implementation of Regex as-
import re string = '''Hello! We% are %#$%^ trying ^&%$to remove^& special! characters%^ from*( a(^ string": except{} for_a space.''' for a in string.split("\n"): string1 = " ".join(re.findall(r"[a-zA-Z0-9]+", a)) print(string1)
Output:
Hello We are trying to remove special characters from a string except for a space
In line 1, we have imported the regular expression library as re.
The string is defined with some special characters which need to be removed. A for loop is used white iterates in string and then finds all the a-z, A-Z,0-9 and replaces it with itself and does not encounter character except this.
At last, the string is printed.
Method 2: Using sub()
The next implementation of this problem can be solved using a different approach that implements sub()
.
sub()
handles the regular expression by substituting which replaces the string after the replacement of matched pattern and returns it.
import re string = '''Hello! We% are %#$%^ trying ^&%$to remove^& special! characters%^ from*( a(^ string": except{} for_a space.''' for a in string.split("\n"): print(re.sub(r"[^a-zA-Z0-9]+", ' ', a))
Output:
Hello We are trying to remove special characters from a string except for a space
Here it substitutes all the other letters, digits, and alphabets into a space.
Here in this task, we have discussed two methods to remove special characters from any string. I hope you found this helpful. Also, you can visit official documentation for various metacharacters in regex.
import re
string = ”’Hello! We% are %#$%^ trying ^&%$to remove^& special! characters%^ from*( a(^ string”: except{} for_a space.”’
for a in string.split(“\n”):
print(re.sub(r”[^a-zA-Z0-9]+”, ‘ ‘, a))