How to remove all the special characters from a text file in Python
In this blog, we will be seeing how we can remove all the special and unwanted characters (including whitespaces) from a text file in Python. first of all, there are multiple ways to do it, such as Regex or inbuilt string functions; since regex will consume more time, we will solve our purpose using inbuilt string functions such as isalnum() that checks whether all characters of a given string are alphanumeric or not.
We will also require some basic file handling using Python to fulfil our goal.
Opening and reading a text file:
We can open a .txt file by using the open() function and read the contents line by line.
Myfile = open("input.txt", "r") #my text is named input.txt #'r' along with file name depicts that we want to read it
Checking all characters of the text file:
It will check all characters for any special characters or whitespaces. We use the function isalnum() and remove all the non-alphanumeric characters and display the content of the text file. The complete code shall look like:
Myfile = open("input.txt", "r") #my text is named input.txt #'r' along with file name depicts that we want to read it for x in Myfile: a_string = x; alphanumeric = " " for character in a_string: if character.isalnum(): alphanumeric += character print(alphanumeric)
Contents of the input.txt are shown below:
This is demo For checking ]][/';;'.%^ these chars @%^* to be removed $ ^ % %..; i am not @^$^(*&happy%$%@$% about %%#$%@ coro%%na virus 19 i #@love**&^ codespeedy%^().
The output will look like this:
ThisisdemoForcheckingthesecharstoberemoved iamnothappyaboutcoronavirus19 ilovecodespeedy
We can clearly see that the whitespaces and special characters have been eliminated successfully.