NumPy String Operations

NumPy is the core library for scientific computing in Python.

The numpy.char module is able to provide a set of vectorized string operations for arrays of type numpy. In order to use any of these operations in our program first of all we have to import the NumPy library by using any of the 3 method listed below:

  1. import numpy
  2. import numpy as np
  3. from numpy import *

The most effective way to import any library in Python is the second one. (However, you may use any method provided)

Numpy.char provides the following string operations:

add(x,y)

This function performs string concatenation. It takes two array as input and returns concatenation of there elements.

import numpy as np
x=["World "]
y=["Cup"]
print(np.char.add(x,y))

output:

['World Cup']

capitalize(x)

For each element in x, it returns a copy of the given array with the first character of each element as capital.

import numpy as np
a=["world","cup","2019"]
print(np.char.capitalize(a))

Output:

['World' 'Cup' '2019']

center(x, width, fillchar)

This function takes an array as input along with a width and a character to be filled and returns the array with its element in center padded on the left and right with fillchar.

import numpy as np
print(np.char.center(["world","cup","2019"], 20,fillchar = '*'))

Output:

['*******world********' '********cup*********' '********2019********']

decode(x[, encoding, errors]), encode(x[, encoding, errors])

Decode and Encode are two different functions is numpy.char. These are set of available codes that come from the standard library of Python, and also maybe extended at runtime.

import numpy as np
x = np.array(['world', 'cup', '2019'])
e = np.char.encode(x, encoding='cp037')
print("Encoded as:", e)
d = np.char.decode(e, encoding='cp037')
print("Decoded back to:", d)

Output:

Encoded as [b'\xa6\x96\x99\x93\x84' b'\x83\xa4\x97' b'\xf2\xf0\xf1\xf9']

Decoded back to: ['world'  'cup'  '2019']

expandtabs(x, tabsize)

It returns the string with as many spaces as given in tabsize where ever  ‘\t’  is found.

import numpy as np
str = "Wow!!\tEngland won this tornament.";
print(np.char.expandtabs(str,tabsize=8))

Output:

Wow!!   England won this tornament.

join(sep,x)

For each element in x, it returns a string concatenated by separator character specified.

import numpy as np
print(np.char.join([':','-'],['CWC','2019']))

Output:

['C:W:C' '2-0-1-9']

ljust(a, width,  fillchar)

It takes an array as input along with a width and a fillchar. Returns array with elements of a left-justified in a string of length width.

import numpy as np
print(np.char.ljust('CWC2019',20, fillchar = '*'))

Output:

CWC2019*************

Here in the above example a string of length equal to 7 is passed along with a width of 20 and a fillchar *. It returns a string of length 20 , left-justified filled with * till 20 characters are reached.

lower(x)

This function returns a copy of the given array with the first character of each element in lower case.

import numpy as np 
print(np.char.lower(['Cwc', '2019', 'England']))

Output:

['cwc' '2019' 'england']

lstrip(x, chars)

For each element in x, it returns a copy of the inputted string with a leading characters removed from left side of the string.

import numpy as np
str1="      CWC 2019 England."
str2="****CWC 2019 England.****"
print(np.char.lstrip(str1))
print(np.char.lstrip(str2,"*"))

Output:

CWC 2019 England.

CWC 2019 England.****

multiply(x,i)

This function performs multiple concatenation.

import numpy as np
print(np.char.multiply('CWC2019 ',5))

Output:

CWC2019 CWC2019 CWC2019 CWC2019 CWC2019

mod(x,values)

This function returns (a % i), that is string formatting.

import numpy as np
x=np.array([00, 19, 2019])
print(x)
print(np.char.mod('%d', x))

Output:

[   0   19 2019]

['0' '19' '2019']

partition(x,sep)

This function is used to partition each element of an array around separator character specified.

import numpy as np
x = "England won CWC2109."
print(np.char.partition(x, 'won'))

Output:

['England ' 'won' ' CWC2109.']

replace(x,old,new,count)

This function returns a copy of a given string with all the occurrences of substring old replaced by new.

import numpy as np
print(np.char.replace('Australia won CWC2019', 'Australia', 'England'))

Output:

England won CWC2019

rjust(x, width, fillchar)

It takes an array as input along with a width and a fillchar. Returns array with elements of a right-justified in a string of length width.

import numpy as np 
print(np.char.rjust('CWC2019',20, fillchar = '*'))

Output:

*************CWC2019

In the above example we passed a string of length equal to 7 along with a width of 20 and a fillchar *. It returns a string of length 20 , right-justified filled with * till 20 characters are reached.

rpartition(x,sep)

For each element in x, split the element as the last occurrence of sep. If the separator (sep) is not found, then it will return 3 strings that contain the string itself, followed by two empty strings.

import numpy as np
print(np.char.rpartition(x, 'won'))

Output:

['England ' 'won' ' CWC2019']

rsplit(x, sep, maxsplit)

For each element in x, return a list of the words in the string, using sep as the separators string.

import numpy as np
print(np.char.rsplit('CWC#2019#England', '#', maxsplit=11))

Output:

['CWC', '2019', 'England']

rstrip(x, chars)

For each element in x, it returns a copy of the inputted string with a leading characters removed from right side of the string.

import numpy as np 
str1="CWC 2019 England.     " 
str2="****CWC 2019 England.****" 
print(np.char.rstrip(str1)) 
print(np.char.rstrip(str2,"*"))

Output:

CWC 2019 England.

****CWC 2019 England.

split(x, sep, maxsplit)

For each element in x, return a list of the words in the string, using sep as the limiter string.

import numpy as np
print(np.char.split('CWC:2019:England', ':'))

Output:

['CWC', '2019', 'England']

splitlines(x, keepends)

This function returns a list of the lines in the element, breaking at line boundaries. We use ‘\n’ or ‘\r’ for breaking at line boundaries.

import numpy as np
print(np.char.splitlines('England \nWon \nCWC2019.'))

Output:

['England ', 'Won ', 'CWC2019.']

strip(x, chars)

This function returns a copy of the given array with the given character removed or stripped.

import numpy as np
print(np.char.strip(['icc','world','cup'],'c'))

Output:

['i' 'world' 'up']

swapcase(x)

Returns a copy of the element with case swapped i.e., either from uppercase to lowercase or from lowercase to uppercase.

import numpy as np
print(np.char.swapcase(['icc','world','cup','2019']))

Output:

['ICC' 'WORLD' 'CUP' '2019']

title(x)

This function returns a title cased version of the input string with the first letter of each word capitalized.

import numpy as np
 print(np.char.title('england hosted cwc2019'))

Output:

England Hosted Cwc2019

translate(x, table, deletechars)

This function mentioned above returns a copy of the string where all characters occurring in the optional argument deletechars are removed, and the remaining characters have been mapped through the given translation table.

import numpy as np
table="10"
print(np.char.translate('ICC World Cup 2019', table, deletechars=None))

Output:

ICC World Cup 2019

upper(x)

This function returns a copy of the given array with the first character of each element in upper case.

import numpy as np 
print(np.char.lower(['cwc', '2019', 'england']))

Output:

['Cwc' '2019' 'England']

zfill(x, width)

This function returns the numeric string left-filled with zeros. Number of zeroes depends directly on the width given.

(number of zeroes = width given – width of the string)

import numpy as np
print(np.char.zfill('CWC2019', 20))

Output:

0000000000000CWC2019

Leave a Reply

Your email address will not be published. Required fields are marked *