How To Rermove Non-alphanumeric Characters At The Beginning Or End Of A String
Solution 1:
def strip_nonalnum(word):
if not word:
return word # nothing to strip
for start, c in enumerate(word):
if c.isalnum():
break
for end, c in enumerate(word[::-1]):
if c.isalnum():
break
return word[start:len(word) - end]
print([strip_nonalnum(s) for s in thelist])
Or
import re
def strip_nonalnum_re(word):
return re.sub(r"^\W+|\W+$", "", word)
Solution 2:
To remove one or more chars other than letters, digits and _
from both ends you may use
re.sub(r'^\W+|\W+$', '', '??cats--') # => cats
Or, if _
is to be removed, too, wrap \W
into a character class and add _
there:
re.sub(r'^[\W_]+|[\W_]+$', '', '_??cats--_')
See the regex demo and the regex graph:
See the Python demo:
import re
print( re.sub(r'^\W+|\W+$', '', '??cats--') ) # => cats
print( re.sub(r'^[\W_]+|[\W_]+$', '', '_??cats--_') ) # => cats
Solution 3:
You can use a regex expression. The method re.sub()
will take three parameters:
- The regex expression
- The replacement
- The string
Code:
import re
s = 'cats--'
output = re.sub("[^\\w]", "", s)
print output
Explanation:
- The part
"\\w"
matches any alphanumeric character. [^x]
will match any character that is notx
Solution 4:
I believe that this is the shortest non-regex solution:
text = "`23`12foo--=+"
while len(text) > 0 and not text[0].isalnum():
text = text[1:]
while len(text) > 0 and not text[-1].isalnum():
text = text[:-1]
print text
Solution 5:
By using strip you have to know the substring to be stripped.
>>> 'cats--'.strip('-')
'cats'
You could use re
to get rid of the non-alphanumeric characters but you would shoot with a cannon on a mouse IMO. With str.isalpha()
you can test any strings to contain alphabetic characters, so you only need to keep those:
>>> ''.join(char for char in '#!cats-%' if char.isalpha())
'cats'
>>> thelist = ['cats5--', '#!cats-%', '--the#!cats-%', '--5cats-%', '--5!cats-%']
>>> [''.join(c for c in e if c.isalpha()) for e in thelist]
['cats', 'cats', 'thecats', 'cats', 'cats']
You want to get rid of non-alphanumeric so we can make this better:
>>> [''.join(c for c in e if c.isalnum()) for e in thelist]
['cats5', 'cats', 'thecats', '5cats', '5cats']
This one is exactly the same result you would get with re (as of Christian's answer):
>>> import re
>>> [re.sub("[^\\w]", "", e) for e in thelist]
['cats5', 'cats', 'thecats', '5cats', '5cats']
However, If you want to strip non-alphanumeric characters from the end of the strings only you should use another pattern like this one (check re Documentation):
>>> [''.join(re.search('^\W*(.+)(?!\W*$)(.)', e).groups()) for e in thelist]
['cats5', 'cats', 'the#!cats', '5cats', '5!cats']
Post a Comment for "How To Rermove Non-alphanumeric Characters At The Beginning Or End Of A String"