Python Regex, Find And Replace Second Tab Character
I am trying to find and replace the second tab character in a string using regex. booby = 'Joe Bloggs\tNULL\tNULL\tNULL\tNULL\tNULL\tNULL\tNULL\tNULL\r\n' This works fine: re.sub(
Solution 1:
You may be overthinking it a little.
>>> text = 'Joe Bloggs\tNULL\tNULL\tNULL\tNULL\tNULL\tNULL\tNULL\tNULL\r\n'
>>> re.sub(r'(\t[^\t]*)\t', r'\1###', text, count=1)
'Joe Bloggs\tNULL###NULL\tNULL\tNULL\tNULL\tNULL\tNULL\tNULL\r\n'
Simply match the first instance of a tab followed by any number of non-tabs followed by a tab, and replace it with everything but the final tab and whatever you want to replace it with.
Solution 2:
>>> re.sub(r'^((?:(?!\t).)*\t(?:(?!\t).)*)\t',r'\1###', booby)
'Joe Bloggs\tNULL###NULL\tNULL\tNULL\tNULL\tNULL\tNULL\tNULL\r\n'
You are almost there, add \1
before ###
I provide another way to solve it because of the comments:
>>> booby.replace("\t", "###",2).replace("###", "\t",1)
'Joe Bloggs\tNULL###NULL\tNULL\tNULL\tNULL\tNULL\tNULL\tNULL\r\n'
Solution 3:
With regex
This is the shortest regex I could find :
import re
booby = 'Joe Bloggs\tNULL\tNULL\tNULL\tNULL\tNULL\tNULL\tNULL\tNULL\r\n'
print re.sub(r'(\t.*?)\t', r'\1###', booby, 1)
It uses non-greedy .
to make sure it doesn't glob too many tabs.
It outputs :
Joe Bloggs NULL###NULL NULL NULL NULL NULL NULL NULL
With split and join
The regex might get ugly if you need it for other indices. You could use split
and join
for the general case :
n = 2
sep = '\t'
cells = booby.split(sep)
print sep.join(cells[:n]) + "###" + sep.join(cells[n:])
It outputs :
Joe Bloggs NULL###NULL NULL NULL NULL NULL NULL NULL
Post a Comment for "Python Regex, Find And Replace Second Tab Character"