Skip to content Skip to sidebar Skip to footer

Python Regex Replace

Hey I'm trying to figure out a regular expression to do the following. Here is my string Place,08/09/2010,'15,531','2,909',650 I need to split this string by the comma's. Though

Solution 1:

new_string = re.sub(r'"(\d+),(\d+)"', r'\1.\2', original_string)

This will substitute the , inside the quotes with a . and you can now just use the strings split method.

Solution 2:

>>> from StringIO import StringIO
>>> import csv
>>> r = csv.reader(StringIO('Place,08/09/2010,"15,531","2,909",650'))
>>> r.next()
['Place', '08/09/2010', '15,531', '2,909', '650']

Solution 3:

Another way of doing it using regex directly:

>>> import re
>>> data = "Place,08/09/2010,\"15,531\",\"2,909\",650">>> res = re.findall(r"(\w+),(\d{2}/\d{2}/\d{4}),\"([\d,]+)\",\"([\d,]+)\",(\d+)", data)
>>> res
[('Place', '08/09/2010', '15,531', '2,909', '650')]

Solution 4:

You could parse a string of that format using pyparsing:

import pyparsing as pp
import datetime as dt

st='Place,08/09/2010,"15,531","2,909",650'defline_grammar():
    integer=pp.Word(pp.nums).setParseAction(lambda s,l,t: [int(t[0])])
    sep=pp.Suppress('/')
    date=(integer+sep+integer+sep+integer).setParseAction(
              lambda s,l,t: dt.date(t[2],t[1],t[0]))
    comma=pp.Suppress(',')
    quoted=pp.Regex(r'("|\').*?\1').setParseAction(
              lambda s,l,t: [int(e) for e in t[0].strip('\'"').split(',')])
    line=pp.Word(pp.alphas)+comma+date+comma+quoted+comma+quoted+comma+integer
    return line

line=line_grammar()
print(line.parseString(st))
# ['Place', datetime.date(2010, 9, 8), 15, 531, 2, 909, 650]

The advantage is you parse, convert, and validate in a few lines. Note that the ints are all converted to ints and the date to a datetime structure.

Solution 5:

a = """Place,08/09/2010,"15,531","2,909",650""".split(',')
result = []
i=0while i<len(a):
    ifnot"\""in a[i]:
        result.append(a[i])
    else:
        string = a[i]
        i+=1whileTrue:
            string += ","+a[i]
            if"\""in a[i]:
                break
            i+=1
        result.append(string)
    i+=1print result

Result: ['Place', '08/09/2010', '"15,531"', '"2,909"', '650'] Not a big fan of regular expressions unless you absolutely need them

Post a Comment for "Python Regex Replace"