Trying To Find The Regex For This Particular Case? Also Can I Parse This Without Creating Groups?

May 29, 2023 Post a Comment

text to capture looks like this.. Policy Number ABCD000012345 other text follows in same line.... My regex looks like this regex value='(?i)(?:[P|p]olicy\s[N|n]o[|:|;|,][

Solution 1:

You may use a more simple regex, just finding from the beginning "[P|p]olicy\s*[N|n]umber\s*\b([A-Z]{4}\d+)\b.*" and use the word boundary \b

pattern = re.compile(r"[P|p]olicy\s*[N|n]umber\s*\b([A-Z0-9]+)\b.*")
line = "Policy Number    ABCD000012345    other text follows in same line...."
matches = pattern.match(line)
id_res = matches.group(1)
print(id_res)  # ABCD000012345

And if there's always 2 words before you can use (?:\w+\s+){2}\b([A-Z0-9]+)\b.*

Also \s is for [\r\n\t\f\v ] so no need to repeat them, your [\n\r\s\t] is just \s

Solution 2:

you don't need the upper and lower case p and n specified since you're already specifying case insensitive.

Also \s already covers \n, \t and \r.

(?i)policy\s+number\s+([A-Z]{4}\d+)\b

for verification purpose: Regex

Another Answer :

^[\s\w]+\b([A-Z]{4}\d+)\b

for verification purpose: Regex

I like this better, in case your text changes from policy number

Python Guru

Trying To Find The Regex For This Particular Case? Also Can I Parse This Without Creating Groups?

Solution 1:

Solution 2:

Post a Comment for "Trying To Find The Regex For This Particular Case? Also Can I Parse This Without Creating Groups?"