Skip to content Skip to sidebar Skip to footer

Regex Pattern To Exlude Specific String

I have a string that have this format: some text septembar 1989 And I'm using this regex find the month and year part
(?!=b\.)(.*?\b\d{4}\b) and I

Solution 1:

Try this

<br/?>([^<]+)\d{4}

[^<] means match anything except an opening tag which is what you want.

Solution 2:

I written a simple code, may be you find helpful unto some extend:

import re
defgetDate(str):
 m = re.match("[\<br>]*[\w\s]*\<br>([\w\s]*[12][0-9]{3})",str);
 return m.group(1)

print getDate("some text <br>dec 1989<br>");
print getDate("<br> some text <br>septembar 1989<br>");
print getDate("grijesh chuahan <br>feb 2009<br>");

Output:

dec1989septembar1989feb2009

Solution 3:

import re

ss = 'dfgqeg<br>some text <br>septembar 1989<br>'

reg = re.compile('<br(?: /)?>''(?!.+?<br(?: /)?>.+?<br(?: /)?>)''(.+?\d{4})''<br(?: /)?>')

print reg.search(ss).group(1)

.

  • '<br(?: /)?>' catches <br> and <br /> occurrences

.

  • '(?!.+?<br(?: /)?>.+?<br(?: /)?>)' is a look-ahead assertion, it verifies that after the position where it starts in the analyzed text, there isn't the suite of characters described as a succession of :

    • .+? any kind of characters, but the ? orders that this portion must stop as soon as <br> or <br /> is encountered
    • <br> or <br />
    • again any kind of characters stopping before <br> or <br />
    • <br> or <br />

Post a Comment for "Regex Pattern To Exlude Specific String"