Skip to content Skip to sidebar Skip to footer

Parse URL With Regex In Python

I want to get the query name and values to be displayed from URL. For eg. url='http://host:port_num/file/path/file1.html?query1=value1&query2=value2' from this parse the query

Solution 1:

Don't use a regex! Use urlparse.

>>> import urlparse
>>> urlparse.parse_qs(urlparse.urlparse(url).query)
{'query2': ['value2'], 'query1': ['value1']}

Solution 2:

I agree that it's best not to use regex and better to use urlparse but here is my regex. Classes like urlparse were developed specifically to handle all urls efficiently and are much more reliable than regex is so make use of them if you can.

>>> x = 'http://www.example.com:8080/abcd/dir/file1.html?query1=value1&query2=value2'
>>> query_pattern='(query\d+)=(\w+)'
>>> # query_pattern='(\w+)=(\w+)'    a more general pattern
>>> re.findall(query_pattern,x)
[('query1', 'value1'), ('query2', 'value2')]

Post a Comment for "Parse URL With Regex In Python"