Using Minidom To Parse Xml
Hi I have trouble understanding the minidom module for Python. I have xml that looks like this: Dexter 7
Solution 1:
Each episode
element has child-elements, including a title
element. Your code, however, is looking for attributes instead.
To get text out of a minidom element, you need a helper function:
def getText(nodelist):
rc = []
for node in nodelist:
if node.nodeType == node.TEXT_NODE:
rc.append(node.data)
return''.join(rc)
And then you can more easily print all the titles:
forepisodein xml.getElementsByTagName('episode'):
fortitlein episode.getElementsByTagName('title'):
print getText(title)
Solution 2:
title
is not an attribute, its a tag. An attribute is like src
in <img src="foo.jpg" />
>>> parsed = parseString(s)
>>> titles = [n.firstChild.data for n in parsed.getElementsByTagName('title')]
>>> titles
[u'Dexter', u'Crocodile', u'Popping Cherry']
You can extend the above to fetch other details. lxml
is better suited for this though. As you can see from the snippet above minidom is not that friendly.
Solution 3:
Thanks to Martijn Pieters who tipped me with the ElementTree API I solved this problem.
xml = ET.parse(urlopen("http://services.tvrage.com/feeds/episode_list.php?sid=7296"))
print'xml fetched..'for episode in xml.iter('episode'):
print episode.find('title').text
Thanks
Post a Comment for "Using Minidom To Parse Xml"