Get The Contents(full Of Text) From The Paragraph Beautiful Soup
I want to extract the contents (full of text) of a paragraph from a news webpages, I have a set of url's from which it should extract only the content of a paragraphs. When i use t
Solution 1:
This is because you are having print p.read()
line that prints out the whole HTML page.
To get the article text, find it by id
and then all paragraphs inside the article.
Example using CSS Selector
:
soup = BeautifulSoup(p)
print''.join(p.text for p in soup.select('article#story p.story-content'))
Prints:
ANKARA, Turkey — The Obama administration on Monday began the work of trying to determine
...
FYI, article#story p.story-content
would match all p
tags that have story-content
class inside the article
tag with story
id.
Post a Comment for "Get The Contents(full Of Text) From The Paragraph Beautiful Soup"