Skip to content Skip to sidebar Skip to footer

Pass Argument To Findall In Bs4 In Python

I need help with using bs4 in a function. If I want to pass the path to findAll (or find) through function, it does not work. Please see the sample below. from bs4 import Beautifu

Solution 1:

from bs4 import BeautifulSoup
data = '<h1 class="headline">Willkommen!</h1>' 

def check_text(path, value):

    soup = BeautifulSoup(''.join(data), "lxml")

    x1 = "h1", {"class":"headline"}
    print (type(x1), 'soup.findAll(x1)===', soup.findAll(x1))
    print (type(path), 'soup.findAll(path)===', soup.findAll(**path))

    for i in soup.findAll(x1):
        print ('x1, text=', i.getText())

    for i in soup.findAll(**path):    
        print ('path, text=', i.getText())


check_text({'name' : "h1", 'attrs': {"class": "headline"} }, 'Willkommen!')

instead of passing as a string, pass a dictionary, whose elements can be passed as keyword arguments to the called function.

Solution 2:

The findAll method takes a tag name as first parameter, and not a path. It returns all the tags whose name matches the one passed, that are descendants of the tag on which it is called. This is the only way it is intended to be used, ie it is not meant to receive a path. Check the documentation for more details.

Now, soup.findAll(path) will look for the tags whose name is path. Since path = '"h1", {"class": "headline"}', soup.findAll(path) will look for the <'"h1", {"class": "headline"}'> tags in the HTML string, which most likely doesn't exist.

So basically, there's no such thing as a "path". Still, the syntax you're using makes me think that you want the tags whose class attribute is equal to "headline". The way to specify attributes to the findAll method is passing them as a dictionary to the attrs argument. What you probably mean to do is:

soup.findAll('h1', attrs={'class': "headline"}, text="wilkommen")

Post a Comment for "Pass Argument To Findall In Bs4 In Python"