Skip to content Skip to sidebar Skip to footer

Scrape Many Products With The Same Name Html. I Can Only Scrape One, Not All Of Them

I have successfully performed a little scraping from a site through Selenium. The data downloaded without problems. Good! On the site there are many products that, in Html, have th

Solution 1:

You are going to need to find all elements for each of the kinds of things you are looking for. So, start with:

Product_Names = driver.find_elements_by_class_name("tablet-desktop-only")
Product_Descriptions = driver.find_elements_by_class_name("h-text-left")
Vendors = driver.find_elements_by_class_name("in-vendor")
Prices = driver.find_elements_by_class_name("h-text-center")

You should have 4 lists of elements (not strings), each of which should be the same length, and picking up things in the same order. To be safe we will choose to work with the shortest list.

Num_Groups = min(len(Product_Names),len(Product_Descriptions),len(Vendors), len(Prices))

Then we loop over all 4 lists at the same time:

for i in range(Num_Groups):
    print(Product_Names[i].text)
    print(Product_Descriptions[i].text)
    print(Vendors[i].text)
    print(Prices[i].text)
    #you might want to add printing a blank line here

Note we need .text here so we get the text of the element, not a description of the element itself. Also note the [i] to get that element in the list.

Within this loop is where you would do your database inserts (though probably connect outside the loop), making sure to merge the .text into the SQL string, not the element's string representation.


Solution 2:

To find multiple elements with a specific class, we can use find_elements_by_class_name (The difference with the function you wrote is that in this function you should write element, instead of elements!). This function returns a list from which we can select the desired element from its indexes. Note that this gives you a list and you can not use text on it, but you must use it on its indexes. Example :

elements = find_elements_by_class_name('tablet-desktop-only')
print( elements[0].text )
# Or using a for :
for element in elements:
 print(element.text)

Solution 3:

Just for an example if tablet-desktop-only represent multiple value for Product name. You should use find_elements not find_element

name = driver.find_elements_by_class_name ("tablet-desktop-only")
for nme in name:
    print(nme.text)

You can easily replicate this for others like Description , Vendor and Price

Update 1 :

above name is a list in Python, similarly you can have list for Description , Vendor and Price

Now we have 4 list, we can print items one by one like this :

for seq in name + Description + Vendor + Price:
    print(seq)

Post a Comment for "Scrape Many Products With The Same Name Html. I Can Only Scrape One, Not All Of Them"