Skip to content Skip to sidebar Skip to footer

Why My Python Code Is Extracting The Same Data For All The Elements In My List?

My project consists of making a competitive watch table for hotel rates for an agency. It is a painful action that I wanted to automate, the code extract correctly the name of hote

Solution 1:

  1. Your code is outdated. The HTML has been changed/updated and elements such as the one with identity boutonr doesn't exist on the page anymore.
  2. Your loop and order of execution is wrong so this makes the code evaluating still the same fields.
  3. You should not use or at least minimise the usage of time.sleep() to an absolute minimum as it is a waste of time for your code execution. Use WebDriverWait(...) instead

I don't speak French so I could not understand what you are after in your code, but this minimised example below should help you to understand the principle.

#!/usr/bin/env python# coding: utf-8import time
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait, Select
from selenium.common.exceptions import StaleElementReferenceException, NoSuchElementException
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

driver = webdriver.Chrome("C:\chromedriver.exe")
driver.get('https://tn.tunisiebooking.com/')

# params to select
params = {  'destination': 'Nabeul',
            'date_from': '25/08/2021',
            'date_to': '26/08/2021',
            'bedroom': '1' }

# select destination
destination_select = Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.ID, 'ville_des'))))
destination_select.select_by_value(params['destination'])

# select bedroom
bedroom_select = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'select_ch'))))
bedroom_select.select_by_value(params['bedroom'])

# select dates
script = f"document.getElementById('checkin').value ='{params['date_from']}';"
script += f"document.getElementById('checkout').value ='{params['date_to']}';"
script +=  f"document.getElementById('depart').value ='{params['date_from']}';"
script += f"document.getElementById('arrivee').value ='{params['date_to']}';"
driver.execute_script(script)

# submit form
btn_rechercher = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//div[@onclick="return submit_hotel_recherche()"]')))
btn_rechercher.click()

urls = []
hotels = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//div[starts-with(@id,'produit_affair')]")))

for hotel in hotels:
    link = hotel.find_element_by_xpath(".//span[@class='tittre_hotel']/a").get_attribute("href")
    urls.append(link)

for url in urls:
    driver.get(url)
    try:
        name = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//div[@class='bloc_titre_hotels']/h2"))).text
        arropt = driver.find_element_by_xpath("//div[contains(@class,'line_result')][1]")
        opt = arropt.find_element_by_tag_name("b").text
        num = len(arropt.find_elements_by_tag_name("option"))
        optiondata = {}
        achats = {}
        marges= {}

        for i inrange(num):
            try:
                selection = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'arrangement')))).select_by_index(i)
                time.sleep(0.5)

                arr = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//select[@id='arrangement']/option[@selected='selected']"))).text
                prize = driver.find_element_by_id("prix_total").text

                optiondata[arr] = int(prize)

            except StaleElementReferenceException:
                passprint("{} : {} - {}".format(name, opt, optiondata))

    except NoSuchElementException:
        pass

driver.quit()

Result:

Byzance Nabeul : Chambre Double - {'All Inclusive soft': 93, 'Demi Pension': 38, 'Petit Dejeuner': 28, 'Pension Complete': 78}
Palmyra Club Nabeul Nabeul : Double Standard - {'All Inclusive soft': 92}

The following code goes to the payment page and extracts all the info there:

#!/usr/bin/env python# coding: utf-8import time
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait, Select
from selenium.common.exceptions import StaleElementReferenceException, NoSuchElementException
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

driver = webdriver.Chrome("/usr/local/bin/chromedriver")
driver.get('https://tn.tunisiebooking.com/')

# params to select
params = {
    'destination': 'Nabeul',
    'date_from': '29/08/2021',
    'date_to': '30/08/2021',
    'bedroom': '1'
}

# select destination
destination_select = Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.ID, 'ville_des'))))
destination_select.select_by_value(params['destination'])

# select bedroom
bedroom_select = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'select_ch'))))
bedroom_select.select_by_value(params['bedroom'])

# select dates
script = f"document.getElementById('checkin').value ='{params['date_from']}';"
script += f"document.getElementById('checkout').value ='{params['date_to']}';"
script +=  f"document.getElementById('depart').value ='{params['date_from']}';"
script += f"document.getElementById('arrivee').value ='{params['date_to']}';"
driver.execute_script(script)

# submit form
btn_rechercher = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//div[@onclick="return submit_hotel_recherche()"]')))
btn_rechercher.click()

urls = []
hotels = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//div[starts-with(@id,'produit_affair')]")))

for hotel in hotels:
    link = hotel.find_element_by_xpath(".//span[@class='tittre_hotel']/a").get_attribute("href")
    urls.append(link)

for url in urls:
    driver.get(url)
    try:
        name = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//div[@class='bloc_titre_hotels']/h2"))).text
        arropt = driver.find_element_by_xpath("//div[contains(@class,'line_result')][1]")
        opt = arropt.find_element_by_tag_name("b").text
        num = len(arropt.find_elements_by_tag_name("option"))
        optiondata = {}
        achats = {}
        marges= {}
        try:
            selection = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'arrangement'))))
            time.sleep(0.5)

            arr = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//select[@id='arrangement']/option[@selected='selected']"))).text
            prize = driver.find_element_by_id("prix_total").text

            optiondata[arr] = (int(prize))

            btn_passe = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'resa')))
            btn_passe.click()

            tot = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'montant_total_apres_code')))
            total = int(tot.text.replace(' €', ''))

            # params to select
            params = {
                        'civilite_acheteur': 'Mlle',
                        'prenom_acheteur': 'test',
                        'nom_acheteur': 'test',
                        'e_mail_acheteur': 'test@gmail.com',
                        'portable_acheteur': '22222222',
                        'ville_acheteur': 'Test',
                    }

            # select civilite
            civilite_acheteur = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.NAME, 'civilite_acheteur'))))
            civilite_acheteur.select_by_value(params['civilite_acheteur'])

            # saisir prenom 
            script  = f"document.getElementsByName('prenom_acheteur')[0].value ='{params['prenom_acheteur']}';"
            script += f"document.getElementsByName('nom_acheteur')[0].value ='{params['nom_acheteur']}';"
            script += f"document.getElementsByName('e_mail_acheteur')[0].value ='{params['e_mail_acheteur']}';"
            script += f"document.getElementsByName('portable_acheteur')[0].value ='{params['portable_acheteur']}';"
            script += f"document.getElementsByName('ville_acheteur')[0].value ='{params['ville_acheteur']}';"
            driver.execute_script(script)

            # submit form
            btn_agence = driver.find_element_by_class_name('continuez_resa')
            btn_agence.click()
            
            achat1 = int(driver.find_element_by_id('montant_a_payer').text.replace(' €', ''))
            achat = int(driver.find_element_by_id('montant_restant').text.replace(' €', ''))
            achat3 = float(driver.find_element_by_xpath('//div[@class="ligne_interne_total"]/div[3]/div[@class="prix_total1 text_shadow"]').text.replace(' TND', ''))
            achats[arr]=achat

            marge =int(((float(prize) - float(achat)) / float(achat)) * 100);
            marges[arr]=marge
            optiondata[arr]=prize,total,achat1,achat,achat3,marge

        except StaleElementReferenceException:
            passprint("{} : {} - {}".format(name, opt, optiondata))

    except NoSuchElementException:
        pass
    
driver.quit()

Output:

Byzance Nabeul : Chambre Double - {'Petit Dejeuner': (36, 41, 12, 29, 4.0, 24)}

Where:

36 = Prix Total
41 = Montant Total
12 = Montant de l'acompte
29 = Vous payerez le reste à votre arrivée à l'hôtel
4.0 = Total taxe de séjour à payer sur place à l'hôtel est
24 = Marges

Hotel page:

enter image description here

Solution 2:

You are using sleeps to load the pages in your first example but not in your second one (the one that you state works just fine).

This is typically not the way you want to actually use selenium and leads me to believe that your timing is off.

This SO answer shows you how to use "Explicit Waits" on "expected_conditions" to not have "specific timings" which can/will fail.

You even create a wait object but never use it.

Use it in conjunction with expected_conditions and remove the specific timed sleeps and things will get better.

expected_conditions docs are here

Solution 3:

The problem was that it can't access to the element listing arrangements for the rest of the hotels in the list i've added a function that tests the presence of the data and it workod

for url in urls:
    driver.get(url)
       
    def existsElement(xpath):
        try:
            driver.find_element_by_id(xpath);
        except NoSuchElementException:
            return "false"
        else:
            return "true"
   
    if (existsElement('result_par_arrangement')=="false"):
   
        btn_t = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="moteur_rech"]/form/div/div[3]/div')))

        btn_t.click()
    else :
        pass

Post a Comment for "Why My Python Code Is Extracting The Same Data For All The Elements In My List?"