Convert Data To Dataframe In Python
With the help of @JaSON, here's a code that enables me to get the data in the table from local html and the code uses selenium from selenium import webdriver driver = webdriver.Ch
Solution 1:
I have modified your code to do a simple output. This is not very pythonic as it does not use vectorized creation of the Dataframe, but here is how it works. First set up pandas second set up a dataframe (but we don't know the columns yet) then set up the columns on the first pass (this will cause problems if there are variable column lengths Then input the values into the dataframe
import pandas as pd
from selenium import webdriver
driver = webdriver.Chrome("C:/chromedriver.exe")
driver.get('file:///C:/Users/Future/Desktop/local.html')
counter = len(driver.find_elements_by_id("Section3"))
xpath = "//div[@id='Section3']/following-sibling::div[count(preceding-sibling::div[@id='Section3'])={0} and count(following-sibling::div[@id='Section3'])={1}]"print(counter)
df = pd.Dataframe()
for i inrange(counter):
print('\nRow #{} \n'.format(i + 1))
_xpath = xpath.format(i + 1, counter - (i + 1))
cells = driver.find_elements_by_xpath(_xpath)
if i == 0:
df = pd.DataFrame(columns=cells) # fill the dataframe with the column namesfor cell in cells:
value = cell.find_element_by_xpath(".//td").text
#print(value)ifnot value: # check the string is not empty# always puting the value in the first item
df.at[i, 0] = value # put the value in the frame
df.to_csv('filename.txt') # output the dataframe to a file
How this could be made better is to put the items in a row into a dictionary and put them into the datframe. but I am writing this on my phone so I cannot test that.
Solution 2:
With the great help of @Paul Brennan, I could modify the code so as to get the final desired output
import pandas as pd
from selenium import webdriver
driver = webdriver.Chrome("C:/chromedriver.exe")
driver.get('file:///C:/Users/Future/Desktop/local.html')
counter = len(driver.find_elements_by_id("Section3"))
xpath = "//div[@id='Section3']/following-sibling::div[count(preceding-sibling::div[@id='Section3'])={0} and count(following-sibling::div[@id='Section3'])={1}]"
finallist = []
for i inrange(counter):
#print('\nRow #{} \n'.format(i + 1))
rowlist=[]
_xpath = xpath.format(i + 1, counter - (i + 1))
cells = driver.find_elements_by_xpath(_xpath)
#if i == 0:#df = pd.DataFrame(columns=cells) # fill the dataframe with the column namesfor cell in cells:
try:
value = cell.find_element_by_xpath(".//td").text
rowlist.append(value)
except:
break
finallist.append(rowlist)
df = pd.DataFrame(finallist)
df[df.columns[[2, 0, 1, 7, 9, 8, 3, 5, 6, 4]]]
The code works well now but it is too slow. Is there a way to make it faster?
Post a Comment for "Convert Data To Dataframe In Python"