Skip to content Skip to sidebar Skip to footer

How To Fetch / Grab Polymer Spa Webpage By Using Python With Headless Server And No GUI

I'm trying to grab the content of the following url: https://docs-05-dot-polymer-project.appspot.com/0.5/articles/demos/spa/final.html My goal is to grab the content (source code)

Solution 1:

I think you are missing something from the Selenium Webdriver docs. You can get the content of a dynamic page, but you have to make sure that the element you are searching is present and visible on the page:

import platform
from selenium import webdriver

browser = webdriver.PhantomJS()
browser.get('https://docs-05-dot-polymer-
project.appspot.com/0.5/articles/demos/spa/final.html')

# Getting content of the first slide
res1 = browser.find_element_by_xpath('//*[@id="pages"]/section[1]/div')

# Save a screenshot so you can see why is failing (if it is)
browser.save_screenshot('screen_test')

# Print the text within the div
print (res1.text)

If you need to get also the text of the other slides, you need to click (using the webdriver) where needs to make visible the second slide, before getting the text from it.


Post a Comment for "How To Fetch / Grab Polymer Spa Webpage By Using Python With Headless Server And No GUI"