Skip to content Skip to sidebar Skip to footer

How To Browse A Whole Website Using Selenium?

Is it possible to go through all the URIs of a given URL (website) using selenium ? My aim is to launch firefox browser using selenium with a given URL of my choice (I know how to

Solution 1:

You can use a recursive method in a class such as the one given below to do this.

public class RecursiveLinkTest {
    //list to save visited links
    static List<String> linkAlreadyVisited = new ArrayList<String>();
    WebDriver driver;

    public RecursiveLinkTest(WebDriver driver) {
        this.driver = driver;
    }

    public void linkTest() {
        // loop over all the a elements in the page
        for(WebElement link : driver.findElements(By.tagName("a")) {
            // Check if link is displayed and not previously visited
            if (link.isDisplayed() 
                        && !linkAlreadyVisited.contains(link.getText())) {
                // add link to list of links already visited
                linkAlreadyVisited.add(link.getText());
                System.out.println(link.getText());
                // click on the link. This opens a new page
                link.click();
                // call recursiveLinkTest on the new page
                new RecursiveLinkTest(driver).linkTest();
            }
        }
        driver.navigate().back();
    }

    public static void main(String[] args) throws InterruptedException {
        WebDriver driver = new FirefoxDriver();
        driver.get("http://newtours.demoaut.com/");
        // start recursive linkText
        new RecursiveLinkTest(driver).linkTest();
    }
}

Hope this helps you.


Solution 2:

As Khyati mentions it is possible, however, selenium not a webcrawler or robot. You have to know where/what you are trying to test.

If you really want to go down that path I would recommend that you hit the page, pull all elements back and then loop through to click any elements that would correspond to navigation functionality (i.e. "//a" or hyperlink click).

Although if you go down this path and there is a page that opens another page then has a link back you would want to keep a list of all visited URL's and make sure that you don't duplicate a page like that.

This would work, but would also require a bit of logic in it to make it happen...and you might find yourself in an endless loop if you aren't careful.


Solution 3:

I know you asked for a python example, but I was just in the middle of setting up a simple rep o for protractor testings and the task you want to accomplish seems to be very easy to do with protractor (which is just a wrapper around webdriver)

here is the code in javascript:

describe( 'stackoverflow scrapping', function () {
  var ptor = protractor.getInstance();

  beforeEach(function () {
    browser.ignoreSynchronization = true;
  } );

  afterEach(function () {

  } );

  it( 'should find the number of links in a given url', function () {
    browser.get( 'http://stackoverflow.com/questions/24257802/how-to-browse-a-whole-website-using-selenium' );

    var script = function () {
      var cb = arguments[ 0 ];
      var nodes = document.querySelectorAll( 'a' );
      nodes = [].slice.call( nodes ).map(function ( a ) {
        return a.href;
      } );
      cb( nodes );
    };

    ptor.executeAsyncScript( script ).then(function ( res ) {
      var visit = function ( url ) {
        console.log( 'visiting url', url );
        browser.get( url );
        return ptor.sleep( 1000 );
      };

      var doVisit = function () {
        var url = res.pop();
        if ( url ) {
          visit( url ).then( doVisit );
        } else {
          console.log( 'done visiting pages' );
        }
      };

      doVisit();

    } );
  } );

} );

You can clone the repo from here

Note: I know protractor is probably not the best tool for it, but it was so simple to do it with it that I just give it a try.

I tested this with firefox (you can use the firefox-conf branch for it, but it will require that you fire webdriver manually) and chrome. If you're using osx this should work with no problem (assuming you have nodejs installed)


Solution 4:

Selenium API provides all the facility via which you can do various operations like type ,click , goto , navigateTo , switch between frames, drag and drop, etc. What you are aiming to do is just browsing in simple terms, clicking and providing different URls within the website also ,if I understood properly. Ya , you can definitely do it via Selenium webdriver. And you can make a property file, for better ease and readiness where-in you can pass different properties like URLs , Base URI ,etc and do the automation testing via Selenium Webdriver in different browsers.


Solution 5:

This is possible. I have implemented this using Java webdriver and URI. This was mainly created to identify the broken links.

Using "getElements" having tag can be get using webdriver once open and save "href" value.

Check all link status using URL class of java and Put it in stack.

Then pop link from stack and "get" link using Webdriver. Again get all the links from the page remove duplicate links which are present in stack.

Loop this until stack is empty.

You can update it as per your requirements. Such as levels of traversing, excluding other links which are not having domain of the given website etc.

Please comment if you are finding difficulty in implementation.


Post a Comment for "How To Browse A Whole Website Using Selenium?"