Skip to content Skip to sidebar Skip to footer

Spider Not Returning All Results After Changing My Item Pipelines To An If And Elif Statement

If you look here I could not get two different spiders to automatically add the results to a mysql database. Now I've added an if and elif statement and they work but they miss ou

Solution 1:

I managed to reproduce the issue when testing with Sqlite3 and the number of errors in the scrapy log corresponded to the missing entries. These errors were caused by unescaped single quotes in the BristolQualification item field (and presumably the Bath spider suffers from the same problem) causing havoc (such as d'Etudes in the snippet below):

Candidates holding a Dipl\xf4me de Technicien Sup\xe9rieur / Sciences Appliqu\xe9es with suitable grades or those with the Dipl\xf4me d'Etudes Universitaires G\xe9n\xe9rales (DEUG) with good grades in suitable subjects will be considered for appropriate undergraduate courses.

I managed to get it working (at least with SQLite3) by breaking up the join and encoding of the qualification item field. The code below should work, but please note that it is untested with MySQL. If any errors occur, then check the scrapy log errors and let me know if there are any problems.

defprocess_item(self, item, spider):
    try:
        if'BristolQualification'in item:
            qualification = ''.join(s for s in item['BristolQualification'])
            qualification.encode('utf8')
            self.cursor.execute("INSERT INTO Bristol(BristolCountry, BristolQualification) VALUES (?, ?)", (item['BristolCountry'], qualification))
        elif'BathQualification'in item:
            qualification = ''.join(s for s in item['BathQualification'])
            qualification.encode('utf8')
            self.cursor.execute("INSERT INTO Bath(BathCountry, BathQualification) VALUES (?, ?)", (item['BathCountry'], qualification))
        self.conn.commit()
        return item

    except MySQLdb.Error as e:
        print"Error %d: %s" % (e.args[0], e.args[1])

Post a Comment for "Spider Not Returning All Results After Changing My Item Pipelines To An If And Elif Statement"