Skip to content Skip to sidebar Skip to footer

Python Pandas: Error Tokenizing Data. C Error: Eof Inside String Starting When Reading 1gb Csv File

I'm reading a 1 GB CSV file in chunks of 10,000 rows. The file has 1106012 rows and 171 columns, other smaller sized file does not show any error and finish off successfully but wh

Solution 1:

If you are under linux, try to remove all non printable caracter. Try to load your file after this operation.

tr -dc '[:print:]\n' < file > newfile

Solution 2:

I inquired many solutions, some of them worked but It affected the calculous used this one and it will skip the line that is causing the error:

pd.read_csv(file,engine='python', error_bad_lines=False) 

#engine='python' provides a better output

Post a Comment for "Python Pandas: Error Tokenizing Data. C Error: Eof Inside String Starting When Reading 1gb Csv File"