Skip to content Skip to sidebar Skip to footer

Python Read Csv File Columns Into Lists, Ignoring Headers

I have a file 'data.csv' that looks something like ColA, ColB, ColC 1,2,3 4,5,6 7,8,9 I want to open and read the file columns into lists, with the 1st entry of that list omitted,

Solution 1:

Just to spell this out for people trying to solve a similar problem, perhaps without Pandas, here's a simple refactoring with comments.

import csv

# Open the file in 'r' mode, not 'rb'
csv_file = open('data.csv','r')
dataA = []
dataB = []
dataC = []

# Read off and discard first line, to skip headers
csv_file.readline()

# Split columns while reading
for a, b, c in csv.reader(csv_file, delimiter=','):
    # Append each variable to a separate list
    dataA.append(a)
    dataB.append(b)
    dataC.append(c)

This does nothing to convert the individual fields to numbers (use append(int(a)) etc if you want that) but should hopefully be explicit and flexible enough to show you how to adapt this to new requirements.


Solution 2:

Use Pandas:

import pandas as pd

df = pd.DataFrame.from_csv(path)
rows = df.apply(lambda x: x.tolist(), axis=1)

Solution 3:

To skip the header, create your reader on a seperate line. Then to convert from a list of rows to a list of columns, use zip():

import csv

with open('data.csv', 'rb') as f_input:
    csv_input = csv.reader(f_input)
    header = next(csv_input)
    data = zip(*[map(int, row) for row in csv_input])

print data

Giving you:

[(1, 4, 7), (2, 5, 8), (3, 6, 9)]

So if needed:

dataA = data[0]

Solution 4:

Seems like you have OSX line endings in your csv file. Try saving the csv file as "Windows Comma Separated (.csv)" format.

There are also easier ways to do what you're doing with the csv reader:

csv_array = []
with open('data.csv', 'r') as csv_file:
    reader = csv.reader(csv_file)
    # remove headers
    reader.next() 
    # loop over rows in the file, append them to your array. each row is already formatted as a list.
    for row in reader:
        csv_array.append(row)

You can then set dataA = csv_array[0]


Solution 5:

First if you read the csv file with csv.reader(csv_file, delimiter=','), you will still read the header.

csv_array[0] will be the header row -> ['ColA', ' ColB', ' ColC']

Also if you're using mac, this issues is already referenced here: CSV new-line character seen in unquoted field error

And I would recommend using pandas&numpy instead if you will do more analysis using the data. It read the csv file to pandas dataframe. https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html


Post a Comment for "Python Read Csv File Columns Into Lists, Ignoring Headers"