Skip to content Skip to sidebar Skip to footer

Formatting Csv File With Python

I have a csv file with the following structure: 'txNomeParlamentar';'ideCadastro';'nuCarteiraParlamentar';'nuLegislatura';'sgUF' 'AVANTE;1;1;2015;PP' I need him to stay like this:

Solution 1:

First: tell the reader to use delimiter=";" and quoting=csv.QUOTE_NONE. This will properly split your second line which is a string literal containing your delimiter, which you desire to be split. We'll tweak that data to remove the quotation marks (otherwise our output will be quoted strings like '"txNomeParlamentar"', etc).

import csv
withopen('file.txt') as f:
     reader = csv.reader(f, delimiter=";", quoting=csv.QUOTE_NONE)
     data = [list(map(lambda s: s.replace('"', ''), row)) for row in reader]

Then: we write the file back out, with the delimiter=";", and quoting=csv.QUOTE_ALL to ensure each item is set in quotes

withopen('out.txt', 'w', newline='') as o:
     writer = csv.writer(o, delimiter=";", quoting=csv.QUOTE_ALL)
     writer.writerows(data)

Input:

"txNomeParlamentar";"ideCadastro";"nuCarteiraParlamentar";"nuLegislatura";"sgUF""AVANTE;1;1;2015;PP"

enter image description here

Output:

"txNomeParlamentar";"ideCadastro";"nuCarteiraParlamentar";"nuLegislatura";"sgUF""AVANTE";"1";"1";"2015";"PP"

enter image description here

Solution 2:

A couple things. First, you do NOT have a csv file because in a csv file, the delimiter is a comma by definition. I'm assuming you want the values in your data file to (1) remain separated by semicolons [why not fix it and make it commas?] and (2) you want each value to be in quotation marks.

If so, I think this will work:

# data reader

in_file = 'data.txt'
out_file = 'fixed.txt'output = open(out_file, 'w')
with open(in_file, 'r') as source:
    for line in source:
        # split by semicolon
        data = line.strip().split(';')             
        # remove all quotes found
        data = [t.replace('"','') for t in data]   
        for item in data[:-1]:
            output.write(''.join(['"', item, '"',';']))
        # write the last item separately, without the trailing ';'output.write(''.join(['"', item, '"']))
        output.write('\n')
output.close()

If your target user is python, you should consider replacing the semicolons with commas (correct csv format) and forgoing the quotes. Everything python reads from csv is taken in as string anyhow.

Solution 3:

Using csv module.

Ex:

import csv

withopen(filename) as csvfile:
    reader = csv.reader(csvfile, delimiter=";")
    headers = next(reader)    #Read Headers
    data = [row.strip('"').split(";") for row in csvfile]    #Format datawithopen(filename, "w") as csvfile_out:
    writer = csv.writer(csvfile_out, delimiter=";")
    writer.writerow(headers)   #Write Headers
    writer.writerows(data)     #Write data

Solution 4:

You could use the csv module to do it if you massage the input data a little first.

import csv


#input_csv = '/home/gustavo/Downloads/Redes/Despesas/csvfile.csv'
input_csv = 'gustavo_input.csv'
output_csv = 'dados_2018_1.csv'withopen(input_csv, 'r', newline='') as input_fd, \
     open(output_csv, 'w', newline='') as output_fd:

    reader = csv.DictReader(input_fd, delimiter=';')
    writer = csv.DictWriter(output_fd, delimiter=';',
                            fieldnames=reader.fieldnames,
                            quoting=csv.QUOTE_ALL)

    first_field = reader.fieldnames[0]
    for row in reader:
        fields = row[first_field].split(';')
        newrow = dict(zip(reader.fieldnames, fields))
        writer.writerow(newrow)

print('done')

Post a Comment for "Formatting Csv File With Python"