Skip to content Skip to sidebar Skip to footer

How To Delete And Replace Columns In A Csv File By Comparing It To Other Csv Files In Python?

I am writing a python code to search,delete and replace columns in a csv file I have 3 files. Input.csv: aaaaaaaa,bbbbbb,cccccc,ddddddd eeeeeeee,ffffff,gggggg,hhhhhhh iiiiiiii,jjj

Solution 1:

I would suggest changing this up a bit:

  • read the things you want to replace in a dictionary
    • set key to what is in your datas 0th spot, set value as what to replace the 0th and 1st spot of your data with
  • read the things you want to delete into a set
    • if your data-row start with it: skip row else add it to the output.

Loop over your data and use the both lookups to "do the right thing".

I changed your data a bit to incorperate the mentioned "escaped" data including newlines:

File creation:

with open("i.csv","w") as f: 
    f.write("""
aaaaaaaa,bbbbbb,cccccc,ddddddd
eeeeeeee,ffffff,gggggg,hhhhhhh
iiiiiiii,jjjjjj,kkkkkk,lllllll
"mmmm
mmmm",nnnnnn,oooooo,ppppppp
qqqqqqqq,rrrrrr,ssssss,ttttttt
uuuuuuuu,vvvvvv,wwwwww,xxxxxxx""")

with open ("d.csv","w") as f: 
    f.write("""
aaaaaaaa
eeeeeeee
uuuuuuuu""")

with open ("r.csv","w") as f: 
    f.write("""
iiiiiiii,11111111,22222222
"mmmm
mmmm",33333333,44444444""")

Programm:

import csv

def read_file(fn):
    rows = [] 
    with open(fn) as f:
        reader = csv.reader(f, quotechar='"',delimiter=",")
        for row in reader:
            if row:                     # eliminate empty rows from data read
                rows.append(row)
    return rows 

# create a dict for the replace stuff        
replace = {x[0]:x[1:] for x in read_file("r.csv")}

# create a set for the delete stuff
delete = set( (row[0] for row in read_file("d.csv")) )  

# collect what we need to write back
result = []

# https://docs.python.org/3/library/csv.html
with open("i.csv") as f:
    reader = csv.reader(f, quotechar='"')
    for row in reader:
        if row:
            if row[0] in delete:
                continue                                   # skip data row
            elif row[0] in replace:
                # replace with mapping, add rest of row
                result.append(replace[row[0]] + row[2:])   # replace data
            else:
                result.append(row)                         # use as is

# write result back into file
with open ("done.csv", "w", newline="") as f:
    w = csv.writer(f,quotechar='"', delimiter= ",")
    w.writerows(result)

Check result:

with open ("done.csv") as f:
    print(f.read()) 

Output:

11111111,22222222,kkkkkk,lllllll
33333333,44444444,oooooo,ppppppp
qqqqqqqq,rrrrrr,ssssss,ttttttt

Doku:


Post a Comment for "How To Delete And Replace Columns In A Csv File By Comparing It To Other Csv Files In Python?"