Skip to content Skip to sidebar Skip to footer

How To Iterate Over Arbitrary Number Of Files In Parallel In Python?

I have a list of file objects in a list called paths I'd like to be able to go through and read the first line of each file, do something with this n-tuple of data, then move on th

Solution 1:

import itertools
for line_tuple in itertools.izip(*files):
    whatever()

I'd use zip, but that would read the entire contents of the files into memory. Note that files should be a list of file objects; I'm not sure what you mean by "list of file handlers".

Solution 2:

This depends on how "arbitrary" it actually is. As long as the number is less than the limit of your OS, then itertools.izip should work just fine (or itertools.izip_longest as appropriate).

files = [open(f) for f in filenames]
forlinesin itertools.izip(*files):
    # do something

for f in files:
    f.close()

If you can have more files than your OS will allow you to open, then you're out of luck (at least as far as an easy solution is concerned).

Solution 3:

the first idea pop into my mind the following code , it seems too Straightforward

fp_list= []
for file in path_array:fp=open(file)fp_list.append(fp)line_list= []
for fp in fp_list:line=fp.readline()line_list.append(line)## you code here process the line_listfor fp in fp_list:fp.close()

Post a Comment for "How To Iterate Over Arbitrary Number Of Files In Parallel In Python?"