Skip to content Skip to sidebar Skip to footer

How Can I Traverse A File System With A Generator?

I'm trying to create a utility class for traversing all the files in a directory, including those within subdirectories and sub-subdirectories. I tried to use a generator because g

Solution 1:

Why reinvent the wheel when you can use os.walk

import osfor root, dirs, files inos.walk(path):
    for name in files:
        printos.path.join(root, name)

os.walk is a generator that yields the file names in a directory tree by walking the tree either top-down or bottom-up

Solution 2:

I agree with the os.walk solution

For pure pedantic purpose, try iterate over the generator object, instead of returning it directly:


def grab_files(directory):
    for name inos.listdir(directory):
        full_path = os.path.join(directory, name)
        ifos.path.isdir(full_path):
            for entry in grab_files(full_path):
                yield entry
        elif os.path.isfile(full_path):
            yield full_path
        else:
            print('Unidentified name %s. It could be a symbolic link' % full_path)

Solution 3:

As of Python 3.4, you can use the glob() method from the built-in pathlib module:

import pathlib
p = pathlib.Path('.')
list(p.glob('**/*'))    # lists all files recursively

Solution 4:

Starting with Python 3.4, you can use the Pathlib module:

In [48]: def alliter(p):
   ....:     yield p
   ....:     for sub in p.iterdir():
   ....:         if sub.is_dir():
   ....:             yield from alliter(sub)
   ....:         else:
   ....:             yield sub
   ....:             

In [49]: g = alliter(pathlib.Path("."))                                                                                                                                                              

In [50]: [next(g) for _ in range(10)]
Out[50]: 
[PosixPath('.'),
 PosixPath('.pypirc'),
 PosixPath('.python_history'),
 PosixPath('lshw'),
 PosixPath('.gstreamer-0.10'),
 PosixPath('.gstreamer-0.10/registry.x86_64.bin'),
 PosixPath('.gconf'),
 PosixPath('.gconf/apps'),
 PosixPath('.gconf/apps/gnome-terminal'),
 PosixPath('.gconf/apps/gnome-terminal/%gconf.xml')]

This is essential the object-oriented version of sjthebats answer. Note that the Path.glob** pattern returns only directories!

Solution 5:

os.scandir() is a "function returns directory entries along with file attribute information, giving better performance [than os.listdir()] for many common use cases." It's an iterator that does not use os.listdir() interally.

Post a Comment for "How Can I Traverse A File System With A Generator?"