python - Matching filenames with list using a loop ignoring already "processed" files -
what want match set of files , sort out ones want (matching extention) ignoring other i've processed using list
what i've come far is
mylist = [] extensions = ['*.txt', '*.foo', '*.bar'] dirpath, dirnames, filenames in os.walk(directory): skip = none ext in extensions: filename in fnmatch.filter(filenames, ext): test in mylist: if test == filename: skip = true if not skip: ## thing mylist.append(filename) but ignoring if test statement. going blind?
you setting skip = true never reset skip, once skipped filename, rest skipped too. moreover, simple if filename not in mylist have sufficed, there no need explicit loop.
however, want use set here fast membership testing, , can simplify logic in case:
seen = set() extensions = ['*.txt', '*.foo', '*.bar'] dirpath, dirnames, filenames in os.walk(directory): ext in extensions: filename in fnmatch.filter(filenames, ext): if filename not in seen: # thing seen.add(filename) next, can rid of fnmatch.filter option here, using .endswith() going simpler , faster:
seen = set() extensions = ('.txt', '.foo', '.bar') dirpath, dirnames, filenames in os.walk(directory): filename in filenames: if filename.endswith(extensions) , filename not in seen: # thing seen.add(filename) the .endswith() can take tuple of strings for; in case sequence of extensions.
if want consider filenames without extension, remove extension before testing against seen:
extensions = ('.txt', '.foo', '.bar') dirpath, dirnames, filenames in os.walk(directory): filename in filenames: if filename.endswith(extensions): root, ext = os.path.splitext(filename) if root in seen: # have seen filename without extension continue # thing seen.add(root)
Comments
Post a Comment