Issue
I have 20 folders, each containing 50 txt files, I need to read all of them in order to compare the word counts of each folder. I know how to read multiple files in one folder, but it is slow, is there a more efficient way instead of reading the folder one by one like below?
import re
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
import os
import glob
1. folder1
folder_path = '/home/runner/Final-Project/folder1'
for filename in glob.glob(os.path.join(folder_path, '*.txt')):
with open(filename, 'r') as f:
text = f.read()
print (filename)
print (len(text))
2. folder2
folder_path = '/home/runner/Final-Project/folder2'
for filename in glob.glob(os.path.join(folder_path, '*.txt')):
with open(filename, 'r') as f:
text = f.read()
print (filename)
print (len(text))
Solution
You can do something similar using glob
like you have, but with the directory names.
folder_path = '/home/runner/Final-Project'
for filename in glob.glob(os.path.join(folder_path,'*','*.txt')):
# process your files
The first '*'
in the os.path.join()
represents directories of any name. So calling glob.glob()
like this will go through and find any text file in any direct sub-directory within folder_path
Answered By - Nathan Roberts
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.