Issue
Im new to python and trying to understand data manipulation.
I have a folder with several files. Some of which are csv's. I want to merge all of the csvs - approximately 400 of them into one single csv and all the data to be stacked
for example if the first csv has a dataframe-
transcript confidence from to speaker Negative Neutral Positive compound
thank you 0.85 1.39 1.65 0 0 0.754 0.246 0.7351
second has a dataframe:
transcript confidence from to speaker Negative Neutral Positive compound
welcome 0.95 1.39 1.65 0 0 0.754 0.201 0.8351
I want the final df to look like -
transcript confidence from to speaker Negative Neutral Positive compound
thank you 0.85 1.39 1.65 0 0 0.754 0.246 0.7351
welcome 0.95 1.39 1.65 0 0 0.754 0.201 0.8351
I tried-
import glob
import pandas as pd
# Folder containing the .csv files to merge
file_path = "C:\\Users\\Desktop"
# This pattern \\* selects all files in a directory
pattern = file_path + "\\*"
files = glob.glob(pattern)
# Import first file to initiate the dataframe
df = pd.read_csv(files[0],encoding = "utf-8", delimiter = ",")
# Append all the files as dataframes to the first one
for file in files[1:len(file_list)]:
df_csv = pd.read_csv(file,encoding = "utf-8", delimiter = ",")
df = df.append(df_csv)
But it did not work. How can I solve this issue?
Solution
This should help:
import pandas as pd
import glob
import os.path
file_path = "C:/Users/Desktop"
data = []
for csvfile in glob.glob(os.path.join(file_path, "*.csv")):
df = pd.read_csv(csvfile, encoding="utf-8", delimiter=",")
data.append(df)
data = pd.concat(data, ignore_index=True)
Answered By - Corralien
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.