Issue
I need help regarding the size of my DataFrame. Here is its size:
df.shape
(946270, 65)
So if we do 946270*65
is only 61 507 550
cells in total.
I opened it with the function pd.read_csv("file.csv",sep=";")
and its size is 5.43G
.
Is it not huge for this kind of df? Does someone know what the file is so huge and if it exists something to reduce its size?
Solution
Sctruct your data since some .csv files returns values as string
, like date
,
float
, int
and boolean
. Then, convert your csv
file to parquet
import pandas as pd
df = pd.read_csv('file.csv')
df.to_parquet('output.parquet')
Other things you can do:
-Remove null and blank data
-Remove what you don't need
Answered By - SakuraFreak
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.