Thursday, November 11, 2021

[FIXED] How to append to a data frame from multiple loops

November 11, 2021 dataframe, numpy, pandas No comments

Issue

I have a code, which takes in files from csv and takes a price difference, but to make it simplar I made a reproducible example as seen below. I want to append each result to the end of a specific column name. For example the first loop will go through size 1 and minute 1 so it should append to column names 1;1, for file2, file 3, file4. So the output should be :

1;1  1;2   1;3   2;1  2;2  2;3      
0    0     0       same below as for 1
0    0     0
2    2     2
2    2     2
4    4     4
4    4     4
5    5     5
0    0     0
0    0     0
0    0     0
2    2     2
2    2     2
4    4     4
4    4     4
6    6     6
6    6     6
0    0     0
0    0     0
0    0     0
2    2     2
2    2     2
4    4     4
4    4     4
6    6     6
7    7     7

I am using a loop to set prefixed data frame columns, because in my original code the number of minutes, sizes, and files is inputted by the user.

import numpy as np
import pandas as pd
file =[1,2,3,4,5,6,6,2]
file2=[1,2,3,4,5,6,7,8]
file3=[1,2,3,4,5,6,7,9]
file4=[1,2,1,2,1,2,1,2]
size=[1,2]
minutes=[1,2,3]
list1=[file,file2,file3]
data=pd.DataFrame(file)
data2=pd.DataFrame(file2)
data3=pd.DataFrame(file3)
list1=(data,data2,data3)
datas=pd.DataFrame(file4)
col_names = [str(sizer)+';'+str(number) for sizer in size for number in minutes]
datanew=pd.DataFrame(columns=col_names)


for sizes in size:
    for minute in minutes:
        for files in list1:
            pricediff=files-data
             datanew[str(sizes)+';'+str(minute)]=datanew[str(sizes)+';'+str(minute)].append(pricediff,ignore_index=True)
print(datanew)

Edit: When trying this line: datanew=datanew.append({str(sizes)+';'+str(minute): df['pricediff']},ignore_index=True) It appends the data but the result isn't "clean"

The result from my original data, gives me this:

    111;5.0,1111;5.0
"0                                          4.5
1                                          0.5
2                                            8
3                                            8
4                                            8
                        ...                   
704                                        3.5
705                                        0.5
706                                       11.5
707                                        0.5
708                                        9.0
Name: pricediff, Length: 709, dtype: object",
"price    0.0
0        0.0
Name: pricediff, dtype: float64",
"0      6.5
1      6.5
2      3.5
3     13.0
Name: pricediff, Length: 4, dtype: float64",

Solution

What you are looking for is:

datanew=datanew.append({str(sizes)+';'+str(minute): pricediff}, ignore_index=True)

This happens because you cannot change length of a single column of a dataframe without modifying length of the whole data frame.

Now consider the below as an example:

import pandas as pd

df=pd.DataFrame({"a": list("xyzpqr"), "b": [1,3,5,4,2,7], "c": list("pqrtuv")})

print(df)

#this will fail:
#df["c"]=df["c"].append("abc", ignore_index=True)
#print(df)

#what you can do instead:
df=df.append({"c": "abc"}, ignore_index=True)

print(df)

#you can even create new column that way:
df=df.append({"x": "abc"}, ignore_index=True)

Edit

In order to append pd.Series do literally the same:

abc=pd.Series([-1,-2,-3], name="c")
df=df.append({"c": abc}, ignore_index=True)

print(df)

abc=pd.Series([-1,-2,-3], name="x")
df=df.append({"x": abc}, ignore_index=True)

Answered By - Grzegorz Skibinski

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, November 11, 2021

[FIXED] How to append to a data frame from multiple loops

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels