Issue
My input is a dataframe :
df = pd.DataFrame({'node1': ['abc-1', 'xyz-1', 'abc-1', 'xyz-2', 'xyz-2', 'ghi-2'],
'p1': [1, 10, 3, 1, 2, 6],
'p2': [9, 2, 11, 4, 5, 3],
'node2': ['xyz-1', 'abc-1', 'xyz-1', 'def-2', 'def-2', 'xyz-1']})
print(df)
node1 p1 p2 node2
0 abc-1 1 9 xyz-1
1 xyz-1 10 2 abc-1
2 abc-1 3 11 xyz-1
3 xyz-2 1 4 def-2
4 xyz-2 2 5 def-2
5 ghi-2 6 3 xyz-1
I'm trying to sort the nodes so the ones with the same prefix go in the same side and thus the same column. But we need also to do the same with points p1
and p2
. For example row index 1
, the nodes xyz-1
and abc-1
will be reversed and we should do this at the same time with the points. So this row becomes : 1 | abc-1 | 2 | 10 | xyz-1
.
I was able to sort the nodes but really ignore how to do the same with the points.
final = df[['node1', 'node2']].apply(np.sort, axis=1).apply(pd.Series)
final.columns = ['node1', 'node2']
print(final)
node1 node2
0 abc-1 xyz-1
1 abc-1 xyz-1
2 abc-1 xyz-1
3 def-2 xyz-2
4 def-2 xyz-2
5 ghi-2 xyz-1
My expected output is the one below. It is one of the two possible outputs because the side is not important. For example, the second accepted output will be a dataframe where the abc
, def
and ghi
nodes are in the right side (column 2
).
node1 p1 p2 node2
0 abc-1 1 9 xyz-1
1 abc-1 2 10 xyz-1
2 abc-1 3 11 xyz-1
3 def-2 4 1 xyz-2
4 def-2 5 2 xyz-2
5 ghi-2 6 3 xyz-1
Solution
You can use a boolean mask:
m = final.eq(df[['node1', 'node2']]).all(axis=1)
final[['p1', 'p2']] = df[['p1', 'p2']].where(m, other=df[['p2', 'p1']].values)
final = final[df.columns]
Output:
>>> final
node1 p1 p2 node2
0 abc-1 1 9 xyz-1
1 abc-1 2 10 xyz-1
2 abc-1 3 11 xyz-1
3 def-2 4 1 xyz-2
4 def-2 5 2 xyz-2
5 ghi-2 6 3 xyz-1
Answered By - Corralien
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.