Issue
Let's assume that I have a numpy array, as below: The shape of the array is (5,3) has more than 1 million rows, and is typed as a numpy object.
Sample array:
x = np.array([['A',1,10],['B',1,20],['C',2,80],['D',3,40],['E',2,50]])
I would like to achieve the following:
If the value of column Y exists in another row(s) but only in column Y in the entire array set, then check value of the column X and they are not equal then filter record.
Col X Y Z
[['A' '1' '10'] ---> filter value '1' from the all values of column Y in entire array
['B' '1' '20'] ---> filter value '1' from the all values of column Y in entire array
['C' '2' '80'] ---> filter value '2' from the all values of column Y in entire array
['D' '3' '40'] ---> filter value '3' from the all values of column Y in entire array
['E' '2' '50']] ---> filter value '2' from the all values of column Y in entire array
For example: While checking for row number 1, column y value is '1' and for the same row, column x value is 'A'. So first filter based on column y value '1'; the following rows satisfy the condition.
[['A' '1' '10']
*['B' '1'* '20']]
Then apply the second filter based on the value of column x in row 1, which is 'A' not equal to the value of all other filtered rows column x.
So in this case, row 2 satisfies the both conditions.
['B' '1'* '20']]
Note: This example shows two matching records, but in actuality, it can be one or more than one, and it can be at any row position.*
Next thing, I would like to perform is, for selected record(s)(in this case row 2), append to row 1.
Please suggest
I tried this code but no lick:
import numpy as np
x = np.array([['A',1,10],['B',1,20],['C',2,80],['D',3,40],['E',2,50]])
y = x
print(x)
print("Result is:",x[np.where(x[:,1] == y[:,1], np.where(x[:,0] != y[:,0][::-1]),False)])
x= --->print(x)
[['A' '1' '10']
['B' '1' '20']
['C' '2' '80']
['D' '3' '40']
['E' '2' '50']]
Result is: empty
Solution
Is your expected result actually something like that?:
arr = [
["A", 1, 10], # 0
["B", 1, 20], # 1
["C", 2, 80], # 2
["D", 3, 40], # 3
["E", 2, 50], # 4
["F", 1, 30], # 5
["A", 1, 70], # 6
]
expected_indexes = [
(1, 5),
(0, 5, 6),
(4,),
(),
(2,),
(0, 1, 6),
(1, 5),
]
expected = [
(["B", 1, 20], ["F", 1, 30]),
(["A", 1, 10], ["F", 1, 30], ["A", 1, 70]),
(["E", 2, 50]),
(),
(["C", 2, 80]),
(["A", 1, 10], ["B", 1, 20], ["A", 1, 70]),
(["B", 1, 20], ["F", 1, 30]),
]
If so, you can do the following:
X, Y = arr[:, :2].T
cond1 = Y[:, None] == Y[None, :]
cond2 = X[:, None] != X[None, :]
mask = cond1 & cond2
>>> cond1
array([[ True, True, False, False, False, True, True],
[ True, True, False, False, False, True, True],
[False, False, True, False, True, False, False],
[False, False, False, True, False, False, False],
[False, False, True, False, True, False, False],
[ True, True, False, False, False, True, True],
[ True, True, False, False, False, True, True]])
>>> cond2
array([[False, True, True, True, True, True, False],
[ True, False, True, True, True, True, True],
[ True, True, False, True, True, True, True],
[ True, True, True, False, True, True, True],
[ True, True, True, True, False, True, True],
[ True, True, True, True, True, False, True],
[False, True, True, True, True, True, False]])
>>> mask
array([[False, True, False, False, False, True, False],
[ True, False, False, False, False, True, True],
[False, False, False, False, True, False, False],
[False, False, False, False, False, False, False],
[False, False, True, False, False, False, False],
[ True, True, False, False, False, False, True],
[False, True, False, False, False, True, False]])
And then work with the mask, the indices of True
values in each row correspond to the expected_indexes
.
From here I don't see how you can work without for loops, but the heavy lifting is already done and you don't have structured arrays anymore anyway:
>>> indexes = [tuple(np.where(r)[0]) for r in mask]
>>> assert indexes == expected_indexes
>>>
>>> result = [tuple(arr[list(inds)]) for inds in indexes]
Answered By - paime
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.