Issue
I am trying to loop through a data frame to extract specific values. I have a loop that states the conditions but I cant get around the indexer out of bounds error.
The loop I have is:
new_end = []
for i in range(size):
if fifo.Same_New.iloc[i] == "New" and fifo.Same_New.iloc[i-1] == "New":
new_end.append(fifo.EndTime.iloc[i])
else:
if fifo.Same_New.iloc[i] == 'Same' and fifo.Same_New.iloc[i+1] =="New":
new_end.append(fifo.EndTime.iloc[i])
else:
new_end.append('Null')
Sample DF:
StartTime EndTime Dura Diff S/N
05/01/2024 12:27 05/01/2024 13:04 2224 1036 New
06/01/2024 07:05 06/01/2024 07:06 60 1081 New
06/01/2024 07:06 06/01/2024 07:06 0 0 Same
08/01/2024 10:26 08/01/2024 10:27 32 200 New
08/01/2024 11:45 08/01/2024 11:46 38 78 New
09/01/2024 13:48 09/01/2024 13:51 172 122 New
09/01/2024 16:41 09/01/2024 16:42 60 170 New
09/01/2024 16:43 09/01/2024 17:37 3267 1 Same
09/01/2024 17:38 09/01/2024 17:41 189 1 Same
10/01/2024 10:06 10/01/2024 10:08 142 985 New
10/01/2024 10:51 10/01/2024 10:53 116 43 New
10/01/2024 11:00 10/01/2024 11:00 26 7 New
12/01/2024 12:17 12/01/2024 12:30 752 77 New
15/01/2024 08:01 15/01/2024 08:01 36 1171 New
16/01/2024 15:44 16/01/2024 15:53 577 463 New
16/01/2024 16:43 16/01/2024 17:16 2003 50 New
16/01/2024 17:16 16/01/2024 17:17 69 0 Same
17/01/2024 00:03 17/01/2024 00:04 60 406 New
17/01/2024 00:04 17/01/2024 00:20 943 0 Same
17/01/2024 06:08 17/01/2024 06:09 16 348 New
The logic is:
- If you have to row 1 and row 2 both have "Same" value in S/N column then return EndTime value from row 1. Example below:
05/01/2024 12:27 05/01/2024 13:04 2224 1036 New
06/01/2024 07:05 06/01/2024 07:06 60 1081 New
- If row 1 has "New" value in S/N column and row 2 has "Same" value in S/N column then return "Null". Example below:
09/01/2024 16:41 09/01/2024 16:42 60 170 New
09/01/2024 16:43 09/01/2024 17:37 3267 1 Same
- If row 1 has "Same" value in S/N column and row 2 has "Same" value in S/N column then return "Null". Example below:
09/01/2024 16:43 09/01/2024 17:37 3267 1 Same
09/01/2024 17:38 09/01/2024 17:41 189 1 Same
- If row 1 has "Same" value in S/N column and row 2 has "New" value in S/N column then return End time value from row 1. Example below:
16/01/2024 17:16 16/01/2024 17:17 69 0 Same
17/01/2024 00:03 17/01/2024 00:04 60 406 New
Expected Output would be additional column with those values:
StartTime EndTime Dura Diff S/N N_E
05/01/2024 12:27 05/01/2024 13:04 2224 1036 New 05/01/2024 13:04
06/01/2024 07:05 06/01/2024 07:06 60 1081 New Null
06/01/2024 07:06 06/01/2024 07:06 0 0 Same 06/01/2024 07:06
08/01/2024 10:26 08/01/2024 10:27 32 200 New 08/01/2024 10:27
08/01/2024 11:45 08/01/2024 11:46 38 78 New 08/01/2024 11:46
09/01/2024 13:48 09/01/2024 13:51 172 122 New 09/01/2024 13:51
09/01/2024 16:41 09/01/2024 16:42 60 170 New Null
09/01/2024 16:43 09/01/2024 17:37 3267 1 Same Null
09/01/2024 17:38 09/01/2024 17:41 189 1 Same 09/01/2024 17:41
10/01/2024 10:06 10/01/2024 10:08 142 985 New 10/01/2024 10:08
10/01/2024 10:51 10/01/2024 10:53 116 43 New 10/01/2024 10:53
10/01/2024 11:00 10/01/2024 11:00 26 7 New 10/01/2024 11:00
12/01/2024 12:17 12/01/2024 12:30 752 77 New 12/01/2024 12:30
15/01/2024 08:01 15/01/2024 08:01 36 1171 New 15/01/2024 08:01
16/01/2024 15:44 16/01/2024 15:53 577 463 New 16/01/2024 15:53
16/01/2024 16:43 16/01/2024 17:16 2003 50 New Null
16/01/2024 17:16 16/01/2024 17:17 69 0 Same 16/01/2024 17:17
17/01/2024 00:03 17/01/2024 00:04 60 406 New Null
17/01/2024 00:04 17/01/2024 00:20 943 0 Same 17/01/2024 00:20
17/01/2024 06:08 17/01/2024 06:09 16 348 New 17/01/2024 06:09
17/01/2024 06:09 17/01/2024 06:10 30 0 Same Null
Hope it clears it. Sorry for bad formatting It is my first post.
Thanks
Solution
You could simplify your four conditions into a single one:
- if the next row is a "New', then use the EndTime value, else "Null":
df['N_E'] = df['EndTime'].where(df['S/N'].shift(-1).eq('New'), 'Null')
Output:
StartTime EndTime Dura Diff S/N N_E
0 05/01/2024 12:27 05/01/2024 13:04 2224 1036 New 05/01/2024 13:04
1 06/01/2024 07:05 06/01/2024 07:06 60 1081 New Null
2 06/01/2024 07:06 06/01/2024 07:06 0 0 Same 06/01/2024 07:06
3 08/01/2024 10:26 08/01/2024 10:27 32 200 New 08/01/2024 10:27
4 08/01/2024 11:45 08/01/2024 11:46 38 78 New 08/01/2024 11:46
5 09/01/2024 13:48 09/01/2024 13:51 172 122 New 09/01/2024 13:51
6 09/01/2024 16:41 09/01/2024 16:42 60 170 New Null
7 09/01/2024 16:43 09/01/2024 17:37 3267 1 Same Null
8 09/01/2024 17:38 09/01/2024 17:41 189 1 Same 09/01/2024 17:41
9 10/01/2024 10:06 10/01/2024 10:08 142 985 New 10/01/2024 10:08
10 10/01/2024 10:51 10/01/2024 10:53 116 43 New 10/01/2024 10:53
11 10/01/2024 11:00 10/01/2024 11:00 26 7 New 10/01/2024 11:00
12 12/01/2024 12:17 12/01/2024 12:30 752 77 New 12/01/2024 12:30
13 15/01/2024 08:01 15/01/2024 08:01 36 1171 New 15/01/2024 08:01
14 16/01/2024 15:44 16/01/2024 15:53 577 463 New 16/01/2024 15:53
15 16/01/2024 16:43 16/01/2024 17:16 2003 50 New Null
16 16/01/2024 17:16 16/01/2024 17:17 69 0 Same 16/01/2024 17:17
17 17/01/2024 00:03 17/01/2024 00:04 60 406 New Null
18 17/01/2024 00:04 17/01/2024 00:20 943 0 Same 17/01/2024 00:20
19 17/01/2024 06:08 17/01/2024 06:09 16 348 New Null
A more generic method if you want to use different values depending on the conditions would be to use numpy.select
, but it's really not needed in this case:
curr_new = df['S/N'].eq('New')
curr_same = df['S/N'].eq('Same')
next_row = df['S/N'].shift(-1)
next_new = next_row.eq('New')
next_same = next_row.eq('Same')
df['N_E'] = np.select(
[curr_new&next_new,
curr_new&next_same,
curr_same&next_same,
curr_same&next_new,
],
[df['EndTime'],
'Null',
'Null',
df['EndTime'],
], 'undefined')
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.