Issue
I have a data frame like this.
document_group
A12J3/381
A02J3/40
B12P4/2536
C10P234/3569
and I would like to get like this
document_group
A12J3/38
A02J3/40
B12P4/25
C10P234/35
I have tried to adapt a function for single string like this
def remove_str_start(s, start):
return s[:start] + s[start]
and work with this sample
s='H02J3/381'
s.find('/')
remove_str_start(s,s.find('/')+2)
it returns 'H02J3/38', what I want to do while s is the input data frame and start is cutting the char start from the position char.
but when I tried with data frame
remove_str_start(df['document_group'],df['document_group'].str.find('/')+2)
the result returns an error
could everyone help me with this kind of situation?
Solution
You can also str.split
remove the unwanted parts and put together:
s = df.document_group.str.split('/')
df['document_group'] = s.str[0] + "/" + s.str[1].str[:2]
prints:
document_group
0 A12J3/38
1 A02J3/40
2 B12P4/25
3 C10P234/35
Answered By - sophocles
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.