Issue
I have a Pandas Series with the names of cities and districts:
London:Alpha
London
London:Beta
London:Delta
Paris
I want to add "_sub" at the end of each city (but not districts!) name so that the converted series looks like this:
London_sub:Alpha
London_sub
London_sub:Beta
London_sub:Delta
Paris_sub
As far as I understand, I need to split, change 1st part and rejoin the series, so I tried this chain:
names_df[0] = names_df[0] \
.str.split(':') \
.apply(lambda x: x[0] + '_sub') \
.str.join(':')
But in this way I've lost a district part (Alpha, Beta, Delta) and th result looks ugly:
L:o:n:d:o:n:_:s:u:b
I've tried another way:
names_df[0] = names_df[0]\
.str.split(':')\
.apply(lambda x: '_sub:'.join(x))
But it doesn't append '_sub' to rows withoud districts :(
What am I doing wrong?
Solution
Use a single regex substitution instead:
s = pd.Series(['London:Alpha', 'London', 'London:Beta', 'London:Delta', 'Paris'])
s = s.str.replace(r'^([^:]+)', r'\1_sub', regex=True)
^([^:]+)
- starting from the start of the string^
captures a set of chars excepting:
char
0 London_sub:Alpha
1 London_sub
2 London_sub:Beta
3 London_sub:Delta
4 Paris_sub
dtype: object
Answered By - RomanPerekhrest
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.