Issue
I have a txt file as below. the dataset has the below template, I want to convert this dataset in to 6 columns with Id, Cause, Code, Event Time, Severity and Severity Code headers in python:
Id = 0005 Cause = ERROR
Code = 307 Event Time = 2020-11-09 10:16:48
Severity = WARNING
Severity Code = 5 Id = 0006 Cause = FAILURE
Code = 517 Event Time = 2020-11-09 10:19:47
Severity = MINOR Severity Code = 4
I want to know that is it possible to convert above dataset as below:
Id Cause Code Event Time Severity Severity Code
0005 ERROR 307 2020-11-09 10:16:48 WARNING 5
0006 FAILURE 517 2020-11-09 10:19:47 MINOR 4
Solution
Try this:
import re
pattern = re.compile("(.+?)=(.+?)\s{2,}")
data = []
item = {}
with open("data.txt") as fp:
for line in fp:
for m in pattern.finditer(line):
key, value = [m.group(i).strip() for i in [1,2]]
if key == "Id":
if item:
data.append(item)
item = {"Id": value}
else:
item[key] = value
data.append(item)
df = pd.DataFrame(data)
Answered By - Code Different
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.