Issue
I'm trying to grab transaction data from plaid and input it into a data frame with clean columns. The "before" format is a list as excerpted below.
My goal is that the "after" format is a data frame where there is a column for each name in the list (e.g., "account_id" or "amount") such that I can then parse the list and insert values in each column.
I'm new to python--I'm fluent in r/dplyr but the syntax is confusing me.
Thanks in advance!
{'account_id': 'nKllGzvJQeIpZwvxlv1Mhw98Zdo57Ec6ZNEm5',
'account_owner': None,
'amount': 4.33,
'authorized_date': datetime.date(2020, 12, 8),
'authorized_datetime': None,
'category': ['Food and Drink', 'Restaurants', 'Coffee Shop'],
'category_id': '13005043',
'check_number': None,
'date': datetime.date(2020, 12, 8),
'datetime': None,
'iso_currency_code': 'USD',
'location': {'address': None,
'city': None,
'country': None,
'lat': None,
'lon': None,
'postal_code': None,
'region': None,
'store_number': None},
'merchant_name': 'Starbucks',
'name': 'Starbucks',
'payment_channel': 'in store',
'payment_meta': {'by_order_of': None,
'payee': None,
'payer': None,
'payment_method': None,
'payment_processor': None,
'ppd_id': None,
'reason': None,
'reference_number': None},
'pending': False,
'pending_transaction_id': None,
'personal_finance_category': None,
'transaction_code': None,
'transaction_id': 'llvvGK61QjH1eX8yP8qZC8BxB3WegMFZrXRjr',
'transaction_type': 'place',
'unofficial_currency_code': None}
enter code here
Solution
In this case, I would use the DataFrame constructor, in the list-of-records format.
Example:
import datetime
import pandas as pd
transaction = {'account_id': 'nKllGzvJQeIpZwvxlv1Mhw98Zdo57Ec6ZNEm5',
'account_owner': None,
'amount': 4.33,
'authorized_date': datetime.date(2020, 12, 8),
'authorized_datetime': None,
'category': ['Food and Drink', 'Restaurants', 'Coffee Shop'],
'category_id': '13005043',
'check_number': None,
'date': datetime.date(2020, 12, 8),
'datetime': None,
'iso_currency_code': 'USD',
'location': {'address': None,
'city': None,
'country': None,
'lat': None,
'lon': None,
'postal_code': None,
'region': None,
'store_number': None},
'merchant_name': 'Starbucks',
'name': 'Starbucks',
'payment_channel': 'in store',
'payment_meta': {'by_order_of': None,
'payee': None,
'payer': None,
'payment_method': None,
'payment_processor': None,
'ppd_id': None,
'reason': None,
'reference_number': None},
'pending': False,
'pending_transaction_id': None,
'personal_finance_category': None,
'transaction_code': None,
'transaction_id': 'llvvGK61QjH1eX8yP8qZC8BxB3WegMFZrXRjr',
'transaction_type': 'place',
'unofficial_currency_code': None}
transactions = [transaction]
df = pd.DataFrame(transactions)
Note that some columns are themselves objects, like the payment category column or payment_meta column. You didn't specify if you thought those should be handled specially or broken out into multiple columns, but it's something you should consider doing.
Also note that transactions
is a list, and you can put multiple transactions in the same format into it, and construct the dataframe all at once. (This is preferable to creating multiple dataframes and appending them together, for the same reason as in dplyr.)
Answered By - Nick ODell
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.