Issue
I have this large table in MySQL incident_archive
that has millions of records, I want to sort the rows by created
column and keep the Top X rows and delete the rest, what is the most efficient way to do this.
So far I came up with this solution in Python:
def do_delete_archive(rowsToKeep):
if rowsToKeep > 0:
db_name = find_environment_value('DB_NAME', False, "dbname")
db_host = find_environment_value('DB_HOST', False, "host")
db_user = find_environment_value('DB_USER', False, "username")
db_pass = find_environment_value('DB_PASS', False, "password")
db = MySQLdb.connect(host=db_host,user=db_user,passwd=db_pass,db=db_name)
cursor = db.cursor()
sql = f"""DELETE FROM `incident_archive`
WHERE incident_id NOT IN
( SELECT incident_id FROM
( SELECT incident_id FROM `incident_archive` ORDER BY created DESC LIMIT {rowsToKeep}) foo) LIMIT 10000;"""
try:
rowcount = rowsToKeep+ 1
while rowcount > rowsToKeep:
cursor.execute(sql)
db.commit()
rowcount = cursor.rowcount
print(f"--- Affected Rows: {rowcount} ---")
except:
db.rollback()
The issue that I have here if rowsToKeep
has value more or equal 10000
this approach will not work, what is a better way to this process?
**Note: rowsToKeep value is dynamic, meaning it can change.
Solution
I came up with solution below:
NOTE: threshold is the variable contains number of max records we want to keep 1000 in my example
sqlCreate = f"""CREATE TABLE new_incident_archive LIKE incident_archive;"""
print(f"Running query is: {sqlCreate}")
cursor.execute(sqlCreate)
print(f"Done with: {sqlCreate}")
sqlInsert = f"""INSERT INTO new_incident_archive SELECT * FROM `incident_archive` ORDER BY created DESC LIMIT {threshold}"""
print(f"Running query is: {sqlInsert}")
cursor.execute(sqlInsert)
db.commit()
print(f"Done with: {sqlInsert}")
sqlDrop = f"""DROP TABLE incident_archive"""
print(f"Running query is: {sqlDrop}")
cursor.execute(sqlDrop)
print(f"Done with: {sqlDrop}")
sqlRename = f"""RENAME TABLE `new_incident_archive` TO `incident_archive`;"""
print(f"Running query is: {sqlRename}")
cursor.execute(sqlRename)
print(f"Done with: {sqlRename}")
Answered By - DeadlyDagger
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.