====== Pruning entries from SQlite database ======
{{tag>dev python sqlite}}
Hey hey! So, very quick/short article this time. Because this is a bit of an urgency in fact: I have an SQLite database, that I'm acessing in python, and I write cryptocurrency prices in that database... for about 45 currencies so far... every minute... lol.
====== ======
So, yeah... the idea was to collect enough data to be able to perform some advanced/precise analysis on about 7 days... except that I'm been collecting data for a lot more that just 7 days now 😁 So my database is getting bigger and bigger everyday! It's not //extraordinary// big sure, but it's about 130MB already for a single database file, so I really think I should try to keep this under control somehow.
===== Magic command to prune database records =====
The solution is relatively simple: everytime I write a new set of price entries, I should then also delete the too old entries. => So how do we do that ? [//... burning a few neurons here... but not too much really...//🤔]. And here we go! Just created a simple function in my **PriceDB** class: def deleteOldHighResRecords(self, ts):
cur = self.execute("SELECT COUNT(*) from highres WHERE timestamp < %d;" % ts)
num = cur.fetchone()
logDEBUG("Deleting %d rows from highres dataset." % num)
cur = self.execute("DELETE from highres WHERE timestamp < %d;" % ts, commit=True)
cur.fetchall()
And then I simply call this function on each 1 minute cycle when I'm done adding the new rows inside the database: # Here we should also remove the too old entries:
# We remove all the entries older than 8 days:
pdb.deleteOldHighResRecords(nowStamp - 8*24*3600)
Then checking the outputs I get in my **highres_price_dataset.log** file, we see that the first pass was a bit intensive (deleting more than 603k records), but then it will quickly stabilize itself to deleteting only 1 row per currency on each cycle:
2021-06-12T16:26:11.898338 [DEBUG] Updated all high res datasets in 8.081 seconds
2021-06-12T16:29:03.789121 [DEBUG] Deleting 603834 rows from highres dataset.
2021-06-12T16:30:09.625585 [DEBUG] Updated all high res datasets in 67.816 seconds
2021-06-12T16:31:06.311368 [DEBUG] Deleting 94 rows from highres dataset.
2021-06-12T16:31:06.633544 [DEBUG] Updated all high res datasets in 4.774 seconds
2021-06-12T16:32:03.574602 [DEBUG] Deleting 47 rows from highres dataset.
2021-06-12T16:32:10.712554 [DEBUG] Updated all high res datasets in 8.377 seconds
2021-06-12T16:33:11.018032 [DEBUG] Deleting 47 rows from highres dataset.
2021-06-12T16:33:13.725622 [DEBUG] Updated all high res datasets in 11.107 seconds
2021-06-12T16:34:03.695258 [DEBUG] Deleting 47 rows from highres dataset.
2021-06-12T16:34:03.885305 [DEBUG] Updated all high res datasets in 1.326 seconds
2021-06-12T16:35:08.617795 [DEBUG] Deleting 47 rows from highres dataset.
2021-06-12T16:35:12.484663 [DEBUG] Updated all high res datasets in 10.447 seconds
And well... this is it 😊! For once, I told you this was going to be a very short one and I didn't lie in the end hi hi hi 🤣