====== Pruning entries from SQlite database ====== {{tag>dev python sqlite}} Hey hey! So, very quick/short article this time. Because this is a bit of an urgency in fact: I have an SQLite database, that I'm acessing in python, and I write cryptocurrency prices in that database... for about 45 currencies so far... every minute... lol. ====== ====== So, yeah... the idea was to collect enough data to be able to perform some advanced/precise analysis on about 7 days... except that I'm been collecting data for a lot more that just 7 days now 😁 So my database is getting bigger and bigger everyday! It's not //extraordinary// big sure, but it's about 130MB already for a single database file, so I really think I should try to keep this under control somehow. ===== Magic command to prune database records ===== The solution is relatively simple: everytime I write a new set of price entries, I should then also delete the too old entries. => So how do we do that ? [//... burning a few neurons here... but not too much really...//🤔]. And here we go! Just created a simple function in my **PriceDB** class: def deleteOldHighResRecords(self, ts): cur = self.execute("SELECT COUNT(*) from highres WHERE timestamp < %d;" % ts) num = cur.fetchone() logDEBUG("Deleting %d rows from highres dataset." % num) cur = self.execute("DELETE from highres WHERE timestamp < %d;" % ts, commit=True) cur.fetchall() And then I simply call this function on each 1 minute cycle when I'm done adding the new rows inside the database: # Here we should also remove the too old entries: # We remove all the entries older than 8 days: pdb.deleteOldHighResRecords(nowStamp - 8*24*3600) Then checking the outputs I get in my **highres_price_dataset.log** file, we see that the first pass was a bit intensive (deleting more than 603k records), but then it will quickly stabilize itself to deleteting only 1 row per currency on each cycle: 2021-06-12T16:26:11.898338 [DEBUG] Updated all high res datasets in 8.081 seconds 2021-06-12T16:29:03.789121 [DEBUG] Deleting 603834 rows from highres dataset. 2021-06-12T16:30:09.625585 [DEBUG] Updated all high res datasets in 67.816 seconds 2021-06-12T16:31:06.311368 [DEBUG] Deleting 94 rows from highres dataset. 2021-06-12T16:31:06.633544 [DEBUG] Updated all high res datasets in 4.774 seconds 2021-06-12T16:32:03.574602 [DEBUG] Deleting 47 rows from highres dataset. 2021-06-12T16:32:10.712554 [DEBUG] Updated all high res datasets in 8.377 seconds 2021-06-12T16:33:11.018032 [DEBUG] Deleting 47 rows from highres dataset. 2021-06-12T16:33:13.725622 [DEBUG] Updated all high res datasets in 11.107 seconds 2021-06-12T16:34:03.695258 [DEBUG] Deleting 47 rows from highres dataset. 2021-06-12T16:34:03.885305 [DEBUG] Updated all high res datasets in 1.326 seconds 2021-06-12T16:35:08.617795 [DEBUG] Deleting 47 rows from highres dataset. 2021-06-12T16:35:12.484663 [DEBUG] Updated all high res datasets in 10.447 seconds And well... this is it 😊! For once, I told you this was going to be a very short one and I didn't lie in the end hi hi hi 🤣