Deleting large volumes of things after set date

For various reasons I am collecting analytic data within my application and storing it within my database. I’m only holding this for a period of 90 days after which it can be deleted. I have 2 options either to schedule a workflow to delete the thing and the list of things associated with it or alternatively run a scheduled workflow to delete all things after a given date. I’m pretty confident the former approach would be optimal… Doing a search is never good but if anybody has any better suggestions or alternatives I would welcome the input

Do you mean to retain a sliding window of 90 days? If so, you can keep the volume lower by matching the frequency of the data deletion with the frequency of the data creation.

Hi again Misha, not quite following…re keeping the volume lower, how would that work. if today Person A does some stuff and the app creates the relevant records in the database…basically in 90 days those records can be expunged. so the 90 days sliding window is applicable per visitor/viewer.

I was thinking of either:

A) Person A’s workflow that creates records, also initiates delete old records … BUT adding additional delays for Person A would be pretty bad, especially in Bubble.

B) From the metrics you can work out how often items are created. For an example high volume daily 7pm to 9pm, a scheduled deletion for daily midnight and 5pm might keep the volume down, without slowing things down during non-busy times.

Its hard to get around needing the search, but if it is returning too many rows, you can break it up into several workflow steps with different date/time ranges.

Thanks Misha, very helpful now i get it. I have been fiddling around with the Data API to try and take the load off bubble to a point. person A visits, person b and c and so on … stuffs created in the DB at which point i basically create a master thing…with a list of sub things related to that visit.

Now i schedule a WF to action the master thing for deletion…90 days later the scheduled WF fires sends the Master Thing ID and all its sub things ID’sto a .PHP script i created which recursively makes a DELETE call to the DATA API end point…that all works but not sure at scale.

Wow, my mind boggles at how many schedules … it’d be worth checking with Bubble that it will handle the volume.

The PHP script will be running at around the time the workflow schedules, which would be around the same time of day that the user created it, unless you have some offset. Do you find the data API less database intensive than deleting it from the workflow?

Ohh… so interesting point there Misha…it never occurred to me that setting scheduled workflows was any form of constraint or concern. I will check. I have really struggled in the past with deleting lists of things… I have found consistently that things in a list which I set the delete get missed… And I end up with annoying remnants plus it slows everything down using bubble to delete stuff. Using the data API and making a DELETE does not from what I can tell put any strain on the bubble system and has proven consistent.

I’ve not considered putting an offset but could easily do so… Basically the schedule API makes a POST to my own .PHP endpoint (separate server) with an array of unquie ID’s the .php chugs away making DELETE calls … i will investigate

If what you have will scale, then its all good, especially as you’ve already set it up.

If you want to change it, I’d go with …

  • Multiple scheduled batch deletions (which for example, trigger a PHP script) for when you know the system isn’t busy.
  • Records in the batch not be dependent on something setup 90 days earlier, i.e. have it do a search for what is ready to delete.
  • For Data API deletion, it’d run a Get Data API first to produce a candidate list (kind of inefficient as it’d get all fields rather than just the ids).
  • Batch limited to some reasonable number of records, for example I think the data API limits to 200 per call.
  • Don’t forget to handle the rate limit type of error.

The bonus of this approach is that if a batch fails, the next batch will cover the same records.

Edit: second bonus is its easy to change your parameters, regarding what is in scope for deletion. For example, to change to a 95 day window : )

1 Like

stoke of genius … i just would not have come up with that approach, Thanks Misha

1 Like

Hi @Bubbleboy, I know this thread’s almost a year old but it’s all I found on the forum that mention executing PHP script. How do you do it? On my side I keep getting “cross origin” problem.

Maybe @mishav, you could give me an advice?

Thanks in advance!

@julienallard1 for me the best route was to use the API connector.

So i created a .php script which would act as an api endpoint it expects a POST with a defined JSON body… i also implemented security… that was the narly bit

In Bubble using the API connector i make a POST to that .PHP… its running on a standard .php server… i pass it the list of things i need deleteing.

The .php script then makes a call back into Bubble using the Data API and chugs away deleting stuff… no workflows … no impact on the bubble app.

Im sure there are various third party services that can do this. i decided on this as i control it and do other thing…for example i can create a list and sequntially order the cretaion of things… for me in my app… important.

1 Like

@Bubbleboy, this is a brilliant solution.

On my end, I ended up executing a PHP script with custom data via a “Run javascript” action with an ajax request.

I never got around to get rid of the “cross origin” issue when the PHP file was hosted on the same server as the Bubble app. Instead I put the file on the distant server onto which I wanted to export data. It now runs fine.

Thanks for answering back!

1 Like