Build a tool to check HTTP Status codes of a list of URLs

Hi,

I am looking for someone who can build me a tool to check the HTTP status codes of a list of URLs (around 10,000 urls). Based on the HTTP status it should store if it is a 404 response or not for every link. At the end, I want to trigger it automatically via a scheduled workflow every week or so.

Looking forward to hear from you!

Hi Pascal,

I have just saw your post and would find myself a best suit for this role. Please contact me on Skype so that we can discuss further or drop me an email regarding the task which you want to accomplish with best quality with affordable price .

Looking forward to hear from you soon.

Regards,
Anna
Skype: cis.am2
anna.cis23@gmail.com

Are you wanting to trigger this on the same list every week, or have the ability to queue up a different list each time? I think this is doable depending on your preference for 3rd party software.

It will be an updated list every week. What 3rd party software would you think is feasible?

There will still be some manual work involved no matter how far you get with automation. This is due to limits on total number of rows you can run a workflow on in Bubble. You might be able to get around this with multiple workflows.

My first thought was Excel or Google Sheets. There are functions you could write that check HTTP status on a list/cell, in which case you would generate the status codes there and export as a CSV you then process with Bubble via CSV upload. My reasoning for recommending this route is mainly to put the bulk of the workload ‘offline’ or at least outside of your app, this way you’re not bogging down workflow runs. Running inside Bubble, assuming ~3 seconds per HTTP check, you’re looking at 8 hours of solid processing time. Using Excel or Sheets would be much faster if I’m thinking about this correctly.

Makes sense, however based on my experience uploading the csv into bubble also takes quite a while. Maybe it could make sense to split it up and only run it on 1/7 of the rows per days…

I think the best thing here would be to have a separate DB that you can then plug the Bubble front end into. Reasoning behind this is more due to the fact that storing each individual record in bubble would cost a lot of CPU time. So you could do the HTTP checks with a simple python script or PHP script record that in a MYSQL DB for example and then create a nice bubble front end for all of the view parts.

I tried this before with storing the data in google sheets and connecting via blockspring, however what I experienced were significantly lower loading times for overview pages ( > 1min). Do you have experience with external MySQL DBs, does it affect the loading times?

There is already a Google sheets add-on for the purpose - https://chrome.google.com/webstore/detail/seomango/oigecakbpjpkahfaeblingmgdhnjiknc?hl=en

1 Like

I made a plugin to solve exactly this problem. “Broken Link Checker” (Shameless self-promotion :slight_smile:
It doesn’t use a 3rd party service, it tests URL’s from Bubble itself. Might fit your use case, might not.

You could use backend workflows to manage all the scheduling of the checks.

2 Likes