Adding to two different databases

rtolmach · August 2, 2018, 5:59pm

Hello!

I currently have a whole database of schools from all over the country, and I am planning on adding new schools to that database by csv. Users can search schools by either City and State or by Zip Code. I was told by Bubble’s support that it would be more efficient to have separate tables for schools in each state, this way, when a user searches by City and State, Bubble can just call the state’s table and find schools in that city, rather than going through all of the schools to find the ones that are in the searched city and state.

I have an item called ‘School’ and another table with one field being the state and another field being the list of schools in that state. I am having trouble, however, telling Bubble to upload the csv of schools and into the table with all schools, and then adding each of the new schools to their respective lists in the other table.

Any thoughts or suggestions would be greatly appreciated! Thanks!

sridharan.s · August 2, 2018, 6:26pm

I’d be surprised if it’s much faster for users when the list is separated by state. Seems like that’d create a lot of additional complications for your app and for searching schools as well.

I’d suggest you start by putting it into a single table and then search that table. Have a field for the school name, another for the state, another for the city, etc. Bubble has pretty good search across a single table so long as you’re not referencing values on another table at the same time (which is when it gets slow in Bubble).

mente12 · August 2, 2018, 6:36pm

Strange approach!

So let me see if I understood your setup correctly:

You have a table A containing all the schools (school name, city, state, zip)
You have another table B of states (state name, list of schools from A)

Correct?

Let me see if I understand what you want to do:

Import CSV of schools into table A (this you can do easily)
Then some how ADD each school to the “list-of-school” field in the B table (this is where you are having trouble?)

Correct?

rtolmach · August 2, 2018, 8:57pm

Hi everyone,

Thanks to those who replied and to others who might read this.

Let me re-state and clarify the issue we are facing. I know this is a bit long, but I want to be super-clear.

Introduction / Request for a Nonprofit
I am interning with a nonprofit, www.ChangingThePresent.org, which helps generate donations for nonprofits and for classroom resources. We are new to bubble, and we are not tech-savvy, so we (and the nonprofits and children) would be most grateful for a bit of your time and guidance.

What We are Doing
We have a database of 135k schools with six fields for each (school name, city, state, ID#, zip code, and link). We will be adding additional schools several times a week.
We are building a tool to help visitors find any specific school. Most people will search by City and State. A few will search by Zip Code. When they search, they will see the info (school name, city, state, zip, and a link) on schools that match that parameter.

APPROACH #1

Description
The simple approach is obviously to have a single large data table with all of the school data, and people would simply search it. Easy to add more schools. Easy to search.

Problem
The problem with Approach #1 is that we expect a very large number of searches—certainly half a million over the next six months, and likely several million. Bubble has not been able or willing to give us any info on what, for instance, “2 units of reserved server capacity” translates to. However, as an emerging nonprofit, we desperately need to minimize our costs, and so we need to minimize the load on the Bubble server.

The following discusses some approaches and questions for each.

APPROACH #2

Description
Bubble said that we would put less of a load on the server (and spend less money) if we put the schools for each state into separate tables. That way, a person searching (by city and state) for a school in a given state would only cause the server to search the table for that state, which has an average of 2,700 schools, instead of searching the entire table of 135k schools. (We would still need one big table with all of the schools, in order to accommodate people who search by zip code).

Question:
Is that suggestion sound (as it seems to be): that searching a table with 2,700 schools would put less load on the server (and cost us less) than searching a table of 135k schools? Is the load on the server generally proportional to the number of records in the table?

Problem:
The challenge with 50 separate state data tables is importing new schools several times a week. Doing a separate CSV upload for each of 50 state data tables would be a burden. Indeed, even generating 50 separate CSV files would be a pain).

Question:
Is there a way that we can import (several times a week) a single CSV with the new schools from various states, and have each of the schools automatically added to the appropriate state table?

APPROACH #3

Description
We also considered the following setup:
• Table A contains all the data for all the schools (school name, city, state, zip, ID#, and link)
• Table B has two columns. One column contains the states. The other column has, for each state, the list of schools that are in that state.
• If someone searches for Houston + Texas, then the server would look up Texas in Table B; see the list of Texas schools; then retrieve from Table A all of info for all of the schools that are in the Texas list and that are in Houston.

Question:
Is this better or worse (load on server, and search speed) than the 50 separate state data tables that are described in APPROACH #2?

Problem:
The challenge is similar to that of APPROACH #2. In this case, when we import new schools several times a week, we would want to import a single CSV with new schools from various states, and have them automatically added BOTH to Table A and also to the appropriate state lists in Table B.

APPROACH #4

Description
This would be similar to APPROACH #2. However, instead of the system allocating each new school to the appropriate state table upon import, all of the new schools would be appended to a single big data table that holds all the schools. After each import, a batch event would populate (or re-populate) each of the 50 state data tables with the schools that are in that state.

Actually, it seems that a similar approach might be used as an alternative to APPROACH #3: After each import, a batch event would add populate or re-populate the list of schools for each state’s section of in Table A.

Thank You
Thank you very much for sharing your time and expertise. We could not be more appreciative!

mishav · August 2, 2018, 9:18pm

This is going to be unweildy on building the app … on every search you’ll need 52 alternative data sources conditional on state. Additionally, every use of the data will need the same conditionals. You may also find that slows the app down too much.

You could determine which state to look in with a preliminary search of a table which contains State, Zip.

APPROACH #98

Is there much more data than what you’ve outlined? How about keeping it stored in an external database, and accessing/searching it either with the Database Connector, or the API Connector, or from javascript on the page (for publicly available data).

I’d only suggest this if you have a very simple data structure, because its more fiddly than using the data within Bubble’s database.

keith · August 3, 2018, 2:15am

I’m kinda with @mishav on this. Don’t go too fiddly. “Do a search for…” on your database should be performant enough. In terms of optimization, I don’t know that I’d stress about this particular issue.

You can always complicate things later. That is, a workflow could be executed to split the db apart in the way you need. Not saying I know better than the bubble folks, but I feel like they are not anticipating side effects of more complex approaches. Also, if your searches grow in the way you predict, you may wind up just wanting higher performance everywhere and so adding capacity might not be the worst thing.

sridharan.s · August 3, 2018, 3:57am

I strongly advise keeping it simple and putting it all in a single table. More complexity isn’t going to save you server cost on this (databases can search large numbers of records quickly) and will simply slow your development down (which is a large cost itself).

Also FWIW, in my experience, a dedicated server can handle 50 concurrent users when you’re running lots of complex, hairy workflows for those users at the same time. And, probably something like 1,000 concurrent users when they’re just navigating your site (which is probably a lot more common for most websites). Of course, the devil is in the details with this, but these data points should give you some sense for it.

rtolmach · August 3, 2018, 3:23pm

Hi @mishav,

You win the gold star in comedy for “Approach #98” - thank you for making my day.

You get another gold star for your advice. We are not tech-savvy enough to deal with an external database connector at this time, but we will look into it as we grow.

Thank you for taking the time to share your expertise. We truly appreciate it.

rtolmach · August 3, 2018, 3:26pm

Hi @keith

Thank you for weighing in and offering good advice. The law of unintended consequences, eh? Solve a small problem by making a big one. We will take your advice and keep it simple.

Yes, adding capacity would be nice, but we do need to pinch our pennies. Life as a nonprofit.

Thanks again!

rtolmach · August 3, 2018, 3:29pm

Hi @sridharan.s

Thanks for the advice.

Is your reference to “a dedicated server can handle . . .” the same as what the bubble pricing page describes as “2 units of reserved server capacity”?

sridharan.s · August 3, 2018, 3:43pm

A dedicated server is Bubble’s “Production” plan on the pricing page.

Note - the numbers I gave are for the number of concurrent users. So, if you have 1 million users who each use your site every month for 10 mins. Then, during peak usage, you’d still have fewer than 1,000 concurrent users so long as people generally use it at different times of the day (which is the norm for website traffic patterns).

rtolmach · August 3, 2018, 10:46pm

Thank you, @sridharan.s !

ericterii · August 7, 2018, 5:08pm

These guys explain some very interesting stuff about B-tree and how it can “accelarate” queries
Using Simple Indexes to Optimize Complicated Sorts in Postgres

Dont know it that Structure is already implemented in Bubble Database system out of the box - or - if it would apply only to certain configuration case using external Databases…

rtolmach · August 9, 2018, 12:16am

Hi @ericterii
Thanks! That is all WAY over my head, but I will forward it to others here who know more than I do.

ericterii · August 13, 2018, 2:40pm

No problem, I believe that its a structural way data is arranged…still I m not sure it Bubble Data uses it out of the box or it has to be “activated” by each webmaster…