Bulk update data via upload

Hi All,

I’m trying to migrate data from our current site to our new bubble site but I’m having a lot of trouble migrating the data because we have so many Things that relate to one another. When I try to use the Upload function in the Database, it keeps throwing errors because of the relationships between Things.

Example:
User things has a Provider
Provider thing has a User and an Organization
Organization things has a Provider and List of Programs
Programs things has an Organization

When I try to upload a .csv of Users, it throws an error because it cannot find Provider thing in the db yet. Similarly, when I try to upload a .csv for Providers, it throws an error because it cannot find the Organization thing in the DB yet.

I’m uploading around 1,000 rows, so it is not very feasible to have to go into each Thing to manually update their relational data fields.

I’d be open to for example uploading thing User while ignoring Provider field, then uploading Provider thing while ignoring Organization field and then going back and doing an upload update to User thing to add in Provider after that thing has been created in the DB, but that doesn’t seem to be an option in Bubble.

Also, when I have a field that is a list in the .csv, it is not uploading correctly, even though in the csv the list in the given cell is [Category One , Category Two , Category Three]. In the DB it visually looks like the list was uploaded correctly, but when I preview the site, the new entries don’t show up in their respective Category and Subcategory.
04 PM

Can anyone please help? I’m not able to use my bubble site unless I can get this figured out. Many thanks in advance!!

1 Like

Are you migrating data from one Bubble app to another Bubble app, or from some other db to Bubble?

@eftomi I am migrating data from my custom build site which uses Python/Django, so I’m migrating the data via .csv files into Bubble.

Okay. The first consideration is the ‘primary keys’ - Bubble uses its own keys, stored in Unique_id field for each data type, you can see these in Data Tab, App Data. These values are used as foreign keys, too (for the relationships). As far as I know, you cannot generate these primary keys in advance by your own (i. e. outside of Bubble), but you can import these keys as ‘foreign keys’ - for instance, in your case User has a Provider, so you can import provider’s Unique_id in a field within User which relates to Provider thing. But this is true only if the Provider thing already exists with that particular Unique_id (as you mentioned).

As I researched during my migration project, the only way to do this kind of migration is to do it in steps:

  • firstly import all the data in all tables / data types, disregarding relationships (in this step, Bubble generates Unique_ids as primary keys)
  • secondly, set all the relationships with ‘foreign fields’ in tables with Unique_ids

The second step can be done in Bubble with bulk workflow, but it’s relatively slow. If you imported foreign keys from pre-migration database into Bubble, you have ‘relationships’ (but Bubble doesn’t respect them). You can then crate workflows with actions like Change User, Set field RelatedProvider = Search for Provider [Provider’s OldId = User’s RelatedProviderOldId].

The third step for me was to have bulk workflows for those relationships that I decided to be fields as lists of things.

I hope it’s clearly described. The process is quite cumbersome.

1 Like

@eftomi thank you so very much! It looks like I got it all to work following your steps - API workflows were the key.

One additional quick question that you might be able to shed light on - when I upload a .csv for my Programs thing, the first column, Status, gets uploaded with a space in front of it, causing me to manually go into the Bubble DB to adjust it so each program can be live on the site. I’ve cleared all formatting on the .csv file and have uploaded and re-uploaded several times with no luck. Any ideas?

1 Like

I’m glad it worked :slight_smile:

It looks like some newline characters are also there (\r, \n)? I suggest that you closely check the CSV. As I mentioned in some other post, Notepad++ is the right tool for this, since it understands different encodings and shows special characters (choose View | Show Symbol | Show All Characters in it). And you can then use its Find & replace functionality to clean the file.

1 Like

Thank you so very much for your help!!

Is it still the only/best way to upload a relational database via csv?
I’m trying to do it now for a simple app and it’s already a challenge…so I’m kinda freaking out just thinking about the other db I’m creating for another app with much more data and relationships in it.

If that’s the case would you be kind enough to detail a bit more the process please? not sure I can follow this one.

thanks

1 Like

As far as I know there’s nothing new about importing data to Bubble. I’m waiting for Action plugins: run custom javascript on the server (according to the roadmap), I hope that these plugins will have the functionality to create data types/tables, or at least that we could develop decent back-end APIs.

As for the above steps, where exactly do you have problems? Let’s say that you have two related tables:

  • Department (DepartmentID, Description)
  • Employee (EmployeeID, Name, Salary, HisDepartmentID)

The trouble with Bubble is that it uses it’s own internally generated IDs. So, if you import the above tables, you get:

  • Department (Unique_id, DepartmentID, Description)
  • Employee (Unique_id, EmployeeID, Name, Salary, HisDepartmentID)

To successfully make a connection from Employees to Departments, you have to create additional key in Employees table:

  • Employee (Unique_id, EmployeeID, Name, Salary, HisDepartmentID, HisDepartment)

of data type Department, which will then store the Unique_id of the related department - but it’s empty right now. As of yet, the information about the corresponding department is in HisDepartmentID, that’s why you have to create a workflow with actions:

Change Employee, Set field HisDepartment = Search for Department [Department’s DepartmentID = Employee’s HisDepartmentID].

You can then run the workflow for all the employees with bulk action (https://bubble.io/reference#Interface.DatabaseViewer.BulkAction).

The process is cumbersome because the import of large datasets is slow, and you have to do the above for any foreign key that you have in your db schema.

1 Like

Thanks for your quick reply and for the clarification on how relationships are stored.
I’m sorry but I’ve tried following your process to create an API workflow that’ll do exactly this (Just replace Employee and Department with Venue and City) but I just can’t make it work.

I don’t know how to properly set up the API workflow and every time I got something like ‘you’re trying to modify a list of things but it should be a thing’

EDIT: I think I need to re-upload my dataset as it was not corresponding to your exemple.

You mentioned the troubles with the list of things - Bubble can deal with related data in two ways; the first way is as I described above:

  • Department (DepartmentID, Description)
  • Employee (EmployeeID, Name, Salary, HisDepartment)

In this case, the field HisDepartment is of Department type. But you can do it also like this:

  • Department (DepartmentID, Description, EmployeesInDepartment)
  • Employee (EmployeeID, Name, Salary)

Here, EmployeesInDepartment is of Employee data type, as a list of things.

The easiest way to replicate classic entity-relationship model is the first option, although Bubble is sometimes faster if you do the database structure as shown in the latter case.

My example of the workflow in previous post is related to the first option.

1 Like

Ok I see. this is actually one of my early interrogations.
Because in my app I would like to display to what Department an Employee belongs to but also, let’s say on Department Page, how many employees there are in a Department.
So should I use only one of these two options or both?

Right now I’m re-uploading my dataset with a new column corresponding to HisDepartmentID (text) from your example populated with the corresponding Department name. I didn’t have that before.

So should I use only one of these two options or both?

It depends - your choice will influence a couple of things when developing the app (this is not about migration):

  • how many tasks should be performed when adding and updating data; you have to have all data in sync;
  • how the querying will be done.

For instance, if you have a “combined” structure like

  • Department (DepartmentID, Description, EmployeesInDepartment)
  • Employee (EmployeeID, Name, Salary, HisDepartment)

and there’s new employee to add into the database, you have to add a new employee to the list EmployeesInDepartment within his Department, and set the department in the HisDepartment field of this Employee.

In old times we called this data redundancy :slight_smile: because the information about the employee’s department is stored twice. In can also happen that the information in EmployeesInDepartment is not the same as in HisDepartment - in that case you have a problem.

And there’s also a querying part of the story: Bubble offers a couple of ways to query the data either way, some approaches are better and quicker than others. Let’s say that you need all the employees in the department; you can do this by:

Department's EmployeesInDepartment

or

Search for Employees [HisDepartment = <department>]

The first approach is quicker, since you already build the list of all employees in department when you put the data in DB. However, if you would like to speed up the data entry for some reason, and you don’t need that particular query, then you can drop the EmployeesInDepartment field and use just HisDepartment.

To be complete, you can find the employee’s department by:

Employee's HisDepartment

or

Search for Department [EmployeesInDepartment contains <employee>]

So, your choice should depend from the insert, update and query actions that you expect to have in the app. You can find some general ideas about apps optimization and “architecture” here.

2 Likes

Thanks a lot for that! This are things I wasn’t taking into consideration. I read about data redundancy and why we should avoid it but I didn’t know if it was applicable to Bubble. I’m still working on getting a full grasp of Bubble and its database.
This helps, thank you

1 Like