Create an API to find words in a text

Hello everyone,

Here is my deal

I want to know if some words are present in a text.

So, I have a CSV of 40,000 words and considering a text document I send, I want to ask Bubble to return me which of the words are present in this document.

Regarding the size of the database and the amount of time required for such an operation, I’m a bit lost about the right way to do it…

How to upload such a large list in Bubble ?
Is it better to call an API workflow to do the operation? But how much time it will take?

Or alternatively, Do you know an external API service where I can upload my list and the API can do the job of scanning my doc (I could then connect it to Bubble…)?

In advance, thanks for your help!

Matt

any thoughts? :wink:

Whole words ?

yes!

I mean I have a list with ‘banana, apple, orange,…’ and a text like ‘I ate a banana and an apple’ than I want that ‘banana’ and ‘apple’ returned.

In responding to your question, I realized that I’m now stuck even with the normal method…

I uploaded the entire CSV.
I make a SET LIST Search for fruits filtered through recipe contains this fruit.

Unfortunately, in return, I got the fruit name when the string is contained in another word. For example, if there is pinapple in the recipe, the word apple is returned…

I don’t figure out how to restrict the filter to only the entire/unique word

Hey Matt,
I used Blockspring with Alchemy and Google sheets for text analysis external to Bubble, but you could probably access Alchemy via the Blockspring plugin in Bubble if you needed to.

Hi Nathan,

Thanks for your help. I already used Alchemy for other tasks. But here my issue is to identify my own list of words in a text. I don’t think that with Alchemy I can set up my own entities/keywords list to be found in a text…don’t you think?

M

I would be interested in how you got the partial search to work, as we haven’t got it to work that way in the past !

Hi Nigel,

Not sure if we are really thinking the same thing, but I just did this (see snapshot). And as return, I got the skills that are present in jobdescr, even when they appear within a larger string (which is something I don’t want).
I guess that in your case, you rather want to have in return the larger expressions where a given string has been partially dentified. Am I right?

Yes, that is it. So usually “contains” doesn’t return partial matches. Hence the thread I linked.

But your method seems to do what we didn’t think you could do !

Ahaha, Glad that my issue can appear as a solution for you guys ! :joy:

Well, I am not sure how you did it. But you can use a regex as described above to pull out words if you want.

Although you shouldn’t need to.

Thanks for the advice. I tried the regex method, unfortunately with a 40’000 words to check, it’s definitely too long. I might have to find an external service where I can input my word list and my text and get results from there…

Looks simple as an API service but I’m struggling since two days to find one! :confused:

Not sure that you will find an API that you can throw that amount of data at will exist. You could store your data on something like ElasticSearch and then use that to do the search, as it will do all the indexing needed to make it performant.

Breaking your text up into words in Bubble, and then storing them for searching against your 40,000 list might take a while that is for use.

Yes, I find that when used in a search constraint, ‘contains’ does not return partial matches (which might suit @mattmazzega), but when used in a list filter constraint, it does. So it would seem that Bubble is happy to do partial matches if the whole list has been returned first.

1 Like

You’re right. Unfortunately for me, search constraints do not allow advanced commands as filters do. So I can not ask bubble to constraint search to things that ‘are contained’ by another thing…

Did you find a solution?

HI @nomads32n, sorry I never found any solution to this…