Developing an AI feature for free?

We’re producing a new version of an app that provides geospatial data and I want to explore adding a search feature that can turn a natural language query into structured data which contains the type of data (basically an enumeration) and/or a bounding box, plus info about date range, etc. I am under the impression that I need to create an embedding based on description and geometry.

This is a side project and there is no budget for tooling, so I am trying to figure out what tools I need to use to be able to develop this for free. I might be able to spend a few bucks per month for an AWS service, but that’s about it. Does anyone have any thoughts?

Do you mean you want something that can turn a natural-language query like “coffee shops near here” into a PostGIS query?

If you have a relatively standard database, probably any LLM API can do this for you — your pick of OpenAI, Azure, AWS, etc. — without any additional training, as long as you specify your own requirements along with the context, e.g. add “our database has these tables and fields, assume this projection, blah blah, return only the SQL and nothing else, double-check to make sure it’s valid PostGIS, etc.” to every query you send to the API so it’s part of the same context. You’d probably still want to manually verify the returned query with some sort of non-LLM validator just to make sure it’s not a jailbroken injection of some sort. There is always a risk of hallucination or malicious use, but that’s just LLMs for ya.

If your backend isn’t a standard DB, you can also write your own MCP server to allow LLMs to better interact with it… that’s probably just a few hours/days of work.

If you want something more constrained, you can fine-tune an existing open-source model against training data of your own, real or synthetic, for better performance. You’d need hardware of your own or you can rent a cloud GPU for a few days/weeks…

But you probably can’t afford to train a model from scratch since you don’t have any resources…

Does that help at all? Or did you mean something else?

Also, it seems to me that this is the type of problem that’s easier to solve with a good filters GUI (let the user choose category, draw a bounding box, select a date range, etc.) so they don’t have to guess at wording. It’s also a lot easier to click on those than to type out a whole sentence. And there are many existing geospatial apps and APIs (Google Maps, Smarty, etc.) to draw inspiration from.

Develop locally using ollama.

When it moves into production, someone has to foot the bill for the llm usage.

You can either host ollama yourself, ask users to provide an api key, or foot the bill yourself for a service (amazon bedrock is pence per call, but if it’s public facing someone will abuse it)