For a specific task, we would generate embeddings for sample prompts based on function descriptions. Then:
Search vector database → Get the most suitable action to carry out → Use an LLM to get the input parameters → Execute the action
The goal is to make it simpler to automate tasks from natural language queries. Unlike other systems that depend completely on LLMs for everything, in this one, the LLM mainly interprets the commands, while the codebase handles the actual action execution.
Do you have any suggestions for improvements, or things I should think about? Are there any features that could make this system even more helpful?
I’m having trouble understanding the reasoning behind this. No judgment, just trying to understand.
So, unlike other systems that fully rely on LLMs for everything, in this case, the LLM is mostly for interpreting commands, while the code handles the actual action execution.
What benefit do you get from partially removing the LLM? Using embeddings seems like it makes your product less effective because you’re not letting the LLM handle the full text interpretation from the start.
The savings from using embeddings make sense when you don’t need perfection and plan on doing this a lot. But gpt4o-mini is pretty affordable, and it’ll likely be more reliable than embeddings. Plus, you’re already using an LLM for an important part of this.
@Scout
The problem with LLMs is that they don’t remember function details between calls, so I would have to send descriptions each time, which isn’t efficient. Also, I wanted offline support, where I can’t always access the latest models. Using embeddings and a local vector database allows the system to work offline and reduces the reliance on LLMs. Open-source models, while not as strong in context or accuracy, are still good enough for extracting parameters.
@Scout
I’m not the original poster, but since the AI is only being used to figure out the action stack, I want to keep the AI usage light for better performance.