What do you think about this way of combining function-calling with natural language?

For a specific task, we would generate embeddings for sample prompts based on function descriptions. Then:

Search vector database → Get the most suitable action to carry out → Use an LLM to get the input parameters → Execute the action

The goal is to make it simpler to automate tasks from natural language queries. Unlike other systems that depend completely on LLMs for everything, in this one, the LLM mainly interprets the commands, while the codebase handles the actual action execution.

Do you have any suggestions for improvements, or things I should think about? Are there any features that could make this system even more helpful?

https://github.com/sri0606/text_to_action

Welcome to this forum

Question Discussion Guidelines


Please follow these guidelines in current and future posts:

  • The post should be over 100 characters - the more details, the better.
  • Your question might already be answered. Use the search function if no one is engaging with your post.
    • Topics like ‘AI will take our jobs’ have been discussed a lot.
  • Discussion on the positives and negatives of AI is welcome, just keep it respectful.
  • Provide links to support your arguments.
  • No silly questions, unless it’s about AI bringing the end-times. It’s not.
Thanks - let the mods know if you have any questions or comments.

I am a bot, and this action was performed automatically. Please contact the moderators of this forum if you have any questions or concerns.

Looks pretty good! Do you plan to support Typescript or JS?

Morgan said:
Looks pretty good! Do you plan to support Typescript or JS?

Are you asking for that kind of support, or would you prefer native support?

Check out server.py and main.js here:

@Oli
Yes, I can try it that way when a use case comes up! Thanks

I’m having trouble understanding the reasoning behind this. No judgment, just trying to understand.

So, unlike other systems that fully rely on LLMs for everything, in this case, the LLM is mostly for interpreting commands, while the code handles the actual action execution.

What benefit do you get from partially removing the LLM? Using embeddings seems like it makes your product less effective because you’re not letting the LLM handle the full text interpretation from the start.

The savings from using embeddings make sense when you don’t need perfection and plan on doing this a lot. But gpt4o-mini is pretty affordable, and it’ll likely be more reliable than embeddings. Plus, you’re already using an LLM for an important part of this.

@Scout
The problem with LLMs is that they don’t remember function details between calls, so I would have to send descriptions each time, which isn’t efficient. Also, I wanted offline support, where I can’t always access the latest models. Using embeddings and a local vector database allows the system to work offline and reduces the reliance on LLMs. Open-source models, while not as strong in context or accuracy, are still good enough for extracting parameters.

@Scout
I’m not the original poster, but since the AI is only being used to figure out the action stack, I want to keep the AI usage light for better performance.