I’m coming from a DevOps background and with AI being everywhere, I thought I’d try something out myself.
I built a small app that uses ChatGPT (with an API key) to generate results based on user input and then show it on a UI.
For my case, is there a way to fine-tune ChatGPT (or any LLM)? I’m trying to make my app more specific to its domain.
Right now, I’m using prompts — I have a system prompt where I explain what the user input is about and the overall function of the tool. It works okay, but I want to make it more specialized. What should I do?
I’m pretty new to this area, so please be easy on me
You could use LangChain with a vector database to create embeddings from your custom data or use custom GPTs, though I’m not sure if API access to those is available. You could automate the process to input into a custom GPT but it’s a bit of work.
Look into LangFlow, it makes LangChain more visual and might help too.
@Carter
Thanks! I watched a YouTube video (timestamp: https://youtu.be/5vvtohsuo6A?t=327) where it talks about creating a new model using synthetic data. Is this a different method?
There’s also something called RAG (Retrieval-Augmented Generation). In the video, I think it shows how you can fine-tune directly using an OpenAI endpoint, so you might be able to do that too. But it’s a long video and I’m tired, sorry.
Marley said: @Carter
Oh, did you mean the video I shared? It’s only 6+ minutes long.
Sorry, I mixed it up. I rewatched it, and yes, it looks possible using that method, but the video uses 3.5. If there are newer models, they might have different tuning options.
Ask GPT for the latest info on tuning newer models.
When improving your AI app, there are a few steps you might want to consider:
Prompting
RAG (Retrieval-Augmented Generation)
Fine-tuning
RAG (Retrieval-Augmented Generation)
RAG helps by adding relevant info to the LLM in real time, like:
Understanding the user’s query
Retrieving relevant data
Adding the info to the response.
It’s useful when you need accurate answers based on specific data, like company documents. You can use tools like embeddings and retrieval libraries for this.
Fine-tuning
Fine-tuning focuses on adjusting the LLM’s overall behavior. It doesn’t fetch data in real-time like RAG; instead, it teaches the model to produce specific kinds of output.
Use fine-tuning when:
You need output in a particular structure
You want consistent style or formats
You have examples to teach the model.
When to Use Each
Use RAG for real-time info retrieval
Use fine-tuning for consistent, structured outputs.
Both methods are available through the OpenAI API, and you can find the documentation for them.