I’ve tried coding and every one I’ve tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.
I’ve tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.
So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can’t really handle anything above 4B in a timely manner. 8B is about 1 t/s!
No, what is it? How do I try it?
RAG is basically like telling an LLM “look here for more info before you answer” so it can check out local documents to give an answer that is more relevant to you.
You just search “open web ui rag” and find plenty kf explanations and tutorials
I think RAG will be surpassed by LLMs in a loop with tool calling (aka agents), with search being one of the tools.
LLMs that train LoRas on the fly then query themselves with the LoRa applied