How I do use the recent llama.cpp native tools to do web rag a.k.a. web_fetch (or anything else for the matter) directly from inside the llama-server's webui
A user on Reddit's r/LocalLLaMA shared a detailed method for enabling Retrieval Augmented Generation (RAG) and other command-line functionalities within the llama.cpp server's web UI. This approach involves enabling native tools in llama-server, installing and configuring `firejail` for system-wide sandboxing, and creating a dedicated user with a virtual machine container harness called `smolmachines`. The setup culminates in a multi-layered sandboxing process that allows the LLM to safely execute commands, such as fetching web content using `wget`, directly from its interface. AI
IMPACT Enables more sophisticated RAG and command execution directly from local LLM interfaces, enhancing their utility for complex tasks.