I’ve been learning AI over the past several years now, but not with paid models. I took a different path. I went local. Sure I use Anthropic, OpenAI, and CoPilot at work, but I wanted to understand both sides of things. This led me down the path of trying out multiple local AI hosting instances, such as the popular Ollama and LM Studio options.
I ultimately landed on llama.cpp. The work the developers are doing is truly amazing. And the number of parameters that lets you experiment is a great learning tool. It helped me learn and understand the importance of a models temperature, context windows, k-quantization, kv-cache quantization, what they do and why they are important. And the best part of using local models is you can still tie them into harnesses – Claude Code, GitHub CoPilot CLI, Qwen Code, you name it.
But enough of that, I’ll post more on local and Enterprise AI development in the future. This is about a different project I’ve been working on: Bandage. Bandage is a completely local multi-agent offense or defensive penetration testing program. The architecture looks something like this (Thanks Google Gemini):

It consists of:
- Reconnaissance agent that leverages playwright/chromium to crawl websites as well as directories, single files, or GitHub repositoties.
- Code Audit agent for source code vulnerability analysis
- Offensive Pentesting agent for web and API vulnerability analaysis
- Validation agent that reviews the findings and ensures accuracy
- Reporting agent that summarizes the findings and drafts an executive report.
The offensive pentesting agent leverages Kali Linux on the backend through a direct SSH connection. All of the Kali Linux tools have been exported to a JSON which the agent ingests to understand what it has at its disposal

What’s interesting around all of this is it’s ability to actually compete. I originally tested it against some items: DVWA (Damn Vulnerable Web App), DVAPI (Damn Vulnerable API), and DVMCP (Damn Vulnerable Model Context Protocol), sat back, and let it rip. The agent reviews the recon reports, and based on that develops a plan and begins to execute.
One of the coolest things to me is that the AI picks the tools and parameters, with the option for me to jump in at any point and provide it direction should I choose. In the below screenshot we can see it first scan the host, execute gobuster, find a login page, and begin to attempt to gain console access.
No guidance given outside of the URL:

At one point the AI decided to write it’s own supporting Python code to aid in it’s offensive attacks. Super interesting stuff:

Interactive conversation during testing:

The other item that proves to be valuable here is the ability to chat with the any of the agents in real time. This proves useful for CTFs (Capture the Flags), where it may need additional context. I’m also developing specifically formatted CTF capabilities. And in initial testing against some Easy TryHackMe boxes, it was able to nail it. I also had some Security Operations guys throw me some CTF containers, and it nailed each one.
The beauty of local development is being able to see behind the curtain. It isn’t just tokens sent and received, it’s data that allows your prompts to become more efficient, more aware of how it handles your context windows – On the Enterprise side of things, it directly allows one to develop in a way to keep costs as low as possible. The sampler chain, the context checkpoints, the prompt processing, all of it shines a light on how much usage is being generated based on what is being requested of the model

The use of AI agents offers a window into automation of almost anything you can dream up. It is already changing the game, and it will continue to do so as it iterates to become better. As a cybersecurity professional, to me this means the importance of introducing security guardrails across the entire AI stack – From the bare metal up to the command prompt. Truly fascinating times to be in the field.





















