← Back to blog posts

DeepKeep Launches Vibe AI Red Teaming: A New Approach to AI Security

Yossi Altevet

•

April 20, 2026

Securing highly autonomous AI systems is not simple. Until now, teams had to choose between two imperfect options.

Traditional manual red teaming gives you depth, but it is slow, expensive, and hard to scale. It depends heavily on expert time and effort, and by the time testing is complete, the system may already have changed. Fully automated testing solves the speed problem. It is fast, efficient, and easy to run continuously, but it follows predefined scripts and rarely adapts to unexpected behavior. That is the core issue. Attackers do not follow scripts.

DeepKeep is introducing Vibe AI Red Teaming, a new approach that combines human expertise with AI-driven execution. It is designed for dynamic, human-steered testing across models, applications, and autonomous agents, without forcing teams to choose between speed and depth.

How Vibe AI Red Teaming Works

Vibe Red Teaming keeps a human in the loop without slowing the process down. Instead of writing complex scripts or relying on rigid templates, your security team interacts with the system using natural language. You define the objective, and the platform executes.

For example, you can ask the system to check for sensitive data exposure, attempt to bypass safety controls, or explore how an agent behaves under specific conditions. The system begins testing immediately, without setup overhead.

At the center of this process is Reddy, DeepKeep's AI red teaming agent. Reddy acts as a co-pilot throughout the test. It generates attack paths, plans next steps, and adapts in real time based on how the system responds. Each action feeds into the next, creating a continuous feedback loop that evolves as the test progresses.

At key decision points, execution pauses. This is where human expertise becomes critical. Your team can review intermediate findings, adjust direction, introduce new attack ideas, or push deeper into a potential vulnerability. You are not watching a test run in the background. You are actively steering it.

This combination of human judgment and AI execution allows teams to move quickly while still exploring complex, real-world attack scenarios.

Automated vs. Vibe Red Teaming

DeepKeep still supports fully automated testing, and it remains an important part of a strong security strategy. Automated testing is ideal for routine checks, broad coverage, and continuous regression testing. It ensures that known risks are consistently validated across system updates.

However, automated approaches are limited by design. They follow predefined playbooks and cannot easily adapt when a system behaves in unexpected ways.

Vibe Red Teaming addresses this gap. It focuses on exploration rather than repetition, allowing teams to follow attack paths as they emerge instead of being confined to fixed scenarios. This shifts the role of the user from passively reviewing reports to actively guiding the testing process.

Here is how they differ in practice:

Interface

Pre-programmed playbooks, scheduled runs (CI/CD), no user input during execution.

Natural language commands (e.g., "Check if LLM exposes personal data").

Adaptation

Limited. Includes some orchestration.

Real-time AI adaptation to defenses during execution (feedback loop).

User Role

Set schedules once; minimal intervention, focuses on monitoring reports.

Teams steer via conversation, with AI agent as an assistant.

Output

Actionable reports on exposures; kill-chain success rates.

Dynamic, audience-tailored insights with live steering.

Use Case

Regression testing, coverage, compliance checklist.

Business-impact vulnerabilities.

Both approaches are necessary. Automation provides consistency and scale, while Vibe Red Teaming provides depth and adaptability.

Meeting the OWASP Standard

Security expectations for AI systems are evolving, and OWASP has introduced clear criteria for evaluating AI red teaming solutions. Two requirements stand out in particular: adversarial creativity and full workflow coverage.

Vibe AI Red Teaming is designed to meet both. Instead of relying on static jailbreak lists, the platform generates attack scenarios tailored to your environment. It uses system context and configurable inputs to create realistic, multi-step attack flows that evolve over time.

This includes testing how prompts change across interactions, how agents make decisions, and how different components behave together under pressure.

It also goes beyond testing a single model in isolation. Modern AI risk lives in the full system. DeepKeep evaluates the entire workflow, including tool usage, data access patterns, and privilege boundaries. It actively probes for issues such as unauthorized actions, excessive data exposure, denial-of-service behavior, and escalation paths.

This level of coverage is essential for meeting both OWASP expectations and real-world security needs.

What Changes with Vibe

Vibe AI Red Teaming is not just a faster way to run tests. It changes how security teams think about testing altogether.

Instead of defining everything upfront and waiting for results, testing becomes interactive. You can follow a promising attack path as it unfolds, adjust your approach in real time, and explore edge cases that were not part of the original plan.

This matters because many real risks are not obvious at the start. They appear only after a few steps, when different parts of the system interact in unexpected ways. Vibe makes it possible to chase those paths without restarting the entire process.

It also lowers the barrier to deep testing. Teams do not need to encode every idea into scripts. They can think, react, and iterate naturally, while Reddy handles execution and scale.

In practice, this means faster discovery of meaningful issues, fewer blind spots, and a much tighter loop between testing and understanding.

Where This Leaves You

AI systems are evolving fast, and security needs to keep up. Fully automated tools are not enough, and manual testing alone does not scale.

Vibe AI Red Teaming brings the two together in a way that reflects how real attacks happen. It allows teams to explore, adapt, and go deeper without losing speed.

With Reddy acting as a co-pilot and humans driving the strategy, DeepKeep gives organizations a practical way to test complex AI systems and keep control as they scale.

The EU AI Act Readiness is an Evidence Problem for AI Security Teams

See what evidence the EU AI Act expects from high-risk AI systems, and how testing, monitoring, logging, and runtime controls support readiness.

InkJect: The Visual Prompt Injection That Text Defenses Were Never Built to Stop

A hidden instruction inside an image. An LLM that follows it. InkJect is a new visual prompt injection vulnerability confirmed on OpenAI and Anthropic's latest models.

What is AI Red Teaming? A Practical Guide

Red teaming AI systems isn't the same as traditional pen testing. The attack surface is different, the methods are different, and a one-time exercise won't keep you safe. Here's what it actually involves.

What Is Prompt Injection? How It Works and How to Stop It

Prompt injection is the most exploited vulnerability in AI systems today, and one of the hardest to fully fix. Here's what it is, why it's structural, and how to build a defense that actually holds.

Agentic AI Security: The Attack Surface Nobody Mapped Yet

AI agents don't just answer questions. They act. That means the blast radius of a security failure has expanded dramatically. Here's the attack surface most teams haven't mapped yet.

DeepKeep Selected as EIC Accelerator Winner: Europe Bets on AI Security

DeepKeep has been awarded €2.5M in blended finance through the EIC Accelerator's October 2024 cut-off. The co-funded project: Multimodal Models with AI-Native Security and Trustworthiness - a recognition that securing AI across LLMs, computer vision, spatial sensing, and multimodal systems isn't a nice-to-have. It's infrastructure.

The 45-Minute AI Lobotomy: Why Built-In Guardrails Are Dead

With open-source tools like Heretic performing a 45-minute lobotomy to effortlessly erase an AI's built-in safety guardrails, organizations must abandon the illusion that models can police themselves.

The AI Red Teaming Reality Check: How DeepKeep Delivers on OWASP

The OWASP v1.0 AI Red Teaming standard is the new benchmark for enterprise resilience. Read how DeepKeep ditches static jailbreaks for dynamic, context-aware testing across your entire agentic workflow.

A Rotten Apple Spoils the Image Generation

Poisoned training samples can turn ControlNet into a hidden backdoor. From a security perspective, this is not a noisy exploit. It is a sleeper agent waiting for the right signal.

Why LLM-as-a-Judge Isn't Enough

Let one AI keep an eye on another AI feels like putting a referee in the game. In reality, LLM-as-a-judge isn’t the silver bullet some people wish it was.

Multimodal AI is Smarter. Unfortunately, so are The Attacks.

AI has gotten good at understanding not just what we type, but what we show. This shift has made AI more powerful. Unfortunately, it has also made it more vulnerable.

You Can’t “Detect” a Jailbreak. Here’s What to Do Instead

Everyone is looking for an efficient way to detect and block jailbreaks, but here’s the uncomfortable truth: you can’t reliably detect every jailbreak, and trying to chase them all is a losing game.

Two Smart AI Models. Zero Common Sense.

AI is no longer a one-trick tool. It writes reports, analyzes photos, answers complex questions, and even kicks off real-world actions. Most of this power comes from two areas working side by side: Generative AI and Computer Vision.

Top Three Scenarios for PII Leakage in GenAI

Comprehensive PII detection combines scanning of data, penetration testing and a real-time AI firewall

DeepKeep Launches GenAI Risk Assessment Module

Evaluating model resilience is paramount, particularly during its inference phase in order to provide insights into the model's ability to handle various scenarios effectively

DeepKeep Comes out of Stealth to Safeguard GenAI with AI-Native Security and Trustworthiness

DeepKeep offers AI-Native security and trustworthiness that secures AI throughout its entire lifecycle

Meta’s LlamaV2 7B LLM Suffers from Susceptibility to DoS and Data Leakage

DeepKeep's evaluation of LlamaV2 7B's security and trustworthiness found strengths in task performance and ethical commitment, with areas for improvement in handling complex transformations, addressing bias, and enhancing security against sophisticated threats

View all