Directed Fuzzing $$

28 Nov, 2025 [essay] 2 min read

Abstract

cm automates bug patching using internal validation loops. But self-verification mechanisms are prone to bias and hallucination. Directed fuzzing is essential for rigorous verification. Also, cm can help Multi-targeted Directed Fuzzing’s fault masking problem.

본문

DeepMind’s cm is an AI agent designed to automate the entire remediation lifecycle, from bug discovery to verification.¹ Once a faulty input is found via Big Sleep or OSS-Fuzz, internal multi-agent systems make patches and validate them against regression tests or LLM-based judges.²

However, such internal validations are insufficient to guarantee security. An AI validator may hallucinate correctness, allowing a patch to pass internal criteria while leaving the vulnerability exposed to modified inputs. Relying on an AI’s self-assessment is inherently risky, as models cannot act as unbiased adversaries to their own creations. To deploy automated repairs safely, we require a validator that is independent of the generation process.

Directed fuzzing, targeting the patched code, can serve as this external verification engine. We can give the original faulty input as an initial seed, mutating it to check for incomplete fixes. Multi-targeted Directed Fuzzing (MDF) would be beneficial since the patch can be spread amongst code.

Conversely, MDF may benefit from cm by addressing the challenge of fault masking. Aiming multiple bugs at once may be inefficient when shallow bugs block access to deeper ones. In that case, immediately patching each found bug is required. While one solution can be a loop of single-target fuzzing and patching, cm can do its job too. Each time a binary is patched, MDF would immediately leverage its current seed pool against both remaining targets and the patch itself. Thus, MDF can aim multiple targets without waste, and also validate each patch at the same time.

By enabling fuzzers to rapidly stress-test AI-generated patches, we provide the ground truth required to trust automated repairs. It would also ease the review burden imposed by the surge of AI-generated Pull Requests.

DeepMind. Introducing Codemender: An AI agent for code security, 2025. https://deepmind.google/blog/introducing-codemender-an-ai-agent-for-code-security/, Last accessed on 2025-11-28. ↩︎
Asif Razzaq. Google DeepMind Introduces CodeMender: A New AI Agent that Uses Gemini Deep Think to Automatically Patch Critical Software Vulnerabilities, 2025. https://www.marktechpost.com/2025/10/07/google-deepmind-introduces-codemender-a-new-ai-agent-that-uses-gemini-deep-think-to-automatically-patch-critical-software-vulnerabilities/, Last accessed on 2025-11-28. ↩︎