Multi-Targeted Fuzzing's semantic blindness

14 Nov, 2025 [essay] 2 min read

Abstract

Google’s Big Sleep exposes the semantic blindness of traditional fuzzing. This blindness causes significant inefficiency in multi-targeted greybox fuzzing, where shared context is currently ignored. The future of multi-directed fuzzing lies in reducing this inefficiency.

본문

Google’s Project Zero demonstrated a paradigm shift with Big Sleep, an LLM-based agent.¹ LLMs can now mimic the intuition of human researchers and overcome the blindness of coverage-guided fuzzing.

Traditional fuzzing relies heavily on random mutations and initial seeds. Tools like AFL fails if the harness (code used to fuzz the target program) lacks a path to the faulty function or if specific keywords are missing from the initial seed pool or dictionary. Even with relevant keywords, fuzzers lack the semantic understanding to navigate the state space. They instead rely on brute-force attempts guided by predefined metrics like coverage.

In contrast, Google employs LLMs for their context and historical memory. Analysis shows that many zero-day exploits are variants of previous vulnerabilities.² LLMs excel at grasping the semantic aspect of previous bugs. Big Sleep acts like a developer, analyzing commit history to hypothesize error states. It then deduces semantic context and generates edge cases.

AFL’s semantic blindness creates additional inefficiency in multi-targeted scenario. Ideally, the hard-earned progress made towards one target should benefit others. However, our current methods merely adjust energy schedules without understanding the targets’ semantic similarity. We lack a metric of how much the exploration of one target contributes to solving another.

The future of multi-directed fuzzing lies in bridging this semantic gap. We may incorporate semantic context into fuzzing. For instance, we can identify each target’s shared bottlenecks and measure how focusing on target A helps traversing diverse paths to target B. Insight from LLMs in Big Sleep can help quantify these relationships. This would transform accidental synergies into an intentional, coordinated discovery process.

Google Project Zero, “From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code”, 2024. https://googleprojectzero.blogspot.com/2024/10/from-naptime-to-big-sleep.html, Last accessed on 2025-11-22. ↩︎
Google Threat Analysis Group, “The ups and downs of 0-days”, 2023. https://blog.google/threat-analysis-group/0-days-exploited-wild-2022/, Last accessed on 2025-11-22. ↩︎

분류:다중 지향성 퍼징