Park, Geon (re-st)

Titan

[research] 4 min read

์ƒ๊ฐ

  • Beacon์˜ precond-infer ์‚ฌ์šฉ.
  • has the potential to improve the performance of other directed fuzzers ๋ผ๋Š”๋ฐ ์œ ์˜๊นŠ๊ฒŒ ๋ณผ๊ตฌ์ ˆ์ธ๊ฐ€? ํ•ด๊ฒฐ : RQ3์ด๋‹ค.
  • ์ด๋Ÿฐ๊ฒƒ๋„ symbolic execution์œผ๋กœ ๋ณผ ์ˆ˜ ์žˆ๋‚˜?

๊ฐœ์š”

2024 ๋‹ค์ค‘ ์ง€ํ–ฅ์„ฑ ํผ์ง• ์ค‘๊ตญ ๋…ผ๋ฌธ. ํƒ€๊นƒ๊ฐ„ ์ƒ๊ด€ ๊ณ ๋ ค

๋ฐฐ๊ฒฝ

PAFL: Extend fuzzing optimizations of single mode to industrial parallel mode

  • deploying multiple concurrent fuzzer instances
  • ์—ฌ๋Ÿฌ ํƒ€๊นƒ์„ ์œ„ํ•ด ์ด๊ฑธ ์“ฐ๋Š”๊ฑด impractical
  • Most other directed fuzzers [2], [10]โ€“[14] do not support reproducing multiple targets simultaneously anyway

Some other tries w/ avg. CFG dist.

  • [15]โ€“[18] have endeavored to address multiple targets simultaneously using general feedback, e.g., average control-flow distance,
  • their effectiveness significantly diminishes when tackling multiple target
  • \(\S\) 5.1 : AFLGo๋Š” ๋ฉ€ํ‹ฐํƒ€๊นƒ ์ง€์›ํ•˜๋‹ˆ ๋ฉ€ํ‹ฐ๋กœ ํ•œ๋ฒˆ์— ๋„ฃ์€ ๊ฑธ -Multi, ํ•˜๋‚˜์”ฉ ๋„ฃ๊ณ  ๋ณ‘๋ ฌ ๋Œ๋ฆฌ๋Š” ๊ฑธ -Single์ด๋ผ๊ณ  ๋งํ•˜์ž.
  • AFLGo-Multi [15] ์— ๋„ฃ์–ด๋ณด๋‹ˆ, ๊ทธ๋ƒฅ AFLGo-Single ๋ณ‘๋ ฌ๋กœ ํ•œ ๋ป์ €๋ณด๋‹ค 3.2x ๋А๋ฆผ, ๊ฐ ํƒ€๊นƒ reproduceํ•˜๋Š” ๋ฐ ๋“  ํ‰๊ท ์‹œ๊ฐ„์ด. AFLGo-Single ๋ณ‘๋ ฌ๋ป์ €๊ฐ€ ์“ด ์‹œ๊ฐ„์„ ๋‹ค ๋”ํ•˜๋ฉด, AFLGO-Multi๋ณด๋‹ค 3.6x ๋А๋ฆผ.
  • ๋ฌธ์ œ์˜์‹ โ†’ ์ข€ ๋‚˜์•„์ ธ์•ผ ํ•œ๋‹ค

๋ฌธ์ œ

synergy ignorance problem

ํƒ€๊นƒ๊ฐ„ ์ƒ๊ด€(correlation)์„ ์ด์šฉ์น˜ ๋ชปํ•จ

  • bias (avoid hard-to-reach), lowering prob. of exploring
  • massive redundant executions
    • ํƒ€๊นƒ1๊ณผ 3๊ฐ€๋Š” ํŒจ์Šค๊ฐ€ ๋‹ค๋ฅผ๋•Œ, ํŒจ์Šค ์ƒ๊ด€์„ ๋ชจ๋ฅด๋Š” ๋ป์ €๋Š” 1๊ณผ 3์„ ๋™์‹œ์— ๋ชป๊ฐ์„ ๋ชจ๋ฅด๊ณ  ๊ฐˆํŒก์งˆํŒกํ•œ๋‹ค

์ „๋žต

์ง๊ด€

ํƒ€๊นƒ ์ƒ๊ด€ โ‰’ ํƒ€๊นƒ์˜ feasible condition๊ฐ„ ์ƒ๊ด€, ํŠนํžˆ independence์™€ contradiction

Pasted image 20240205221754.png

contradiction

eg_ b๊ฐ€ ์ธํ’‹์˜ 2๋ฒˆ์งธ coordinate๋ผ ํ•˜์ž. ์•„! ํƒ€๊นƒ1๊ณผ 2๋Š” bโ‰ค1์กฐ๊ฑด์ด ๋‹ค๋ฅด๊ตฌ๋‚˜. contradiction. ๊ทธ๋Ÿฌ๋ฉด ํƒ€๊นƒ1์€ bโ‰ค1์ธ ์‹œ๋“œA๊ฐ€, 2๋Š” b>1์ธ ์‹œ๋“œB๊ฐ€ ๊ฐ€์ž.

independence

multiple input bytes could affect different targets independently

์ฆ‰, a์™€ d๊ฐ€ indep.๋‹ˆ๊นŒ a์™€ d๋ฅผ ๋™์‹œ์— ๋ณ€ํ˜•์‹œ์ผœ ํƒ€๊นƒ1๊ณผ 3 ๋™์‹œ์— ๋…ธ๋ฆฌ์ž (ํƒ€๊นƒ1์€ ์ตœ์†Œํ•œ a>5์กฐ๊ฑด์ด ์žˆ์–ด์•ผ ๊ฐ€๊ณ , ํƒ€๊นƒ3์€ d๊ฐ€ ์ˆ˜๋ฌธ์žฅ์ž„)

ํ•ต์‹ฌ

CFG๋Œ€์‹  path condition์œผ๋กœ ๋””๋ ‰ํŒ….

synergy-aware scheduling strategy

finds the seed potentially reaching the most targets with a fine-grained ranking of the seeds

multi-target-oriented mutation

์—ฌ๋Ÿฌํƒ€๊นƒ ๋™์‹œ์ ‘๊ทผํ•ด์„œ, fewer executions needed

์‹ค์ œ

Instrumentation (on LLVM BC)

ํ˜„์žฌ๋Š” cheap, preciseํ•œ ์ •์ ๋ถ„์„์œผ๋กœ ํƒ€๊นƒ์ƒ๊ด€ ๋ฐํž˜

Pasted image 20240205221812.pngPasted image 20240205221816.png eg_ We can use the correlation, \(conflict(b, l_5, t_1, t_2)=true\), to know that the var. b defined at Line 2 influences targets 1 and 2.

๊ฒฐ๊ณผ : The set of correlation, C(S), of the seed S is: Pasted image 20240205221827.png

Synergy-aware scheduling strategy

seed๊ฐ€ ์žˆ์„ ๋•Œ, ํƒ€๊นƒ ํ•„ํ„ฐ๋งโ€ฆ

์œ„ ์˜ˆ์ œ ์ด์–ด์„œ ..- ์‹œ๋“œ A, B๊ฐ€ ํƒ€๊นƒ 1,2๋ฅผ ๊ฐ€์•ผ ํ•œ๋‹ค๊ณ  ๋ณด์ž. A, B ์‹คํ–‰์‹œ์ผฐ์„๋•Œ ๋œ b์˜ ๊ฐ’์œผ๋กœ A๊ฐ€ 1, B๊ฐ€ 2 ๊ฐ€์•ผ ํ•œ๋‹ค๋Š”๊ฑธ ์•Œ๊ฒŒ๋จ.

๊ตฌ์ฒด์ ์œผ๋กœ, A=(a=6, b=3, c=1, d=1)์—์„œ Pasted image 20240205221836.png (T_res, T_conf, T_ind์— ๋Œ€ํ•œ, ๊ทธ๋ฆฌ๊ณ  ์ž์„ธํ•œ, ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ Algorithm 2)

โ€ฆํ•˜๊ณ  ์šฐ์„ ์ˆœ์œ„ ์ •ํ•˜์ž

The seed input whose value satisfies the conditions for most variables should be prioritized. Pasted image 20240205221847.pngPasted image 20240205221851.png

Multi-target-oriented mutation

Pasted image 20240205221859.png B_ti๋ฅผ, ์ง์ ‘ ๋ฐ”๊พธ๊ณ  ๋Œ๋ ค๋ด์„œ ๊ตฌํ•˜๊ธด ํ•˜์ง€๋งŒ, ๋ชจ๋“  B_ti๋ฅผ 2^n๊ฐœ ๋‹ค ํ•˜์ง„ ์•Š๋Š”๋‹ค.

  1. dynamically infer the input bytes that can influence the values of the variables in the independent correlations using inference-based taint analysis [22FairFuzz], [25RedQueen], [26GreyOne].
  2. mutate the bytes that influence the independent variables respectively

๊ทธ๋ž˜์„œ ๊ตฌํ˜„์€?

For each byte, b, of seed, S, we measure the number of its influenced targets that have not been triggered as inf(b). We choose the bytes with probability, p, defined as follows: Pasted image 20240205221909.png ์ด์ œ ์„ ํƒ ์•ˆ๋œ ๋ฐ”์ดํŠธ ์ค‘, ์„ ํƒ๋œ ๊ฒƒ๋“ค๊ณผ indep.ํ•œ ํƒ€๊นƒ์„ ๊ณต๋žตํ•˜๋Š” ๋ฐ”์ดํŠธ๋ฅผ ๋˜ flipํ•œ๋‹ค.

inf(b)๋Š” ์–ด๋–ป๊ฒŒ ๊ตฌํ•˜๋‚˜?

C(S)๊ฐ€ ๋ญ”๊ฐ€? variable๋งˆ๋‹ค ๊ทธ๊ฒŒ reachability์— ๊ด€์—ฌํ•  ์ˆ˜๋„ ์žˆ๋Š” ํƒ€๊นƒ์„ ๋ชจ์•„ ๋†จ๋‹ค. ๊ทธ๋Ÿฌ๋ฉด ์ด์ œ ๊ทธ variable์„ ์›€์ง์ด๋Š” byte๋“ค์„ ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ๋‹ค. ใ……ใ…‡. ์ƒ๊ฐ๋ณด๋‹ค ๋งŽ์„๊ฑฐ ๊ฐ™์€๋ฐ..

๋ฒค์น˜๋งˆํฌ

  • MAGMA : CVEs chosen from nine programs (that are freqโ€™ly used for evaluating fuzzers). ์ด ์ค‘ 6๊ฐœ๋งŒ ์“ด๋‹ค.
    • ์ตœ๊ทผ์— ์ถ”๊ฐ€๋œ lua๋Š” ์–ด๋–ค state-of-the-art fuzzer๋„ ์ œ๋Œ€๋กœ ์•ˆ๋Œ์•„๊ฐ€๋ฏ€๋กœ ์•Š์œผ๋ฏ€๋กœ ๋บ€๋‹ค.
    • sqlite3์™€ PhP๋Š” ํŒŒ๋งˆ์‚ฐ์ด ์•ˆ๋Œ์•„๊ฐ€๋ฏ€๋กœ ๋บ€๋‹ค.
  • MAGMA์™€ state-of-the-art directed fuzzer๊ฐ€ ์“ด ํ”„๋กœ๊ทธ๋žจ์˜ ์ตœ์‹ ๋ฒ„์ „์˜ CVE๋“ค.
    • for reproducing โ€˜real vulnerabilitiesโ€™ and detecting incomplete fixes of the previously reported bugs in the newest version of the programs.

ํ†ต๊ณ„

  • We use the Mann-Whitney U Test to demonstrate the statistical significance of our frameworkโ€™s contribution

๊ฒฐ๊ณผ

  • Titan may require longer pre-analysis for larger programs, e.g., openssl and php, it is acceptable because the analysis time could be spent offline, and Titan saves more time during the fuzzing period.
  • ๊ฒฐ๊ณผ๋Š” Table 4์— ์žˆ๋Š”๋ฐ, ๋ฒ„๊ทธ์ฐพ๋Š” ์‹œ๊ฐ„ (reproduction time)์„ ๋น„๊ตํ–ˆ๋‹ค. Ratio (Titan๋ณด๋‹ค ๋ช‡๋ฐฐ ๊ฑธ๋ ธ๋Š”์ง€) ๋ณด๋‹ˆ Titan์ด ์ „๋ฐ˜์ ์œผ๋กœ ์šฐ์ˆ˜ํ•˜๋‹ค (๋ฉ€ํ‹ฐํƒ€๊นƒ์œผ๋กœ ์‹คํ—˜ํ•ด์„œ ๊ทธ๋Ÿฐ๊ฑธ์ง€๋„). ๊ทผ๋ฐ Table5๋ณด๋‹ˆ๊นŒ Ratio๋ฅผ ์‚ฐ์ˆ ํ‰๊ท  ๋‚ธ๊ฑฐ๊ฐ™์€๋ฐ.. ๊ธฐํ•˜ํ‰๊ท ์ด๋‚˜ ์กฐํ™”ํ‰๊ท ์ด ๋‚ซ์ง€ ์•Š์œผ๋ ค๋‚˜ ์‹ถ๋‹ค. pvalue๊ฐ€ <0.01์ธ๊ฒŒ ๋งŽ์œผ๋‹ˆ, ํ™•์‹คํžˆ ๋‚˜์•„์กŒ๋‹ค๊ณ  ๋ณด์ธ๋‹ค.

๊ฐ์ƒ

๋ฉ€ํ‹ฐํƒ€๊นƒ์€ ์—ญ์‹œ ์ •์ ๋ถ„์„์ธ๋ฐ, ๊ทธ๊ฑธ Beacon์„ ๊ทธ๋Œ€๋กœ ๋Œ์–ด์„œ ๋‚ด๋‹ค๋‹ˆ ๋ถ€๋Ÿฌ์šด ์—ฐ๊ตฌ๋‹ค. ์„ฑ๋Šฅ์ด ๋„ˆ๋ฌด ๋งŽ์ด ์ข‹์•„์กŒ๋Š”๋ฐ, ์ž์‹ ๋“ค์ด โ€˜๋ฉ€ํ‹ฐ ํƒ€๊นƒ์—์„œ ํƒ€๊นƒ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ๊ณ ๋ คํ•œ ์ฒซ ์‚ฌ๋žŒ๋“คโ€™์ด๋ผ๋Š” ์ ์„ ๊ฐ•์กฐํ•˜๋‹ˆ, ๋‚ฉ๋“์ด ์ผ๊ฒฌ ๊ฐ€๋ฉด์„œ๋„, ์ •์ ๋ถ„์„์œผ๋กœ ๊ตฌ๊ฐ„์ด ์ด๋ ‡๊ฒŒ ๋ช…ํ™•ํžˆ๊นŒ์ง€ ๋‚˜๋‰˜๋‚˜? ๋ผ๋Š” ๊ฒŒ ์˜๊ตฌ์‹ฌ์ด ๋“ ๋‹ค. ๋˜ํ•œ, ํƒ€๊นƒ์œผ๋กœ ๊ฐˆ ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋Š” input bit flip๊นŒ์ง€ ์ฒดํฌํ•˜๋ฉด์„œ๋„ ๊นƒํ—™์ฝ”๋“œ๊ฐ€ ์งง์•„์„œ, ์–ด๋–ป๊ฒŒ ๋Œ์•„๊ฐ€๋Š”์ง€ ๋ณผ ์ด์œ ๊ฐ€ ์žˆ๋‹ค. ์ฆ‰ ํ…Œํฌ๋‹ˆ์ปฌํ•œ ๋ฉด์ด ๊ถ๊ธˆํ•˜๋‹ค.

Source: https://5hadowblad3.github.io/files/Oakland24-Titan.pdf

๋ถ„๋ฅ˜:๋‹ค์ค‘ ์ง€ํ–ฅ์„ฑ ํผ์ง•

<< Previous Post

|

Next Post >>

โ† ๋’ค๋กœ