i have c++ project integrated with cmake build system and i use CMakePresets.json. I have such classes class some_object { // ... }; class processor { public: some_object process(const std::vector<std::uint8_t>& input); }; Write fuzz test, using libFuzzer, of method processor::process

Nice — below is a ready-to-drop-in libFuzzer harness for processor::process, plus a minimal CMakeLists.txt snippet and an example CMakePresets.json preset you can add to your project. The fuzz target simply feeds libFuzzer's input bytes to processor::process as a std::vector<std::uint8_t>.

1) `fuzz_processor.cpp` (fuzz target)

cpp
// fuzz_processor.cpp
// Build this with libFuzzer (clang) and sanitizers enabled.

#include <cstdint>
#include <cstddef>
#include <vector>

// include your headers that define `processor` and `some_object`
// adjust the include path as needed for your project:
#include "processor.hpp"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // Convert libFuzzer input into the form expected by processor::process
    const std::vector<std::uint8_t> input(data, data + size);

    // Create a processor instance and call process()
    // NOTE: we intentionally do NOT catch exceptions here so crashes / UB surface to libFuzzer.
    processor p;
    some_object out = p.process(input);

    // Optionally: do a light sanity touch on `out` to avoid it being optimized away.
    // (Only necessary if some compilers optimize away unused results.)
    (void)out;

    return 0; // required by libFuzzer signature
}

// Optional: if your target needs initialization you can implement this
// extern "C" int LLVMFuzzerInitialize(int *argc, char ***argv) { /* init */ return 0; }

Notes about the harness code

The function must be extern "C" with the LLVMFuzzerTestOneInput signature so libFuzzer can call it.
I intentionally do not catch exceptions inside LLVMFuzzerTestOneInput. Letting exceptions or crashes surface helps libFuzzer find real bugs (adjust if your code throws many expected exceptions you want to ignore).
If your processor constructor is expensive or needs global setup, implement LLVMFuzzerInitialize to perform one-time setup and create a shared instance.

2) `CMakeLists.txt` snippet

Add a fuzz target to your existing CMake project. Replace your_library_target with the target/library that contains processor (or link the objects directly if appropriate).

cmake
cmake_minimum_required(VERSION 3.18)
project(MyProject CXX)

# your normal project targets...
# add_library(your_library_target ...)

# Fuzz target
add_executable(fuzz_processor fuzz_processor.cpp)

# Link with the library that contains processor (change name to match your target)
target_link_libraries(fuzz_processor PRIVATE your_library_target)

# Compiler and linker flags for libFuzzer + ASAN/UBSAN
# These flags expect clang/clang++ and libFuzzer support (-fsanitize=fuzzer).
target_compile_options(fuzz_processor PRIVATE
    -g
    -O1                 # libFuzzer recommendation for fast instrumentation; change if desired
    -fno-omit-frame-pointer
    -fsanitize=fuzzer,address,undefined
)

target_link_options(fuzz_processor PRIVATE
    -fsanitize=fuzzer,address,undefined
)

# If your project sets CXX_STANDARD, ensure fuzz target matches it:
target_compile_features(fuzz_processor PRIVATE cxx_std_17)

Important

-fsanitize=fuzzer requires clang (clang++). If you are using GCC, libFuzzer linking is different and typically requires an external libFuzzer library; using clang is simpler.
Adjust optimization level (-O1/-O2) depending on needs. libFuzzer docs recommend -O1 for faster iteration but -O2 is also used widely.

3) Example `CMakePresets.json` snippet

If you use CMake presets, here's a sample CMakePresets.json entry that sets clang as the compiler and configures a Debug-ish build with sanitizers. Add this to your project's CMakePresets.json or merge with your existing file.

json
{
  "version": 3,
  "configurePresets": [
    {
      "name": "fuzz-clang",
      "displayName": "Configure for libFuzzer (clang, ASAN, UBSAN)",
      "description": "Use clang/clang++ and build fuzz targets with sanitizers.",
      "generator": "Ninja",
      "binaryDir": "${sourceDir}/out/build-fuzz",
      "cacheVariables": {
        "CMAKE_C_COMPILER": "/usr/bin/clang",
        "CMAKE_CXX_COMPILER": "/usr/bin/clang++",
        "CMAKE_BUILD_TYPE": "RelWithDebInfo"
      }
    }
  ]
}

Adjust CMAKE_C_COMPILER paths to match your environment (for example: /usr/bin/clang++ on Linux, or a path to a custom clang).

4) Build & run (example commands)

Assuming the preset above and ninja generator:

bash
# configure
cmake --preset fuzz-clang

# build fuzz target
cmake --build out/build-fuzz --target fuzz_processor -j

# run the fuzzer (example: run indefinitely; libFuzzer accepts various flags)
./out/build-fuzz/fuzz_processor -jobs=4 -max_total_time=3600

Replace or remove -max_total_time as needed — libFuzzer supports many runtime flags (-runs, -jobs, -seed, -artifact_prefix, ...). See libFuzzer docs for details.

5) Extra suggestions / hardening

Enable AddressSanitizer, UndefinedBehaviorSanitizer (already included above).
Consider -fsanitize=leak if you want leak detection, but note it can slow fuzzing and complicate reproduction.
If processor::process needs external state (files, config), implement LLVMFuzzerInitialize to set it up.
If processor::process is stateful and you want to avoid reconstructing processor per input, create a static processor in LLVMFuzzerInitialize and reuse it — but be careful about residual state between runs (prefer stateless or reset between runs).
Use -artifact_prefix=./artifacts/ to collect crashing inputs for later analysis.

If you'd like, I can:

adapt the harness to catch/ignore std::exception (if you expect a lot of thrown exceptions),
create a reproducible minimization script (how to reproduce a crash with libFuzzer and clang),
or prepare a Dockerfile/CI step that runs the fuzzer in your CI environment.

Tell me which of those you want and I’ll produce it right away.

ubuntu, install clang

Got you 👍 On Ubuntu, installing clang is straightforward. Here are the clean, practical options depending on what you want.

Option 1: Install the default clang (easy, usually good enough)

This installs the clang version that matches your Ubuntu release.

bash
sudo apt update
sudo apt install -y clang

Verify:

bash
clang --version
clang++ --version

This is usually enough for libFuzzer + sanitizers.

Option 2: Install a specific / newer clang version (recommended for fuzzing)

If you want a modern clang (often better libFuzzer + sanitizer support):

bash
sudo apt update
sudo apt install -y clang-17 lldb-17 lld-17

(Replace 17 with 16, 18, etc., depending on availability for your Ubuntu version.)

Use it explicitly:

bash
clang-17 --version
clang++-17 --version

Or set it as default:

bash
sudo update-alternatives --install /usr/bin/clang clang /usr/bin/clang-17 100
sudo update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-17 100

Then:

bash
clang --version

Option 3: LLVM official apt repository (latest clang)

Best if Ubuntu’s repo is too old.

bash
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh 17

This installs:

clang-17
clang++-17
lld, lldb, llvm-symbolizer

Recommended extras for fuzzing

These make debugging crashes much nicer:

bash
sudo apt install -y llvm llvm-dev llvm-runtime

And make sure symbolizer is found:

bash
export LLVM_SYMBOLIZER=$(which llvm-symbolizer)

(You can add that to ~/.bashrc.)

Quick sanity check for libFuzzer support

Run:

bash
echo 'int main(){}' | clang++ -fsanitize=fuzzer -x c++ -

If it compiles without errors, libFuzzer is available 👍

TL;DR (what I’d actually do)

On modern Ubuntu:

bash
sudo apt update
sudo apt install -y clang-17 lld-17 lldb-17 llvm-17
export LLVM_SYMBOLIZER=$(which llvm-symbolizer-17)

If you tell me your Ubuntu version (22.04, 24.04, etc.), I’ll give you the exact best clang version and repo combo for it.

is there portable clang distribution? i would like to avoid mangling with system libraries

Yep — there is a way to use clang portably without touching system libraries 👍
There are a few practical options, with different trade-offs. I’ll start with the one that works best for CMake + libFuzzer setups.

✅ Best option: LLVM “portable” binary tarballs

LLVM provides self-contained prebuilt clang/LLVM binaries you can unpack anywhere (no apt, no system pollution).

Download

From: https://github.com/llvm/llvm-project/releases

Example (Linux x86_64, clang 17):

bash
wget https://github.com/llvm/llvm-project/releases/download/llvmorg-17.0.6/clang+llvm-17.0.6-x86_64-linux-gnu-ubuntu-22.04.tar.xz

Extract

bash
tar -xf clang+llvm-17.0.6-x86_64-linux-gnu-ubuntu-22.04.tar.xz

You’ll get something like:

bash
clang+llvm-17.0.6-x86_64-linux-gnu-ubuntu-22.04/
├── bin/clang
├── bin/clang++
├── bin/llvm-symbolizer
├── lib/
├── include/

Use it (no system changes)

bash
export LLVM_HOME=$PWD/clang+llvm-17.0.6-x86_64-linux-gnu-ubuntu-22.04
export PATH=$LLVM_HOME/bin:$PATH
export LLVM_SYMBOLIZER=$LLVM_HOME/bin/llvm-symbolizer

Now:

bash
clang --version
clang++ --version

✔ No apt
✔ No system library changes
✔ Works perfectly with -fsanitize=fuzzer,address,undefined
✔ Ideal for CI and reproducible builds

Using it with CMakePresets.json

Point CMake directly at this clang:

json
{
  "version": 3,
  "configurePresets": [
    {
      "name": "fuzz-portable-clang",
      "generator": "Ninja",
      "binaryDir": "${sourceDir}/out/build-fuzz",
      "cacheVariables": {
        "CMAKE_C_COMPILER": "/path/to/clang+llvm-17.0.6/bin/clang",
        "CMAKE_CXX_COMPILER": "/path/to/clang+llvm-17.0.6/bin/clang++"
      }
    }
  ]
}

This is the cleanest approach if you want zero system interference.

⚠️ Important libc detail (very relevant!)

Clang itself is portable, libc is not.

LLVM tarballs:

Use system glibc
Do not bundle libc

That means:

Works on the same or newer glibc than it was built for
Might fail on older distros

Rule of thumb:

Build on the oldest distro you need to support

Example:

Ubuntu 22.04 clang binaries → safe on 22.04, 24.04
Not guaranteed on 20.04

🧪 Alternative: Fully hermetic via Docker (most isolated)

If you want absolute isolation:

bash
docker run --rm -it \
  -v "$PWD:/work" \
  llvmorg/llvm:17 \
  bash

Pros:

100% reproducible
No host contamination
Perfect for fuzzing in CI

Cons:

Heavier
Less convenient locally

⚙️ Alternative: Build clang yourself (rarely worth it)

You can build LLVM from source with:

custom install prefix
bundled libc++ / compiler-rt

But:

very slow
complex
rarely needed unless you need special patches

⭐ What I recommend for your case

For CMake + libFuzzer + Ubuntu, minimal pain:

Use LLVM prebuilt tarball
Export PATH + LLVM_SYMBOLIZER
Point CMake preset to that clang
Done

This avoids:

apt
update-alternatives
system lib pollution
CI weirdness

If you want, I can:

give you the exact download URL for your Ubuntu version
show how to make a project-local toolchains/clang/ setup
write a CMake toolchain file instead of hardcoding compilers
help you make fuzzing fully reproducible in CI

Just tell me which one you want.

What does "rc2" in "LLVM 22.1.0-rc2" stand for?

It stands for “Release Candidate 2.”

In LLVM 22.1.0-rc2:

22.1.0 → the intended final version number
rc → release candidate
2 → the second release candidate for that version

What a release candidate means in practice

A release candidate is a build that:

is feature-complete
is expected to become the final release if no serious bugs are found
is published so users can test it on real workloads (like your fuzzing setup)

So:

rc1 → first “almost final” build
rc2 → fixes on top of rc1
…
final release → same code as the last RC, just without the -rcN suffix

Is it safe to use?

Yes for testing, fuzzing, CI experiments
Usually yes for development
Less ideal for long-term production toolchains unless you pin it carefully

For libFuzzer and sanitizer work, RC builds are actually very useful, because:

they often contain sanitizer fixes not yet in the last stable release
regressions are rare at this stage

Rule of thumb

Want maximum stability → use LLVM 21.x.y final
Want new sanitizer / fuzzer improvements → 22.1.0-rc2 is fine
Want reproducible builds → pin exactly 22.1.0-rc2, not “latest RC”

If you want, I can also explain:

how LLVM’s release process works end-to-end
how long RCs usually live before final
whether there are known differences between 22.1.0-rc2 and the final 22.1.0

Just say the word 🙂

In your answer above, why did you choose "LLVM 17.0.6" release, not newer one?

Good catch — that wasn’t accidental, and it’s a fair question 👍
Short version: 17.0.6 is the “boring-but-solid” choice. Here’s the full reasoning.

1️⃣ LLVM 17 is the current long-term stable workhorse

LLVM releases follow this pattern:

N → latest stable
N+1 → active development / early adopters
N−1 → very stable, widely deployed

At the time people standardized toolchains broadly:

LLVM 17 became the “everyone uses this” baseline
It has multiple patch releases (17.0.6), meaning:
- critical bugs fixed
- sanitizer regressions ironed out
- ABI and behavior stabilized

For fuzzing, maturity matters more than novelty.

2️⃣ libFuzzer + sanitizers favor mature releases

libFuzzer, ASan, UBSan, and compiler-rt are tightly coupled to clang.

In practice:

New LLVM major versions often introduce:
- new sanitizer checks
- rewritten instrumentation
- subtle regressions
These get fixed in .1, .2, .3… patch releases

17.0.6:

has been hammered by:
- Chromium
- Android
- OSS-Fuzz
is known to work well for fuzzing C++ projects

Newer (e.g. 18, 19, 22 RCs):

may fuzz better
but are more likely to produce:
- false positives
- performance regressions
- toolchain surprises

3️⃣ Portability: glibc compatibility sweet spot

This one is subtle but important.

LLVM prebuilt binaries are linked against glibc of the build distro.

LLVM 17 binaries are commonly built on Ubuntu 20.04 / 22.04
Newer LLVM releases are increasingly built on newer glibc

Result:

LLVM 17 runs on:
- Ubuntu 22.04
- Ubuntu 24.04
- many enterprise distros
Newer releases may not run on older systems

So 17 is a safe portability floor.

4️⃣ CMake ecosystem stability

CMake + clang integration lags slightly behind LLVM HEAD.

With LLVM 17:

-fsanitize=fuzzer works everywhere
CMake detects compiler features reliably
Fewer surprises with:
- try_compile
- CheckCXXCompilerFlag
- LTO / ThinLTO flags

With bleeding-edge LLVM:

you occasionally hit CMake modules that haven’t been tested against it yet

5️⃣ Why not LLVM 18 / 19 / 22-rc?

You can use them. I do — selectively.

But I wouldn’t recommend them as a default example unless:

you explicitly asked for “latest”
you’re testing sanitizer changes
you’re already pinning toolchains per-project

Examples:

LLVM 18/19 → fine, but still settling
LLVM 22.1.0-rc2 → excellent for experimentation, not yet boring

Practical recommendation matrix

Use case	Best choice
Reproducible fuzzing	LLVM 17.0.6
CI on many distros	LLVM 17.x
OSS-Fuzz-like setup	LLVM 17 or 18
Testing new sanitizers	LLVM 22-rc
Personal dev, latest toys	latest stable

If you want a “future-proof” setup

A nice compromise is to support two toolchains:

text
toolchains/
├── llvm-17/   # stable, default
└── llvm-22/   # experimental / RC

And switch via CMake presets.

If you tell me:

your Ubuntu version
whether this is CI / local dev / fuzz farm
how long you want builds to remain reproducible

…I’ll tell you exactly which LLVM version I’d pin and why.

From https://github.com/llvm/llvm-project/releases page, I see that "LLVM 17.0.6" release (here: https://github.com/llvm/llvm-project/releases/tag/llvmorg-17.0.6) is the last one with package for Ubuntu 22.04 (here: https://github.com/llvm/llvm-project/releases/download/llvmorg-17.0.6/clang+llvm-17.0.6-x86_64-linux-gnu-ubuntu-22.04.tar.xz). Next releases doesn't have this package. Why?

You’re observing a real policy shift, not a glitch — and your conclusion is correct 👍
LLVM didn’t “forget” Ubuntu 22.04 packages after 17.0.6; they intentionally stopped producing them.

Here’s what’s going on under the hood.

The short answer

LLVM 17.0.6 is the last release built on Ubuntu 22.04 because LLVM moved its official Linux build baseline to a newer distro.

Once that happened, they stopped publishing tarballs explicitly labeled ubuntu-22.04.

The longer (and more useful) explanation

1️⃣ LLVM binary tarballs are built on specific distros

LLVM doesn’t just say “Linux” — each tarball is built on one concrete distribution, which determines:

glibc version
libstdc++ ABI
kernel/syscall expectations

That’s why filenames include things like:


clang+llvm-17.0.6-x86_64-linux-gnu-ubuntu-22.04.tar.xz

This tells you exactly what glibc baseline the binaries require.

2️⃣ After LLVM 17, the build baseline moved forward

Starting after the LLVM 17 release line, LLVM maintainers:

dropped Ubuntu 22.04 as a primary build host
moved to a newer baseline (effectively Ubuntu 24.04 / newer glibc)

Why they did this:

newer compilers need newer system headers
newer C++ standard library features
less backporting pain
fewer CI failures across architectures

This is a common lifecycle step once a distro is no longer “new”.

3️⃣ Why you still see Linux tarballs — just not 22.04-labeled ones

For newer LLVM releases, you’ll see things like:


clang+llvm-18.x.x-x86_64-linux-gnu.tar.xz

or builds tied to newer baselines.

These binaries:

may run on Ubuntu 22.04
but are not guaranteed to
and LLVM doesn’t want to promise compatibility they can’t test

So they stopped advertising a 22.04-specific artifact.

This is a support guarantee decision, not a technical impossibility.

4️⃣ Why LLVM 17 is special

LLVM 17 is:

the last release line fully tested on Ubuntu 22.04
the last one where LLVM CI explicitly validated that environment
effectively the “LTS-by-convention” release for that distro

That’s why you still see:


clang+llvm-17.0.6-…-ubuntu-22.04.tar.xz

And nothing newer.

What this means for you (practically)

✅ If you are on Ubuntu 22.04

Best options:

Stick to LLVM 17.0.6
- fully compatible
- officially supported
- boring and reliable (good!)
Use newer LLVM anyway — but knowingly
- try newer generic Linux tarballs
- accept possible glibc/libstdc++ issues
- test carefully
Use apt.llvm.org
- they do still build clang 18/19/… for 22.04
- but that’s no longer “portable tarball”, it’s system-integrated

⚠️ Why this matters a lot for fuzzing

libFuzzer + sanitizers are very sensitive to:

libc ABI
libstdc++ mismatch
unwinder differences

Using a tarball built for a newer glibc on 22.04 can cause:

sanitizer crashes
false positives
symbolization failures

So LLVM’s conservatism here is actually a favor.

Rule of thumb going forward

Your distro	Safest LLVM tarball
Ubuntu 20.04	LLVM 15 / 16
Ubuntu 22.04	LLVM 17.0.6
Ubuntu 24.04	LLVM 18+
Mixed / CI	Oldest supported baseline

Bottom line

LLVM 17.0.6 isn’t “old” — it’s the last release that promises Ubuntu 22.04 compatibility.

Everything after that may still work — but LLVM no longer wants to guarantee it.

If you want, I can:

check whether a specific newer LLVM tarball will run safely on your system
help you choose between LLVM 17 vs LLVM 22-rc for fuzzing
show how OSS-Fuzz handles this exact issue in practice

Just tell me what you’re aiming for.

extract file.tar.xz to /opt/clang using tar

Sure 👍 Here’s the clean, correct way to extract a .tar.xz archive directly into /opt/clang using tar.

1️⃣ Create the target directory

bash
sudo mkdir -p /opt/clang

(Optional but recommended: make it owned by root and readable by everyone.)

2️⃣ Extract the archive into `/opt/clang`

bash
sudo tar -xJf file.tar.xz -C /opt/clang

Explanation of flags:

-x → extract
-J → use xz decompression
-f → archive file
-C /opt/clang → change directory before extracting

3️⃣ Check what you got

bash
ls /opt/clang

Typically you’ll see something like:

text
clang+llvm-17.0.6-x86_64-linux-gnu-ubuntu-22.04/

4️⃣ (Optional) Flatten the directory

LLVM tarballs usually extract into a versioned subdirectory.
If you want /opt/clang/bin/clang directly:

bash
sudo mv /opt/clang/clang+llvm-*/* /opt/clang/
sudo rmdir /opt/clang/clang+llvm-*

⚠️ Do this only if /opt/clang is empty.

5️⃣ (Optional) Make it usable system-wide (without replacing system clang)

Add to /etc/profile.d/clang.sh:

bash
sudo tee /etc/profile.d/clang.sh >/dev/null <<'EOF'
export PATH=/opt/clang/bin:$PATH
export LLVM_SYMBOLIZER=/opt/clang/bin/llvm-symbolizer
EOF

Or just for your shell:

bash
export PATH=/opt/clang/bin:$PATH
export LLVM_SYMBOLIZER=/opt/clang/bin/llvm-symbolizer

Quick verification

bash
clang --version
which clang

Should point to /opt/clang/bin/clang.

If you want:

a versioned layout (/opt/clang/17, /opt/clang/22)
a symlink-based switch (/opt/clang/current)
a CMake toolchain file pointing at /opt/clang

…say the word and I’ll set it up cleanly.

use long option names

Sure — here’s the same thing using long option names only, nice and explicit 👍

Create the target directory

bash
sudo mkdir --parents /opt/clang

Extract `file.tar.xz` into `/opt/clang`

bash
sudo tar --extract --xz --file=file.tar.xz --directory=/opt/clang

Explanation:

--extract → extract files
--xz → use xz decompression
--file=file.tar.xz → archive file
--directory=/opt/clang → extract into this directory

Verify extraction

bash
ls /opt/clang

You should see a directory like:

text
clang+llvm-17.0.6-x86_64-linux-gnu-ubuntu-22.04/

Optional: flatten the directory (long options)

Only do this if /opt/clang is empty and you want /opt/clang/bin/clang directly:

bash
sudo mv /opt/clang/clang+llvm-*/* /opt/clang/
sudo rmdir --parents /opt/clang/clang+llvm-*

Optional: make it available in PATH

Per-user:

bash
export PATH=/opt/clang/bin:$PATH
export LLVM_SYMBOLIZER=/opt/clang/bin/llvm-symbolizer

System-wide (login shells):

bash
sudo tee /etc/profile.d/clang.sh >/dev/null <<'EOF'
export PATH=/opt/clang/bin:$PATH
export LLVM_SYMBOLIZER=/opt/clang/bin/llvm-symbolizer
EOF

Sanity check

bash
which clang
clang --version

If you want, I can also:

show how to list archive contents with long options
extract only a subdirectory from the tarball
or make a versioned /opt/clang/17 layout with symlinks

Just tell me.

what does --strip-components=1 option in tar do?

--strip-components=1 tells tar to remove the first directory level from every file path while extracting.

It’s easiest to see with an example.

The problem it solves

Many archives are structured like this:

text
clang+llvm-17.0.6-x86_64-linux-gnu-ubuntu-22.04/
├── bin/
│   └── clang
├── lib/
├── include/

If you extract normally:

bash
tar --extract --file=file.tar.xz

you get:

text
./clang+llvm-17.0.6-x86_64-linux-gnu-ubuntu-22.04/bin/clang

Sometimes you don’t want that top-level directory.

What `--strip-components=1` does

bash
tar --extract \
    --xz \
    --file=file.tar.xz \
    --directory=/opt/clang \
    --strip-components=1

Result:

text
/opt/clang/bin/clang
/opt/clang/lib/...
/opt/clang/include/...

Tar:

splits each path by /
removes the first component
extracts the rest

General rule

Option	Effect
`--strip-components=0`	default, strip nothing
`--strip-components=1`	remove top-level directory
`--strip-components=2`	remove two leading directories
…	…

Important caveats (this matters)

1️⃣ Paths must have at least N components

If a file path has fewer components than you strip:

tar skips that file
no error by default

2️⃣ You can destroy directory structure

If different paths become identical after stripping, files may overwrite each other.

3️⃣ Always inspect first

Before using it, do:

bash
tar --list --file=file.tar.xz

(or --list --xz --file=… for .xz)

Typical real-world use cases

LLVM / GCC / Go / Node tarballs

bash
tar --extract --xz --file=llvm.tar.xz --directory=/opt/clang --strip-components=1

Extract only part of an archive

bash
tar --extract --file=archive.tar \
    --strip-components=2 \
    some/top/dir/file.txt

Mental model

Think of --strip-components=1 as:

“Pretend everything inside the archive started one directory deeper.”

If you want, I can:

show how to safely combine --strip-components with --wildcards
explain how it differs from --transform
help write a defensive extraction command for CI scripts

Just say the word.

add target_compile_options and target_link_options to CMakePresets.json

This is an important nuance in CMake, so let’s be very precise.

Short, correct answer

👉 You cannot add target_compile_options() or target_link_options() directly in CMakePresets.json.
Those are target-level commands, and presets operate before targets even exist.

But—you can achieve the same effect indirectly in a few clean ways.

Why this doesn’t work conceptually

CMakePresets.json:
- sets cache variables
- selects compilers, generators, build types
target_compile_options() / target_link_options():
- operate on targets
- run inside CMakeLists.txt
- require the target to already exist

So there is no syntax like this (and never will be):

json
// ❌ impossible
"target_compile_options": ["-fsanitize=address"]

✅ Correct ways to do this

Option 1 (recommended): Use cache variables in presets

(Best for fuzzing / sanitizer builds)

You inject flags globally via cache variables, and your targets inherit them.

Example: `CMakePresets.json`

json
{
  "version": 3,
  "configurePresets": [
    {
      "name": "fuzz",
      "generator": "Ninja",
      "binaryDir": "${sourceDir}/out/build-fuzz",
      "cacheVariables": {
        "CMAKE_C_COMPILER": "/opt/clang/bin/clang",
        "CMAKE_CXX_COMPILER": "/opt/clang/bin/clang++",

        "CMAKE_C_FLAGS": "-g -O1 -fno-omit-frame-pointer -fsanitize=fuzzer,address,undefined",
        "CMAKE_CXX_FLAGS": "-g -O1 -fno-omit-frame-pointer -fsanitize=fuzzer,address,undefined",

        "CMAKE_EXE_LINKER_FLAGS": "-fsanitize=fuzzer,address,undefined",
        "CMAKE_SHARED_LINKER_FLAGS": "-fsanitize=fuzzer,address,undefined"
      }
    }
  ]
}

This is:

portable
preset-controlled
commonly used by OSS-Fuzz-style setups

⚠️ Downside: applies to all targets in that build tree.

Option 2: Preset sets a mode, CMakeLists applies target options

(Most idiomatic CMake)

`CMakePresets.json`

json
{
  "cacheVariables": {
    "ENABLE_FUZZING": "ON"
  }
}

`CMakeLists.txt`

cmake
option(ENABLE_FUZZING "Enable fuzzing instrumentation" OFF)

add_executable(fuzz_processor fuzz_processor.cpp)

if (ENABLE_FUZZING)
    target_compile_options(fuzz_processor PRIVATE
        -g -O1 -fno-omit-frame-pointer
        -fsanitize=fuzzer,address,undefined
    )
    target_link_options(fuzz_processor PRIVATE
        -fsanitize=fuzzer,address,undefined
    )
endif()

✅ Best balance of:

clean presets
target-level precision
reusable project logic

This is the approach I’d recommend for anything non-trivial.

Option 3: Use a toolchain file (advanced / CI-friendly)

Preset points to a toolchain file:

json
{
  "cacheVariables": {
    "CMAKE_TOOLCHAIN_FILE": "${sourceDir}/cmake/fuzz-toolchain.cmake"
  }
}

And in fuzz-toolchain.cmake:

cmake
set(CMAKE_C_COMPILER clang)
set(CMAKE_CXX_COMPILER clang++)

set(CMAKE_C_FLAGS_INIT "-fsanitize=fuzzer,address,undefined")
set(CMAKE_CXX_FLAGS_INIT "-fsanitize=fuzzer,address,undefined")
set(CMAKE_EXE_LINKER_FLAGS_INIT "-fsanitize=fuzzer,address,undefined")

Used heavily in cross-compilation and hermetic builds.

What you should not do

❌ Don’t try to:

encode target logic into presets
overload CMAKE_<LANG>_FLAGS for fine-grained per-target control
mix sanitizers across different presets without isolating build directories

Recommendation for your fuzzing setup

Given your earlier context (libFuzzer, portable clang, Ubuntu 22.04):

✔ Use Option 2:

Preset selects mode (ENABLE_FUZZING=ON)
CMakeLists.txt owns target logic
Clean, debuggable, future-proof

If you want, I can:

refactor your existing CMakeLists to be fuzz-ready
design a multi-preset layout (debug, release, fuzz)
show how to avoid sanitizer flags leaking into non-fuzz targets

Just tell me what you want next.

is it possible to run fuzz target through cmake?

Yes — you can run a fuzz target “through CMake” in a few practical ways. CMake won’t fuzz by itself, but it can build and launch the fuzzer via ctest or a custom target.

Option A: Run it with CTest (recommended)

In CMakeLists.txt:

cmake
enable_testing()

add_test(
  NAME fuzz_processor_smoke
  COMMAND fuzz_processor
          -runs=1000
          -max_len=4096
)

Then:

bash
cmake --build <build_dir> --target fuzz_processor
ctest --test-dir <build_dir> -R fuzz_processor_smoke --output-on-failure

Why this is nice:

works with presets (cmake --preset … then ctest --preset …)
you can keep it as a smoke test (limited runs/time) for CI

Tip: In CI, prefer -max_total_time=60 (or similar) instead of infinite fuzzing.

Option B: Add a custom CMake target that runs the fuzzer

cmake
add_custom_target(run_fuzz_processor
  COMMAND $<TARGET_FILE:fuzz_processor>
          -max_total_time=60
          -artifact_prefix=${CMAKE_BINARY_DIR}/fuzz_artifacts/
  DEPENDS fuzz_processor
  USES_TERMINAL
)

Run:

bash
cmake --build <build_dir> --target run_fuzz_processor

This is the closest to “run through CMake build”.

Option C: Use `cmake --build --target test` (CTest default)

If you used Option A (add_test), you can also do:

bash
cmake --build <build_dir> --target test

(That’s equivalent to running CTest in many generators; Ninja/MSBuild behaviors vary slightly.)

Good defaults for fuzzers in CTest/CI

A fuzzer normally runs forever, so for automation give it bounds:

-runs=10000 (fixed iterations) or
-max_total_time=60 (fixed time)
-timeout=5 (per-input timeout) if you suspect hangs

Example CI-friendly command:

cmake
add_test(NAME fuzz_processor_ci
  COMMAND fuzz_processor
          -max_total_time=60
          -timeout=5
          -artifact_prefix=${CMAKE_BINARY_DIR}/fuzz_artifacts/
)

Using presets to run it cleanly

If you have a build preset fuzz:

bash
cmake --preset fuzz
cmake --build --preset fuzz --target fuzz_processor
ctest --preset fuzz -R fuzz_processor_ci --output-on-failure

(CTest presets live in the same CMakePresets.json under testPresets.)

If you paste your current CMakePresets.json (or just the relevant preset), I can write a matching testPresets block + a ready add_test()/add_custom_target() setup tailored to your exact build dir and target name.

another options?

Yep—there are several other solid ways to “run” a libFuzzer target via CMake tooling, depending on what you mean by through CMake (build system integration, IDE integration, install/run, CI, etc.). Here are more options beyond add_test() and a custom run_* target.

Option D: Use `cmake --build` with `--target` that runs `ctest` presets

If you already define tests, some generators let you do:

bash
cmake --build <build_dir> --target test

This triggers CTest from the build system. It’s not perfect across all generators, but it’s common (Ninja/Makefiles).

Option E: Add a “driver” executable target that wraps libFuzzer flags

Create a tiny runner program or script target that calls your fuzzer with standardized flags (time limit, corpus, artifacts). Then CMake runs the runner.

Example (CMake + script):

cmake
configure_file(${CMAKE_SOURCE_DIR}/cmake/run_fuzz_processor.sh.in
               ${CMAKE_BINARY_DIR}/run_fuzz_processor.sh @ONLY)

add_custom_target(run_fuzz_processor_script
  COMMAND /bin/sh ${CMAKE_BINARY_DIR}/run_fuzz_processor.sh
  DEPENDS fuzz_processor
  USES_TERMINAL
)

run_fuzz_processor.sh.in could set ASAN_OPTIONS, create corpus dirs, set LLVM_SYMBOLIZER, etc. This keeps CMakeLists clean and moves “policy” into a script.

Option F: Use CMake presets’ `buildPresets` + `testPresets` to run fuzzing in one command line

This is still CTest, but the difference is you can make a dedicated test preset that runs your fuzzer with arguments and env vars.

In CMakePresets.json you can define testPresets with:

environment (ASAN_OPTIONS, UBSAN_OPTIONS, LLVM_SYMBOLIZER)
filter (include regex for just fuzz tests)
output controls

Then:

bash
ctest --preset fuzz

It feels like “running via presets” and works nicely in CI/IDE.

Option G: Use `install(TARGETS ...)` and run from an installed prefix

If you want “build + install + run” as a standard workflow:

cmake
install(TARGETS fuzz_processor RUNTIME DESTINATION bin)

Then:

bash
cmake --build <build_dir> --target install
<install_prefix>/bin/fuzz_processor -max_total_time=60

This is handy if you ship the fuzzer into a container/image where runtime env is clean.

Option H: Add a `fuzz` “meta target” that builds multiple fuzzers

If you have many fuzz targets, you can define a meta target that depends on all fuzz executables:

cmake
add_custom_target(fuzz_targets)
add_dependencies(fuzz_targets fuzz_processor fuzz_parser fuzz_decoder)

Then:

bash
cmake --build <build_dir> --target fuzz_targets

Combine with one runner script/ctest preset to execute whichever you want.

Option I: Integrate with `CTest` labels + `-L` selection

Label fuzz tests and run them selectively:

cmake
set_tests_properties(fuzz_processor_ci PROPERTIES LABELS "fuzz")

Run:

bash
ctest --test-dir <build_dir> -L fuzz

Great when fuzz smoke tests live alongside unit tests.

Option J: Use `ExternalProject` / `FetchContent` style “superbuild” to run fuzzing isolated

If you really want isolation (no “leaking” sanitizer flags into normal builds), you can make a separate fuzz build configured from the main build:

main build stays clean
fuzz build is a separate CMake invocation managed by CMake

This is heavier, but it’s a legit “CMake runs fuzz build” workflow for large repos.

What I’d pick in practice

Single fuzzer / local dev: custom run_fuzz_* target or script target (Option B/E)
CI smoke fuzzing: CTest test + label + time limit (Option A/I)
Many fuzzers: meta target + label-based selection (Option H/I)
Strict isolation: separate fuzz build tree / superbuild (Option J)

If you tell me which you want (local dev, CI, many fuzzers, isolation), I’ll write the exact CMake + preset snippets for your setup.

i would like to embed in CMakePresets.json as much information, as possible. I would like to avoid creating new toolchain files, CMakeLists.txt etc.

You can put a lot into CMakePresets.json, but there’s a hard limit: presets can only set cache variables / environment / generator / build & test invocations. They cannot create targets or attach target_compile_options() to an existing target without changing CMakeLists.

So the “max you can do with only presets” is:

configure compilers + global flags (compile + link)
build the fuzz target by name
run it via a testPreset (without adding add_test) using ctest “dashboard/script” mode isn’t great; instead, use a buildPreset + a workflowPreset to run commands? (CMake has limited command-running in presets). Practically, you run the executable using cmake --build and then invoke it yourself, unless you already have tests defined.

The best compromise that keeps all fuzz flags out of CMakeLists is:

Use presets to inject global sanitizer/fuzzer flags
Build the fuzz target (which must already exist)
Run it from the command line (or from IDE “launch” config, not CMake)

If your fuzz target already exists as fuzz_processor, here’s the “all in presets” setup.

A “mostly everything in presets” `CMakePresets.json`

This does:

portable clang in /opt/clang
global libFuzzer + ASan + UBSan flags
environment for symbolization and sanitizer behavior
build preset that builds only fuzz_processor
test preset that runs CTest if you already have tests (optional)

json
{
  "version": 6,
  "cmakeMinimumRequired": {
    "major": 3,
    "minor": 23,
    "patch": 0
  },
  "configurePresets": [
    {
      "name": "fuzz",
      "displayName": "Fuzz (clang + libFuzzer + ASan/UBSan)",
      "generator": "Ninja",
      "binaryDir": "${sourceDir}/out/build-fuzz",
      "cacheVariables": {
        "CMAKE_C_COMPILER": "/opt/clang/bin/clang",
        "CMAKE_CXX_COMPILER": "/opt/clang/bin/clang++",
        "CMAKE_BUILD_TYPE": "RelWithDebInfo",

        "CMAKE_C_FLAGS": "-g -O1 -fno-omit-frame-pointer -fsanitize=fuzzer,address,undefined",
        "CMAKE_CXX_FLAGS": "-g -O1 -fno-omit-frame-pointer -fsanitize=fuzzer,address,undefined",

        "CMAKE_EXE_LINKER_FLAGS": "-fsanitize=fuzzer,address,undefined",
        "CMAKE_SHARED_LINKER_FLAGS": "-fsanitize=fuzzer,address,undefined",
        "CMAKE_MODULE_LINKER_FLAGS": "-fsanitize=fuzzer,address,undefined"
      },
      "environment": {
        "LLVM_SYMBOLIZER": "/opt/clang/bin/llvm-symbolizer",
        "ASAN_OPTIONS": "abort_on_error=1:detect_leaks=0:handle_abort=1:symbolize=1",
        "UBSAN_OPTIONS": "halt_on_error=1:abort_on_error=1:print_stacktrace=1:symbolize=1"
      }
    }
  ],
  "buildPresets": [
    {
      "name": "fuzz-build",
      "displayName": "Build fuzz_processor",
      "configurePreset": "fuzz",
      "targets": [ "fuzz_processor" ],
      "jobs": 0
    }
  ],
  "testPresets": [
    {
      "name": "fuzz-test",
      "displayName": "Run ctest (if fuzz tests are registered)",
      "configurePreset": "fuzz",
      "output": { "outputOnFailure": true }
    }
  ]
}

How you use it

bash
cmake --preset fuzz
cmake --build --preset fuzz-build

Now run the fuzzer executable directly:

bash
./out/build-fuzz/fuzz_processor -max_total_time=60 -artifact_prefix=./out/build-fuzz/artifacts/

Why you can’t fully avoid CMakeLists changes (one unavoidable bit)

If you want “run fuzz target through cmake/ctest using presets only”, then the fuzz run must be a CTest test. But CTest only knows about tests that were registered via add_test() (or generated by a framework integration). Without some CMakeLists logic, CTest has nothing to run.

So either:

(A) accept running the fuzzer executable directly (still fully configured by presets), or
(B) add a tiny add_test() line (one-line change) and then you can run via ctest --preset …

You said you want to avoid changing CMakeLists, so A is the max possible.

One more trick: keep flags scoped without changing targets

If you don’t want sanitizers to affect all targets, keep fuzzing in a separate build tree (as above: out/build-fuzz). That gives you clean separation with no CMakeLists changes.

If you tell me:

the exact name of your fuzz executable target (e.g. fuzz_processor?)
where your portable clang lives (/opt/clang is OK?)
whether you want ASan leak detection on/off

…I’ll tailor the presets to your project exactly (still without toolchain files or CMakeLists edits).

What is the most widely used option? Give recommendations: CMakePresets? Cmake toolchains?

In practice, the most widely used pattern today is both — with different jobs:

Toolchain files are the “portable, reusable definition of a compiler + sysroot + ABI”.
CMakePresets.json are the “developer/CI entry points” that pick a toolchain, build dir, and a few toggles.

If I had to rank what’s most common across real projects:

Toolchain files (especially for cross-compiling, embedded, reproducible builds)
Presets (increasingly common since IDEs and CI started supporting them well)
Plain old -D... command lines (still extremely common in older repos)

When CMakePresets are the better default

Use Presets when you want:

consistent configure/build commands for humans + CI
multiple build variants (debug/release/asan/ubsan/fuzz) with different build dirs
per-user overrides via CMakeUserPresets.json (local paths to clang, SDKs, etc.)
minimal “CMake code” changes (you can inject flags with cache vars)

Great for your fuzzing workflow if you’re OK with:

global flags per build tree (sanitizers apply to everything in that build dir)
running fuzzers directly unless you already have add_test()

When toolchain files are the better default

Use a toolchain file when you care about:

true portability / reproducibility of the toolchain definition
cross-compiling (almost mandatory)
consistent compiler + linker + sysroot + CMake “try_compile” behavior
avoiding fragile global flag injection (CMAKE_CXX_FLAGS, etc.)
sharing the exact toolchain setup across repos

For “portable clang distribution” + fuzzing, toolchain files help when:

you need to bundle libc++ / runtime choices
you want to pin exactly how CMake detects and uses the compiler

The most pragmatic recommendation

Use presets for selection + toolchains for definition.

Example pattern:

cmake/toolchains/clang-fuzz.cmake defines compilers + sanitizer instrumentation policy
CMakePresets.json chooses build dirs and sets CMAKE_TOOLCHAIN_FILE + a couple toggles

This scales best and is what you’ll see in many modern CMake repos.

If you insist on “presets only”

It’s workable and common for single-platform projects:

Put compilers and flags in presets (CMAKE_CXX_COMPILER, CMAKE_CXX_FLAGS, linker flags)
Use separate build dirs per preset to avoid “flag leakage”

Just know the ceiling:

presets can’t do per-target options
presets can’t invent a fuzz “run” unless the project already registers tests (CTest)

Quick decision table

Need	Prefer
Multiple build flavors for developers/CI	Presets
Cross-compilation / sysroot / SDK	Toolchain file
Reproducible, shareable toolchain config	Toolchain file
Keep CMakeLists clean from “local machine paths”	Presets (via UserPresets)
Per-target compile/link options	CMakeLists (not presets/toolchain alone)
“Presets only, no extra files”	Presets, accept global flags

What I’d do for your libFuzzer setup

If it’s just Ubuntu and you want minimal files: Presets-only is fine.
If you want this to be robust, shareable, and grow (more fuzzers, CI, maybe different clang versions): toolchain + presets.

If you paste your current CMakePresets.json, I can suggest a clean structure (still minimal) for debug, release, and fuzz variants that won’t contaminate each other.

i want CMakePresets.json to be clean and concise, and store settings in toolchain files to avoid CMakePresets.json to grow big

That’s the most maintainable setup in 2026 CMake-land: small Presets, “real” settings in toolchain files.

Recommended layout

pgsql
CMakePresets.json
cmake/
  toolchains/
    clang-fuzz.cmake
    clang-asan.cmake
    clang-release.cmake

Toolchain files hold:

compiler paths (portable clang in /opt/clang or versioned dir)
sanitizer/fuzzer flags
env helpers (symbolizer, etc.) as CMake cache defaults

Presets stay concise:

generator
binaryDir
which toolchain
build/test presets

Clean, concise `CMakePresets.json`

json
{
  "version": 6,
  "configurePresets": [
    {
      "name": "base",
      "hidden": true,
      "generator": "Ninja",
      "cacheVariables": {
        "CMAKE_EXPORT_COMPILE_COMMANDS": "ON"
      }
    },
    {
      "name": "fuzz",
      "inherits": "base",
      "binaryDir": "${sourceDir}/out/build-fuzz",
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "RelWithDebInfo",
        "CMAKE_TOOLCHAIN_FILE": "${sourceDir}/cmake/toolchains/clang-fuzz.cmake"
      }
    }
  ],
  "buildPresets": [
    {
      "name": "fuzz",
      "configurePreset": "fuzz",
      "targets": [ "fuzz_processor" ],
      "jobs": 0
    }
  ]
}

Usage:

bash
cmake --preset fuzz
cmake --build --preset fuzz
./out/build-fuzz/fuzz_processor -max_total_time=60

(If you later add add_test, you can add a tiny testPresets block too.)

Example toolchain: `cmake/toolchains/clang-fuzz.cmake`

cmake
# clang-fuzz.cmake
# Toolchain file for clang + libFuzzer + ASan/UBSan.
# Keep it deterministic and repo-friendly.

set(CMAKE_C_COMPILER   "/opt/clang/bin/clang"   CACHE FILEPATH "" FORCE)
set(CMAKE_CXX_COMPILER "/opt/clang/bin/clang++" CACHE FILEPATH "" FORCE)

# Recommend Ninja with clang; not required.
# set(CMAKE_MAKE_PROGRAM "/usr/bin/ninja" CACHE FILEPATH "" FORCE)

# Common flags for fuzzing
set(_FUZZ_COMMON "-g -O1 -fno-omit-frame-pointer")

# libFuzzer is linked via -fsanitize=fuzzer; combine with sanitizers
set(_FUZZ_SAN "-fsanitize=fuzzer,address,undefined")

set(CMAKE_C_FLAGS_INIT   "${_FUZZ_COMMON} ${_FUZZ_SAN}")
set(CMAKE_CXX_FLAGS_INIT "${_FUZZ_COMMON} ${_FUZZ_SAN}")

# Linker flags (executables are what fuzzers are)
set(CMAKE_EXE_LINKER_FLAGS_INIT "${_FUZZ_SAN}")

# Helpful runtime env for better stacks (you still export these in your shell/CI)
# CMake itself can't reliably “export env at runtime” for arbitrary executables,
# but it can help CTest (if you later run via add_test/ctest).
set(ENV{LLVM_SYMBOLIZER} "/opt/clang/bin/llvm-symbolizer")
set(ENV{ASAN_OPTIONS} "abort_on_error=1:detect_leaks=0:handle_abort=1:symbolize=1")
set(ENV{UBSAN_OPTIONS} "halt_on_error=1:abort_on_error=1:print_stacktrace=1:symbolize=1")

Notes

Using *_FLAGS_INIT is toolchain-friendly: it seeds flags early, before most checks.
For fuzzing, I’d keep leak detection off by default (detect_leaks=0) unless you explicitly want it (LSan can slow fuzzing and produce noise).

Extra: keep paths out of git (optional but common)

If /opt/clang differs per dev machine, don’t hardcode it in the toolchain. Two clean patterns:

Pattern A: environment variable

cmake
if(DEFINED ENV{LLVM_HOME})
  set(_LLVM_HOME "$ENV{LLVM_HOME}")
else()
  set(_LLVM_HOME "/opt/clang")
endif()

set(CMAKE_C_COMPILER   "${_LLVM_HOME}/bin/clang" CACHE FILEPATH "" FORCE)
set(CMAKE_CXX_COMPILER "${_LLVM_HOME}/bin/clang++" CACHE FILEPATH "" FORCE)

Pattern B: per-user preset override

Commit CMakePresets.json
Each dev has CMakeUserPresets.json with their local toolchain path

This keeps repo config clean and avoids “works on my machine”.

My recommendation for you

Yes: keep CMakePresets.json small
Yes: store compiler + sanitizer/fuzzer policy in toolchain files
Also: if you want “run fuzz via ctest”, add a single add_test() later; otherwise run the fuzzer executable directly

If you tell me whether your portable clang lives at /opt/clang or versioned like /opt/llvm/17, I’ll tailor the toolchain file so it’s robust and doesn’t require edits when you upgrade LLVM.

can CMAKE_BUILD_TYPE be stored in toolchain file too?

Yes — it can be set in a toolchain file, but whether you should do it depends on how you use CMake. This is one of those “allowed, but sharp edges” areas.

Let’s be precise.

The factual answer

✔ Yes, you can set CMAKE_BUILD_TYPE in a toolchain file, e.g.:

cmake
set(CMAKE_BUILD_TYPE "RelWithDebInfo" CACHE STRING "" FORCE)

CMake will honor it for single-config generators (Ninja, Makefiles).

But here’s the important nuance

Toolchain files are meant to define:

compiler
sysroot / SDK
ABI
platform properties

`CMAKE_BUILD_TYPE` defines:

build configuration, not toolchain
optimization / debug policy

That mismatch is why setting it in a toolchain file is legal but controversial.

When it’s reasonable to do it

Setting CMAKE_BUILD_TYPE in a toolchain file makes sense when:

you use single-config generators only (Ninja/Makefiles)
the build type is inseparable from the toolchain
(e.g. “this is a fuzz toolchain”)

Example: a fuzz toolchain that must always be RelWithDebInfo.

cmake
# clang-fuzz.cmake
set(CMAKE_BUILD_TYPE "RelWithDebInfo" CACHE STRING "" FORCE)

This guarantees:

no accidental Debug builds with slow fuzzing
reproducible behavior across machines and CI

This pattern is used in fuzzing and sanitizer setups.

When you should NOT do it

❌ Avoid setting CMAKE_BUILD_TYPE in a toolchain file if:

you use multi-config generators (Visual Studio, Xcode, Ninja Multi-Config)
you want to switch Debug/Release without reconfiguring
the same toolchain should support multiple build types

In multi-config generators:

CMAKE_BUILD_TYPE is ignored
configuration is chosen at build time (--config Debug)
setting it in a toolchain file is meaningless or confusing

Best-practice recommendation (cleanest design)

Keep this separation:

Concern	Where
Compiler, sanitizers, fuzzer	Toolchain file
Build type (Debug/Release/RelWithDebInfo)	Preset

This keeps:

toolchains reusable
presets readable
behavior unsurprising

Example (recommended)

Toolchain

cmake
# clang-fuzz.cmake
set(CMAKE_C_COMPILER   "/opt/clang/bin/clang"   CACHE FILEPATH "" FORCE)
set(CMAKE_CXX_COMPILER "/opt/clang/bin/clang++" CACHE FILEPATH "" FORCE)

set(_SAN "-fsanitize=fuzzer,address,undefined")
set(CMAKE_C_FLAGS_INIT   "${_SAN}")
set(CMAKE_CXX_FLAGS_INIT "${_SAN}")
set(CMAKE_EXE_LINKER_FLAGS_INIT "${_SAN}")

Preset

json
{
  "cacheVariables": {
    "CMAKE_BUILD_TYPE": "RelWithDebInfo",
    "CMAKE_TOOLCHAIN_FILE": "cmake/toolchains/clang-fuzz.cmake"
  }
}

This is the most common and least surprising setup in modern CMake projects.

A compromise pattern (used by some teams)

If you really want the toolchain to provide a default but allow override:

cmake
if(NOT DEFINED CMAKE_BUILD_TYPE)
  set(CMAKE_BUILD_TYPE "RelWithDebInfo" CACHE STRING "" FORCE)
endif()

This:

sets a sane default
still allows presets or CLI to override it

Bottom line

✔ Possible: yes
⚠️ Recommended: only if the build type is intrinsic to the toolchain (fuzzing often is)
✅ Best practice: toolchain = how you build, preset = what you build

If you want, I can:

review your current toolchain file for sharp edges
suggest naming conventions (clang-fuzz, clang-asan, etc.)
show how to make this work cleanly with Ninja Multi-Config

Just say the word.

i want to define running fuzz tests in testPreset

You can define how fuzz tests are run in a testPreset, but not what they are.

That distinction is the key constraint.

The hard limit (cannot be bypassed)

❌ testPresets cannot invent or define tests.
CTest only runs tests that were registered at configure time via:

cmake
add_test(NAME ... COMMAND ...)

So:

testPreset ✅ controls how CTest runs
testPreset ❌ cannot define what command is a test

This is a fundamental CMake/CTest rule.

👉 Therefore: at least one add_test() must exist somewhere
(one line is enough; no toolchain or logic needed).

Minimal, acceptable CMakeLists change (one-liner)

If your fuzz target already exists as fuzz_processor, this is literally all you need:

cmake
add_test(NAME fuzz_processor COMMAND fuzz_processor)

That’s it.
No flags, no logic, no presets involved.

Everything else goes into CMakePresets.json.

Now: defining fuzz execution in `testPreset` (your goal)

Once the test exists, you can fully define fuzz execution behavior in testPresets.

Example `testPresets` for fuzzing

json
{
  "testPresets": [
    {
      "name": "fuzz",
      "displayName": "Run fuzz tests (time-bounded)",
      "configurePreset": "fuzz",
      "filter": {
        "include": {
          "name": "fuzz_.*"
        }
      },
      "execution": {
        "timeoutSeconds": 300
      },
      "environment": {
        "LLVM_SYMBOLIZER": "/opt/clang/bin/llvm-symbolizer",
        "ASAN_OPTIONS": "abort_on_error=1:detect_leaks=0:symbolize=1",
        "UBSAN_OPTIONS": "halt_on_error=1:abort_on_error=1:print_stacktrace=1"
      },
      "output": {
        "outputOnFailure": true
      }
    }
  ]
}

Run it with:

bash
ctest --preset fuzz

How do fuzz flags get passed?

There are three valid patterns. Pick one.

Pattern A (cleanest): encode fuzz limits in `add_test`

cmake
add_test(
  NAME fuzz_processor
  COMMAND fuzz_processor
          -max_total_time=60
          -timeout=5
          -artifact_prefix=${CMAKE_BINARY_DIR}/fuzz_artifacts/
)

Pros:

deterministic
CI-friendly
fully controlled by CTest + presets

Cons:

flags live in CMakeLists

Pattern B (advanced): use `CTEST_PARALLEL_LEVEL` + env only

Let the test run “normally”, but control behavior via environment:

json
"environment": {
  "LIBFUZZER_OPTIONS": "-max_total_time=60 -timeout=5"
}

And in your fuzzer main:

cpp
const char* opts = std::getenv("LIBFUZZER_OPTIONS");
// parse or forward as needed

⚠️ This requires custom code and is not common.

Pattern C (hybrid, common in CI)

add_test(NAME fuzz_processor COMMAND fuzz_processor)
limit runtime using CTest timeout
rely on sanitizer aborts to fail the test

This is widely used for “fuzz smoke tests”.

What `testPreset` is excellent for

Once the test exists, presets shine at:

✅ selecting fuzz-only tests (filter.include.name)
✅ setting sanitizer + symbolizer env
✅ bounding runtime (timeoutSeconds)
✅ CI vs local variants
✅ keeping CMakeLists clean and boring

What is NOT possible (by design)

❌ Define a test command in testPreset
❌ Attach arguments to a specific test from testPreset
❌ Run arbitrary executables via ctest without add_test

This is not a missing feature — it’s a deliberate design choice in CMake.

Recommendation for your stated goals

You want:

clean CMakePresets.json
logic in toolchains
minimal CMakeLists noise
fuzz execution defined via presets

➡ Best achievable design:

One-line add_test() per fuzz target
Everything else in testPresets
Separate fuzz build tree

This is exactly how many large CMake + fuzz projects do it.

If you want, next I can:

show a naming convention (fuzz_*) that scales to many fuzzers
design two testPresets: fuzz-smoke (CI) vs fuzz-long
explain how OSS-Fuzz maps onto this model

Just say what you want next.

i also would like to be able to easily run fuzz tests from clion

Yep — you can make this smooth in CLion, but there’s one key reality:

CLion can run CTest tests and CMake targets easily, and it can run an executable with args easily — but CTest tests still have to exist in CMakeLists (even if it’s just one-liners). Presets can’t define tests.

Here’s the cleanest setup that keeps presets/toolchains doing the heavy lifting and makes CLion “one click run”.

Best approach for CLion: CTest + CMake Presets

1) Minimal CMakeLists additions (small, scalable)

Assuming your fuzz targets are already defined (e.g. add_executable(fuzz_processor ...)), add only:

cmake
enable_testing()
add_test(NAME fuzz_processor COMMAND $<TARGET_FILE:fuzz_processor>)
set_tests_properties(fuzz_processor PROPERTIES LABELS fuzz)

That gives CLion something it can discover and run under Tests.

If you have many fuzzers, repeat the add_test line per target (or use a small helper macro if you ever relax the “no CMake code” rule).

2) Put the “fuzz run policy” in `CMakePresets.json` testPresets

You have two practical ways:

A) CI/smoke style: bound by CTest timeout (no fuzzer args needed)

json
{
  "testPresets": [
    {
      "name": "fuzz-smoke",
      "displayName": "Fuzz smoke (CTest timeout)",
      "configurePreset": "fuzz",
      "filter": { "include": { "label": "fuzz" } },
      "execution": { "timeoutSeconds": 60 },
      "environment": {
        "LLVM_SYMBOLIZER": "/opt/clang/bin/llvm-symbolizer",
        "ASAN_OPTIONS": "abort_on_error=1:detect_leaks=0:symbolize=1",
        "UBSAN_OPTIONS": "halt_on_error=1:abort_on_error=1:print_stacktrace=1:symbolize=1"
      },
      "output": { "outputOnFailure": true }
    }
  ]
}

Pros: CMakeLists stays clean (no args embedded)
Cons: libFuzzer will run “forever”, and CTest will just kill it at 60s (fine for smoke).

B) “Proper” fuzzing: pass libFuzzer args in the test command (recommended)

This is the only way to reliably control -max_total_time, corpus dirs, artifacts, etc. without extra scripts.

cmake
add_test(NAME fuzz_processor
  COMMAND $<TARGET_FILE:fuzz_processor>
          -max_total_time=60
          -timeout=5
          -artifact_prefix=${CMAKE_BINARY_DIR}/fuzz_artifacts/
          ${CMAKE_SOURCE_DIR}/fuzz/corpus/processor
)
set_tests_properties(fuzz_processor PROPERTIES LABELS fuzz)

Then the same testPreset just selects label fuzz.

Pros: best developer experience + reproducible behavior
Cons: a few args live in CMakeLists (but it’s still minimal and very maintainable)

Running from CLion (what to do)

Use CLion’s CMake Presets support

Settings / Preferences → Build, Execution, Deployment → CMake
Enable CMake Presets (and choose your fuzz configure preset as a profile)
Reload CMake

Now CLion will:

configure using your fuzz preset (clang + sanitizers from toolchain)
show tests under Tests tool window (CTest)

Run fuzzers from CLion with one click

Open Tests tool window
Find fuzz_processor (or filter by label if CLion shows labels)
Click Run / Debug

If you used pattern B (args in add_test), you’re done: it will run bounded fuzzing.

Alternative CLion workflow: Run configuration for the fuzzer executable

If you truly want zero args in CMakeLists, you can:

create a CLion “Application” run config for fuzz_processor
set program args there (-max_total_time=60 ...)

Pros: no CMakeLists changes beyond add_executable
Cons: run configs are per-user / IDE state unless you export/share them; not great for team/CI consistency.

My recommendation for your goals

You said:

keep CMakePresets.json clean
keep settings in toolchain files
run easily from CLion

The best “team friendly + CLion friendly” setup is:

Toolchain file contains compilers + sanitizer/fuzzer flags
Preset selects toolchain + build dir
CTest test exists for each fuzzer
Put fuzzer args in add_test (just the minimal stable ones: time limit, artifact prefix, corpus path)
Use testPreset to select label fuzz and set sanitizer env

That gives you:

one-click run in CLion
consistent behavior in CI
minimal growth of presets and CMakeLists

If you tell me:

your fuzz target name(s)
whether you have a corpus directory (and where you want it)
whether you want CI smoke (60s) + local longer run

…I’ll output an exact minimal snippet (CMakeLists + Presets) tailored to your paths and conventions.

If I add fuzz test via cmake add_test, then can i set different parameters to test, like max_total_time?

Yes — absolutely.
Once a fuzz target is registered with add_test(), you can give each test its own parameters, including libFuzzer options like -max_total_time.

There are three clean, idiomatic ways to do this in CMake.

✅ 1) Pass libFuzzer parameters directly in `add_test` (most common)

This is the most widely used and the simplest.

cmake
add_test(
  NAME fuzz_processor_smoke
  COMMAND $<TARGET_FILE:fuzz_processor>
          -max_total_time=30
          -timeout=5
          -artifact_prefix=${CMAKE_BINARY_DIR}/fuzz_artifacts/
)

You can define multiple tests for the same fuzzer:

cmake
add_test(
  NAME fuzz_processor_long
  COMMAND $<TARGET_FILE:fuzz_processor>
          -max_total_time=600
          -timeout=10
)

Then select which one to run via:

test name
labels
testPreset filters

✔ different -max_total_time
✔ different corpus paths
✔ different artifact directories
✔ fully reproducible
✔ CLion-friendly

This is the recommended approach for libFuzzer.

✅ 2) Use `set_tests_properties()` to control execution limits

This does not pass libFuzzer flags, but controls CTest behavior.

cmake
add_test(NAME fuzz_processor COMMAND $<TARGET_FILE:fuzz_processor>)

set_tests_properties(fuzz_processor PROPERTIES
  TIMEOUT 60          # seconds; CTest kills the process
  LABELS fuzz
)

What this means:

libFuzzer still runs “forever”
CTest forcibly stops it after 60s
test is considered failed if killed

This is commonly used for CI smoke fuzzing.

⚠️ Limitation:

you cannot control libFuzzer internals (corpus, artifact prefix, per-input timeout)

✅ 3) Set per-test environment variables (advanced)

You can also set environment variables per test:

cmake
set_tests_properties(fuzz_processor PROPERTIES
  ENVIRONMENT
    "ASAN_OPTIONS=abort_on_error=1:detect_leaks=0"
    "UBSAN_OPTIONS=halt_on_error=1"
)

This is useful for:

sanitizer tuning
symbolizer paths
suppressions

⚠️ libFuzzer does NOT read options from env vars by default
(-max_total_time must still be a command-line arg).

🔁 Combine all three (real-world pattern)

cmake
add_test(
  NAME fuzz_processor_ci
  COMMAND $<TARGET_FILE:fuzz_processor>
          -max_total_time=60
          -timeout=5
          -artifact_prefix=${CMAKE_BINARY_DIR}/fuzz_artifacts/
)

set_tests_properties(fuzz_processor_ci PROPERTIES
  LABELS fuzz
  TIMEOUT 70
  ENVIRONMENT
    "ASAN_OPTIONS=abort_on_error=1:detect_leaks=0:symbolize=1"
    "UBSAN_OPTIONS=halt_on_error=1:print_stacktrace=1"
)

This gives you:

deterministic fuzzing behavior
safety net timeout
clean CI + CLion integration

How this plays with `testPresets` and CLion

Once tests exist:

testPresets can:
- select which fuzz tests to run (by name or label)
- define sanitizer env globally
- bound execution time further if desired

json
"filter": {
  "include": { "label": "fuzz" }
}

In CLion:

tests appear in the Tests tool window
you can run fuzz_processor_ci or fuzz_processor_long with one click

Recommendation for your setup

Given your goals (clean presets, toolchains for config, CLion-friendly):

✔ Put libFuzzer parameters in add_test()
✔ Use multiple tests per fuzzer for different modes (smoke vs long)
✔ Use labels to select via testPresets
✔ Keep presets focused on selection, not per-test behavior

This is the most scalable and least surprising approach.

If you want, I can:

suggest a naming convention (fuzz_*_smoke, fuzz_*_long)
show how to auto-generate fuzz tests from a list of targets
help keep artifact/corpus dirs tidy per test

Just tell me what you want next.

i would like to avoid hardcoding these parameters in cmake files, and control it from higher level

Yes — you can avoid hardcoding libFuzzer params in CMake files and control them “from above”, but with one key constraint:

CTest/testPresets cannot change a test’s command-line arguments directly.
So you need the test command in CMake to reference variables (cache vars / env), and then you set those vars from CMakePresets.json.

That gives you a clean, scalable “higher-level control” with minimal CMake changes.

Best pattern: test command reads args from cache variable

Minimal CMake code (stable, not hardcoded)

cmake
enable_testing()

# A cache var you control from presets/CLI.
set(FUZZER_ARGS "" CACHE STRING "Extra args passed to libFuzzer targets")

# Turn a space-separated string into a CMake list (so it becomes proper argv)
separate_arguments(FUZZER_ARGS)

add_test(
  NAME fuzz_processor
  COMMAND $<TARGET_FILE:fuzz_processor> ${FUZZER_ARGS}
)

set_tests_properties(fuzz_processor PROPERTIES LABELS fuzz)

What this achieves

CMakeLists contains no policy values like -max_total_time=60
It just says: “run this fuzzer with whatever args higher level provides”

Control it from `CMakePresets.json`

Example: `testPresets` sets the args (via cache at configure time)

Because FUZZER_ARGS is a cache variable, you set it in the configure preset (not test preset):

json
{
  "version": 6,
  "configurePresets": [
    {
      "name": "fuzz-smoke",
      "generator": "Ninja",
      "binaryDir": "${sourceDir}/out/build-fuzz-smoke",
      "cacheVariables": {
        "CMAKE_TOOLCHAIN_FILE": "${sourceDir}/cmake/toolchains/clang-fuzz.cmake",
        "CMAKE_BUILD_TYPE": "RelWithDebInfo",
        "FUZZER_ARGS": "-max_total_time=30 -timeout=5 -artifact_prefix=out/build-fuzz-smoke/artifacts/"
      }
    },
    {
      "name": "fuzz-long",
      "generator": "Ninja",
      "binaryDir": "${sourceDir}/out/build-fuzz-long",
      "cacheVariables": {
        "CMAKE_TOOLCHAIN_FILE": "${sourceDir}/cmake/toolchains/clang-fuzz.cmake",
        "CMAKE_BUILD_TYPE": "RelWithDebInfo",
        "FUZZER_ARGS": "-max_total_time=600 -timeout=10 -artifact_prefix=out/build-fuzz-long/artifacts/"
      }
    }
  ],
  "testPresets": [
    {
      "name": "fuzz-smoke",
      "configurePreset": "fuzz-smoke",
      "filter": { "include": { "label": "fuzz" } },
      "environment": {
        "LLVM_SYMBOLIZER": "/opt/clang/bin/llvm-symbolizer",
        "ASAN_OPTIONS": "abort_on_error=1:detect_leaks=0:symbolize=1",
        "UBSAN_OPTIONS": "halt_on_error=1:abort_on_error=1:print_stacktrace=1"
      },
      "output": { "outputOnFailure": true }
    },
    {
      "name": "fuzz-long",
      "configurePreset": "fuzz-long",
      "filter": { "include": { "label": "fuzz" } },
      "output": { "outputOnFailure": true }
    }
  ]
}

Run:

bash
cmake --preset fuzz-smoke
cmake --build --preset fuzz-smoke
ctest --preset fuzz-smoke

or:

bash
cmake --preset fuzz-long
cmake --build --preset fuzz-long
ctest --preset fuzz-long

This is the cleanest “higher-level control”: different presets = different fuzz parameters, no hardcoded values in CMake.

Notes and gotchas

separate_arguments(FUZZER_ARGS) is important; otherwise CMake treats the whole string as one argument.
Keeping separate build dirs (out/build-fuzz-smoke, out/build-fuzz-long) prevents cache collisions and keeps things reproducible.
testPreset is still useful for env (ASAN_OPTIONS, symbolizer), filtering by label, output.

CLion workflow

CLion can work nicely with this setup:

Choose the configure preset (smoke vs long) in CLion’s CMake profile/presets UI
Then run the CTest test fuzz_processor from the Tests window
The args come from that preset’s cache (FUZZER_ARGS)

So you get “one click run” and high-level control.

If you want, I can also show a variant where:

FUZZER_ARGS is different per fuzzer (e.g. FUZZER_ARGS_processor, FUZZER_ARGS_parser)
or where corpus directories are injected cleanly without hardcoding paths.

cmake option() directive

option() is CMake’s built-in way to define boolean feature switches that can be controlled from the “outside” (CLI, presets, cache, GUIs) without hard-coding behavior in CMakeLists.txt.

Given your goal (high-level control via presets, minimal CMake logic), this fits perfectly.

What `option()` does

cmake
option(ENABLE_FUZZING "Enable fuzzing support" OFF)

This creates a cache variable:

Name: ENABLE_FUZZING
Type: BOOL
Default: OFF
User-configurable via:
- -DENABLE_FUZZING=ON
- CMakePresets.json
- CLion / CMake GUI

Once defined, you use it like a normal variable:

cmake
if (ENABLE_FUZZING)
  # fuzz-related logic
endif()

Why `option()` is better than raw cache variables

Compared to:

cmake
set(ENABLE_FUZZING OFF CACHE BOOL "")

option():

documents intent (description shows up in GUIs/CLion)
provides a default
is idiomatic and expected by CMake users
avoids accidental type misuse

Minimal fuzz-friendly pattern using `option()`

This matches your “control from higher level” requirement.

CMakeLists.txt (small, policy-only)

cmake
option(ENABLE_FUZZING "Enable fuzz targets" OFF)

enable_testing()

if (ENABLE_FUZZING)
  add_test(
    NAME fuzz_processor
    COMMAND $<TARGET_FILE:fuzz_processor> ${FUZZER_ARGS}
  )
  set_tests_properties(fuzz_processor PROPERTIES LABELS fuzz)
endif()

No hardcoded flags.
No build-type logic.
No toolchain logic.

Drive it from `CMakePresets.json`

json
{
  "configurePresets": [
    {
      "name": "fuzz",
      "binaryDir": "${sourceDir}/out/build-fuzz",
      "cacheVariables": {
        "ENABLE_FUZZING": "ON",
        "FUZZER_ARGS": "-max_total_time=60 -timeout=5"
      }
    }
  ]
}

Now:

fuzz tests exist only when ENABLE_FUZZING=ON
arguments come from presets
CMakeLists stays stable and clean

Advanced: multiple fuzz “modes” without code duplication

You can stack options:

cmake
option(ENABLE_FUZZING "Enable fuzz tests" OFF)
option(FUZZ_LONG_RUN "Enable long-running fuzz tests" OFF)

cmake
if (ENABLE_FUZZING)
  set(_args "${FUZZER_ARGS}")
  if (FUZZ_LONG_RUN)
    list(APPEND _args "-max_total_time=600")
  endif()

  add_test(NAME fuzz_processor COMMAND $<TARGET_FILE:fuzz_processor> ${_args})
endif()

Presets choose behavior:

json
"cacheVariables": {
  "ENABLE_FUZZING": "ON",
  "FUZZ_LONG_RUN": "ON"
}

Important caveats (worth knowing)

1️⃣ `option()` is evaluated at configure time

Changing it requires reconfigure (which presets naturally do).

2️⃣ `option()` is global

It’s not target-scoped. That’s intentional: it represents build configuration, not per-target logic.

3️⃣ Boolean only

If you need structured values (paths, arg lists), use:

cmake
set(FUZZER_ARGS "" CACHE STRING "libFuzzer arguments")

…and combine with option().

Recommended pattern (what you’re converging to)

Purpose	Mechanism
Enable fuzzing	`option(ENABLE_FUZZING)`
Control fuzz parameters	cache `STRING` vars
Compiler/sanitizers	toolchain file
Build variant	preset
Run selection	testPreset + labels

This is exactly how large, well-maintained CMake projects stay sane.

If you want, I can:

refactor a small example showing option() + presets + CLion flow end-to-end
show how to keep options discoverable in CLion’s UI
help you design a naming scheme that won’t get messy as fuzzers grow

Just say what you want next.

cmake, custom target running fuzz tests

Here’s the clean CMake pattern for a custom target that runs fuzz tests, while keeping the parameters controlled from “higher level” (cache vars / presets), and keeping hardcoded args out of CMake as much as possible.

1) CMakeLists.txt: custom target `run_fuzz`

This assumes you already have a fuzzer executable target called fuzz_processor.

cmake
# Enable/disable fuzz runner from outside (presets/CLI)
option(ENABLE_FUZZING "Enable fuzz targets and runners" OFF)

# libFuzzer args controlled from presets/CLI
set(FUZZER_ARGS "" CACHE STRING "Arguments passed to libFuzzer targets (space-separated)")
separate_arguments(FUZZER_ARGS)  # convert string -> list for correct argv splitting

# Optional: where to put crash artifacts
set(FUZZ_ARTIFACT_DIR "${CMAKE_BINARY_DIR}/fuzz_artifacts" CACHE PATH "Directory for fuzz artifacts")

if (ENABLE_FUZZING)
  # Ensure artifact dir exists before running
  add_custom_target(run_fuzz
    COMMAND ${CMAKE_COMMAND} -E make_directory "${FUZZ_ARTIFACT_DIR}"
    COMMAND $<TARGET_FILE:fuzz_processor>
            -artifact_prefix=${FUZZ_ARTIFACT_DIR}/
            ${FUZZER_ARGS}
    DEPENDS fuzz_processor
    USES_TERMINAL
    COMMENT "Running libFuzzer: fuzz_processor"
  )
endif()

How to run

bash
cmake --build <build_dir> --target run_fuzz

This gives you “run through CMake” via the build system.

2) Control everything from `CMakePresets.json`

Example preset that keeps it clean:

json
{
  "version": 6,
  "configurePresets": [
    {
      "name": "fuzz",
      "generator": "Ninja",
      "binaryDir": "${sourceDir}/out/build-fuzz",
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "RelWithDebInfo",
        "CMAKE_TOOLCHAIN_FILE": "${sourceDir}/cmake/toolchains/clang-fuzz.cmake",

        "ENABLE_FUZZING": "ON",
        "FUZZER_ARGS": "-max_total_time=60 -timeout=5 -max_len=4096",
        "FUZZ_ARTIFACT_DIR": "${sourceDir}/out/fuzz_artifacts/"
      }
    }
  ],
  "buildPresets": [
    {
      "name": "run-fuzz",
      "configurePreset": "fuzz",
      "targets": [ "run_fuzz" ]
    }
  ]
}

Run it end-to-end:

bash
cmake --preset fuzz
cmake --build --preset run-fuzz

3) Multiple fuzz targets (optional “meta runner”)

If you have more fuzzers, you can create separate runners:

cmake
add_custom_target(run_fuzz_processor
  COMMAND ${CMAKE_COMMAND} -E make_directory "${FUZZ_ARTIFACT_DIR}"
  COMMAND $<TARGET_FILE:fuzz_processor> -artifact_prefix=${FUZZ_ARTIFACT_DIR}/ ${FUZZER_ARGS}
  DEPENDS fuzz_processor
  USES_TERMINAL
)

…and a meta target:

cmake
add_custom_target(run_all_fuzz)
add_dependencies(run_all_fuzz run_fuzz_processor run_fuzz_parser)

Then cmake --build --target run_all_fuzz.

Notes / gotchas

add_custom_target() runs at build time, not test time (so it’s not CTest). That’s fine if your goal is “run via CMake”.
separate_arguments() is important so FUZZER_ARGS gets split into real argv items.
If you want different params per fuzzer, use per-target cache vars, e.g. FUZZER_ARGS_processor.

If you tell me your fuzz target names (one or many) and where you want corpus/artifacts, I’ll give you a neat multi-fuzzer setup that still keeps presets clean and avoids hardcoding.

i have this fuzzing code: extern "C" int LLVMFuzzerTestOneInput(const std::uint8_t *data, std::size_t size) { const std::vector<std::uint8_t> input(data, data + size); return 0; } i need to perform some init steps before the actual test launches

Use libFuzzer’s optional initializer hook: LLVMFuzzerInitialize. It is called once, before the first LLVMFuzzerTestOneInput run. Put your init steps there.

Example (one-time init)

cpp
#include <cstdint>
#include <cstddef>
#include <vector>
#include <mutex>

// Include your project headers
#include "processor.hpp"

// One-time init flag (extra safety if you ever call init lazily too)
static std::once_flag g_init_once;

static void DoInitOnce(int* argc, char*** argv) {
    // Put your init here, examples:
    // - set up logging
    // - seed deterministic RNG
    // - load configuration
    // - initialize singletons
    // - create temp dirs
    //
    // Keep it deterministic and avoid network / long I/O for fuzzing.
    (void)argc;
    (void)argv;

    // Example:
    // my_library::Init();
}

extern "C" int LLVMFuzzerInitialize(int* argc, char*** argv) {
    std::call_once(g_init_once, [&] { DoInitOnce(argc, argv); });
    return 0;
}

extern "C" int LLVMFuzzerTestOneInput(const std::uint8_t* data, std::size_t size) {
    // Optional: if you prefer lazy init instead of LLVMFuzzerInitialize:
    // std::call_once(g_init_once, [&] { DoInitOnce(nullptr, nullptr); });

    const std::vector<std::uint8_t> input(data, data + size);

    processor p;
    (void)p.process(input);

    return 0;
}

Notes

LLVMFuzzerInitialize is the canonical place for “init before fuzzing starts”.
It receives argc/argv, so you can parse custom flags if you want (but keep it simple).
Prefer deterministic init and avoid expensive work; if you need external resources, consider embedding them or using a small local fixture.
If your init needs to happen even when the fuzzer is linked differently, the std::call_once guard makes it robust.

If you tell me what your init steps are (e.g., “load a config file”, “initialize crypto library”, “set logger”), I can show the safest fuzzing-friendly way to do each (including how to avoid nondeterminism and slowdowns).

should i use static_cast<void>(argc) or (void) argc

Use (void)argc; (and (void)argv;) in C++.

Why `(void)argc;` is preferred

It’s the idiomatic, universally recognized way to mark an intentionally unused variable.
It’s shorter and clearer.
It works in both C and C++, which matters here because:
- libFuzzer entry points are extern "C"
- fuzz harnesses often look C-like even in C++ files

Example:

cpp
extern "C" int LLVMFuzzerInitialize(int* argc, char*** argv) {
    (void)argc;
    (void)argv;

    // init code
    return 0;
}

About `static_cast<void>(argc)`

This is also valid C++, but:

It’s verbose
Slightly awkward for simple “unused parameter” suppression
Mostly used in template-heavy or strongly typed contexts where casts already exist

You’ll rarely see it in:

libc++
LLVM
compiler-rt
libFuzzer examples

They overwhelmingly use (void)x;.

Even better: use attributes (when available)

If you control the signature and your compiler supports it, this is the cleanest:

cpp
extern "C" int LLVMFuzzerInitialize([[maybe_unused]] int* argc,
                                    [[maybe_unused]] char*** argv) {
    // init
    return 0;
}

But:

attributes can’t always be added cleanly to mandated signatures
some projects prefer C-style suppression for portability

TL;DR

✅ Use (void)argc;
static_cast<void>(argc) is fine but uncommon here
[[maybe_unused]] is great when signatures allow it

Your instinct to ask is good — consistency matters in fuzz harnesses.

i need length of std::vector in fuzz test to have some specific length

That’s a very common fuzzing requirement, and there are several correct ways to handle it.
Which one you choose depends on whether the length is a hard precondition or just a preferred shape.

I’ll walk from most common / recommended → more specialized.

1️⃣ Enforce a hard length requirement (most common)

If your code only makes sense for a specific length (or range), reject inputs that don’t match.

Exact length (e.g. exactly 32 bytes)

cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
    if (size != 32) {
        return 0;  // reject input
    }

    std::vector<uint8_t> input(data, data + size);

    processor p;
    p.process(input);
    return 0;
}

Minimum length

cpp
if (size < 16) {
    return 0;
}

Length range

cpp
if (size < 16 || size > 1024) {
    return 0;
}

✅ Best when:

length is a real API precondition
shorter inputs are meaningless
you want focused fuzzing

libFuzzer handles this well; rejected inputs are cheap.

2️⃣ Resize / pad input to required length (very common)

If you want every fuzz input to run, but still enforce a shape:

Pad with zeros

cpp
constexpr size_t REQUIRED_LEN = 64;

std::vector<uint8_t> input(data, data + size);
input.resize(REQUIRED_LEN, 0);  // pad or truncate

Truncate only

cpp
if (size > REQUIRED_LEN) {
    input.resize(REQUIRED_LEN);
}

✅ Best when:

function expects a fixed-size buffer
you want coverage even from small inputs
padding is semantically acceptable

⚠️ Padding can slightly bias fuzzing, but usually acceptable.

3️⃣ Split input: header + payload (recommended for structure)

Very useful when:

first N bytes control behavior
rest is data

cpp
constexpr size_t HEADER = 8;
if (size < HEADER) return 0;

uint64_t flags;
std::memcpy(&flags, data, sizeof(flags));

std::vector<uint8_t> payload(data + HEADER, data + size);

Or enforce payload size:

cpp
if (payload.size() != 128) return 0;

This dramatically improves fuzzing efficiency for structured code.

4️⃣ Use libFuzzer’s built-in size controls (coarse control)

You can limit size globally via runtime flags:

bash
-max_len=256

But:

this controls maximum, not exact length
you still need checks in code

Good as a first line of defense, not a full solution.

5️⃣ Advanced: deterministic expansion (hash-based)

If you need exact size but want entropy preserved:

cpp
std::vector<uint8_t> input(REQUIRED_LEN);
for (size_t i = 0; i < REQUIRED_LEN; ++i) {
    input[i] = data[i % size];
}

Or mix with a hash.

✅ Rare, but useful for crypto / parsers expecting fixed-size blobs.

What NOT to do

❌ Don’t assert/crash on wrong size

cpp
assert(size == 32); // bad for fuzzing

❌ Don’t silently ignore most of the buffer unless intentional

Recommended decision table

Requirement	Best approach
Exact length required	Early return
Minimum length	Early return
Fixed buffer but flexible	Resize/pad
Structured input	Split input
Reduce input size	`-max_len` + checks

Example: clean, idiomatic fuzz harness

cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
    constexpr size_t REQUIRED_LEN = 64;
    if (size < REQUIRED_LEN) return 0;

    std::vector<uint8_t> input(data, data + REQUIRED_LEN);

    processor p;
    p.process(input);
    return 0;
}

If you tell me:

exact length or range
whether padding is allowed
whether input is structured (header/body)

…I’ll give you the optimal harness for your case (coverage-friendly and fast).

during running fuzz tests, i observe hundreds of tests using exactly the same input

That can happen, and it’s usually due to one of a few common causes. Here are the most likely explanations and what to do about each.

1) You’re running with a fixed seed (or CLion keeps reusing the same seed)

If the fuzzer is launched with -seed=N, it will generate the same sequence every run. Also, some IDE run configs accidentally keep a seed or reuse a fixed corpus.

What to check:

Are you passing -seed=... anywhere (CTest, run config, script)?
Are you always starting from the same tiny corpus directory?

Fix:

Remove -seed (or vary it)
Or add -seed=$(date +%s) in scripts (not great for reproducibility, but good for exploration)

2) libFuzzer is doing “calibration / minimization” work

libFuzzer can execute the same input many times while it:

measures execution speed
checks for flaky crashes
performs internal reduction/merging

This is normal especially at startup or when it finds something interesting.

What to check:

Run with -verbosity=1 and watch messages like “pulse”, “minimize”, “merge”, “REDUCE”, etc.
If you see “Trying to find minimal…”, repeats are expected.

3) Your harness rejects almost all inputs, so only a few survive

If you do something like:

cpp
if (size != 32) return 0;

then most generated inputs get discarded, and the effective “useful” space is tiny. libFuzzer may keep reusing a small set that passes the filter.

Fixes:

Prefer range checks over exact checks: if (size < 32) return 0;
Or use resize/pad instead of rejecting
Or use -max_len to guide sizes instead of strict filtering

4) Your code ignores most bytes (only uses a prefix), so many inputs are equivalent

If you only consume the first N bytes, different inputs become effectively identical. libFuzzer will appear to repeat “the same” input because mutations outside the consumed region don’t change behavior.

Fix:

Make sure you actually use the whole buffer (or at least a lot of it)
If your format is structured, parse more fields from the input rather than just a few bytes

5) You enabled `-runs` with a very small corpus and `-mutate_depth`/`-max_len` constraints

With a tiny corpus and strong constraints, libFuzzer may spend many iterations mutating but effectively not producing novel behavior, so it keeps picking the same base input.

Fix:

Provide an initial corpus directory with a few diverse seeds
Increase allowed size (-max_len)
Let it run longer (novelty grows with time if the space isn’t overly constrained)

6) Determinism issues: the target behaves “the same” even on different inputs

If the function always returns the same result and doesn’t branch much, libFuzzer won’t find coverage improvements and will keep reusing inputs while trying mutations that don’t help.

Fix:

Add more structure to input to reach more code paths (e.g., interpret header bytes as flags/lengths)
Consider adding a simple grammar-ish decoding of the input

How to diagnose quickly (2 commands)

A) Increase libFuzzer logging

Run your fuzzer with:

bash
./fuzz_processor -verbosity=1 -print_final_stats=1

If repeats are due to minimization/merge/calibration, you’ll see it.

B) Print a fingerprint of the input you actually use

A super-effective trick is to hash the bytes you consume:

cpp
#include <cstdint>
#include <cstddef>

static uint64_t fnv1a(const uint8_t* d, size_t n) {
    uint64_t h = 1469598103934665603ull;
    for (size_t i = 0; i < n; ++i) { h ^= d[i]; h *= 1099511628211ull; }
    return h;
}

extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
    // e.g., only first 32 bytes are used:
    size_t used = (size < 32) ? size : 32;
    volatile uint64_t sig = fnv1a(data, used);
    (void)sig;

    // ... rest ...
    return 0;
}

If the signature is the same over and over, your harness is effectively seeing identical inputs (because of filtering, truncation, etc.). If signatures differ but behavior doesn’t, your code path is likely constant.

Most common root cause in your earlier context

You mentioned wanting a “specific length”. If you used an exact-length reject (size != N), that is the #1 cause of “same input over and over”.

Best fix in fuzzing practice:

accept a range, or
resize() to N, rather than rejecting.

Example:

cpp
std::vector<uint8_t> input(data, data + size);
input.resize(N, 0); // ensures exact length but keeps all runs meaningful

If you paste:

the current harness (including your size handling / init),
and the exact libFuzzer command line CLion/CTest runs,

I can tell you exactly which cause it is and the minimal change to fix it.

NOTE: libFuzzer has rudimentary signal handlers. Combine libFuzzer with AddressSanitizer or similar for better crash reports.

Yep — that message is normal, and it’s basically libFuzzer reminding you:

it can catch crashes via basic signal handlers,
but you’ll get much better diagnostics (stack traces, heap corruption details, UBs) if you build/run with sanitizers like ASan/UBSan.

What you should do (typical fuzz build)

Compile and link your fuzz target with:

libFuzzer: -fsanitize=fuzzer
AddressSanitizer: -fsanitize=address
UndefinedBehaviorSanitizer: -fsanitize=undefined

Example (clang++):

bash
clang++ -O1 -g -fno-omit-frame-pointer \
  -fsanitize=fuzzer,address,undefined \
  fuzz_processor.cpp -o fuzz_processor

With CMake, same idea via compile/link flags or your fuzz toolchain file.

Make crash reports actually readable

Also ensure symbolization works:

bash
export LLVM_SYMBOLIZER=/opt/clang/bin/llvm-symbolizer
export ASAN_OPTIONS=abort_on_error=1:symbolize=1:detect_leaks=0
export UBSAN_OPTIONS=halt_on_error=1:print_stacktrace=1:symbolize=1

Then when it crashes, you’ll see proper file/line stacks instead of raw addresses.

Why the message appears even when sanitizers are enabled

Sometimes you’ll still see that note because:

libFuzzer prints it unconditionally at startup in some builds, or
it doesn’t detect the sanitizer runtime early enough to silence the note

So the right way to verify is:

bash
./fuzz_processor -help=1 | head

and/or just cause a controlled crash to see if ASan reports.

One more tip for fuzzing quality

Prefer:

-O1 (faster instrumentation)
-fno-omit-frame-pointer (better stacks)

And keep LSan off during fuzzing unless you specifically want leaks (detect_leaks=0), because leak checks slow down fuzzing a lot.

If you show me your current CMake/toolchain flags (or your fuzzer binary startup line), I can tell you whether you’re actually getting ASan/UBSan and why the note is still printed.

i have that: add_executable(fuzztest fuzztest/fuzztest.cpp) target_link_libraries(fuzztest PRIVATE ${PROJECT_NAME}) target_compile_options(fuzztest PRIVATE -g -O1 -fno-omit-frame-pointer -fsanitize=fuzzer,address,undefined) target_link_options(fuzztest PRIVATE -fsanitize=fuzzer,address,undefined)

That setup is basically correct for clang/libFuzzer + ASan/UBSan. A few practical tweaks will make it more robust and less surprising.

1) Don’t pass the sanitizer flags only as options if your project may use `try_compile`

For fuzz targets it’s fine, but if you want to be extra safe, ensure the compile + link both get the same sanitizer set (you already do). Good.

2) Add `-fno-sanitize-recover=all` (optional but common)

This makes UBSan/ASan abort immediately instead of trying to recover (better for fuzzing).

cmake
target_compile_options(fuzztest PRIVATE
  -g -O1 -fno-omit-frame-pointer
  -fno-sanitize-recover=all
  -fsanitize=fuzzer,address,undefined
)

3) Consider disabling leak detection during fuzzing (runtime, not compile)

LeakSanitizer can slow fuzzing and create noise. Prefer runtime env:

ASAN_OPTIONS=detect_leaks=0

(You can still enable it in a separate “leaks” run/preset.)

4) Ensure you’re really using Clang

-fsanitize=fuzzer is clang-specific. If someone configures with GCC, this will break.

Good guard:

cmake
if (NOT CMAKE_CXX_COMPILER_ID MATCHES "Clang")
  message(FATAL_ERROR "fuzztest requires Clang (libFuzzer: -fsanitize=fuzzer)")
endif()

5) Make symbolization reliable

Not a CMake flag; set env when running (CTest preset / CLion run config):

LLVM_SYMBOLIZER=/path/to/llvm-symbolizer
ASAN_OPTIONS=abort_on_error=1:symbolize=1:detect_leaks=0
UBSAN_OPTIONS=halt_on_error=1:print_stacktrace=1:symbolize=1

6) If you still see the “rudimentary signal handlers” note

That message can appear even when sanitizers are enabled; it’s not necessarily an error. The real check is: when a crash happens, do you get an ASan/UBSan report with stack traces? If yes, you’re fine.

A polished version of your snippet

cmake
add_executable(fuzztest fuzztest/fuzztest.cpp)
target_link_libraries(fuzztest PRIVATE ${PROJECT_NAME})

if (NOT CMAKE_CXX_COMPILER_ID MATCHES "Clang")
  message(FATAL_ERROR "fuzztest requires Clang (libFuzzer: -fsanitize=fuzzer)")
endif()

target_compile_options(fuzztest PRIVATE
  -g -O1 -fno-omit-frame-pointer
  -fno-sanitize-recover=all
  -fsanitize=fuzzer,address,undefined
)

target_link_options(fuzztest PRIVATE
  -fsanitize=fuzzer,address,undefined
)

If you tell me how you run it (CLI/CTest/CLion), I can give you the exact testPreset/CLion env settings so the crash reports are fully symbolized.

artifact_prefix='./'; Test unit written to ./crash-50dde49f5881ea6253f932c6f77d8dcdf0407219

That output means libFuzzer found a crashing input and wrote it to disk.

artifact_prefix='./' → “put artifacts in the current working directory”
Test unit written to ./crash-50dde49f5881ea6253f932c6f77d8dcdf0407219 → the minimized input that triggers the crash is saved in that file.

How to reproduce the crash

Run your fuzz target on that exact file:

bash
./fuzztest ./crash-50dde49f5881ea6253f932c6f77d8dcdf0407219

If you want it to stop immediately on the first crash and print a clean report:

bash
ASAN_OPTIONS=abort_on_error=1:symbolize=1:detect_leaks=0 \
UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_stacktrace=1:symbolize=1 \
LLVM_SYMBOLIZER=/opt/clang/bin/llvm-symbolizer \
./fuzztest ./crash-50dde49f5881ea6253f932c6f77d8dcdf0407219

Why it wrote into “./”

Because your fuzzer was launched with something like:

-artifact_prefix=./
or -artifact_prefix=.

If you want a dedicated directory (recommended):

bash
mkdir -p artifacts
./fuzztest -artifact_prefix=artifacts/ ...

If you run via CTest/CLion

The working directory might be the build dir or project root, so artifacts land “wherever you launched from”. To make it deterministic:

set -artifact_prefix=${CMAKE_BINARY_DIR}/fuzz_artifacts/ (in your test command), or
set it in the higher-level args you inject (your FUZZER_ARGS cache var approach).

Next steps

Reproduce with the crash file (command above).
Debug: run under gdb/lldb:

bash
lldb -- ./fuzztest ./crash-50dde49f5881ea6253f932c6f77d8dcdf0407219

Optionally minimize further (libFuzzer can already minimize, but you can re-run with):

bash
./fuzztest -minimize_crash=1 ./crash-...

If you paste the ASan/UBSan stack trace from the reproducer run, I can help interpret it and suggest the most likely bug class and fix.

output crash log to some directory

That’s controlled entirely by libFuzzer, via -artifact_prefix.
Set it to a directory path (ending with /) and all crash/minimized inputs will go there.

The correct way (recommended)

1️⃣ Choose a directory

Example: fuzz_artifacts/

bash
mkdir -p fuzz_artifacts

2️⃣ Run the fuzzer with `-artifact_prefix`

bash
./fuzztest -artifact_prefix=fuzz_artifacts/

Result:

text
Test unit written to fuzz_artifacts/crash-50dde49f5881ea6253f932c6f77d8dcdf0407219

Absolute path (best for CI / CTest / CLion)

bash
./fuzztest -artifact_prefix=/full/path/to/fuzz_artifacts/

This avoids “where did it write?” confusion when the working directory changes.

Via CMake (clean, higher-level control)

If you already use a cache variable like FUZZER_ARGS:

cmake
# CMakeLists.txt
set(FUZZER_ARGS "" CACHE STRING "libFuzzer arguments")
separate_arguments(FUZZER_ARGS)

add_custom_target(run_fuzz
  COMMAND $<TARGET_FILE:fuzztest> ${FUZZER_ARGS}
  DEPENDS fuzztest
  USES_TERMINAL
)

Then in CMakePresets.json:

json
"cacheVariables": {
  "FUZZER_ARGS": "-artifact_prefix=${sourceDir}/out/fuzz_artifacts/ -max_total_time=60"
}

Now all crashes go to:

bash
out/fuzz_artifacts/

Via CTest (if using `add_test`)

cmake
add_test(
  NAME fuzztest
  COMMAND $<TARGET_FILE:fuzztest>
          -artifact_prefix=${CMAKE_BINARY_DIR}/fuzz_artifacts/
)

CTest + CLion will now always store crashes in a predictable place.

Important details (easy to miss)

Directory must exist — libFuzzer does NOT create it
Path must end with /
If not set, libFuzzer writes to the current working directory
Crash files are inputs, not logs
(the log goes to stdout/stderr)

Typical, battle-tested layout

powershell
out/
  build-fuzz/
  fuzz_artifacts/
    crash-xxxx
    timeout-yyyy
    leak-zzzz

TL;DR

Use:

bash
-artifact_prefix=/path/to/dir/

That’s the only correct and supported way.

If you want, I can:

wire this cleanly into your CTest preset
make CLion always write artifacts to the same directory
show how to auto-tag artifacts per fuzzer target

Just say the word.

is name "fuzz_artifacts" also some kind of industrial agreement?

Short answer: yes — more or less.
Longer answer: it’s not a standard, but it is an industry-wide convention that emerged independently in many projects.

Here’s the nuance.

Is `fuzz_artifacts` an official standard?

No.

It’s not defined by LLVM
not defined by libFuzzer
not defined by CMake, OSS-Fuzz, or any RFC

You won’t find it documented as “the correct name”.

Why `fuzz_artifacts` shows up everywhere anyway

1️⃣ libFuzzer itself uses the word artifact

The flag is literally:

diff
-artifact_prefix=

So people naturally name the directory after what it contains: artifacts.

2️⃣ OSS-Fuzz popularized the concept (without enforcing a name)

In OSS-Fuzz:

crashes, timeouts, leaks, OOMs are all called artifacts
the infrastructure collects them into an “artifacts” directory internally

Local setups mirrored that terminology:


fuzz_artifacts/


artifacts/

3️⃣ It avoids semantic overload

Compare alternatives:

Name	Problem
`crashes/`	excludes timeouts, leaks, OOM
`logs/`	these are inputs, not logs
`corpus/`	wrong meaning
`outputs/`	vague
`results/`	vague

fuzz_artifacts:

clearly says “fuzzing outputs”
doesn’t lie about contents
scales as you add sanitizers

4️⃣ Widely used in real projects

You’ll see variations of this in:

LLVM
Chromium
Android
many internal security teams
CI pipelines that archive fuzz results

Common variants:


fuzz_artifacts/
artifacts/
fuzz-results/
fuzz_out/

fuzz_artifacts is the least ambiguous.

What is effectively standardized

These things are de-facto standards:

using -artifact_prefix
keeping artifacts outside the corpus
storing artifacts per fuzzer or per run
not mixing artifacts with logs

Directory names are convention, but the concept is standard practice.

Recommended naming (what I’d use)

For a single fuzzer:

bash
out/fuzz_artifacts/

For multiple fuzzers:

bash
out/fuzz_artifacts/
  fuzz_processor/
  fuzz_parser/
  fuzz_decoder/

For CI:

bash
out/fuzz_artifacts/${CI_JOB_ID}/

TL;DR

❌ not a formal standard
✅ strong, industry-wide convention
✅ aligns with libFuzzer terminology
✅ understood instantly by fuzzing engineers

Using fuzz_artifacts will never confuse anyone who’s done fuzzing before — which is exactly what you want.

If you want, I can suggest a directory layout that scales cleanly from local dev → CI → OSS-Fuzz-style workflows.

is it possible to do something like that? cmake --preset fuzz cmake --build --preset fuzz --target fuzztest -artifact_prefix=fuzz_artifacts

Not directly.

cmake --build … does not accept arbitrary extra arguments to pass to the program you’re building/running. -artifact_prefix=… is a runtime argument for your fuzzer executable, not a build option, so CMake will either reject it or treat it as an invalid build-tool arg.

What does work

1) Build, then run the executable with args (simplest)

bash
cmake --preset fuzz
cmake --build --preset fuzz --target fuzztest
./out/build-fuzz/fuzztest -artifact_prefix=fuzz_artifacts/

2) Create a custom “run” target (so you can do it via `cmake --build`)

In CMake:

cmake
add_custom_target(run_fuzztest
  COMMAND $<TARGET_FILE:fuzztest> -artifact_prefix=${CMAKE_BINARY_DIR}/fuzz_artifacts/
  DEPENDS fuzztest
  USES_TERMINAL
)

Then:

bash
cmake --build --preset fuzz --target run_fuzztest

3) Use CTest (best if you want `ctest --preset …`)

cmake
enable_testing()
add_test(NAME fuzztest COMMAND $<TARGET_FILE:fuzztest> -artifact_prefix=${CMAKE_BINARY_DIR}/fuzz_artifacts/)

Run:

bash
ctest --preset fuzz

If you want “higher-level control” without hardcoding

Use a cache var (set from presets) and expand it in the test/runner:

cmake
set(FUZZER_ARGS "" CACHE STRING "Args for fuzzers")
separate_arguments(FUZZER_ARGS)
add_custom_target(run_fuzztest
  COMMAND $<TARGET_FILE:fuzztest> ${FUZZER_ARGS}
  DEPENDS fuzztest
  USES_TERMINAL
)

Preset sets:

json
"cacheVariables": {
  "FUZZER_ARGS": "-artifact_prefix=${sourceDir}/fuzz_artifacts/ -max_total_time=60"
}

So: you can get the workflow you want, but it requires either:

running the executable after build, or
a custom run_* target / add_test wrapper.

If you tell me whether you prefer run_* target or ctest, I’ll give you the cleanest minimal snippet for your setup.

im using gitlab, and in case of failed test, i want to expose crash report as artifact

Yes — this is a very standard GitLab CI pattern.

Two important details up front:

libFuzzer “crash artifacts” are files containing the crashing input (e.g. crash-*, timeout-*), not the full console log.
The “crash report” (ASan/UBSan stack trace, etc.) normally goes to stdout/stderr → GitLab job log. If you want it as an artifact too, you should tee stdout/stderr into a file.

Below is a clean, working .gitlab-ci.yml pattern.

Option 1: Upload crash inputs + a saved log file (recommended)

yaml
stages: [build, fuzz]

fuzz_smoke:
  stage: fuzz
  image: ubuntu:22.04
  variables:
    # Put artifacts inside the project dir so GitLab can collect them.
    FUZZ_ARTIFACT_DIR: "$CI_PROJECT_DIR/fuzz_artifacts"
    # Helpful sanitizer settings for readable stack traces
    ASAN_OPTIONS: "abort_on_error=1:detect_leaks=0:symbolize=1"
    UBSAN_OPTIONS: "halt_on_error=1:abort_on_error=1:print_stacktrace=1:symbolize=1"
    # If you use a portable clang:
    LLVM_SYMBOLIZER: "/opt/clang/bin/llvm-symbolizer"
  script:
    - mkdir -p "$FUZZ_ARTIFACT_DIR"
    - cmake --preset fuzz
    - cmake --build --preset fuzz --target fuzztest

    # Run bounded fuzzing and save output to a file.
    # `set -o pipefail` ensures the job fails if fuzztest fails even though we tee the output.
    - |
      set -o pipefail
      ./out/build-fuzz/fuzztest \
        -max_total_time=60 \
        -timeout=5 \
        -artifact_prefix="$FUZZ_ARTIFACT_DIR/" \
        2>&1 | tee "$FUZZ_ARTIFACT_DIR/fuzztest.log"
  artifacts:
    when: always              # upload even if the job fails
    expire_in: 7 days
    paths:
      - fuzz_artifacts/

What you’ll get in GitLab job artifacts:

fuzz_artifacts/crash-* (and/or timeout-*, oom-*, etc.)
fuzz_artifacts/fuzztest.log (the stack trace and libFuzzer output)

Option 2: Only upload on failure

If you prefer not to store artifacts on successful runs:

yaml
artifacts:
  when: on_failure
  expire_in: 7 days
  paths:
    - fuzz_artifacts/

Notes that prevent common CI headaches

1) Ensure `-artifact_prefix` ends with `/`

libFuzzer expects a prefix, and for directories it should look like:

-artifact_prefix=/path/to/dir/

2) Use an absolute path inside the job workspace

$CI_PROJECT_DIR/fuzz_artifacts is safe. Relative paths depend on the job’s working directory.

3) Don’t rely on “current directory” in CTest/CLion/CI

Always set -artifact_prefix explicitly in CI so artifacts go where you expect.

If you tell me whether you run fuzzing via ctest or by calling ./fuzztest directly in CI, I can tailor the YAML to your exact flow (including ctest --preset … and collecting artifacts per fuzzer target).

using libfuzzer, in my test i need to generate std::vector<std::uint8_t> input with the size of 830 elements, and every element must take only 0 or 1 value

This is a very common and very solvable fuzzing shape 👍
You want:

a std::vector<uint8_t> of exactly 830 elements
each element ∈ {0, 1}
driven by libFuzzer input (not random / not fixed)

Below are the three correct patterns, from most recommended → more specialized.

✅ Recommended pattern: map raw fuzz bytes → fixed-size bit-vector

This keeps all entropy, avoids rejection bias, and gives libFuzzer full control.

Canonical solution

cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
    constexpr size_t N = 830;

    if (size == 0) return 0;  // avoid modulo by zero

    std::vector<uint8_t> input(N);
    for (size_t i = 0; i < N; ++i) {
        input[i] = data[i % size] & 1;  // force {0,1}
    }

    // use input
    // processor.process(input);

    return 0;
}

Why this is the best default

✅ always produces exactly 830 elements
✅ every element is guaranteed 0 or 1
✅ no early rejection → no “same input over and over” problem
✅ libFuzzer still mutates meaningful bits
✅ deterministic and fast

This is widely used in fuzzers for:

bitsets
boolean masks
adjacency matrices
neural nets
SAT/SMT encodings

⚠️ What NOT to do (important)

❌ Rejecting inputs of wrong size

cpp
if (size != 830) return 0;   // BAD for fuzzing

This causes:

massive rejection
repeated identical inputs
very poor coverage

You already observed this symptom earlier.

Alternative patterns (use only if needed)

2️⃣ One bit per bit (denser, more entropy)

If you want to use every bit, not every byte:

cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
    constexpr size_t N = 830;
    constexpr size_t BYTES_NEEDED = (N + 7) / 8;

    if (size < BYTES_NEEDED) return 0;

    std::vector<uint8_t> input(N);
    for (size_t i = 0; i < N; ++i) {
        input[i] = (data[i / 8] >> (i % 8)) & 1;
    }

    return 0;
}

Use this when:

you care about bit-level structure
performance is critical
you want minimal input size

3️⃣ Split structure: control bits + data bits

If some bits control behavior and others are payload:

cpp
constexpr size_t CONTROL_BITS = 64;
constexpr size_t TOTAL_BITS = 830;

if (size * 8 < TOTAL_BITS) return 0;

std::vector<uint8_t> input(TOTAL_BITS);
for (size_t i = 0; i < TOTAL_BITS; ++i) {
    input[i] = (data[i / 8] >> (i % 8)) & 1;
}

// control flags live in input[0..63]
// payload in input[64..]

This scales very well for complex logic.

libFuzzer flags you should use with this harness

bash
-max_len=256    # or even smaller; modulo mapping handles expansion

Because you expand input deterministically, you do not need large inputs.

This speeds up fuzzing a lot.

TL;DR (what you should use)

Use this:

cpp
input[i] = data[i % size] & 1;

It is:

idiomatic
fuzz-friendly
deterministic
high-coverage

If you want, I can:

optimize this for performance (branchless, vectorized)
show how to seed a corpus that matches this shape
help you visualize coverage improvements from this mapping

Just say the word.

can i force libfuzzer to generate input of some specific length all the time? i would like to avoid rejections

Not really. libFuzzer can be guided toward a maximum length, but it can’t be told “always give me exactly N bytes” as a hard guarantee.

What you can do (and what most fuzzers do) is:

avoid rejections in the harness by mapping whatever size you get into the exact shape you need.

That’s the right solution for your “830 bytes of {0,1}” case.

What libFuzzer can do about length

✅ Limit maximum length

You can set:

bash
-max_len=104

That tells libFuzzer the input size should not exceed that maximum.

But:

it may still generate inputs of many different sizes ≤ max_len
it won’t guarantee a fixed size

There is also -len_control=0/1 (a heuristic about whether to mutate length), but even with length control disabled, you still won’t get “always N”.

So: you can reduce size variability, not eliminate it.

Best solution: accept any size and map to your required size (no rejections)

For your requirement:

vector length = 830
elements ∈ {0,1}
no rejections

Use this pattern:

cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
    constexpr size_t N = 830;
    if (size == 0) return 0;   // tiny, unavoidable guard

    std::vector<uint8_t> input(N);
    for (size_t i = 0; i < N; ++i) {
        input[i] = data[i % size] & 1;
    }

    // processor p; p.process(input);
    return 0;
}

This yields:

always exactly 830 values
always only 0/1
no rejection bias
fast and deterministic

If you’d like to avoid even the size==0 early return, you can default the buffer:

cpp
if (size == 0) {
    static const uint8_t zero = 0;
    data = &zero;
    size = 1;
}

Then you truly never reject.

If you want inputs to be naturally close to your “effective size”

Since your harness expands input anyway, you can keep libFuzzer inputs small for speed, e.g.:

bash
-max_len=64

You still get a full 830-bit vector via mapping, but fuzzing is much faster.

Bottom line

❌ You can’t force libFuzzer to always generate exactly length N.
✅ You can avoid rejections by mapping any input length to your required fixed-size structure (recommended).

If you tell me whether those 830 bits represent a structured object (header/payload), I can suggest a mapping that gives libFuzzer even better control and faster coverage growth.

write method std::vector<std::uint8_t> hexstr2bitset(const std::string& input) converting hexstring (example: "DEADBEEF") to std::vector<std::uint8_t>, but each element of resulting vector should take only 0 or 1 value and represent each respective bit

Here’s a clean, correct, and idiomatic implementation that does exactly what you want.

Requirements recap

Input: hex string like "DEADBEEF"
Output: std::vector<std::uint8_t>
Each element is 0 or 1
Each element represents one bit
Order is well-defined and predictable

I’ll use MSB → LSB per hex digit, which is the most common and least surprising convention.

Canonical implementation

cpp
#include <cstdint>
#include <string>
#include <vector>
#include <stdexcept>

std::vector<std::uint8_t> hexstr2bitset(const std::string& input) {
    std::vector<std::uint8_t> bits;
    bits.reserve(input.size() * 4);

    auto hex_to_nibble = [](char c) -> std::uint8_t {
        if (c >= '0' && c <= '9') return static_cast<std::uint8_t>(c - '0');
        if (c >= 'a' && c <= 'f') return static_cast<std::uint8_t>(c - 'a' + 10);
        if (c >= 'A' && c <= 'F') return static_cast<std::uint8_t>(c - 'A' + 10);
        throw std::invalid_argument("Invalid hex character");
    };

    for (char c : input) {
        std::uint8_t v = hex_to_nibble(c);

        // Extract bits: MSB → LSB
        bits.push_back((v >> 3) & 1);
        bits.push_back((v >> 2) & 1);
        bits.push_back((v >> 1) & 1);
        bits.push_back((v >> 0) & 1);
    }

    return bits;
}

Example

cpp
auto bits = hexstr2bitset("DEADBEEF");

Hex → binary:

ini
D = 1101
E = 1110
A = 1010
D = 1101
B = 1011
E = 1110
E = 1110
F = 1111

Resulting vector (32 elements):

text
1 1 0 1  1 1 1 0  1 0 1 0  1 1 0 1
1 0 1 1  1 1 1 0  1 1 1 0  1 1 1 1

Each element is guaranteed to be 0 or 1.

Important notes (especially for fuzzing)

1️⃣ Deterministic and fast

No branching in the hot loop (except validation)
Perfect for fuzz harnesses

2️⃣ Bit order is explicit

This function defines:

hex digit order: left → right
bit order: MSB → LSB

If you ever need LSB → MSB instead, just reverse the shift order.

Variant: tolerate invalid characters (fuzzer-friendly)

If this will be used inside a fuzz target, throwing exceptions is usually undesirable. Here’s a fuzz-safe variant:

cpp
std::vector<std::uint8_t> hexstr2bitset(const std::string& input) {
    std::vector<std::uint8_t> bits;
    bits.reserve(input.size() * 4);

    for (char c : input) {
        std::uint8_t v;
        if      (c >= '0' && c <= '9') v = c - '0';
        else if (c >= 'a' && c <= 'f') v = c - 'a' + 10;
        else if (c >= 'A' && c <= 'F') v = c - 'A' + 10;
        else continue;  // skip invalid chars

        bits.push_back((v >> 3) & 1);
        bits.push_back((v >> 2) & 1);
        bits.push_back((v >> 1) & 1);
        bits.push_back((v >> 0) & 1);
    }

    return bits;
}

This avoids:

exceptions
early exits
rejection bias in fuzzing

TL;DR

✔ one hex char → 4 bits
✔ output vector contains only 0 or 1
✔ deterministic bit ordering
✔ fuzz-friendly variant provided

If you want:

LSB-first ordering
fixed output length (e.g. always 830 bits)
or mapping directly from uint8_t* data (no string allocation)

…I can tailor it exactly to your fuzz harness.

For libfuzzer, not being able to generate random input of some specific size (withut tricks) seems to me like some foundational miss I can't understand. For example, I need to generate input with size of 100 and strip the data (after 100 bytes) libfuzzer generates. Result is I see myriads of the same input, because libfuzzer make changes AFTER those 100 bytes

What you’re running into is a consequence of how libFuzzer is designed: it’s not a random generator, it’s an evolutionary mutational fuzzer. It mutates existing corpus inputs to get new coverage. If your harness discards everything after byte 100, then mutations after byte 100 are invisible to the target, so many different “raw” inputs become the same effective input — and you observe “myriads of the same input”.

The good news: you don’t need “rejections” to fix this, and you don’t need to accept junk after 100.

The real fix: stop libFuzzer from generating inputs longer than 100

Use:

bash
-max_len=100

This tells libFuzzer the maximum length of generated inputs. llvm.googlesource.com+1

Then your harness that uses only the first 100 bytes will no longer be sabotaged by mutations “after 100”, because those bytes will never exist.

If you want libFuzzer to start trying near-max_len sizes sooner, you can also try:

bash
-len_control=0

which is documented as “immediately try inputs with size up to max_len”. llvm.googlesource.com
(Practically, it may still begin from small seeds unless you provide a suitable seed corpus—see next point.)

Make it start at exactly 100 bytes right away (no rejections)

Provide a seed corpus containing at least one file of exactly 100 bytes. libFuzzer uses the corpus to guide sizes. (Docs explicitly note -max_len and seed corpus influence size behavior.) GitHub+1

Example:

bash
mkdir -p corpus
python3 - <<'PY'
open("corpus/seed100", "wb").write(b"\x00"*100)
PY

./fuzztest corpus -max_len=100

Now mutations start from a 100-byte input, and stay ≤ 100 bytes.

Your harness should not “reject”; just normalize

Even with -max_len=100, it’s good practice to normalize to a fixed 100 bytes so you never reject and never depend on size:

cpp
constexpr size_t N = 100;

std::array<uint8_t, N> buf{};
if (size > 0) {
    const size_t take = std::min(size, N);
    std::memcpy(buf.data(), data, take);
}
// Now buf is always exactly 100 bytes; short inputs are zero-padded.

This avoids the “many inputs collapse to the same behavior” problem that happens when you only look at a prefix but allow long inputs.

If you truly need exactly 100 bytes always (strongest option)

libFuzzer doesn’t have a “fixed length” flag (only max_len etc.). llvm.googlesource.com
The supported way to force a fixed-size representation is to use LLVMFuzzerCustomMutator (a “built-in extension point”): you make the fuzzer’s mutation step always produce size 100. That’s the clean, official way—just more code.

TL;DR

What you’re seeing is expected when you ignore bytes after 100: mutations there don’t matter, so effective inputs repeat.
Set -max_len=100 so libFuzzer never generates longer inputs. llvm.googlesource.com+1
Seed with a 100-byte file so it starts at that size immediately. GitHub+1
Normalize/pad to 100 instead of rejecting.

If you show me the exact command line you run (especially any corpus path and flags), I can suggest the tightest set of flags for your case (fast startup, stable 100-byte behavior, minimal repeats).

I MUST enforce exactly 100 bytes from libfuzzer. Can i use this LLVMFuzzerCustomMutator

Yes — LLVMFuzzerCustomMutator is exactly the “official” way to enforce structural constraints like “inputs are always exactly 100 bytes”, without rejections or harness tricks. libFuzzer will call your mutator for every mutation step, and you return the new size. llvm.org+1

That said, there’s one important caveat:

You can enforce size = 100 for all mutations.
The initial corpus can still contain files of other sizes, so you should normalize them (once) either by:
- seeding with at least one 100-byte file, or
- handling non-100 sizes in the mutator (recommended).

Below is a robust implementation that guarantees the fuzzer’s internal “current unit” becomes exactly 100 bytes and stays that way.

Custom mutator that forces exactly 100 bytes

cpp
#include <cstddef>
#include <cstdint>
#include <algorithm>

// libFuzzer exposes this helper you may call inside your mutator.
// (Declared in FuzzerInterface.h; provided by libFuzzer runtime.)
extern "C" size_t LLVMFuzzerMutate(uint8_t* Data, size_t Size, size_t MaxSize);

extern "C" size_t LLVMFuzzerCustomMutator(
    uint8_t* Data, size_t Size, size_t MaxSize, unsigned int Seed) {

    (void)Seed;
    constexpr size_t kFixedSize = 100;

    // libFuzzer promises MaxSize is the max you may return.
    // If this ever happens, you must not overflow.
    if (MaxSize < kFixedSize) {
        // Best effort: clamp to MaxSize (rare unless misconfigured).
        // This breaks the "MUST be 100" requirement, but avoids UB.
        // Consider aborting instead if you truly require 100.
        return MaxSize;
    }

    // Normalize size to exactly 100: pad with zeros or truncate.
    if (Size < kFixedSize) {
        std::fill(Data + Size, Data + kFixedSize, 0);
        Size = kFixedSize;
    } else if (Size > kFixedSize) {
        Size = kFixedSize;
    }

    // Now mutate IN PLACE but keep size fixed.
    // We call LLVMFuzzerMutate with MaxSize == kFixedSize so the helper
    // cannot change the size beyond 100.
    (void)LLVMFuzzerMutate(Data, Size, kFixedSize);

    return kFixedSize;
}

Why this works

LLVMFuzzerCustomMutator lets you return the new size (≤ MaxSize). llvm.googlesource.com
By forcing Size to 100 before and after mutation, and calling LLVMFuzzerMutate(..., MaxSize=100), you prevent growth/shrink. llvm.googlesource.com
You avoid all rejections in LLVMFuzzerTestOneInput.

Your `LLVMFuzzerTestOneInput` becomes simple (no truncation needed)

cpp
#include <cstdint>
#include <cstddef>

extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
    // With the custom mutator above, size should effectively always be 100.
    // You can keep a debug assert if you want:
    // if (size != 100) __builtin_trap();

    // ... consume all 100 bytes ...
    return 0;
}

Strong recommendation: seed corpus with a 100-byte file

Even though the mutator normalizes, seeding with a correct-size input helps the fuzzer start “in the right shape” immediately.

bash
mkdir -p corpus
python3 - <<'PY'
open("corpus/seed100", "wb").write(b"\x00"*100)
PY
./fuzztest corpus -max_len=100

Also set -max_len=100 to stop libFuzzer from trying to create larger units (your mutator will clamp anyway, but this reduces wasted work). llvm.org

Extra: if you use cross-over, enforce 100 there too (optional)

If you don’t implement LLVMFuzzerCustomCrossOver, libFuzzer will use its default crossover + then your mutator will normalize later. Usually fine. If you want it perfectly fixed end-to-end, you can also implement LLVMFuzzerCustomCrossOver and always return 100. llvm.googlesource.com

Summary

Yes, LLVMFuzzerCustomMutator is the right way to enforce exactly 100 bytes without rejections. llvm.googlesource.com
Pair it with -max_len=100 and a 100-byte seed corpus for best efficiency. llvm.org

If you tell me whether your 100 bytes have internal structure (e.g., fields/lengths/checksums), I can show a custom mutator that preserves validity (structure-aware fuzzing) while still staying fixed-size.

i still see myriads of cases, when libfuzzer generated exactly the same input

That can still be totally normal — and it can also mean your mutation step often ends up being a no-op (or your target treats many different raw inputs as “equivalent”). Here are the main causes and the fixes.

Why you can see the same input many times

1) libFuzzer intentionally re-runs the same unit

It will re-execute the same input for things like:

startup calibration / timing
verifying a crash is reproducible
minimizing / merging
some tracing/measurement modes

So some repetition is expected.

How to confirm: run with:

bash
./fuzztest -verbosity=1 -print_final_stats=1

If you see messages about “minimize”, “merge”, “pulse”, etc., repetition is expected.

2) Your custom mutator sometimes produces the same bytes

LLVMFuzzerMutate() can legitimately make a mutation that results in identical output (rare per call, but can happen), and your own “normalize to 100 bytes” step can also collapse distinct inputs (padding/truncation).

Fix: ensure the mutator forces a change if the buffer didn’t change.

Here’s a robust LLVMFuzzerCustomMutator that guarantees “different output” (within a small bounded number of attempts), and if still unchanged, flips a bit derived from Seed:

cpp
#include <cstddef>
#include <cstdint>
#include <algorithm>
#include <cstring>

extern "C" size_t LLVMFuzzerMutate(uint8_t* Data, size_t Size, size_t MaxSize);

extern "C" size_t LLVMFuzzerCustomMutator(
    uint8_t* Data, size_t Size, size_t MaxSize, unsigned int Seed) {

    constexpr size_t kFixedSize = 100;
    if (MaxSize < kFixedSize) return MaxSize; // misconfigured; avoid UB

    // Normalize to exactly 100 bytes
    if (Size < kFixedSize) {
        std::fill(Data + Size, Data + kFixedSize, 0);
        Size = kFixedSize;
    } else if (Size > kFixedSize) {
        Size = kFixedSize;
    }

    uint8_t before[kFixedSize];
    std::memcpy(before, Data, kFixedSize);

    // Try a few mutations until something changes
    for (int i = 0; i < 8; ++i) {
        LLVMFuzzerMutate(Data, kFixedSize, kFixedSize);
        if (std::memcmp(before, Data, kFixedSize) != 0) {
            return kFixedSize;
        }
    }

    // Still unchanged: force a deterministic change using Seed
    const size_t idx = static_cast<size_t>(Seed) % kFixedSize;
    Data[idx] ^= 0x01;           // flip LSB; still “byte-level” change
    // If you require each byte to be {0,1}, keep it binary:
    Data[idx] &= 0x01;

    return kFixedSize;
}

If you also require each element to be 0/1, then in your test you should either:

store bits separately, or
ensure the mutator maintains that invariant (e.g., Data[i] &= 1 for all i) — but doing that for all 100 bytes on every mutation can slow fuzzing.

3) Your target only “uses” part of the 100 bytes

Even if size is fixed, if your code only actually influences behavior via (say) the first 4 bytes, then mutations elsewhere don’t matter → lots of “same effective input”.

Quick check: hash the bytes you actually consume (or just confirm you really use all 100).

4) You’re starting from a tiny corpus, so many mutations don’t improve coverage

When coverage plateaus, libFuzzer will still keep mutating, and you may observe many repeated inputs (especially if you’re printing/inspecting after some normalization).

Fix: provide a small diverse seed corpus of valid 100-byte inputs, or add structure-aware mutation (dictionary / custom mutator that respects format).

The fastest way to pinpoint which one it is

Add a cheap fingerprint print only occasionally (don’t spam):

cpp
#include <cstdint>
#include <cstddef>

static uint64_t fnv1a(const uint8_t* d, size_t n) {
    uint64_t h = 1469598103934665603ull;
    for (size_t i = 0; i < n; ++i) { h ^= d[i]; h *= 1099511628211ull; }
    return h;
}

Then in your harness (temporarily), print every N runs (e.g., using a static counter). If the fingerprint is identical, your mutator/normalization is producing identical bytes; if fingerprints differ but behavior is “same”, your target likely ignores most bytes.

If you paste:

your current LLVMFuzzerTestOneInput (especially how you transform/consume those 100 bytes), and
the exact libFuzzer command line you run,

…I can tell you precisely which cause is biting you and suggest the smallest fix.

is there any fuzzing library allowing to generate bytes with fixed length? I think I dont understand the basic principles of libfuzzer. I think easy generating random input of fixed size should be some basic feature. Is that really so weird?

This is not a weird question at all — it’s a very common reaction from people coming to libFuzzer with a “random testing” mental model. The friction you’re feeling is real, and it comes from a foundational design difference, not from you misunderstanding something small.

Let me answer in three layers:

Why libFuzzer behaves this way (foundational principle)
Are there fuzzers that naturally generate fixed-length inputs?
What to do, depending on what you actually want

1️⃣ The key thing: libFuzzer is not a random generator

libFuzzer is not designed to generate random inputs at all.

It is an evolutionary, coverage-guided, mutational fuzzer.

That means:

It starts from a corpus (even if empty, it creates trivial seeds)
It mutates existing inputs
Mutations are judged only by coverage increase
Size is just another mutation dimension, not a constraint

In libFuzzer’s mental model:

“Input length is not semantics.
Bytes are semantics only if the target observes them.”

So the engine optimizes for:

coverage
mutation efficiency
minimizing useless work

A hard requirement like “always exactly 100 bytes” is outside its default optimization goals.

That’s why:

there is -max_len
but no -exact_len

This is not an omission — it’s deliberate.

2️⃣ “Why isn’t fixed-size generation a basic feature?”

Because in libFuzzer’s universe:

Rejecting inputs is expensive
Forcing exact length globally breaks many mutation strategies
Structure is expected to be handled by the harness or mutator

From LLVM devs’ point of view:

If you need structure (fixed size, grammar, invariants),
you must define the structure — not ask the engine to guess it.

This is why libFuzzer exposes:

LLVMFuzzerCustomMutator
LLVMFuzzerCustomCrossOver
dictionaries
corpus seeding

They expect you to define invariants.

So yes — libFuzzer intentionally does not offer “random fixed-length generator”.

3️⃣ Are there fuzzers that do naturally generate fixed-length inputs?

✅ YES — but they are different classes of fuzzers

Here’s the landscape.

🔹 AFL / AFL++ (similar philosophy, same issue)

Notes:

Also mutation-based
Also size-varying
Also no “exact length” guarantee
Same collapse issue if you ignore suffix bytes

So AFL does not solve your problem either.

🔹 Honggfuzz

Mutation-based
Similar behavior to libFuzzer
Same limitation

🔹 QuickCheck / property-based testing (NOT fuzzers)

Examples:

RapidCheck (C++)
Hypothesis (Python)
QuickCheck (Haskell)

These:

generate values, not byte streams
fixed-size arrays are trivial
great for invariants
terrible for deep coverage fuzzing

Example mental model:

cpp
gen::container<std::vector<uint8_t>>(100, gen::inRange(0,2))

If your goal is “generate lots of random fixed-size inputs”, these are much closer to what you intuitively want.

But:

no coverage guidance
no corpus evolution
much weaker bug-finding for parsers / state machines

🔹 Grammar-based fuzzers (libprotobuf-mutator, Peach, etc.)

These:

generate structured inputs
often fixed-size or shape-constrained
require schemas / grammars

Powerful, but heavy.

4️⃣ The uncomfortable truth (but important)

What you want:

“Random inputs of fixed length”

What libFuzzer wants:

“Inputs that mutate in ways that increase coverage”

Those are orthogonal goals.

libFuzzer assumes:

if length matters, you will encode that in the harness or mutator
if suffix bytes don’t matter, they are irrelevant noise

This is why ignoring bytes after 100 is a bug in the harness, not in the fuzzer (from libFuzzer’s POV).

5️⃣ The “correct” libFuzzer mindset shift

Instead of thinking:

“libFuzzer should generate exactly 100 bytes”

Think:

“My semantic input is a fixed-size object of 100 bytes.
libFuzzer’s raw bytes are just mutation fuel.”

Then you do one of these:

✅ A) Map arbitrary bytes → fixed semantic object (most common)

You already saw this:

cpp
input[i] = data[i % size];

This is by far the most common approach in real fuzzers.

✅ B) Custom mutator (you already did this)

This is the official extension point for enforcing invariants.

✅ C) Property-based testing (if coverage isn’t the goal)

6️⃣ So… is your frustration justified?

Yes. Completely.

What feels “basic” to you:

fixed-size random input

is intentionally not a primitive in libFuzzer.

Not because it’s hard.
Not because LLVM devs didn’t think of it.

But because:

it conflicts with how coverage-guided mutation works
structure is expected to be explicit

This is one of the biggest conceptual hurdles when learning modern fuzzers.

7️⃣ Straight recommendation (no ideology)

If:

you need exactly 100 bytes
every byte matters
you want maximum fuzzing efficiency

👉 Your custom mutator approach is correct and idiomatic
👉 Add -max_len=100
👉 Seed with a 100-byte corpus file
👉 Ensure mutations actually change observed bytes

That’s exactly how LLVM, Chrome, and OSS-Fuzz projects do it internally.

If instead you want:

simple randomized stress testing
invariants, not coverage

👉 use RapidCheck / Hypothesis-style testing

Final reassurance

You’re not missing something obvious.
You’ve hit a real philosophical boundary between random testing and coverage-guided fuzzing.

If you want, next I can:

show how LLVM itself fuzzes fixed-size crypto primitives
show a minimal RapidCheck vs libFuzzer comparison
or help you decide which tool is actually better for your specific bug class

write method std::vector<std::uint8_t> pack(const std::vector<std::uint8_t>& input) vector "input" elements are only with value 0 or 1 and represent bits. output vector elements should contain those bits packed as bytes

Here’s a correct, clear, and idiomatic implementation that packs a {0,1} bit-vector into bytes.

I’ll define the packing order explicitly (this matters).

Definition (important)

input[i] is a bit (must be 0 or 1)
Output packs 8 bits per byte
Bit order:
- input[0] → most significant bit (bit 7) of output[0]
- input[7] → least significant bit (bit 0) of output[0]
If input.size() is not a multiple of 8, the last byte is zero-padded in the LSBs

This is the most common and least surprising convention.

Canonical implementation

cpp
#include <cstdint>
#include <vector>

std::vector<std::uint8_t> pack(const std::vector<std::uint8_t>& input) {
    const std::size_t nbits = input.size();
    const std::size_t nbytes = (nbits + 7) / 8;

    std::vector<std::uint8_t> output(nbytes, 0);

    for (std::size_t i = 0; i < nbits; ++i) {
        const std::size_t byte_index = i / 8;
        const std::size_t bit_index  = 7 - (i % 8);  // MSB → LSB

        // input[i] is assumed to be 0 or 1
        output[byte_index] |= (input[i] & 1u) << bit_index;
    }

    return output;
}

Example

Input

cpp
// bits: 1 0 1 1 0 0 1 0   1 1
std::vector<uint8_t> bits = {1,0,1,1,0,0,1,0, 1,1};

Output

text
output[0] = 0b10110010 = 0xB2
output[1] = 0b11000000 = 0xC0  // padded with zeros

Variant: LSB-first packing (if you need it)

If instead you want:

input[0] → bit 0 (LSB) of first byte

Change one line:

cpp
const std::size_t bit_index = i % 8;  // LSB → MSB

Everything else stays the same.

Fuzzer-friendly version (defensive, optional)

If this is used in a fuzz target and you want to defensively clamp inputs:

cpp
output[byte_index] |= (input[i] ? 1u : 0u) << bit_index;

This avoids UB if someone accidentally passes values other than {0,1}.

Complexity

Time: O(n bits)
Space: ⌈n/8⌉ bytes
Branchless inner loop → fast, vectorizable

TL;DR

✔ packs {0,1} bits into bytes
✔ deterministic, explicit bit order
✔ zero-pads last byte
✔ fuzz- and performance-friendly

If you want, I can also give you:

the inverse unpack()
a version using std::span
or a version optimized for fixed-size inputs (e.g. always 830 bits)

why "This is the most common and least surprising convention"

Great question — this gets into representation conventions, not “right vs wrong”.

When I said “the most common and least surprising convention”, I meant common in practice across file formats, network protocols, and low-level code, not “mandated by the C++ standard”.

Let me unpack why MSB-first is usually the default people expect.

1️⃣ It matches how humans write and read binary

When people write binary, they write most-significant bit first:

arduino
10110010
↑       ↑
bit 7   bit 0

So mapping:

css
input[0] → bit 7
input[7] → bit 0

preserves the visual order.

If you print your bit vector, then pack it, the byte value “looks right”.

This reduces cognitive friction.

2️⃣ It matches how hex strings are interpreted

Hex strings (which you already work with):

makefile
"DE"
D = 1101
E = 1110

That mapping is MSB → LSB per nibble.

Your earlier hexstr2bitset() produced bits in that order. Packing MSB-first makes:

cpp
pack(hexstr2bitset("DE")) == { 0xDE }

That’s an extremely strong expectation — breaking it would surprise most engineers.

3️⃣ It aligns with network / protocol conventions

Most wire formats define:

“first bit on the wire” = most significant bit of the first byte

Examples:

IP, TCP, UDP headers
ASN.1 / BER
many hardware registers

So if your bit vector is “logical bit 0 first”, MSB-first packing is the standard mental model.

4️⃣ It matches bit-mask intuition

When people write:

cpp
byte |= (1 << 7);

they intuitively think:

“set the first / leftmost bit”

Mapping input[0] → bit 7 matches that intuition.

The opposite mapping (input[0] → bit 0) makes this feel reversed unless explicitly documented.

5️⃣ LSB-first is also common — but in different domains

LSB-first is typical when:

bits are treated as little-endian integers
doing arithmetic / bitwise math
working with CPUs directly

Examples:

bit 0 = least significant
value & 1 checks the “first” bit

So LSB-first is common inside arithmetic, not for serialization.

That’s why neither is “wrong”.

6️⃣ Why MSB-first is “least surprising” here

Given your context:

hex strings
bit vectors
fuzzing
serialization / packing

Most engineers will expect:

cpp
unpack(pack(bits)) == bits
pack(hexstr2bitset("DE")) == 0xDE

That only holds naturally with MSB-first.

LSB-first would silently invert expectations unless documented.

7️⃣ The real rule (this is the important part)

There is no universal correct order.
There is only:

“Be explicit, and match surrounding conventions.”

That’s why I explicitly documented the order in the implementation.

TL;DR

MSB-first is “least surprising” because it:

matches how binary & hex are written
aligns with protocols and serialization
preserves visual order
matches common expectations in parsing/packing code

If your domain is:

arithmetic → LSB-first may be better
serialization / representation → MSB-first is usually expected

If you want, I can:

rewrite pack() + unpack() with LSB-first
show test vectors for both
help you choose one that best matches your project’s semantics

write method "unpack", making operation inverted to "pack"

Here’s the exact inverse of the pack() I gave you earlier — same bit order, same padding rules, and fully deterministic.

I’ll restate the convention briefly so the inversion is unambiguous.

Convention (same as `pack`)

Each byte contains 8 bits
Bit order inside a byte: MSB → LSB
- bit 7 is first
- bit 0 is last
Output bits are 0 or 1
If the original bit count was not a multiple of 8, the last byte was padded with zeros
→ unpack() will return all bits, including padded zeros

Canonical `unpack` implementation

cpp
#include <cstdint>
#include <vector>

std::vector<std::uint8_t> unpack(const std::vector<std::uint8_t>& input) {
    std::vector<std::uint8_t> bits;
    bits.reserve(input.size() * 8);

    for (std::uint8_t byte : input) {
        // Extract bits MSB → LSB
        bits.push_back((byte >> 7) & 1);
        bits.push_back((byte >> 6) & 1);
        bits.push_back((byte >> 5) & 1);
        bits.push_back((byte >> 4) & 1);
        bits.push_back((byte >> 3) & 1);
        bits.push_back((byte >> 2) & 1);
        bits.push_back((byte >> 1) & 1);
        bits.push_back((byte >> 0) & 1);
    }

    return bits;
}

Round-trip property (important)

With this unpack() and the earlier pack():

cpp
unpack(pack(bits)) == bits   // true, except for padded zeros at the end

If you need to truncate back to the original bit length, do:

cpp
auto out = unpack(bytes);
out.resize(original_bit_count);

That’s unavoidable unless you store the bit count separately.

Example

Input bytes

cpp
std::vector<uint8_t> bytes = { 0xB2, 0xC0 };

Binary:

ini
0xB2 = 10110010
0xC0 = 11000000

Output bits

text
1 0 1 1 0 0 1 0  1 1 0 0 0 0 0 0

Optional: branchless loop version

If you prefer a compact loop:

cpp
std::vector<std::uint8_t> unpack(const std::vector<std::uint8_t>& input) {
    std::vector<std::uint8_t> bits;
    bits.reserve(input.size() * 8);

    for (std::uint8_t byte : input) {
        for (int bit = 7; bit >= 0; --bit) {
            bits.push_back((byte >> bit) & 1);
        }
    }
    return bits;
}

Same semantics, slightly less unrolled.

Sanity check with hex

cpp
pack(unpack({0xDE, 0xAD})) == std::vector<uint8_t>{0xDE, 0xAD};

This symmetry is why MSB-first packing/unpacking is so commonly chosen.

If you want next:

a length-aware unpack (so padding is removed automatically),
a fixed-size (e.g. 830-bit) optimized version,
or a bitset/span-based variant,

just tell me.

std::vector from const std::uint8_t *data, std::size_t size

Here are the idiomatic, correct ways to construct a std::vector<std::uint8_t> from

cpp
const std::uint8_t* data, std::size_t size

✅ The canonical way (what you almost always want)

cpp
std::vector<std::uint8_t> v(data, data + size);

This:

copies size bytes
preserves order
works for any trivially copyable type
is exactly what the iterator constructor is for

This is the standard, idiomatic C++ solution.

✅ With an explicit empty check (optional, defensive)

cpp
std::vector<std::uint8_t> v;
if (data && size > 0) {
    v.assign(data, data + size);
}

Only needed if data might be nullptr when size == 0 and you want to be extra explicit (the first form is already safe if size == 0).

❌ What NOT to do

cpp
std::vector<std::uint8_t> v(size);
std::memcpy(v.data(), data, size);   // unnecessary and error-prone

This:

is more verbose
easier to get wrong
provides no benefit here

C++20+ convenience (optional)

If you already have a std::span:

cpp
std::span<const std::uint8_t> s(data, size);
std::vector<std::uint8_t> v(s.begin(), s.end());

Useful when APIs already use span, but otherwise overkill.

In a libFuzzer harness (typical)

cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
    std::vector<uint8_t> input(data, data + size);
    // use input
    return 0;
}

This is exactly correct.

TL;DR

Use:

cpp
std::vector<std::uint8_t> v(data, data + size);

That’s the right, idiomatic answer.

If you want:

zero-copy views (span)
fixed-size vectors
padding/truncation helpers
fuzzing-specific patterns

I can show those too.

in libfuzzer, i dont want to finish tests when my application under test crashes with error. i want libfuzzer to write crash log and continue testing

You generally can’t do that with libFuzzer in the way you’re imagining.

libFuzzer’s core contract is:

If the target crashes (SIGSEGV, abort, ASan “deadly signal”, etc.), that’s a bug found
libFuzzer stops, writes the reproducer (crash-*), and exits non-zero

It does this on purpose so you can:

reliably reproduce
minimize
fix the bug
then continue fuzzing

Why it can’t “just continue”

After a real crash / memory corruption:

the process state is undefined
continuing would produce garbage results, false positives, or hide bugs
the sanitizer runtimes intentionally abort to prevent further damage

So “keep going after a crash” is not supported as a normal mode.

What you can do instead (practical options)

1) Keep fuzzing by restarting the process (standard approach)

In CI or locally, run the fuzzer in a loop so a crash ends one run, but the next run starts immediately:

bash
mkdir -p fuzz_artifacts
while true; do
  ./fuzztest -max_total_time=60 -artifact_prefix=fuzz_artifacts/ || true
done

This is the common way to “continue fuzzing across crashes”.

If you want it to skip already-known crashing inputs, keep the corpus separate from artifacts.

2) If “crash” is actually an exception or error code — handle it in the harness

If your “crash” is something like:

throw std::runtime_error
an error return
a domain check failing

Then you can catch and continue:

cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
    try {
        // call target
    } catch (const std::exception&) {
        return 0; // treat as non-crash
    }
}

But this does not apply to real crashes (signals, ASan aborts).

3) Use `-ignore_crashes=1` only for non-sanitized, non-UB situations (not recommended)

libFuzzer has flags like -ignore_crashes=1, but:

it’s mainly for “keep going” in some signal cases
it’s unsafe with memory corruption
it often doesn’t help with ASan/UBSan because they abort hard
you lose reliable repro/minimization

I don’t recommend this except for very specific scenarios, and even then it can produce misleading results.

4) The robust solution: run the target in a subprocess (fork mode)

If you truly need “log crash and continue” safely, you need process isolation:

parent fuzzer keeps running
child runs the target
child can crash; parent records it and continues

libFuzzer itself doesn’t provide a great built-in forkserver mode like AFL++ does, but you can implement a simple harness that forks per test case. Downsides:

much slower
sanitizer integration gets trickier

If this is important, AFL++ (forkserver) or honggfuzz (persistent/fork modes) may fit better.

Recommendation

For libFuzzer + sanitizers, the “right” workflow is:

Let it crash
Save reproducer (crash-*) + log
Fix bug
Resume fuzzing (or auto-restart runs in a loop)

If you want continuous fuzzing in GitLab CI, the best pattern is:

run bounded sessions (-max_total_time=60)
loop or schedule as nightly
always collect fuzz_artifacts/ as job artifacts

If you tell me what you mean by “crashes with error” (exception? assert? abort()? segfault? ASan?), I can give the safest configuration to either:

catch and continue (if it’s recoverable), or
keep fuzzing across crashes via restart/isolation (if it’s not).

peach fuzz

Peach Fuzzer (usually just Peach) is a very different kind of fuzzer from libFuzzer — and it actually lines up much better with the mental model you’ve been describing.

I’ll explain what Peach is, how it differs from libFuzzer, and when it’s the right choice.

What Peach Fuzzer is (at its core)

Peach is a model-based (grammar / schema-based) fuzzer.

Instead of mutating raw bytes, you:

Describe the input format explicitly (structure, sizes, constraints)
Peach generates inputs that always satisfy those constraints
It mutates within the model, not arbitrarily

Think:

“Generate valid-shaped inputs, then break them intelligently.”

Why Peach feels “more intuitive” for your use case

Based on what you’ve said so far, you want things like:

fixed-size inputs
exact bit/byte layouts
no meaningless suffix bytes
no collapsing of inputs after truncation
generation, not mutation of random garbage

👉 That is exactly Peach’s design goal.

In Peach, “exactly 100 bytes” is trivial and foundational.

Example: fixed-size input in Peach (conceptual)

In Peach you would define something like:

xml
<DataModel name="Input100">
  <Blob name="payload" length="100"/>
</DataModel>

Or even bit-level:

xml
<Number name="flag" size="1"/>
<Number name="counter" size="7"/>
<Blob name="data" length="92"/>

Peach will never generate an input of the wrong size.
No tricks. No harness normalization. No rejection.

That’s why it feels “obvious”.

Why libFuzzer doesn’t work this way

libFuzzer assumes:

input structure is unknown
coverage is the only guide
raw bytes are just mutation fuel

Peach assumes:

input structure is known
constraints matter
validity is valuable

These are philosophically opposite tools.

Key differences (important)

Feature	libFuzzer	Peach
Input type	Raw bytes	Structured model
Fixed-size inputs	❌ (by default)	✅ (native)
Grammar / schema	❌	✅
Coverage-guided	✅	❌ (mostly)
Mutation strategy	Byte-level	Structure-aware
Best for	Parsers, libraries	Protocols, formats
Learning curve	Low	High
Speed	Very high	Slower

When Peach is the right tool

Peach shines when:

input format is well-defined
size/layout constraints are strict
validity matters
you’re testing:
- protocols
- binary formats
- file formats
- firmware
- stateful systems

Your repeated frustration with libFuzzer is a strong signal that you’re in this category.

When libFuzzer is still better

libFuzzer wins when:

structure is unknown or loose
bugs hide behind coverage paths
speed matters more than validity
you want deep code coverage fast

This is why large projects often use both.

Reality check (important in 2026)

Peach:

is powerful
but heavier
XML-based models
slower iteration
less modern CI integration

Many teams today choose alternatives like:

libprotobuf-mutator (structure + coverage)
AFL++ + custom mutator
grammar-based fuzzers embedded into libFuzzer

These give you “Peach-like structure” with coverage.

Honest recommendation for you

Based on everything you’ve said:

Your intuition about fixed-size inputs is 100% valid
libFuzzer is fighting you because it’s the wrong abstraction
Peach (or a grammar-based fuzzer) matches your mental model

If you want:

exact size
exact structure
no meaningless inputs
no harness hacks

👉 Peach is conceptually the right tool

If you want:

speed
coverage
sanitizer integration

👉 libFuzzer + custom mutator is the compromise

If you want, next I can:

show a Peach vs libFuzzer side-by-side for the same problem
suggest modern Peach alternatives that integrate better with C++
help you decide which tool is actually worth your time for your target

Just tell me how deep you want to go.

is there portable cmake distro for linux (i use ubuntu:22.04)

Yes — you can use portable, prebuilt CMake binaries on Linux (including Ubuntu 22.04) so you don’t have to install it system-wide or rely on the distro package. cmake.org

✅ Portable CMake binary distribution (official)

CMake publishes precompiled Linux binary archives you can just download and extract anywhere:

✔ no system package install
✔ no root required
✔ works on Ubuntu or any compatible glibc based distro

Where to get it

Head to the official CMake download page and choose the Linux binary:

🔗 https://cmake.org/download/ cmake.org

Typically you’ll see files like:

php-template
cmake-<version>-linux-x86_64.tar.gz

php-template
cmake-<version>-linux-x86_64.sh

How to use it (portable)

Download and extract it:

bash
wget https://github.com/Kitware/CMake/releases/download/v4.2.3/cmake-4.2.3-linux-x86_64.tar.gz
tar -xzf cmake-4.2.3-linux-x86_64.tar.gz

Add it to your PATH (temporary):

bash
export PATH=$PWD/cmake-4.2.3-linux-x86_64/bin:$PATH

Now cmake --version shows the portable version.

No root, no system install. cmake.org

This is the best “portable CMake distro” for Linux.

Alternatives that are less portable but still isolated

📦 Snap package

You can install CMake from Snap without interfering with the system package:

bash
sudo snap install cmake --classic

This gives you an isolated CMake managed by Snap (not tied to Ubuntu’s apt), but it’s still a system install — not truly “portable”. Its Linux FOSS

🐍 Python package (`pip install cmake`)

If you have Python, you can install CMake binaries via pip:

bash
pip install cmake

This installs a standalone CMake that lives in your Python environment (e.g., virtualenv), which is also a kind of portable distribution. GitLab

Good for CI, virtualenv workflows, GitHub Actions, etc.

Why this is useful

Portable binaries let you:

use newer CMake than Ubuntu apt (which often lags)
avoid messing with system packages
ship a consistent toolchain in CI or across developers
avoid building CMake from source manually

Summary

Method	Portable?	Requires root?	Notes
Download official Linux binary	⭐ Portable	❌	Best for isolated usage
Snap	Partial	✔	Isolated runtime but system-level install
Pip	Partial	❌	Based on Python environment
Build from source	❌	❌	Customizable but heavy

So yes — portable CMake binaries exist and work fine on Ubuntu 22.04 without installation, just extract and add to your PATH. cmake.org

If you want, I can show a ready script to download and set up a specific CMake version in your project (for CI or local dev).

is installing portable cmake disto recommended? it gives some isolation, doesnt mangle with rest of the system and whenever possible, i try to install portable version of applications

Short answer: yes — it’s recommended, sane, and widely done, especially if you care about isolation and reproducibility.
Long answer: it depends on how you use it, but your instinct is solid.

Let’s be concrete and non-hand-wavy.

Is installing portable CMake recommended?

✅ Yes, in these scenarios (very common today)

Portable CMake is actively recommended in practice when you want:

Isolation
No interaction with /usr/bin/cmake, no apt conflicts, no snap weirdness.
Reproducibility
Everyone uses exactly the same CMake version.
Newer CMake features
Ubuntu 22.04 ships CMake 3.22, which is already behind for presets, workflows, and modern features.
CI parity
Same CMake binary locally and in GitLab CI.
Toolchain-style workflows
If you already use portable clang, LLVM, etc., portable CMake fits naturally.

This is very common in:

C++ infrastructure teams
security / fuzzing setups
cross-platform projects
CI-heavy environments

You’re not being “paranoid” — you’re being intentional.

What Kitware (CMake maintainers) implicitly encourage

Kitware:

publishes official prebuilt Linux binaries
documents them prominently
uses them internally for CI and testing

That’s a strong signal:
👉 “We expect people to use these.”

If they thought portable usage was a bad idea, they wouldn’t ship them.

What not to do with portable CMake

Portable CMake is great as long as you respect its role.

❌ Don’t treat it as a runtime dependency

CMake is a build-time tool only.
Never bake it into runtime containers or images unnecessarily.

❌ Don’t mix portable + system CMake invisibly

Avoid situations where:

cmake sometimes resolves to /usr/bin/cmake
sometimes to /opt/cmake/bin/cmake

Be explicit:

bash
export PATH=/opt/cmake/bin:$PATH

or use an absolute path.

Recommended usage patterns (battle-tested)

⭐ Pattern 1: `/opt/cmake/<version>` (clean, scalable)

bash
/opt/cmake/
  ├── 3.22.6/
  ├── 3.28.6/
  └── current -> 3.28.6

Pros:

easy upgrades
easy rollback
clear ownership

⭐ Pattern 2: project-local toolchain (maximum isolation)

bash
project/
  tools/
    cmake/
      bin/cmake

Pros:

perfect reproducibility
no global dependencies
ideal for CI + fuzzing

Cons:

repo size increases (acceptable in infra-heavy projects)

⭐ Pattern 3: CI-only portable CMake

System CMake locally, portable in CI.

Pros:

minimal local setup
full CI control

Cons:

local/CI drift (avoidable but real)

How this compares to other tools

You already do this instinctively with:

Tool	Portable usage
clang / LLVM	very common
Python	venv / pyenv
Node.js	nvm
Rust	rustup
Go	versioned tarballs

CMake is no different — it just historically lagged behind in mindset.

Ubuntu 22.04 specifically

Ubuntu 22.04:

ships CMake 3.22
modern CMake workflows want 3.25+
presets, workflows, install rules, toolchain behavior are all better in newer versions

Using portable CMake on 22.04 is arguably the best option.

TL;DR (straight answer)

✅ Installing portable CMake is recommended
✅ It gives real isolation
✅ It avoids system pollution
✅ It improves reproducibility
✅ It’s widely used in serious C++ setups
❌ It’s not overengineering

Your philosophy — “whenever possible, install portable versions” — is aligned with modern best practices, especially in C++.

If you want, I can:

propose a standard layout for portable tools (clang + cmake together)
show how to wire portable CMake cleanly into GitLab CI
help you decide which CMake version is best for your current project

Just tell me how far you want to take the isolation.

i installed cmake portable distro in docker image in /opt/cmake and added to PATH and i use this image in clion, but clion says "cmake not found"

This is a classic CLion + Docker toolchain gotcha: CLion doesn’t reliably use the container’s interactive-shell PATH when it probes for tools. It often expects an explicit absolute path to CMake, or it runs detection in a way that doesn’t source /etc/profile, ~/.bashrc, etc. JetBrains+2JetBrains+2

Here’s how to fix it quickly and permanently.

1) First confirm CMake really exists in the image

Run the same image CLion uses:

bash
docker run --rm -it <your-image> sh -lc 'echo $PATH; ls -l /opt/cmake/bin/cmake; /opt/cmake/bin/cmake --version; command -v cmake'

If /opt/cmake/bin/cmake --version works but command -v cmake does not, then your PATH isn’t actually set in the runtime environment CLion sees.

2) Make PATH truly part of the image (Dockerfile: use `ENV`, not shell rc files)

In your Dockerfile, do not rely on .bashrc/profile edits. Set PATH with ENV:

dockerfile
ENV PATH="/opt/cmake/bin:${PATH}"

This is the reliable, container-native way to persist PATH across non-interactive runs. Stack Overflow+1

Rebuild the image and re-test with the command from step 1.

3) Best CLion fix: point CLion directly at the CMake binary (recommended)

Even if PATH is correct, CLion tool detection can still fail. In CLion:

Settings → Build, Execution, Deployment → Toolchains → (your Docker toolchain)
Set CMake to:

swift
/opt/cmake/bin/cmake

CLion explicitly supports switching from bundled CMake to a custom CMake and setting tool paths. JetBrains+1

This avoids all PATH/shell ambiguity.

4) “Make it impossible to miss”: add a symlink in a standard location

Some IDE probes look in common paths like /usr/bin, /usr/local/bin.

In Dockerfile:

dockerfile
RUN ln -s /opt/cmake/bin/cmake /usr/local/bin/cmake

Now cmake is discoverable even if PATH probing is weird.

5) If CLion still says “cmake not found”

Two more common causes:

A) You installed the wrong CMake package layout

Some CMake distributions extract as /opt/cmake/cmake-<ver>-linux-x86_64/bin/cmake (extra directory layer). Verify where bin/cmake actually is.

B) Known CLion issue / detection bug

There’s an active JetBrains issue where Docker toolchains sometimes fail to detect CMake/compilers in images even when present; the workaround is exactly “set absolute paths” (step 3) or provide an environment init script. YouTrack+1

What I’d do (lowest friction, most robust)

Keep ENV PATH="/opt/cmake/bin:${PATH}" in the Dockerfile
Also symlink cmake into /usr/local/bin
In CLion, set CMake path explicitly to /opt/cmake/bin/cmake

That combination makes it work regardless of how CLion spawns processes.

If you paste:

your Dockerfile snippet where you set PATH
the exact extracted directory structure under /opt/cmake
and which CLion version you use

…I’ll tell you the minimal change that will make detection succeed.

i added RUN ln -s /opt/cmake/bin/cmake /usr/bin/cmake and it worked. Do i need to add RUN ln -s /opt/cmake/bin/ctest /usr/bin/ctest too?

Short answer: yes — you should do it, and for the same reason cmake needed it.

Here’s why, and what else (if anything) is worth symlinking.

Why `ctest` matters (especially with CLion)

CLion uses CTest for:

discovering tests (add_test)
running tests from the UI
showing test results / reruns
handling ctest --preset …

Even if:

configuration works
builds work

…CLion will later try to run ctest independently.

If it can’t find ctest, you’ll eventually see:

“CTest not found”
tests not discovered
test run actions disabled or failing

So if CLion required a symlink for cmake, it will very likely need one for ctest as well.

What I recommend (minimal and clean)

Add both:

dockerfile
RUN ln -s /opt/cmake/bin/cmake /usr/bin/cmake \
 && ln -s /opt/cmake/bin/ctest /usr/bin/ctest

That’s enough for:

configure
build
test
CLion UI integration

Do you need to symlink anything else?

Optional (only if you use them)

Tool	Needed?	When
`ctest`	✅ Yes	Tests / CLion test runner
`cmake`	✅ Yes	Already confirmed
`cpack`	❌ Usually no	Packaging (`cpack`)
`ccmake`	❌	Interactive terminal UI
`cmake-gui`	❌	Not used in Docker

If you never use packaging, cpack can stay where it is.

Why symlinking is better than PATH hacks (for CLion)

You already discovered the key insight:

CLion does not reliably inherit container PATH
It probes well-known locations (/usr/bin, /usr/local/bin)
Symlinks are deterministic and IDE-proof

This is why:

text
/usr/bin/cmake
/usr/bin/ctest

works when PATH-based solutions don’t.

“Is this a hack?” — No, it’s normal

This pattern is very common in:

Docker toolchains for CLion
VS Code dev containers
CI base images
hermetic C++ environments

You’re not breaking isolation:

the real binaries still live in /opt/cmake
/usr/bin just contains stable entry points

Final recommendation (best practice)

For a portable CMake in Docker used by IDEs:

dockerfile
ENV CMAKE_HOME=/opt/cmake
ENV PATH="${CMAKE_HOME}/bin:${PATH}"

RUN ln -s ${CMAKE_HOME}/bin/cmake /usr/bin/cmake \
 && ln -s ${CMAKE_HOME}/bin/ctest /usr/bin/ctest

That gives you:

correct PATH for shells
rock-solid IDE detection
zero ambiguity

If you want, next I can:

suggest a canonical /opt/tools layout (cmake + clang + llvm)
review your Dockerfile for other IDE gotchas
explain why CLion behaves this way internally (it’s not random)

Just say the word.

what does ubuntu package "build-essential" contain?

On Ubuntu, build-essential is a meta-package.
It doesn’t contain code itself — it depends on the core tools required to build most C/C++ software from source.

Installing it pulls in this set:

What `build-essential` contains (Ubuntu 22.04)

✅ Essential components

Package	What it provides
`gcc`	GNU C compiler
`g++`	GNU C++ compiler
`make`	Build automation tool
`libc6-dev`	glibc headers and startup files
`dpkg-dev`	Debian packaging/build helpers

These are the minimum requirements for compiling native software on Ubuntu.

What this means in practice

After:

bash
sudo apt install build-essential

You can:

compile C and C++ code
build most open-source projects using make
compile projects that depend on standard libc headers

It is effectively the Ubuntu equivalent of:

“Give me a working native build environment.”

What it does NOT include (important)

build-essential does not include:

❌ CMake (cmake)
❌ Ninja (ninja-build)
❌ Clang / LLVM
❌ Debuggers (gdb, lldb)
❌ Sanitizers beyond what GCC provides
❌ pkg-config
❌ autotools (autoconf, automake, libtool)
❌ libc++ / libc++abi

That’s why modern C++ projects often install more than just build-essential.

Typical modern additions

For a modern C++ / CMake / fuzzing setup, people usually add:

bash
sudo apt install -y \
  build-essential \
  cmake \
  ninja-build \
  pkg-config \
  gdb

Or, in your case (portable toolchains):

portable clang
portable cmake
still install build-essential for:
- glibc headers
- linker basics
- system ABI compatibility

Why it’s still relevant even if you use clang

Even if you:

use portable clang
use portable cmake

You still need build-essential because:

glibc headers come from libc6-dev
crt objects (crt1.o, etc.) come from glibc dev packages
the system linker (ld) is typically used
many system libraries assume this baseline

So build-essential is still the correct base layer in Docker images.

TL;DR

build-essential gives you:

“Enough GNU toolchain + libc headers to build native C/C++ software.”

It’s:

minimal
boring
foundational
still necessary in 2026

Using it alongside portable clang and portable cmake is best practice, not redundancy.

If you want, I can also:

show a minimal Dockerfile for C++ + clang + cmake + fuzzing
explain when you can avoid build-essential (rare, but possible)

Just tell me.

List all available Clion run/debug configurations

CLion has several distinct Run/Debug configuration types, some built-in, some enabled depending on your toolchains, plugins, and project type. Here’s the complete, practical list, grouped by category, with notes on when they appear.

Core C / C++ configurations (always relevant)

1️⃣ CMake Application

Most common

Runs a single CMake executable target
Supports program arguments, env vars, working directory
Used for:
- normal executables
- fuzz targets
- small tools

You’ll use this for:

text
add_executable(my_app …)
add_executable(fuzztest …)

2️⃣ CTest

For tests registered via add_test()

Runs tests via ctest
Integrates with:
- test explorer
- test filters
- rerun failed tests
Required for:
- CMake add_test
- ctest --preset

This is what CLion uses behind the scenes when you click “Run test”.

3️⃣ CMake Script

Runs a .cmake script directly:

bash
cmake -P script.cmake

Useful for:

custom build steps
tooling scripts
CI-like automation

Rare, but powerful.

Toolchain / environment specific

4️⃣ Docker

(Shown if Docker integration is enabled)

Runs any executable inside a Docker container
Usually wraps a CMake Application or CTest config
Uses Docker toolchain

In practice, you still select CMake Application, but Docker is the execution backend.

5️⃣ Remote GDB Server

(Embedded / remote targets)

Attaches to gdbserver
Used for:
- embedded Linux
- QEMU
- hardware targets

Rare unless you do embedded work.

Debugger / analysis tools

6️⃣ Valgrind / Memcheck

(Plugin / toolchain dependent)

Runs an executable under Valgrind
Memory leak / error detection
Slower than sanitizers

Often replaced today by ASan/UBSan.

7️⃣ Sanitizers

(Modern CLion versions)

AddressSanitizer
UndefinedBehaviorSanitizer
ThreadSanitizer
MemorySanitizer (limited)

This is not a separate config type anymore — it’s usually a toggle inside:

CMake Application
CTest

But conceptually, it’s a “run under sanitizer” mode.

Language / ecosystem specific (only if enabled)

8️⃣ Python

(If Python plugin enabled)

Run Python scripts
Used for build helpers, tooling, tests

9️⃣ Custom Build Application

(Less common)

Runs arbitrary command
Not tied to a CMake target
Useful for:
- wrappers
- shell scripts
- external tools

Testing frameworks (auto-detected)

These appear if CLion detects the framework:

🔟 Google Test

Per-test / per-suite run
Integrated output
Breakpoints inside tests

Internally still uses CTest or direct execution.

1️⃣1️⃣ Catch2

Same idea as Google Test.

1️⃣2️⃣ Boost.Test

Same idea.

Fuzzing-relevant configs (important for you)

For libFuzzer, CLion uses:

✅ CMake Application

for:

long fuzz runs
custom args (-max_total_time, -artifact_prefix)

✅ CTest

for:

fuzz smoke tests
CI-style bounded runs
tests defined with add_test

There is no special “Fuzzer” run config — fuzzers are just executables.

What you will actually see in practice

In a modern C++ project, the visible list is usually:

CMake Application
CTest
CMake Script
Docker (if enabled)
Google Test / Catch2 / Boost.Test (if detected)
Custom Build Application

Everything else is conditional.

TL;DR (mental model)

You want to…	Use
Run an executable	CMake Application
Run tests	CTest
Run fuzz target	CMake Application
Run fuzz smoke tests	CTest
Run inside Docker	Same configs + Docker toolchain
Run under sanitizer	Toggle sanitizer
Run scripts	CMake Script / Custom Build

If you want, I can:

recommend exact run configs for libFuzzer + Docker + CLion
show how to keep run configs shareable across the team
explain why CLion sometimes “duplicates” configs

Just tell me.

Powiedziałeś(-aś):

ChatGPT powiedział:

1) fuzz_processor.cpp (fuzz target)

2) CMakeLists.txt snippet

3) Example CMakePresets.json snippet

4) Build & run (example commands)

5) Extra suggestions / hardening

Powiedziałeś(-aś):

ChatGPT powiedział:

Option 1: Install the default clang (easy, usually good enough)

Option 2: Install a specific / newer clang version (recommended for fuzzing)

Option 3: LLVM official apt repository (latest clang)

Recommended extras for fuzzing

Quick sanity check for libFuzzer support

TL;DR (what I’d actually do)

Powiedziałeś(-aś):

ChatGPT powiedział:

✅ Best option: LLVM “portable” binary tarballs

Download

Extract

Use it (no system changes)

Using it with CMakePresets.json

⚠️ Important libc detail (very relevant!)

🧪 Alternative: Fully hermetic via Docker (most isolated)

⚙️ Alternative: Build clang yourself (rarely worth it)

⭐ What I recommend for your case

Powiedziałeś(-aś):

ChatGPT powiedział:

What a release candidate means in practice

Is it safe to use?

Rule of thumb

Powiedziałeś(-aś):

ChatGPT powiedział:

1️⃣ LLVM 17 is the current long-term stable workhorse

2️⃣ libFuzzer + sanitizers favor mature releases

3️⃣ Portability: glibc compatibility sweet spot

4️⃣ CMake ecosystem stability

5️⃣ Why not LLVM 18 / 19 / 22-rc?

Practical recommendation matrix

If you want a “future-proof” setup

Powiedziałeś(-aś):

ChatGPT powiedział:

The short answer

The longer (and more useful) explanation

1️⃣ LLVM binary tarballs are built on specific distros

2️⃣ After LLVM 17, the build baseline moved forward

3️⃣ Why you still see Linux tarballs — just not 22.04-labeled ones

4️⃣ Why LLVM 17 is special

What this means for you (practically)

✅ If you are on Ubuntu 22.04

⚠️ Why this matters a lot for fuzzing

Rule of thumb going forward

Bottom line

Powiedziałeś(-aś):

ChatGPT powiedział:

1️⃣ Create the target directory

2️⃣ Extract the archive into /opt/clang

3️⃣ Check what you got

4️⃣ (Optional) Flatten the directory

5️⃣ (Optional) Make it usable system-wide (without replacing system clang)

Quick verification

Powiedziałeś(-aś):

ChatGPT powiedział:

Create the target directory

Extract file.tar.xz into /opt/clang

Verify extraction

Optional: flatten the directory (long options)

Optional: make it available in PATH

Sanity check

Powiedziałeś(-aś):

ChatGPT powiedział:

The problem it solves

What --strip-components=1 does

General rule

Important caveats (this matters)

Typical real-world use cases

LLVM / GCC / Go / Node tarballs

Extract only part of an archive

Mental model

Powiedziałeś(-aś):

1) `fuzz_processor.cpp` (fuzz target)

2) `CMakeLists.txt` snippet

3) Example `CMakePresets.json` snippet

2️⃣ Extract the archive into `/opt/clang`

Extract `file.tar.xz` into `/opt/clang`

What `--strip-components=1` does

Example: `CMakePresets.json`

`CMakePresets.json`

`CMakeLists.txt`

Option C: Use `cmake --build --target test` (CTest default)

Option D: Use `cmake --build` with `--target` that runs `ctest` presets

Option F: Use CMake presets’ `buildPresets` + `testPresets` to run fuzzing in one command line

Option G: Use `install(TARGETS ...)` and run from an installed prefix

Option H: Add a `fuzz` “meta target” that builds multiple fuzzers

Option I: Integrate with `CTest` labels + `-L` selection

Option J: Use `ExternalProject` / `FetchContent` style “superbuild” to run fuzzing isolated

A “mostly everything in presets” `CMakePresets.json`

Clean, concise `CMakePresets.json`

Example toolchain: `cmake/toolchains/clang-fuzz.cmake`

`CMAKE_BUILD_TYPE` defines:

Now: defining fuzz execution in `testPreset` (your goal)

Example `testPresets` for fuzzing