OSS-Fuzz guides

In this section we will go through how to use Fuzz Introspector with OSS-Fuzz. Fuzz Introspector is integrated into OSS-Fuzz. This means that the OSS-Fuzz infrastructure provides a set of features for running Introspector on an arbitrary OSS-Fuzz project. The goal of this is to make it easier for maintainers of projects on OSS-Fuzz to assess the completeness of their fuzzing setup.

Running introspector on an OSS-Fuzz project

Running Fuzz Introspector by way of OSS-Fuzz is beneficial in that OSS-Fuzz abstracts away many of the tasks needed, e.g. generating corpus, generating coverage, compiling fuzzers in various ways and finally running Fuzz Introspector on the generated data. In fact, using the OSS-Fuzz environment makes it possible to run Fuzz Introspector with a single command, as shown in the following example:

# Clone oss-fuzz
git clone https://github.com/google/oss-fuzz
cd oss-fuzz

# Build a project using introspector
python3 infra/helper.py introspector libdwarf --seconds=30

In the event of success, the last ouput you see should be something along the lines of:

INFO:root:Introspector run complete. Report in /home/dav/code/oss-fuzz/build/out/libdwarf/introspector-report/inspector
INFO:root:To browse the report, run: python3 -m http.server 8008 --directory /home/dav/code/oss-fuzz/build/out/libdwarf/introspector-report/inspector and navigate to localhost:8008/fuzz_report.html in your browser

You can then launch a simple web server following the description:

# View the generate HTML report
python3 -m http.server 8008 \
  --directory build/out/libdwarf/introspector-report/inspector

# Navigate to https://localhost:8008/fuzz_report.html to view the report.

Generate Fuzz Introspector report with latest public corpus

Runtime code coverage is a central theme in Fuzz Introspector. It helps us understand the actual code executed by our fuzzers, and is a core part of ensuring the fuzzers analyse the code we want them to analyse.

OSS-Fuzz builds up a corpus of test case inputs over time for each fuzzer. A 30-day old version of this corpus is publicly available. We can use this corpus to generate an almost-up-to-date understanding of how much of the code OSS-Fuzz has analysed.

To use the latest corpus, we just need to –public-corpora to the introspector command. The following example shows how to do this:

# Clone oss-fuzz
git clone https://github.com/google/oss-fuzz
cd oss-fuzz

# Build a project using introspector
python3 infra/helper.py introspector libdwarf --public-corpora

Programmatically use OSS-Fuzz Fuzz Introspector data

Fuzz Introspector provides tooling support for accessing data generated by OSS-Fuzz. This makes it convenient to easily generate a Fuzz Introspector report using the same data as OSS-Fuzz albeit without running builds, fuzzers code coverage generation and so on.

The only task that is needed is to rerun the post-processing parts of Fuzz Introspector, which additionally makes it convenient for testing new features and analyses.

The tooling support comes in the form of a library that makes it easy to fetch relevant data and run Fuzz Introspector on the data. This library is accessible here: oss-fuzz-scanner.

Find most complex functions with no coverage

To show the use of the OSS-Fuzz toolkit this section will go through a quick sample of how to easily extract all functions for a given project that has no code coverage and then also rank these based on complexity. This is useful for example when finding new interesting targets for a given project.

We can achieve the tool described in the paragraph above using the following code snippet:

import sys
import scanner

def print_function_details(project_name):
    # Scan for Fuzz Introspector reports in the last 100 days
    report_generator = scanner.get_all_reports([project_name], 100, 1)

    # Get the first report and run fuzz introspector on it.
    project, date_as_str, introspector_project = next(report_generator)

    # Get dictionary of all functions
    all_functions = introspector_project.proj_profile.get_all_functions()

    # Create list of names of functions with 0% code coverage.
    not_hit = []
    for function_name in all_functions:
        cov_percentage = introspector_project.proj_profile.get_func_hit_percentage(
            function_name)
        if cov_percentage == 0.0:
            # We rank the functions by complexity at end so extract this
            # data here as well.
            function_profile = all_functions[function_name]
            not_hit.append(
                (function_name, function_profile.cyclomatic_complexity))

    print("Stats as of %s-%s-%s" %
          (date_as_str[0:4], date_as_str[4:6], date_as_str[6:]))
    print("Functions with 0 coverage: %d" % (len(not_hit)))
    print("Most complex functions with no code coverage:")
    not_hit.sort(key=lambda e: e[1], reverse=True)
    for i in range(min(len(not_hit), 10)):
        func_name, complexity = not_hit[i]
        print("- %s, %s" % (func_name, complexity))


if __name__ == "__main__":
    project_name = sys.argv[1]
    print_function_details(project_name)

To test this tool, ensure you’re in an environment with the relevant requirements.txt installed, then place the above script in the the folder oss-fuzz-scanner and label it find-uncovered-functions.py. It’s a simple command-line tool that can be run as follows:

$ python3 ./find-uncovered-functions.py libssh
Stats as of 2023-04-05
Functions with 0 coverage: 685
Most complex functions with no code coverage:
- ssh_execute_server_request, 72
- ssh_userauth_agent, 41
- ssh_pki_openssh_privkey_export, 39
- ssh_userauth_publickey_auto, 37
- ssh_channel_select, 37
- ssh_agent_sign_data, 36
- ssh_options_copy, 34
- ssh_pki_openssh_import, 34
- ssh_options_getopt, 33
- channel_write_common, 28

$ python3 ./find-uncovered-functions.py libpng
Stats as of 2023-04-05
Functions with 0 coverage: 250
Most complex functions with no code coverage:
- png_do_compose, 91
- png_image_read_colormap, 70
- OSS_FUZZ_png_set_quantize, 58
- png_image_read_direct, 51
- OSS_FUZZ_png_ascii_from_fp, 41
- png_image_read_background, 27
- png_do_rgb_to_gray, 27
- OSS_FUZZ_png_colorspace_set_rgb_coefficients, 24
- png_XYZ_normalize, 23
- OSS_FUZZ_png_set_keep_unknown_chunks, 23

$ python3 ./find-uncovered-functions.py c-ares
Stats as of 2023-04-05
Functions with 0 coverage: 149
Most complex functions with no code coverage:
- ares__get_hostent, 61
- inet_net_pton_ipv4, 51
- ares__readaddrinfo, 49
- inet_net_pton_ipv6, 29
- fake_addrinfo, 26
- process_answer, 24
- ares_getaddrinfo, 24
- write_tcp_data, 23
- read_tcp_data, 22
- get_precedence, 22

The scanner module from the OSS-Fuzz toolkit is relatively short, so the recommendation as of now is to study that module in order to understand the details of the above script. However, the most important lines of the code are: The most important lines of the code are:

report_generator = scanner.get_all_reports([project_name], 100, 1) searches for successful Fuzz Introspector reports generated by OSS-Fuzz for each project in the list given as first argument. The second argument specifies the maximum number of days to analyse and the third argument the interval between each day. For example, get_all_reports([proj1], 12, 31) will scan for successful runs on 12 days with 31 days interval, from today, meaning it will approximately scan for reports over the last year.

project, date_as_str, introspector_project = next(report_generator) gets the first report from the generator as a triplet, namely the project name the date of the data and then a Fuzz Introspector report. Extracting an element from the generator will run Fuzz Introspector on the data, so this step may take some time.

The central object in the above triplet is introspector_project. This is an instance of the IntrospectionProject class, which gievs access to the analysis Fuzz Introspector offers.