..

OSS-Fuzz guides
===============

In this section we will go through how to use Fuzz Introspector with OSS-Fuzz.
Fuzz Introspector is integrated into
`OSS-Fuzz <https://github.com/google/oss-fuzz>`_. This means that the OSS-Fuzz
infrastructure provides a set of features for running Introspector on an
arbitrary OSS-Fuzz project. The goal of this is to make it easier for
maintainers of projects on OSS-Fuzz to assess the completeness of their fuzzing
setup.

Running introspector on an OSS-Fuzz project
-------------------------------------------

Running Fuzz Introspector by way of OSS-Fuzz is beneficial in that OSS-Fuzz
abstracts away many of the tasks needed, e.g. generating corpus, generating
coverage, compiling fuzzers in various ways and finally running Fuzz
Introspector on the generated data. In fact, using the OSS-Fuzz environment
makes it possible to run Fuzz Introspector with a single command, as shown
in the following example:

.. code-block:: bash

   # Clone oss-fuzz
   git clone https://github.com/google/oss-fuzz
   cd oss-fuzz

   # Build a project using introspector
   python3 infra/helper.py introspector libdwarf --seconds=30

In the event of success, the last ouput you see should be something along
the lines of:

.. code-block:: bash

   INFO:root:Introspector run complete. Report in /home/dav/code/oss-fuzz/build/out/libdwarf/introspector-report/inspector
   INFO:root:To browse the report, run: python3 -m http.server 8008 --directory /home/dav/code/oss-fuzz/build/out/libdwarf/introspector-report/inspector and navigate to localhost:8008/fuzz_report.html in your browser


You can then launch a simple web server following the description:

.. code-block:: bash

   # View the generate HTML report
   python3 -m http.server 8008 \
     --directory build/out/libdwarf/introspector-report/inspector

   # Navigate to https://localhost:8008/fuzz_report.html to view the report.


Generate Fuzz Introspector report with latest public corpus
-----------------------------------------------------------

Runtime code coverage is a central theme in Fuzz Introspector. It helps us
understand the actual code executed by our fuzzers, and is a core part of
ensuring the fuzzers analyse the code we want them to analyse.

OSS-Fuzz builds up a corpus of test case inputs over time for each fuzzer.
A 30-day old version of this corpus is publicly available. We can use this
corpus to generate an almost-up-to-date understanding of how much of the code
OSS-Fuzz has analysed.

To use the latest corpus, we just need to `--public-corpora` to the
`introspector` command. The following example shows how to do this:

.. code-block:: bash

   # Clone oss-fuzz
   git clone https://github.com/google/oss-fuzz
   cd oss-fuzz

   # Build a project using introspector
   python3 infra/helper.py introspector libdwarf --public-corpora


Programmatically use OSS-Fuzz Fuzz Introspector data
----------------------------------------------------

Fuzz Introspector provides tooling support for accessing data generated by
OSS-Fuzz. This makes it convenient to easily generate a Fuzz Introspector
report using the same data as OSS-Fuzz albeit without running builds, fuzzers
code coverage generation and so on.

The only task that is needed is to rerun
the post-processing parts of Fuzz Introspector, which additionally makes it
convenient for testing new features and analyses.

The tooling support comes in the form of a library that makes it easy to fetch
relevant data and run Fuzz Introspector on the data. This library is accessible
here:
`oss-fuzz-scanner <https://github.com/ossf/fuzz-introspector/tree/main/tools/oss-fuzz-scanner>`_.


Find most complex functions with no coverage
############################################

To show the use of the OSS-Fuzz toolkit this section will go through a quick
sample of how to easily extract all functions for a given project that has no
code coverage and then also rank these based on complexity. This is useful for
example when finding new interesting targets for a given project.

We can achieve the tool described in the paragraph above using the following
code snippet:


.. code-block:: python

    import sys
    import scanner

    def print_function_details(project_name):
        # Scan for Fuzz Introspector reports in the last 100 days
        report_generator = scanner.get_all_reports([project_name], 100, 1)

        # Get the first report and run fuzz introspector on it.
        project, date_as_str, introspector_project = next(report_generator)

        # Get dictionary of all functions
        all_functions = introspector_project.proj_profile.get_all_functions()

        # Create list of names of functions with 0% code coverage.
        not_hit = []
        for function_name in all_functions:
            cov_percentage = introspector_project.proj_profile.get_func_hit_percentage(
                function_name)
            if cov_percentage == 0.0:
                # We rank the functions by complexity at end so extract this
                # data here as well.
                function_profile = all_functions[function_name]
                not_hit.append(
                    (function_name, function_profile.cyclomatic_complexity))

        print("Stats as of %s-%s-%s" %
              (date_as_str[0:4], date_as_str[4:6], date_as_str[6:]))
        print("Functions with 0 coverage: %d" % (len(not_hit)))
        print("Most complex functions with no code coverage:")
        not_hit.sort(key=lambda e: e[1], reverse=True)
        for i in range(min(len(not_hit), 10)):
            func_name, complexity = not_hit[i]
            print("- %s, %s" % (func_name, complexity))


    if __name__ == "__main__":
        project_name = sys.argv[1]
        print_function_details(project_name)


To test this tool, ensure you're in an environment with the relevant `requirements.txt <https://github.com/ossf/fuzz-introspector/blob/main/requirements.txt>`_
installed, then place the above script in the the folder
`oss-fuzz-scanner <https://github.com/ossf/fuzz-introspector/tree/main/tools/oss-fuzz-scanner>`_
and label it ``find-uncovered-functions.py``. It's a simple command-line
tool that can be run as follows:


.. code-block:: bash

    $ python3 ./find-uncovered-functions.py libssh
    Stats as of 2023-04-05
    Functions with 0 coverage: 685
    Most complex functions with no code coverage:
    - ssh_execute_server_request, 72
    - ssh_userauth_agent, 41
    - ssh_pki_openssh_privkey_export, 39
    - ssh_userauth_publickey_auto, 37
    - ssh_channel_select, 37
    - ssh_agent_sign_data, 36
    - ssh_options_copy, 34
    - ssh_pki_openssh_import, 34
    - ssh_options_getopt, 33
    - channel_write_common, 28

    $ python3 ./find-uncovered-functions.py libpng
    Stats as of 2023-04-05
    Functions with 0 coverage: 250
    Most complex functions with no code coverage:
    - png_do_compose, 91
    - png_image_read_colormap, 70
    - OSS_FUZZ_png_set_quantize, 58
    - png_image_read_direct, 51
    - OSS_FUZZ_png_ascii_from_fp, 41
    - png_image_read_background, 27
    - png_do_rgb_to_gray, 27
    - OSS_FUZZ_png_colorspace_set_rgb_coefficients, 24
    - png_XYZ_normalize, 23
    - OSS_FUZZ_png_set_keep_unknown_chunks, 23

    $ python3 ./find-uncovered-functions.py c-ares
    Stats as of 2023-04-05
    Functions with 0 coverage: 149
    Most complex functions with no code coverage:
    - ares__get_hostent, 61
    - inet_net_pton_ipv4, 51
    - ares__readaddrinfo, 49
    - inet_net_pton_ipv6, 29
    - fake_addrinfo, 26
    - process_answer, 24
    - ares_getaddrinfo, 24
    - write_tcp_data, 23
    - read_tcp_data, 22
    - get_precedence, 22


The ``scanner`` module from the OSS-Fuzz toolkit is relatively short, so
the recommendation as of now is to study that module in order to understand the
details of the above script. However, the most important lines of the code
are:
The most important lines of the code are:

``report_generator = scanner.get_all_reports([project_name], 100, 1)`` searches
for successful Fuzz Introspector reports generated by OSS-Fuzz for each project in the
list given as first argument. The second argument specifies the maximum number
of days to analyse and the third argument the interval between each day. For
example, ``get_all_reports([proj1], 12, 31)`` will scan for successful runs
on 12 days with 31 days interval, from today, meaning it will approximately
scan for reports over the last year.

``project, date_as_str, introspector_project = next(report_generator)`` gets
the first report from the generator as a triplet, namely the project name
the date of the data and then a Fuzz Introspector report. Extracting an element
from the generator will run Fuzz Introspector on the data, so this step may
take some time.

The central object in the above triplet is ``introspector_project``. This is
an instance of the `IntrospectionProject <https://fuzz-introspector.readthedocs.io/en/latest/development/core.html#fuzz_introspector.analysis.IntrospectionProject>`_
class, which gievs access to the analysis Fuzz Introspector offers.