A mission statement and social contract for GNU

2019 was a difficult year for the Free Software Community with lots of questions about the future of GNU. It is hard to come up with good answers unless you know which shared principles you all value. After a very long discussion we finally have a first GNU Social Contract DRAFT and a new public wiki for GNU maintainers to share public discussion documents like this.

Update: Carlos wrote a nice introduction for the glibc community.

Proposals for the new GNU/FSF relationship

To: fsf-and-gnu@fsf.org
CC: gnu-misc-discuss@gnu.org

As volunteers for the GNU Project we are happy that the FSF provides GNU with services like fiscal sponsorship, technical infrastructure, promotion, copyright assignment, and volunteer management. And we note that the FSF is looking for feedback on this relationship going forward:

FSF and GNU: https://www.fsf.org/news/fsf-and-gnu

To that end we have held discussions with other GNU maintainers, developers and other contributors, drafting a GNU mission statement and social contract, identifying stakeholders, delegation models and consensus based decision making. We would like to share some of the things we believe should happen to improve the shared understanding of the relationship for the future of the Free Software Foundation and the GNU Project.

leadership

We believe GNU leadership includes the GNU maintainers who should have this discussion together with the FSF. That way, the FSF can support the GNU Project as a whole.

More generally, we think it is time for the GNU Project to collectively define its governance structure, in a way that includes all stakeholders, and that the FSF should facilitate this process.

responsibility and delegation

We recognize that as the fiscal sponsor of the GNU project, the FSF is ultimately responsible for how GNU uses FSF-supported resources: the web site, mailing lists, earmarked financial accounts, sysadmin support, and so on. However, as the FSF is an organization that is more focused on advocacy than the day-to-day details of software production, we expect that the FSF delegates its responsibility for how GNU resources are allocated and used to the GNU project. The FSF should be supportive of requests from GNU maintainers for FSF-supported resources.

web site

The web site at gnu.org is currently run by a team of volunteers, the GNU webmasters, according to the procedures described at https://www.gnu.org/server/standards/README.webmastering.html.

We would like to increase transparency on how the web site is run and changed, notably by using a publicly visible tool rather than the currently-used private ticket system.

We understand that traditionally the FSF has used the gnu.org domain not just for the GNU project, but also to facilitate other programs of the FSF. We would like to see a better separation between pages maintained by GNU volunteers and FSF staff maintained pages for other FSF programs.

Specifically we like to give pages maintained by the FSF Licensing and Compliance Lab like the Free Software Definition, License List, Free System Distribution Guidelines and other FSF Compliance programs on which the larger Free Software community relies a special status or redirect them to the fsf.org domain.

We would also like to discuss which (historical) FSF/GNU Philosophy pages could be better maintained by the FSF License Education program or the FSF Education and Outreach program.

domain names

The gnu.org domain and its sub-domains are administrated by FSF employees. Upon request, they can delegate sub-domains, such as hurd.gnu.org, guix.gnu.org, gcc.gnu.org, etc.

We think the procedure for GNU maintainers to make use of sub-domains should be documented.

allocation of resources

The FSF provides hardware and sysadmin work time to support the GNU Project. We would like to have a clear procedure to request such support—e.g., a procedure by which developers of a GNU package could ask for a virtual machine (VM).

collecting donations on behalf of GNU

The FSF acts as fiscal sponsor of several GNU packages as part of its Working Together for Free Software Fund. We think this action is very welcome and should continue and we would appreciate a clear, documented and transparent procedure for GNU packages to join the Working Together for Free Software Fund.

trademark

We hope that the FSF, as the holder of the GNU trademark, will continue to use this trademark in a responsible manner in support of the GNU Project and GNU packages. We would like to see a public trademark policy guideline for GNU.

transparency

We think the FSF and the GNU Project should take advantage of recent changes in the FSF leadership to clarify their relation. Furthermore, we would like that relation to be as transparent as possible: procedures should be publicly documented, requests and replies should be logged and visible at least to members of the GNU Project, ownership of the various GNU resources should also be publicly documented.

Sincerely,

Ludovic Courtès, Andy Wingo, Carlos O’Donell, Andreas Enge and Mark Wielaard

Software Freedom Conservancy Interview

As part of the Software Freedom Conservancy Donor Match they did a little interview with me. Please read it and get inspired to Donate or join the Conservancy as a Supporter.

Software Freedom Conservancy Donor Match

I decided to be part of the Software Freedom Conservancy Donor Match this year. Because I believe many more free software communities deserve to have a home for their project at the Conservancy. In their own words:

Software Freedom Conservancy is a not-for-profit organization that helps promote, improve, develop, and defend Free, Libre, and Open Source Software (FLOSS) projects. Conservancy provides a non-profit home and infrastructure for FLOSS projects. This allows FLOSS developers to focus on what they do best — writing and improving FLOSS for the general public — while Conservancy takes care of the projects’ needs that do not relate directly to software development and documentation.

Please support the Software Freedom Conservancy by donating so they will be able to provide a home to many more communities. A donation of 10 US dollars a month will make you an official sponsor. Donations will be matched and so count double. And new Supporters will even have their donations tripled!

Software Freedom Conservancy Member Projects

A public discussion about GNU

New GNU Governance

There is now a public discussion about GNU governance issues as described in this LWN article: Rethinking the governance of the GNU Project. We have had private discussion about GNU governance issues for the last couple of decades between GNU maintainers, but that never resulted in actual change. And recent events made things a bit more urgent. Since the Chief GNUisance is no longer the president of the FSF. The FSF is now asking for feedback on how their relationship with the GNU project should go forward with respect to fiscal sponsorship, technical infrastructure, promotion, copyright assignment, and volunteer management. So we need to answer a lot of questions.

Mentoring and apprenticeship

We started with a description of how various GNU projects handle mentoring and apprenticeship. Once a GNU maintainer is assigned as the FSF steward of a project/package there are lots of documents on coding standards and what it means for a project to be GNU and Free Software. But there is no core guideline and a GNU maintainer has almost complete freedom interpreting whether any guidelines are or aren’t applicable to their project. This results in GNU maintainers reinventing a lot of project maintenance, governance and delegation of tasks. It would be good to document the various (consensus based) development models that are the result.

GNU membership

The mentoring and apprenticeship discussion focused on the GNU maintainers as being the core of the GNU project. But as was pointed out there are also webmasters, translators, infrastructure maintainers (partially paid FSF staff and volunteers), education and conference organizers, etc. All these people are GNU stakeholders. And how we organize governance of the GNU project should also involve them. There are also already some committees to evaluate new GNU packages and give feedback on the GNU coding standards. But given these committees are advisory only and are sometimes ignored or overruled people have been demotivated to join them or don’t see them as legitimate. It isn’t clear who is actually a GNU member, or whether the FSF recognizes just the GNU maintainers or also other GNU volunteers as stakeholders.

FSF Philosophy or GNU Policy

Both the GNU membership and the new GNU governance discussions try to answer the question “What is GNU?“. The easy answer is “GNU is an operating system that is free software, put together by people working together for the freedom of all software users to control their computing“. That still leaves a lot to define. What is in an Operating System, who are these people that do all this work and how do we coordinate all that work?

But looking at gnu.org it is much more complex than that. As you expect there is a people section and a software section. But then there is a lot of sections that blur the lines between the FSF and GNU. Most of that is simply historical. GNU used to be the only program the FSF ran. And some of these pages now have their own on fsf.org. The FSF now has a long list of programs besides GNU it runs. But things like the Free Software License List, Free Software Definition and Free System Distribution Guidelines are still maintained on gnu.org. It would be good to agree on who defines what.

And looking at the Philosophy of the GNU project page one could ask whether GNU is fundamentally about producing coherent, empowering free software systems, or whether it is fundamentally about developing and propagating an inspiring, liberatory philosophy? Or maybe it is both? And which Philosophy articles actually define Policy for the project and which are just personal opinions or preferences of the authors? How we are going to maintain these pages in the future (or maybe we are just going to mark them as historic?) depends on answers to these questions.

Resources

The FSF manages a lot of resources for the GNU project. It holds the trademark, it is entrusted with some of the copyrights, does fundraising and uses the money for technical infrastructure that GNU volunteers can use. Crucially it maintains the infrastructure for www.gnu.org, lists.gnu.org, ftp.gnu.org, savannah.gnu.org and fencepost.gnu.org for GNU projects to publish their work and coordinate development. But this infrastructure doesn’t currently scale and several GNU projects have to maintain their own infrastructure. Some projects have their own (earmarked) funds through the FSF Working Together for Free Software program (or sometimes through other foundations like Software in the Public Interest). It would be nice if the FSF could provide a place to have a discussion about the use of FSF resources by all the GNU volunteers (meta.gnu.org maybe) to help with these discussions and to make it more clear who can speak for GNU and which volunteers can use which mailinglists for what purposes.

GNU Social Contract

All the above discussions will be easier if we could agree on some guidelines that everybody would follow when acting on behalf of GNU. A mission statement about what it means to be GNU and what the values are that the GNU community respects when working together. Condensed to something that is easy to comprehend and follow by anybody who wishes to associate with GNU. Ludo posted a first (annotated) draft based on the idea of the Debian Social Contract. And after some discussion, Andreas posted a preliminary version of the GNU Social Contract based on four core principles:

  • The GNU Project respects users’ freedoms
  • The GNU Project provides a consistent system
  • The GNU Project collaborates with the broader free software community
  • The GNU Project welcomes contributions from all and everyone

If you are working on and/or participation in a GNU project we would love to hear your feedback on the proposed GNU Social Contract, the relation of the GNU project and the FSF, governance, membership and any of the other topics that we have been discussing. Together we can make sure that the GNU project will keep empowering all users to control their computing.

Software does not, by itself, change the world

Andy Wingo wrote some thoughts on rms and gnu. Although I don’t agree with the description of RMS as doing nothing for GNU, the part describing GNU itself is spot on:

Software does not, by itself, change the world; it lacks agency. It is the people that maintain, grow, adapt, and build the software that are the heart of the GNU project — the maintainers of and contributors to the GNU packages. They are the GNU of whom I speak and of whom I form a part.

Go GNU!

FSF and GNU

the FSF is now working with GNU leadership on a shared understanding of the relationship for the future.

Joint statement on the GNU Project

The GNU Project we want to build is one that everyone can trust to defend their freedom.

elfutils 0.177 released with eu-elfclassify

elfutils 0.177 was released with various bug fixes (if you ever had issues updating > 2GB ELF files using libelf, this release is for you!) and some new features. One of the features is eu-elfclassify, a utility by Florian Weimer to analyze ELF objects.

People use various tricks to construct ELF files that might make it non-trivial to determine what kind of ELF file you might be dealing with. Even a simple question like “is this a program executable or shared library?” might be tricky given the fact that (static) PIE executables look a lot like shared libraries. And some “shared libraries” are also “program executables”. e.g. Qt likes to provide some information about how the files have been build. So you can link against it as a shared library, but you can also execute it as if it was a program:

$ /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
This is the QtCore library version Qt 5.11.3
(x86_64-little_endian-lp64 shared (dynamic) release build; by GCC 8.3.0)
Installation prefix: /usr
Library path: lib/x86_64-linux-gnu
Include path: include/x86_64-linux-gnu/qt5
Processor features: sse3 sse2[required] ssse3 fma cmpxchg16b sse4.1 sse4.2 movbe popcnt aes avx f16c rdrand bmi avx2 bmi2 rdseed

glibc does the same thing for its shared libraries. Which is nice if you just quickly need to know what libc version is installed on a system, but might make it tricky to determine what kind of ELF file something really is.

eu-classify has a mode that will tell you whether such a file is primarily a shared library or primarily a program executable. And of course is able to classify it as both a library and a program. Hopefully eu-classify can replace the usage of the file (1) utility in various tools, with a more precise way to classify ELF files.

Usage: elfclassify [OPTION...] FILE...
Determine the type of an ELF file.

All of the classification options must apply at the same time to a particular
file.  Classification options can be negated using a "--not-" prefix.

Since modern ELF does not clearly distinguish between programs and dynamic
shared objects, you should normally use either --executable or --shared to
identify the primary purpose of a file.  Only one of the --shared and
--executable checks can pass for a file.

If you want to know whether an ELF object might a program or a shared library
(but could be both), then use --program or --library. Some ELF files will
classify as both a program and a library.

If you just want to know whether an ELF file is loadable (as program or
library) use --loadable.  Note that files that only contain (separate) debug
information (--debug-only) are never --loadable (even though they might contain
program headers).  Linux kernel modules are also not --loadable (in the normal
sense).

Without any of the --print options, the program exits with status 0 if the
requested checks pass for all input files, with 1 if a check fails for any
file, and 2 if there is an environmental issue (such as a file read error or a
memory allocation error).

When printing file names, the program exits with status 0 even if no file names
are printed, and exits with status 2 if there is an environmental issue.

On usage error (e.g. a bad option was given), the program exits with a status
code larger than 2.

The --quiet or -q option suppresses some error warning output, but doesn't
change the exit status.

 Classification options
      --core                 File is an ELF core dump file
      --debug-only           File is a debug only ELF file (separate .debug,
                             .dwo or dwz multi-file)
      --elf                  File looks like an ELF object or archive/static
                             library (default)
      --elf-archive          File is an ELF archive or static library
      --elf-file             File is an regular ELF object (not an
                             archive/static library)
      --executable           File is (primarily) an ELF program executable (not
                             primarily a DSO)
      --library              File is an ELF shared object (DSO) (might also be
                             an executable)
      --linux-kernel-module  File is a linux kernel module
      --loadable             File is a loadable ELF object (program or shared
                             object)
      --program              File is an ELF program executable (might also be a
                             DSO)
      --shared               File is (primarily) an ELF shared object (DSO)
                             (not primarily an executable)
      --unstripped           File is an ELF file with symbol table or .debug_*
                             sections and can be stripped further

 Input flags
  -f, --file                 Only classify regular (not symlink nor special
                             device) files
      --no-stdin             Do not read files from standard input (default)
      --stdin                Also read file names to process from standard
                             input, separated by newlines
      --stdin0               Also read file names to process from standard
                             input, separated by ASCII NUL bytes
  -z, --compressed           Try to open compressed files or embedded (kernel)
                             ELF images

 Output flags
      --matching             If printing file names, print matching files
                             (default)
      --no-print             Do not output file names
      --not-matching         If printing file names, print files that do not
                             match
      --print                Output names of files, separated by newline
      --print0               Output names of files, separated by ASCII NUL

 Additional flags
  -q, --quiet                Suppress some error output (counterpart to
                             --verbose)
  -v, --verbose              Output additional information (can be specified
                             multiple times)

  -?, --help                 Give this help list
      --usage                Give a short usage message
  -V, --version              Print program version

Report bugs to https://sourceware.org/bugzilla.

bzip2 and the CVE that wasn’t

Compiling with the GCC sanitizers and then fuzzing the resulting binaries might find real bugs. But not all such bugs are security issues. When a CVE is filed there is some pressure to treat such an issue with urgency and push out a fix as soon as possible. But taking your time and making sure an issue can be replicated/exploited without the binary being instrumented by the sanitizer is often better.

This was the case for CVE-2019-12900BZ2_decompress in decompress.c in bzip2 through 1.0.6 has an out-of-bounds write when there are many selectors“.

The bzip2 project had lost the domain which it had used for the last 15 years. And it hadn’t seen an official release since 2010. The bzip2 project homepage, documentation and downloads had already been moved back to sourceware.org. And a new bug tracker, development mailinglist and git repository had been setup. But we were still in the middle of a code cleanup (removing references to the old homepage, updating the manual and adding various cleanups that distros had made to the code) when the CVE was filed.

The issue reported was discovered by a fuzzer ran against a bzip2 binary compiled with gcc -fsanitizer=undefined. Which produced the following error:

decompress.c:299:10: runtime error: index 18002 out of bounds for type 'UChar [18002]'

The DState struct given to the BZ2_decompress function has a field defined as UChar selectorMtf[BZ_MAX_SELECTORS]; where BZ_MAX_SELECTORS is 18002. So the patch that came with the security report looked totally reasonable.

--- a/decompress.c
+++ b/decompress.c
@@ -284,15 +284,15 @@ Int32 BZ2_decompress ( DState* s )
284      /*--- Now the selectors ---*/
285      GET_BITS(BZ_X_SELECTOR_1, nGroups, 3);
286      if (nGroups < 2 || nGroups > 6) RETURN(BZ_DATA_ERROR);
287      GET_BITS(BZ_X_SELECTOR_2, nSelectors, 15);
288 -    if (nSelectors < 1) RETURN(BZ_DATA_ERROR);
    +    if (nSelectors < 1 || nSelectors > BZ_MAX_SELECTORS) RETURN(BZ_DATA_ERROR);
289      for (i = 0; i < nSelectors; i++) {
290         j = 0;
293         while (True) {
294            GET_BIT(BZ_X_SELECTOR_3, uc);
295            if (uc == 0) break;
296            j++;
297            if (j >= nGroups) RETURN(BZ_DATA_ERROR);
298         }
299         s->selectorMtf[i] = j; /* array overrun! */
300      }

Without the new nSelectors > BZ_MAX_SELECTORS guard the code could write beyond the selectorMtf array, which is undefined behavior. The undefined behavior in this case would be writing to memory addresses after the array. Given that an attacker could define nSelectors as big as they want, they would be able to override any memory after the array. This seemed urgent enough to do a new release quickly with this fix.

bzip2 1.0.7 was released. But the next day we already got bug reports that the fix broke decompression of some existing .bz2 files. This didn’t really make sense at first. BZ_MAX_SELECTORS was the theoretical maximum number of selectors that could validly be used in a .bz2 file. But some testing did confirm that these files did define a handful more selectors than were actually used. It turned out that some alternative bzip2 implementations used a slightly bigger maximum for the number of selectors (rounded up to a factor 8) which they might define, but didn’t expect to be used.

Julian Seward came up with a fix that split the max number of selectors in two. The original theoretical max that bzip2 would encode, and a bigger (rounded up to a factor 8) max that would be accepted when decompressing. This seemed to fix the issue for real, while still accepting some slightly “wrong” .bz2 files. The original code had worked for these because the array overwrite was only a few bytes, and the DState struct has extra state right after the selectorMtf array. The UChar len[BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE] array (6 * 258 = 6192 bytes), which was only written to after the selectors were read. So the memory overwrite was almost immediately corrected and didn’t do any harm because it was just such a small amount. The new code would still protect against real “too bignSelector values.

But we still didn’t feel completely confident we had fixed things correctly. One issue was that bzip2 never had a really good testsuite. Testing was mostly done ad-hoc by developers on a random collection of .bz2 files that they happened to have around. Luckily some alternative bzip2 implementations had created more formal testsuites. The .bz2 testfiles of those projects were collected and a testframe was created that ran bzip2 on both correct and known bad .bz2 files (optionally using valgrind to catch bad memory usage). This was a really good thing. The testsuite was added to the bzip2 buildbot. Which immediately flagged one testcase (32767.bz2) as BAD!

The 32767.bz2 testcase has the max number of selectors that the file format allows (2^15 - 1 = 32767). The .bz2 file format reserves 15 bits for the number of selectors used in a block. This is because to express the max of 18002 selectors can only be expressed when using 15 bits. That testcase could be decompressed correctly by bzip2 1.0.6 (or earlier), but not by the new bzip2 version that checked the number of selectors was “sane“. When the original bzip2 1.0.6 code was compiled with gcc -fsanitize=undefined the selectorMtf array overwrite was (correctly) reported. But surprisingly when ran under valgrind memcheck no bad memory usage was reported.

Some more investigation revealed that although this was an example of the most extreme possible selectorMtf array overwrite, it still only wrote over already allocated memory and that memory was not used before being assigned correct values. The selectorMtf array could hold 18002 bytes. 32767 – 18002 = 14765 bytes that could be overwritten after the array. But the DState struct had 3 more arrays after the selectorMtf and len arrays. Each defined as UInt32 [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE], which is 3 * 4 * 6 * 258 = 18576 bytes. And all state after the selectorMtf array in the DState struct would be assigned values right after reading the selectors. And none of the excess selector values would ever be used. So even though there really was an array overwrite, it was completely harmless!

That knowledge allowed us to write a much simpler patch that just skipped over the extra selectors without storing them. And release bzip2 1.0.8 that decompressed all the same files that 1.0.6 and earlier could.

In the end it was good for the bzip2 project to have a bit of an emergency. It brought people together who cared deeply about making sure bzip2 survives as a project, it got us automated release scripts, a new testsuite, buildbots, various other fixes upstreamed from distros and bzip2 is now part of oss fuzz (so we might get earlier warnings about similar issues in the future) and there is now a kind of roadmap for how to move forward

But part of the panic was also completely unnecessary. Yes, there was a way to trigger undefined behavior, but with any current compiler that behavior was actually defined, it would write over known (bounded) memory, memory that otherwise was correctly used and defined. We should have insisted on having a real reproducer, that could be triggered under valgrind memcheck. The instrumentation of the undefined sanitizer was not enough to show a real issue. We were lucky, it could certainly have been, or become, a real issue if the DState structure layout would have been different, if some constants were larger or smaller or if the compiler was smarter (it could have decided that writing after the array could never happen and so “optimize” the program assuming some loops were bounded). So fixing the bug was certainly the right thing to do. But in practice it never was a real security issue and we placed too much value in the fact that a CVE was assigned to it.

bzip2 1.0.8

We are happy to announce the release of bzip2 1.0.8.

This is a fixup release because the CVE-2019-12900 fix in bzip2 1.0.7 was too strict and might have prevented decompression of some files that earlier bzip2 versions could decompress. And it contains a few more patches from various distros and forks.

bzip2 1.0.8 contains the following fixes:

  • Accept as many selectors as the file format allows. This relaxes the fix for CVE-2019-12900 from 1.0.7 so that bzip2 allows decompression of bz2 files that use (too) many selectors again.
  • Fix handling of large (> 4GB) files on Windows.
  • Cleanup of bzdiff and bzgrep scripts so they don’t use any bash extensions and handle multiple archives correctly.
  • There is now a bz2-files testsuite at https://sourceware.org/git/bzip2-tests.git

Patches by Joshua Watt, Mark Wielaard, Phil Ross, Vincent Lefevre, Led and Kristýna Streitová.

This release also finalizes the move of bzip2 to a community maintained project at https://sourceware.org/bzip2/

Thanks to Bhargava Shastry bzip2 is now also part of oss-fuzz to catch fuzzing issues early and (hopefully not) often.