New services for sourceware projects
====================================

Sourceware
----------

sourceware.org is one of the original free software project hosting
services.  It hosts dozens of projects, including some GNU projects.

It is community operated, (mostly) Red Hat subsidized infrastructure.

<img src="./sourceware.png" width="50%">

Presenter Notes
---------------

This is about infrastructure that projects use on Sourceware. And it
about how you want to use it and maybe about which policies projects
want to adopt that the infrastructure makes possible. Setting project
policies isn't the job of sourceware overseers, but we love to help
making sure the infrastructure is ready.

You can think about having signed commits or authenticated patches,
tracking reviewers of patches, CI tests that always need to be green,
tracker bugs, etc. Some of which are popularly known as supply chain
security policies, but can also be seen as simply making sure we track
everything we do.

---

Projects
--------

Historically sourceware.cygnus.com (1998), sources.redhat.com (2000),
now sourceware.org (since 2001).

Also known as cygwin.com, gcc.gnu.org, valgrind.org, elfutils.org.

Host to ~25 projects, annobin, binutils, builder, bunsen, bzip2, cgen,
cygwin, debugedit, dwz, elfutils, insight, gcc, gccrs, gdb, glibc,
gnu-gabi, kawa, libabigail, libffi, lvm2, newlib, poke, sid,
systemtap, valgrind, ...

Plus another ~40 dormant or moved projects.

Worry-free, friendly, developer controlled, home for Free Software
projects.

---

Services
--------

- mailman based mailing lists
  - plus public-inbox (*)
- git hosting
- project websites, also git based
- bug tracking using bugzilla
- patch tracking via patchwork (*)
- integration testing using buildbot (*)
  - bunsen test results database (*)
- others, just ask! 

Trying to provide zero maintenance infrastructure. Prefer packaged
software for easy automated (security) updates. Otherwise virtual
environments.

Various projects have admin groups to setup their own services.

(*) New or improved - this talk!

---

Goal and vision
---------------

Email is awesome, but how can we combine our discussion based patch
reviews with patch tracking and test automation?

Provide infrastructure for tracking and automation of patches, testing
and analyzing testresults.

And how do we keep improving and innovating as a community run free
software infrastructure project?

Sourceware was started in 1998, what about the next 24 years?

Presenter Notes
---------------

Younger generation has trouble contributing to projects that primarily
use email. For elfutils we got some new contributors who struggle with
providing patches. gccrs frontend, project on github, but wants to
integrate with gcc.

Forges provide two main things that make people using them productive,
automated testing and easy patch tracking. So focus was on new
services which help with that.

Finally while setting up the new services this year we started
discussing a good structure for making sure Sourceware keeps around
for another 24 years. Best to have this discussion while there are no
known problems or issues.

---

builder.sourceware.org
----------------------

An installation of Buildbot (python), along with a local community,
mailinglist, git repo, container files, self-configuring.

Includes various compute resources:

- virtual machines, bare metal, containers
- distros (fedora, debian, centos, ubuntu, opensuse)
- architectures (i386, x86_64, ppc64le, s390x, ppc64, armhf, arm64)
- Thanks: Brno University, Marist University, Thomas Fitzsimmons, Mark
  Wielaard, Frank Eigler, IBM, The Works on Arm initiative and OSUOSL.

Presenter Notes
---------------

Show https://builder.sourceware.org/

---

Three kinds of builders
-----------------------

- Try builders - "pre-commit" - reports to author only.
  (binutils, elfutils, gdb and libabigail)
- CI builders - "quick regression tests", reports to author and list
  on regression.
  (bzip2, dwz, debugedit, libabigail, elfutils, glibc, gcc, gccrs,
   valgrind, gdb, binutils)
- Full builders - "all (slow) tests, not all commits, not all arches"
  (gcc x86_64/arm64/armhf, glibc x86_64/arm64, gdb-binutils x86_64,
   gccrust-bootstrap x86_64/arm64)

All results, reported or not, go into bunsen.
15.000+ builds a month.

Presenter Notes
---------------

Try builders normally run all tests as the CI builders. Full builders
are only needed for those projects that have heavy/slow tests.

---

Try Example
-----------

- git checkout -b frob
- hack, hack, hack... OK, looks good to submit
- git commit -a -m "Awesome hack"
- git push origin frob:users/mark/try-frob
- ... wait for the emails to come in or watch buildbot logs ...
- Send in patches and mention what the try bot reported

After your patches have been accepted you can delete the branch again:

- git push origin :users/mark/try-frob

Currently enabled for binutils, elfutils, gdb and libabigail.

Presenter Notes
---------------

My favorite kind of builder!

Really, really useful. Does need "quick" tests if enabled on all
arches. Currently sents email to the patch author, who might not be
the committer.

---

bunsen testrun storage & analysis
---------------------------------

An installation of bunsen (git://sourceware.org/git/bunsen.git)
(python), under rapid development.  irc: #bunsen on irc.libera.chat.

Stores all test-related logs + metadata from each testrun into one commit
of a ordinary dedicated git repo (git://sourceware.org/git/bunsendb.git).

Small python analysis scripts parse new log files from git into an
sqlite database.  Analysis passes build on each other.  Testruns are
clustered by metadata similarities.

Small command line reporting tools browse/search git + sqlite data,
show diffs/regressions.  More coming.

Includes a simple web frontend (https://builder.sourceware.org/testruns/).

Easy to run your own!  https://sourceware.org/git/?p=bunsen.git;a=blob;f=README

Presenter Notes
---------------

Sourceware testrun logs includes numerous & huge dejagnu, automake,
autoconf log files, and more.  Bunsen stores them in git.

Git compresses amazingly: 1TB of raw text from all 45.000 buildbot
runs fits into 10GB.  Use your knowledge of ordinary git to replicate,
subset, garbage-collect, authenticate.

Let's flip through some screenshots of the web interface.

---

bunsen web: list testruns
-------------------------

<img src="./bunsenweb-testruns.png" width="80%">

Presenter Notes
---------------

Each testrun uses its git commit hash as identifier.

---

bunsen web: testrun metadata
----------------------------

<img src="./bunsenweb-testrun-metadata.png" width="80%">

Presenter Notes
---------------

Here we see the metadata for this particular testrun.  

From the "filelist" and other tabs, one can explore what the system
has found out about this testrun: all the files, all the test cases.

The table's right columns let one navigate to related testruns that
share given metadata values, or else precede or follow it.  Let's
compare this testrun to one that came from the previous elfutils
source version: "source.gitdescribe", "prev", "delta".

---

bunsen web: compare two testruns
------------------------------------

<img src="./bunsenweb-testrun-diff.png" width="80%">

Presenter Notes
---------------

Here we see the differences in test results between the two builds.
Nothing too worrysome.  Notice the autoconf values are also
differenced.  We can click on each log file to instantly see the exact
line context of each test case.

The regression mode limits differences to only pass->fail type
transitions, omitting skipped results and other such less serious
results.

---

bunsen web: see details of one test case
----------------------------------------

<img src="./bunsenweb-testrun-testcase.png" width="80%">

Presenter Notes
---------------

Here we see the specific logfile region that accounts for the skipped
run-large-elf-file.sh.trs case.

There is much more to see, clicking around.  Please excuse the plain
looks, it's old school HTML with simple self-contained server tech.

---

patchwork.sourceware.org
------------------------

An installation of patchwork (python/django).

git pw is awesome

patchwork plus CI/CD - Let's use those buildbot workers too

Presenter Notes
---------------

I use it as a personal TODO for elfutils, not really a team effort.
gdb used it but abandoned is restarting the effort.
glibc does seem to use it as a group with weekly patch reviews
some gcc hackers use it?
Why does patchwork work for some, but not others?

To really use it in an automated fashion really needs committing
exactly what was submitted.

DJ has build a framework around patchwork for glibc that can trigger a
test run. Needs authentication to use on "random" buildbots.

---

inbox.sourceware.org
--------------------

An installation of public-inbox (perl).

A better mail archive and so much more....

Allows people to "subscribe" to the list through atom, nntp, imap

Easy way to have mailinglist mirrors, including local (git like) mirror

Try out piem (public-inbox emacs mode) or b4 tools.

Presenter Notes
---------------

All based on Message-IDs.

Make email lists like git.

Not a push, but a pull resource.

---

b4 example
----------

<pre>
.git/config

[b4]
    midmask = https://inbox.sourceware.org/dwz/%s
    linkmask = https://inbox.sourceware.org.org/dwz/%s
</pre>

<pre>
 $ b4 am a2bf576d-3598-385d-2139-cae0d4f11074@suse.cz
 Looking up https://inbox.sourceware.org/dwz/a2bf576d-3598-385d-2139-cae0d4f11074%40suse.cz
 Grabbing thread from inbox.sourceware.org/dwz/a2bf576d-3598-385d-2139-cae0d4f11074%40suse.cz/t.mbox.gz
 Analyzing 1 messages in the thread
 Checking attestation on all messages, may take a moment...
 ---
   ✓ [PATCH] Use grep -E instead of egrep.
   ---
   ✓ Signed: DKIM/suse.cz
 ---
Total patches: 1
---
 Link: https://inbox.sourceware.org.org/dwz/a2bf576d-3598-385d-2139-cae0d4f11074@suse.cz
 Base: applies clean to current tree
       git am ./20220907_mliska_use_grep_e_instead_of_egrep.mbx
</pre>

---

b4 example (cont)
-----------------

<pre>
$ b4 ty --auto
Auto-thankanating commits in master
Found 2 of your commits since 1.week
Calculating patch hashes, may take a moment...
  Located: [PATCH] Use grep -E instead of egrep.
  Located: [PATCH] Fix executable stack warning from linker
---
Generating 2 thank-you letters
  Writing: ./mliska_suse_cz_patch_use_grep_e_instead_of_egrep_.thanks
  Writing: ./mliska_suse_cz_patch_fix_executable_stack_warning_from_linker.thanks
---
You can now run:
  git send-email ./*.thanks
</pre>

---

Patch attestation
-----------------

Either:

- Awesome way to "close" the secure software supply chain
- Security "theater" that will exclude people and reduce code
  reviews to "has the submitter jumped through code signing hoops"

Presenter Notes
---------------

Example showed DKIM verification, but this can also be done for gpg
signed email messages or special patchatt "signed" patch emails.

---

Experiment: sourcehut
---------------------

https://sr.ht/~sourceware/

A more webby git workflow alternative

git send-email without the email

<img src="./sr.ht.png" width="80%">


Presenter Notes
---------------

Closes the "mail loop", you can submit without using "real" email.

But not many people seem to use it, it might not be known well enough.
Was tried to submit gccrs patches, but it wasn't a success?

If it doesn't work out we can still keep the git mirrors.

---

Sourceware as Conservancy member
--------------------------------

First off: things are fine and stable.

The next 24 years of Sourceware?

Why?  In case of future financial or organizational needs.

Reached out to Software Freedom Conservancy a few months ago. Wrote up
an application a couple of months ago, reached out to "all" sourceware
users. SFC offered project membership to Sourceware.

What is a fiscal sponsor?

Independent from any guest projects!

In particular the GNU toolchain projects have the FSF as fiscal
sponsor, nothing changes about that.

Continue to have public discussions on overseers mailing list and
public video chats with Conservancy.

Presenter notes
---------------

We reached out to the conservancy because they are trusted as
guardians of Free Software projects. Also because they are really
concerned with having free not company controlled or proprietary
infrastructure for free softaware projects. And we can provide
that. So partners.

There currently isn't really a need, so now is a good time to set
something up. This is just about sourceware providing free software
project hosting for the next few decades. Projects still decide how or
if they want to use the services.

We don't want to decide any policy for any guest project. Talked to
FSF to make sure this isn't in conflict with the FSF providing fiscal
sponsorship to the GNU Toolchain projects or a way to undermine GNU
projects leadership.

We believe this is a non-event. But we want to be as transparent as
possible. If this isn't a non-event we are doing it wrong. We want to
know if any of this is controversial.

They also turned out the be somewhat critical. What if the machines
are gone next week? How are you making sure you are really independent
of Red Hat? Talked to Red Hat legal, talked to other overseers, talked
to FSF, forced us to be as public as possible, no surprising the
community. Held public chats, etc.

So all discussions are open and public on the overseers
mailinglist. Video chats with Conservancy have been open to all on
BBB.

---

The FSF tech-team
-----------------

The various GNU projects hosted on sourceware also get support for the
FSF tech-team hosting some websites, releases, etc.

They have a paid staff and can also provide backups, mirrors and have
offered technical assistance to sourceware.

Run lists.gnu.org from which we are learning some tricks.

---

Future plans
------------

bugzilla sourceware infrastructure

Offsite backup/system, BBB instance, Software Heritage, gitolite,
cgit, signed release upload, dkim preserving lists...

Do this every 4 months?

Presenter notes
---------------

We are normally a little shy. We do our best work being
invisible. Infrastructure should just work. But it might be good to be
a little bit more visible so people know where to go
(overseers@sourceware.org).

We now have a sourceware infrastructure bugzilla component where we
try to track requests and ideas.

It would be good to do an infrastructure presentation/talk/discussion
every couple of months. Also to check if our goals are still
achievable. If something didn't happen after 4 months, do we need
extra help? Or maybe even hire someone to do it?