Systemtap 0.9 – Cast away

Another nice feature for Systemtap 0.9 was added by Josh Stone. Systemtap can collect data from any variable in scope at a probe point using the DWARF debug info. You can even dereference pointers, access struct members, array elements, etc. This is very powerful when collecting data during a trace and the systemtap runtime makes sure all data access is safe. But there were two issues making this less powerful than it could be.

First to the keep the tracing language simple systemtap only supports basic types (integers and strings), associative arrays or aggregates in stap scripts. This means that you could not easily pass program data around to an helper function to manipulate or format. Second sometimes programs “hide” the real type of a variable, or use a void * pointer that gets cast to the right type later on. You could work around this in the past by using embedded C and guru mode, but that wasn’t very nice, and made your script potentially unsafe.

So to make sure you can do this safely Josh added a @cast construct. This allows you to pass around a pointer to program data and interpret it as if it was any type described in the DWARF debuginfo for the program. All accesses are of course still checked for safety by the runtime.

A nice example of this feature in action is the following simple stap script to print the number of incoming connections for an executable by port number. We want to probe the kernel and get the inet_sock from the inet_csk_accept function when it returns successfully. Although this function handles inet sockets (it is part of inet_connection_sock.c), it passes around sock pointers. It can do this since an inet_sock struct starts with a sock pointer, later it will cast this to a full featured inet_sock pointer. So we do the same in our script:

global ports;

probe kernel.function("inet_csk_accept").return
{
  sock = $return
  if (sock != 0)
    {
      port = @cast(sock, "inet_sock")->num;
      ports[execname(), port]++;
    }
}

probe timer.s(30), end
{
  printf("Connections on ports: %s\n", ctime(gettimeofday_s()));
  foreach ([exec, port] in ports-)
    printf("%12s %4d: %4d\n", exec, port, ports[exec, port]);
  delete ports;
}
$ stap ports.stp
Connections on ports: Sat Feb 28 22:49:10 2009
       httpd   80:  172
       spamd  783:   30
        exim   25:   27
     portmap  111:   11
  imap-login  993:    8
      ypserv  818:    7
        sshd   22:    2

There are some more exciting network tracing examples in the Systemtap Examples collection.

Systemtap 0.9 – Markers everywhere

We recently released Systemtap 0.9 and one of the nice new features included is the user space markers that Stan Cox has been working on. They were designed so that they should be compatible with dtrace static user space markers, so you can immediately take advantage of them if your program already has those included. It even comes with a little dtrace python script wrapper that automagically does the right thing during the build. A nice example of that is postgresql, which has a set of markers to observe transactions, database locks, etc. All you have to do is recompile postgresql with –enable-dtrace and tada, out roll systemtap enabled markers that you can use for getting some high level events from your database. Like for example (postgresql-transactions.stp) how many and how long transactions take:

$ stap -x `pgrep -n postgres` postgresql-transactions.stp
committed transactions:
transaction id: time
            34: 1593213ns
            36: 2817146ns
            37: 1463901ns
            38: 1427854ns

aborted transactions: 4

We are trying to get some of these static markers activated in packages compiled for Fedora as SystemtapStaticProbes F11 Feature, so you can use them out of the box.

But you can also add your own markers to existing code. Daniel Tralamazza has been experimenting with a small glibc patch to add markers around various pthread mutexes, for doing userland synchronization primitives analysis. Then you can easily get things like the top 10 most shared locks.

Some Fosdem pictures

Sarah has been publishing sets of pictures from the Fosdem Free Java Meeting.

IcedTea 1.4 with XRender support

IcedTea 1.4 got released this week. And while it is full of new exciting stuff, you really should check out the XRender support by Clemens Eisserer. Especially if you often use java through remote X. It just flies! Trying it out is easy as soon as the new IcedTea hits a distro near you:

java -Dsun.java2d.xrender=True my.fancy.gui.HelloWorld

Also check out the JGears2 benchmark to compare your results.

So this speeds up the rendering pipeline to X enormously. Now the next step will be optimizing or rewriting the actual Render backend (pisces at this time) which seems to be the next bottleneck, at least for anti-aliased operations.

Clemens will give a talk about his work at Fosdem. Hope to see you all there.

What are you working on? – Systemtap!

Sometimes people ask me what Red Hat actually pays me for. Aren’t you working on something java related? Although my manager has been very generous and allows me to spend some (but not too much!) time on helping out the free java efforts, my main job is in the engineering tools group (gcc, gdb/Archer [formerly Frysk], binutils, elfutils, oprofile, etc.). Currently I am hacking on Systemtap which has been a lot of (low-level) fun.

LWN just published my article “A Systemtap update” which gives a high level overview of the project through some small examples (all work out of the box on Fedora 10 of course).  It is tucked away on the LWN kernel page, but I tried to show how Systemtap moved beyond the kernel and now provides complete system observability. At the end of the article I point out some of the other work that is being doing around debugging and tracing (elfutils dwarf framework, gcc vta branch, froggy/archer, the user-space breakpoint support layer) to show it is all connected to provide a better debugging and tracing environment for GNU/Linux.

Posters for Libre Java Fosdem meeting

I am going to Fosdem

Looking for a handy reference of all the talks in the Free Java developer room at Fosdem 2009 in Brussels, Saturday 7 and Sunday 8 February? Look no further! PDF Poster and ODG Poster.

Free Java Poster

Free Java Meeting at Fosdem

Our libre-java meeting at Fosdem is approaching quickly. February 7 and 8 in Brussels, Belgium. If you want to give a talk, do a demo or have some general presentation/hack-session in our room, please act quickly so we can still schedule it (the deadline is end of this week!).

This is what will be at our disposal:

  • the room “AW1.120” with a capacity of 74 seats (in the building “AW”)
    • on Saturday 2009-02-07 from 12:00 to 18:00
    • on Sunday 2009-02-08 from 09:00 to 17:00
  • a video projector with VGA cable
  • Internet connectivity (wifi A and B only, no wired)

This is what we need from you:

I am going to Fosdem
Also please add you name to the wiki even if you don’t want to present something so we know roughly how many people to expect.

planet.classpath.org moved

planet.classpath.org moved servers and if done correctly nobody will notice (except for the new server having a totally sweet favicon Tap). But if you do happen to notice anything odd with the planet after the move, then please do yell and scream.

Observe, systemtap and oprofile updates

Without much fanfare systemtap 0.8 was released a little while ago. There is one little tidbit in the release notes that does warrent some excitement though:

User space probing is supported at a prototype level, for kernels built with the utrace patches.

So what does that mean? Take for example the para-callgraph.stp script:

$ stap para-callgraph.stp 'process("/bin/ls").function("*")' -c /bin/ls
0 ls(12631):->main argc=0x1 argv=0x7fff1ec3b038
276 ls(12631): ->human_options spec=0x0 opts=0x61a28c block_size=0x61a290
365 ls(12631): <-human_options return=0x0
496 ls(12631): ->clone_quoting_options o=0x0
657 ls(12631):  ->xmemdup p=0x61a600 s=0x28
815 ls(12631):   ->xmalloc n=0x28
908 ls(12631):   <-xmalloc return=0x1efe540
950 ls(12631):  <-xmemdup return=0x1efe540
990 ls(12631): <-clone_quoting_options return=0x1efe540
1030 ls(12631): ->get_quoting_style o=0x1efe540
[...]
650290 ls(12631):  <-print_current_files
650330 ls(12631): <-print_dir
650456 ls(12631): ->free_pending_ent p=0x1f02d90
650539 ls(12631): <-free_pending_ent
650660 ls(12631): ->close_stdout
650821 ls(12631):  ->close_stream stream=0x376db6c780
650966 ls(12631):  <-close_stream return=0x0
651082 ls(12631):  ->close_stream stream=0x376db6c860
651164 ls(12631):  <-close_stream return=0x0
651205 ls(12631): <-close_stdout

That is timestamp, process name, tid, function entry and function exit with parameters and return values. Currently it relies on having the debuginfo files available, so make sure you install the coreutils-debuginfo package (if you want to trace /bin/ls and friends). Systemtap 0.8 should be in a distro near you soon. Fedora 10 already has it.

Another nice thing added in Fedora 10 is oprofile-jit, which enhances the system profiler with java support (for runtimes supporting jvmti/jvmpi, gcj native code was obviously already supported), just add -agentlib:jvmti_oprofile to your java invocation, and then opreport can give you stuff like:

samples  %        linenr info                 image name   app name    symbol name
136220   20.3345  (no location information)   21010.jo     java        Interpreter
15176     2.2654  indexSet.cpp:528            libjvm.so    libjvm.so   IndexSetIterator::advance_and_next()
12273     1.8321  (no location information)   21010.jo     java        int[] java.math.BigInteger.montReduce(int[], int[], int, int)
11129     1.6613  (no location information)   21010.jo     java        int java.text.CollationElementIterator.next()
9932      1.4826  (no location information)   21010.jo     java        java.lang.String com.sun.javatest.finder.JavaCommentStream.readComment()~1
9731      1.4526  (no location information)   21010.jo     java        java.nio.charset.CoderResult sun.nio.cs.UTF_8$Decoder.decodeArrayLoop(java.nio.ByteBuffer, java.nio.CharBuffer)
9239      1.3792  reg_split.cpp:409           libjvm.so    libjvm.so   PhaseChaitin::Split(unsigned int)
8617      1.2863  ifg.cpp:464                 libjvm.so    libjvm.so   PhaseChaitin::build_ifg_physical(ResourceArea*)

Note how you can see the percentages of time spend in the interpreter, compiled methods, hotspot (and if I would have it enabled, libc, kernel, etc). The above is clearly a somewhat short run (of the jtreg crypto tests) and you can see that most of the time is spend in the Interpreter because the methods haven’t been compiled yet.

Paul Frields made me smile

Paul wrote about “Fedora 10 around the corner!” and said something really nice:

Last, but certainly not least, I want to thank you, the reader, if I haven’t already. You’re part of our community too, and without you we would be diminished. Free software isn’t just about bits and bytes, it’s about people, about doing something real, something tangible, something lasting for your fellow human beings. And with your help, the Fedora Project has been able to lead in free software innovation for over five years and ten releases now. Each and every one of you — pat yourself on the back for a job well done.

Thanks Paul, very well said.