# How to start hacking on Valgrind --- ## Easy hacks for valgrind Mark Wielaard ------------- .notes: First show next slide, so people can fetch sources, then talk. .notes: Is it really easy? .notes: No. Experience. Recognizing what has been done before. .notes: Contains lies by omission. Looks easy, till you try yourself. .notes: Better be called "cleanup stuff we could use some help with". --- ## Get and build the sources svn co svn://svn.valgrind.org/valgrind/trunk valgrind cd valgrind ./autogen.sh ./configure --prefix=/opt/local/install make -j4 ## And lets build and run the testsuite make check -j4 make regtest --- # Easy Hack ## (Not really) > Figure out why there are failing tests. Hard because depends on: - architecture/instruction set - kernel version - glibc version - random stuff But you learn a lot and there are setups where we get zero fail. == 709 tests, 0 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures == x86_64 Xeon E5-1620 Linux 3.10.0 glibc-2.17 --- # Help is all around you. Please ask! - There will be a hack session after this talk. - There is an irc channel `#valgrind-dev` on `irc.freenode.net` - Mailinglist `valgrind-developers@lists.sourceforge.net` - bugzilla: `https://bugs.kde.org/buglist.cgi?product=valgrind` - Please, please, please file issues for everything. --- # Get your git You will need a git repro (or mercurial) to easily find commits that fixed similar issues. Valgrind uses svn submodules (`VEX` inside `valgrind`): git svn clone svn://svn.valgrind.org/valgrind/trunk valgrind cd valgrind git svn clone svn://svn.valgrind.org/vex VEX **The above takes forever, don't do it right now!** Make sure to update them together: git svn fetch \ && git svn rebase \ && cd VEX \ && git svn fetch \ && git svn rebase \ && cd .. --- # Easy Hack ## (More like a full blown infrastructure project) > Create the canonical git mirror, git stitch `valgrind` and `VEX` together, convince project to adopt it to maintain sources. --- # `true` $ ./vg-in-place /bin/true ==8633== Memcheck, a memory error detector ==8633== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==8633== Using Valgrind-3.11.0.SVN and LibVEX; rerun with -h for copyright ==8633== Command: /bin/true ==8633== ==8633== ==8633== HEAP SUMMARY: ==8633== in use at exit: 0 bytes in 0 blocks ==8633== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==8633== ==8633== All heap blocks were freed -- no leaks are possible ==8633== ==8633== For counts of detected and suppressed errors, rerun with: -v ==8633== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2) Perfect! Or... --- # I am being suppressed! For counts of detected and suppressed errors, rerun with: -v ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2) $ ./vg-in-place -v /bin/true [...] --9318-- used_suppression: 2 dl-hack3-cond-1 /home/mark/src/valgrind/./.in_place/default.supp:1206 What is in the default suppressions? { dl-hack3-cond-1 Memcheck:Cond obj:*/lib*/ld-2.17*.so* obj:*/lib*/ld-2.17*.so* obj:*/lib*/ld-2.17*.so* } --- # Easy Hack ## (A history project) > Clean up the default suppressions Configure will tell you which ones are used: Default supp files: exp-sgcheck.supp xfree-3.supp xfree-4.supp glibc-2.X-drd.supp glibc-2.34567-NPTL-helgrind.supp glibc-2.X.supp Some comments in those files: # Errors to suppress by default with XFree86 4.1.0) # *** And a bunch of other stuff which is completely unrelated # to X. The default suppressions are a bit of a mess and could do # with a good tidying up. # Resulting from R H 8.0 # MontaVista Linux 4.0.1 on ppc32 # Ubuntu 10.04 on ARM (Thumb). Not sure why this is necessary. http://valgrind.org/docs/manual/manual-core.html#manual-core.suppress --- # What is going on? $ ./vg-in-place --default-suppressions=no /bin/true ==19444== ==19444== Conditional jump or move depends on uninitialised value(s) ==19444== at 0x401856B: index (strchr.S:58) ==19444== by 0x40079D3: expand_dynamic_string_token (dl-load.c:429) ==19444== by 0x40084F5: _dl_map_object (dl-load.c:2311) ==19444== by 0x400160D: map_doit (rtld.c:624) ==19444== by 0x400F313: _dl_catch_error (dl-error.c:177) ==19444== by 0x4000E8D: do_preload (rtld.c:813) ==19444== by 0x4004296: dl_main (rtld.c:1629) ==19444== by 0x4016244: _dl_sysdep_start (dl-sysdep.c:244) ==19444== by 0x4004DA3: _dl_start_final (rtld.c:329) ==19444== by 0x4004DA3: _dl_start (rtld.c:555) ==19444== by 0x4001427: ??? (in /usr/lib64/ld-2.17.so) `index (strchr)` is an implementation using sse instructions in `/lib64/ld-linux-x86-64.so.2 -> ld-2.17.so`. --- # Why does this happen? - Real bug in program, bad use of unadressable or undefined memory. - Bug in valgrind/VEX instruction emulation or memcheck. But most likely, see `shared/vg_replace_strmem.c` We have our own versions of these functions for two reasons: (a) it allows us to do overlap checking (b) some of the normal versions are hyper-optimised, which fools Memcheck and cause spurious value warnings. Our versions are simpler. (c) the glibc SSE-variants can read past the end of the input data ranges. This can cause false-positive Memcheck / Helgrind / DRD reports. Note that sometimes that last point may be worked around with `--partial-loads-ok=yes`. That allows "large" naturally aligned loads with some 'not addressabele' bytes. --- # Easy Hack ## (but not in this case) > Provide more vg_preload functions. Check different arches. --- # What does vg_preload do? `shared/vg_replace_strmem.c`: !c #define STRCHR(soname, fnname) \ char* VG_REPLACE_FUNCTION_EZU(20020,soname,fnname) ( const char* s, int c ); \ char* VG_REPLACE_FUNCTION_EZU(20020,soname,fnname) ( const char* s, int c ) \ { \ HChar ch = (HChar)c ; \ const HChar* p = s; \ while (True) { \ if (*p == ch) return (HChar *)p; \ if (*p == 0) return NULL; \ p++; \ } \ } // Apparently index() is the same thing as strchr() #if defined(VGO_linux) STRCHR(VG_Z_LIBC_SONAME, strchr) STRCHR(VG_Z_LIBC_SONAME, __GI_strchr) STRCHR(VG_Z_LIBC_SONAME, __strchr_sse2) STRCHR(VG_Z_LIBC_SONAME, __strchr_sse2_no_bsf) STRCHR(VG_Z_LIBC_SONAME, index) # if !defined(VGP_x86_linux) STRCHR(VG_Z_LD_LINUX_SO_2, strchr) STRCHR(VG_Z_LD_LINUX_SO_2, index) STRCHR(VG_Z_LD_LINUX_X86_64_SO_2, strchr) STRCHR(VG_Z_LD_LINUX_X86_64_SO_2, index) # endif --- # But why doesn't it work? Hint. Look at the backtrace: ==19444== Conditional jump or move depends on uninitialised value(s) ==19444== at 0x401856B: index (strchr.S:58) ==19444== by 0x40079D3: expand_dynamic_string_token (dl-load.c:429) ==19444== by 0x40084F5: _dl_map_object (dl-load.c:2311) ==19444== by 0x400160D: map_doit (rtld.c:624) ==19444== by 0x400F313: _dl_catch_error (dl-error.c:177) ==19444== by 0x4000E8D: do_preload (rtld.c:813) ==19444== by 0x4004296: dl_main (rtld.c:1629) ==19444== by 0x4016244: _dl_sysdep_start (dl-sysdep.c:244) ==19444== by 0x4004DA3: _dl_start_final (rtld.c:329) ==19444== by 0x4004DA3: _dl_start (rtld.c:555) ==19444== by 0x4001427: ??? (in /usr/lib64/ld-2.17.so) Drat, it is doing the actual preloading... --- # Why don't we need this for x86? Now your git repo comes in handy... `valgrind r11992`. x86-linux: don't add redirections for strchr/index in ld.so since they are already hardwiredly-redirected at startup, and so these are redundant. See `coregrind/m_trampoline.S` /* ------------------ SIMULATED CPU HELPERS ------------------ */ --- # What do we do for x86? `coregrind/m_trampoline.S`: /* There's no particular reason that this needs to be handwritten assembly, but since that's what this file contains, here's a simple index implementation (written in C and compiled by gcc.) unsigned char* REDIR_FOR_index ( const char* s, int c ) { unsigned char ch = (unsigned char)((unsigned int)c); unsigned char* p = (unsigned char*)s; while (1) { if (*p == ch) return p; if (*p == 0) return 0; p++; } } */ .global VG_(x86_linux_REDIR_FOR_index) .type VG_(x86_linux_REDIR_FOR_index), @function VG_(x86_linux_REDIR_FOR_index): pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax movzbl 12(%ebp), %ecx movzbl (%eax), %edx cmpb %dl, %cl jne .L9 jmp .L2 .L11: addl $1, %eax movzbl (%eax), %edx cmpb %dl, %cl je .L2 .L9: testb %dl, %dl jne .L11 xorl %eax, %eax .L2: popl %ebp ret .size VG_(x86_linux_REDIR_FOR_index), .-VG_(x86_linux_REDIR_FOR_index) --- # Lets patch amd64 to do the same diff --git a/coregrind/m_redir.c b/coregrind/m_redir.c index d87110b..6e5f1f3 100644 --- a/coregrind/m_redir.c +++ b/coregrind/m_redir.c @@ -1275,6 +1275,9 @@ void VG_(redir_initialise) ( void ) if (0==VG_(strcmp)("Memcheck", VG_(details).name)) { add_hardwired_spec( + "ld-linux-x86-64.so.2", "index", + (Addr)&VG_(amd64_linux_REDIR_FOR_index), NULL); + add_hardwired_spec( "ld-linux-x86-64.so.2", "strlen", (Addr)&VG_(amd64_linux_REDIR_FOR_strlen), # ifndef GLIBC_MANDATORY_STRLEN_REDIRECT diff --git a/coregrind/m_trampoline.S b/coregrind/m_trampoline.S index 3d2be09..a3ff66c 100644 --- a/coregrind/m_trampoline.S +++ b/coregrind/m_trampoline.S @@ -220,6 +220,30 @@ VG_(amd64_linux_REDIR_FOR_strlen): .LfnE5: .size VG_(amd64_linux_REDIR_FOR_strlen), .-VG_(amd64_linux_REDIR_FOR_strlen) +.global VG_(amd64_linux_REDIR_FOR_index) +.type VG_(amd64_linux_REDIR_FOR_index), @function +VG_(amd64_linux_REDIR_FOR_index): + movzbl (%rdi), %eax + movl %esi, %edx + cmpb %sil, %al + jne .L4 + jmp .L5 +.L10: + addq $1, %rdi + movzbl (%rdi), %eax + cmpb %dl, %al + je .L5 +.L4: + testb %al, %al + jne .L10 + xorl %eax, %eax + ret +.L5: + movq %rdi, %rax + ret +.size VG_(amd64_linux_REDIR_FOR_index), .-VG_(amd64_linux_REDIR_FOR_strlen) diff --git a/coregrind/pub_core_trampoline.h b/coregrind/pub_core_trampoline.h index 411a23c..eba3798 100644 --- a/coregrind/pub_core_trampoline.h +++ b/coregrind/pub_core_trampoline.h @@ -71,6 +71,7 @@ extern Addr VG_(amd64_linux_REDIR_FOR_vgettimeofday); extern Addr VG_(amd64_linux_REDIR_FOR_vtime); extern Addr VG_(amd64_linux_REDIR_FOR_vgetcpu); extern UInt VG_(amd64_linux_REDIR_FOR_strlen)( void* ); +extern Char* VG_(amd64_linux_REDIR_FOR_index) ( const Char*, Int ); #endif #if defined(VGP_ppc32_linux) --- # Victory! $ ./vg-in-place --default-suppressions=no /bin/true ==16308== Memcheck, a memory error detector ==16308== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==16308== Using Valgrind-3.11.0.SVN and LibVEX; rerun with -h for copyright ==16308== Command: /bin/true ==16308== ==16308== ==16308== HEAP SUMMARY: ==16308== in use at exit: 0 bytes in 0 blocks ==16308== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==16308== ==16308== All heap blocks were freed -- no leaks are possible ==16308== ==16308== For counts of detected and suppressed errors, rerun with: -v ==16308== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) --- ## Easy Hack > Write normal C functions to be used with `add_hardwired_spec`. There really is nothing special about them and we now have multiple architectures wanting to hardwire the same implementation. Having to copy/paste the output of gcc -S is somewhat silly. --- ## Is there are drawback to hardwiring? # Yes It is less flexible than the LD_PRELOAD mechanism and it needs the symbol table of the file to replace a function in. We used to avoid having hardwires because of that. But these days we need some hardwires anyway (see patch above, there is already `strlen`). Some distros however strip everything, even the symbol table of ld.so. Unpleasant user experience --- ## Easy Hack > Convince distros (Debian) of the importance of `README_PACKAGERS`: Do not ship your Linux distro with a completely stripped /lib/ld.so. At least leave the debugging symbol names on -- line number info isn't necessary. If you don't want to leave symbols on ld.so, alternatively you can have your distro install ld.so's debuginfo package by default, or make ld.so.debuginfo be a requirement of your Valgrind RPM/DEB/whatever. Reason for this is that Valgrind's Memcheck tool needs to intercept calls to, and provide replacements for, some symbols in ld.so at startup (most importantly strlen). If it cannot do that, Memcheck shows a large number of false positives due to the highly optimised strlen (etc) routines in ld.so. This has caused some trouble in the past. As of version 3.3.0, on some targets (ppc32-linux, ppc64-linux), Memcheck will simply stop at startup (and print an error message) if such symbols are not present, because it is infeasible to continue. It's not like this is going to cost you much space. We only need the symbols for ld.so (a few K at most). Not the debug info and not any debuginfo or extra symbols for any other libraries. --- # Easy Hack > Find and implement missing syscalls (or ioctls). `README_MISSING_SYSCALL_OR_IOCTL` is a really good tutorial. `man syscalls` has a list of syscalls per kernel version. `http://kernelnewbies.org/LinuxVersions` background on new syscalls. Nice example is `memfd_create` added in valgrind svn r14875. `http://bugs.kde.org/343012` --- # syscall bonus for some arches Some arches have diverged and just need to catch up the `coregrind/m_syswrap/syswrap--linux.c` table. Bug 340922 arm64: unhandled getgroups/setgroups syscalls. --- a/coregrind/m_syswrap/syswrap-arm64-linux.c +++ b/coregrind/m_syswrap/syswrap-arm64-linux.c @@ -968,6 +968,8 @@ static SyscallTableEntry syscall_main_table[] = { GENX_(__NR_getpgid, sys_getpgid), // 155 GENX_(__NR_getsid, sys_getsid), // 156 GENX_(__NR_setsid, sys_setsid), // 157 + GENXY(__NR_getgroups, sys_getgroups), // 158 + GENX_(__NR_setgroups, sys_setgroups), // 159 GENXY(__NR_uname, sys_newuname), // 160 GENXY(__NR_getrlimit, sys_old_getrlimit), // 163 GENX_(__NR_setrlimit, sys_setrlimit), // 164 @@ -1237,8 +1239,6 @@ static SyscallTableEntry syscall_main_table[] = { //ZZ GENX_(__NR_setreuid32, sys_setreuid), // 203 //ZZ GENX_(__NR_setregid32, sys_setregid), // 204 //ZZ -//ZZ GENXY(__NR_getgroups32, sys_getgroups), // 205 -//ZZ GENX_(__NR_setgroups32, sys_setgroups), // 206 //ZZ LINX_(__NR_setresuid32, sys_setresuid), // 208 //ZZ LINXY(__NR_getresuid32, sys_getresuid), // 209 --- # Easy Hack > Find and fix bad syscall wrappers. LTP. Linux Test Project http://linux-test-project.github.io/ > The LTP testsuite contains a collection of tools for testing the Linux kernel and related features. Our goal is to improve the Linux kernel and system libraries by bringing test automation to the testing effort. git clone git://github.com/linux-test-project/ltp cd ltp make autotools ./configure make -j4 --- # A crash is easy to find... $ valgrind ./testcases/kernel/syscalls/getsockname/getsockname01 --32037-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting --32037-- si_code=1; Faulting address: 0x1; sp: 0x802c99ce0 valgrind: the 'impossible' happened: Killed by fatal signal host stacktrace: ==32037== at 0x380FEB16: deref_UInt (syswrap-generic.c:1145) ==32037== by 0x380FEB16: vgModuleLocal_buf_and_len_pre_check (syswrap-generic.c:1152) ==32037== by 0x380FC750: vgPlain_client_syscall (syswrap-main.c:1586) ==32037== by 0x380F902A: handle_syscall (scheduler.c:1103) ==32037== by 0x380FA706: vgPlain_scheduler (scheduler.c:1416) ==32037== by 0x38109F90: thread_wrapper (syswrap-linux.c:103) ==32037== by 0x38109F90: run_a_thread_NORETURN (syswrap-linux.c:156) sched status: running_tid=1 Thread 1: status = VgTs_Runnable ==32037== at 0x4F36577: getsockname (syscall-template.S:81) ==32037== by 0x401406: main (getsockname01.c:124) --- # `getsockname` `int getsockname(int sockfd, struct sockaddr *addr, socklen_t *addrlen);` The addrlen argument should be initialized to indicate the amount of space (in bytes) pointed to by addr. On return it contains the actual size of the socket address. `coregrind/m_syswrap/syswrap-generic.c`: static UInt deref_UInt ( ThreadId tid, Addr a, const HChar* s ) { UInt* a_p = (UInt*)a; PRE_MEM_READ( s, (Addr)a_p, sizeof(UInt) ); if (a_p == NULL) return 0; else return *a_p; } void ML_(buf_and_len_pre_check) ( ThreadId tid, Addr buf_p, Addr buflen_p, const HChar* buf_s, const HChar* buflen_s ) { if (VG_(tdict).track_pre_mem_write) { UInt buflen_in = deref_UInt( tid, buflen_p, buflen_s); if (buflen_in > 0) { VG_(tdict).track_pre_mem_write( Vg_CoreSysCall, tid, buf_s, buf_p, buflen_in ); } } } --- # Patch --- a/coregrind/m_syswrap/syswrap-generic.c +++ b/coregrind/m_syswrap/syswrap-generic.c @@ -1138,7 +1138,7 @@ static UInt deref_UInt ( ThreadId tid, Addr a, const HChar* s ) { UInt* a_p = (UInt*)a; PRE_MEM_READ( s, (Addr)a_p, sizeof(UInt) ); - if (a_p == NULL) + if (a_p == NULL || ! ML_(safe_to_deref) (a_p, sizeof(UInt))) return 0; else return *a_p; --- ## Easy Hacks - Figure out why there are failing tests. - Create the canonical git mirror. - Clean up the default suppressions. - Check vg_preload functions for different arches. - Write normal C functions for `add_hardwired_spec`. - Convince distros of `README_PACKAGERS` (ld.so symtab) - Find and implement missing syscalls (or ioctls). `README_MISSING_SYSCALL_OR_IOCTL` - Find and fix bad syscall wrappers. (Wrap LTP)