From Fedora Project Wiki

Line 80: Line 80:
<code>gcc -o foo1 foo1.o foo2.so foo3.so -Wl,--rpath-link=.</code>
<code>gcc -o foo1 foo1.o foo2.so foo3.so -Wl,--rpath-link=.</code>


== Example: SystemTap ==
== Example deltarpm ==


This example uses the SystemTap package and shows how to find and fix any errors that may be created by the proposed change to DSO Linkage.
I checked out a 'devel' version of deltarpm from :
 
Checked out SystemTap using CVS. You can use <code>fedora-cvs systemtap</code> or point your cvs to:


<code>:pserver:anonymous@cvs.fedproject.org:/cvs/pkgs</code>
<code>:pserver:anonymous@cvs.fedproject.org:/cvs/pkgs</code>


In the devel folder, run 'make srpm' to produce a source rpm. Open one of the .cfg files in /etc/mock and add the following between [fedora] and [local]:
and ran 'make srpm' to produce a source rpm in order to do a mock build. I ran this build with the new binutils rpm (http://roland.fedorapeople.org/ld-test/) so that it would have the version of ld with the proposed changes. After running the build the following error appeared in the log file :
  [ld-test]
  name=ld-test
  baseurl=http://roland.fedorapeople.org/ld-test/
  enabled=1
  gpgcheck=0
You may wish to rename the .cfg file to fedora-test and set config_opts['root']='fedora-test' at the top of the file. This will create a mock config file that uses the new version of ld.
 
Now run <code>mock -r fedora-test systemtap-1.0.src.rpm</code>. The following error will appear in the log file :


  RPM build errors:
  RPM build errors:
  '''/usr/bin/ld.bfd: stap.13165.test: undefined reference to symbol 'pthread_cancel@@GLIBC_2.0''''
  '''/usr/bin/ld.bfd: rpmdumpheader.o: undefined reference to symbol 'Fopen''''
  '''/usr/bin/ld.bfd: note: 'pthread_cancel@@GLIBC_2.0' is defined in DSO /lib/libpthread.so.0 so try adding it to the linker command line'''
  '''/usr/bin/ld.bfd: note: 'Fopen' is defined in DSO /usr/lib/librpmio.so.0 so try adding it to the linker command line'''
  '''/lib/libpthread.so.0: could not read symbols: Invalid operation'''
  '''/usr/lib/librpmio.so.0: could not read symbols: Invalid operation'''
  *** /usr/bin/ld: ld behavior mismatch! ***
  *** /usr/bin/ld: ld behavior mismatch! ***
  *** /usr/bin/ld.bfd succeeeded ***
  *** /usr/bin/ld.bfd succeeeded ***
  *** /usr/bin/ld.bfd --no-add-needed exits 1 ***
  *** /usr/bin/ld.bfd --no-add-needed exits 1 ***
  *** arguments: --eh-frame-hdr --build-id -m elf_i386 --hash-style=gnu -dynamic-linker /lib/ld-linux.so.2 -pie -o stap
  *** arguments: --eh-frame-hdr --build-id -m elf_i386 --hash-style=gnu -dynamic-linker /lib/ld-linux.so.2 -o rpmdumpheader
  /usr/lib/gcc/i686-redhat-linux/4.4.2/../../../Scrt1.o /usr/lib/gcc/i686- redhat-linux/4.4.2/../../../crti.o /usr/lib/gcc/i686-redhat-linux/4.4.2/crtbeginS.o
  /usr/lib/gcc/i686-redhat-linux/4.4.2/../../../crt1.o /usr/lib/gcc/i686-redhat-linux/4.4.2/../../../crti.o /usr/lib/gcc/i686-redhat-linux/4.4.2/crtbegin.o
  -L/usr/lib/gcc/i686-redhat-linux/4.4.2 -L/usr/lib/gcc/i686-redhat-linux/4.4.2 -L/usr/lib/gcc/i686-redhat- linux/4.4.2/../../.. -z relro -z now stap-main.o  
  -L/usr/lib/gcc/i686-redhat-linux/4.4.2 -L/usr/lib/gcc/i686-redhat-linux/4.4.2 -L/usr/lib/gcc/i686-redhat-linux/4.4.2/../../.. rpmdumpheader.o -lrpm -lgcc
stap-parse.o stap-staptree.o stap-elaborate.o stap-translate.o stap-tapsets.o stap-buildrun.o stap-loc2c.o stap-hash.o stap-mdfour.o  stap-cache.o stap-util.o
  --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/i686-redhat-linux/4.4.2/crtend.o
  stap-coveragedb.o stap-dwarf_wrappers.o stap-tapset-been.o stap-tapset-procfs.o stap-tapset-timers.o stap-tapset-perfmon.o stap-tapset-mark.o stap-tapset-itrace.o
/usr/lib/gcc/i686-redhat-linux/4.4.2/../../../crtn.o
stap-tapset-utrace.o stap-task_finder.o stap-dwflpp.o stap-rpm_finder.o --start-group -ldw -lebl --end-group -lelf -lsqlite3 -lrpm -ldl -lstdc++ -lm -lgcc_s -lgcc
-lc -lgcc_s -lgcc /usr/lib /gcc/i686-redhat-linux/4.4.2/crtendS.o /usr/lib/gcc/i686-redhat-linux/4.4.2/../../../crtn.o  
  collect2: ld returned 1 exit status
  collect2: ld returned 1 exit status
  make[2]: *** [stap] Error 1
  make: *** [rpmdumpheader] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2


This message means that Systemtap currently uses /lib/libpthread.so.0 without explicitly linking to it. To fix this, go into the configure.ac file and add -lpthread where it is needed:
  save_LIBS="$LIBS"
  '''LIBS="$LIBS -ldl -lpthread"'''
  AC_RUN_IFELSE(AC_LANG_PROGRAM([[
  --
  LIBRPM_COMPAT=true
  '''rpm_LIBS="-lrpm -lpthread"'''
  AC_MSG_RESULT(yes)


So this would seem to indicate that the version of deltarpm used was using /usr/lib/librpmio.so.0 without explicitly linking to it. The solution would be to simply ensure that any binaries that use librpmio.so, have a -lrpmio added when linking is happening.


For testing purposes, it may help to untar the tar.gz created by <code>make srpm</code> and modifying the configuration files in there before re-tarring it and running make srpm again.
Making the following change to the Makefile fixes this problem :


Once -lpthread has been added to the correct location, <code>mock -r fedora-test systemtap-1.0.src.rpm</code> should complete with no further errors.
rpmdumpheader: rpmdumpheader.o
-      $(CC) $(LDFLAGS) $^ -lrpm -o $@
'''+      $(CC) $(LDFLAGS) $^ -lrpm -lrpmio -o $@'''

Revision as of 18:17, 3 December 2009

Understanding the (Proposed) Change to DSO Linking

Basics

The default behaviour for ld is to not link objects that are listed as dependencies of another linked object. This is dangerous if the other object is ever changed to occlude the object on which your program depended, causing your program to break without any change to your code.

For example :

libxml2.so has:

 NEEDED            Shared library: [libdl.so.2]
 NEEDED            Shared library: [libz.so.1]

Under the old system, a program that links with libxml2 and uses dlopen may not link with libdl, and a program that links with libxml2 and uses gzopen may not link with libz. While these programs will work, they will break if libxml2 is ever changed to omit the dependency on libdl/libz.

What's the difference?

For example (courtesy Roland McGrath):

 ==> foo1.c <==
 #include <stdio.h>
 extern int foo ();
 int
 main ()
 {
   printf ("%d\n", foo ());
 }
 ==> foo2.c <==
 extern int foo ();
 int bar () { return foo (); }
 ==> foo3.c <==
 int foo () { return 0; }


Prepare position-independent code:

gcc -g -fPIC -c foo1.c foo2.c foo3.c

Generate foo3.so:

gcc -shared -o foo3.so foo3.o

Generate foo2.so, linking foo3.so:

gcc -shared -o foo2.so foo2.o foo3.so

The proposed change will affect the next step: Creating foo1.

Current

A call to gcc will succeed quietly, even though the link to foo3.so is only implicit.

 gcc -o foo1 foo1.o foo2.so -Wl,--rpath-link=.

Proposed

The call to gcc will fail, prompting the user to explicitly link the required shared object.

 gcc -o foo1 foo1.o foo2.so -Wl,--rpath-link=.
/usr/bin/ld: foo1.o: undefined reference to symbol 'foo'
/usr/bin/ld: note: 'foo' is defined in DSO ./foo3.so so try adding it to the linker command line


So, the difference is whether you can refer to a symbol that's in a DSO that you didn't list explicitly in your link line, but that is a DT_NEEDED dependency of one of those (or recursively of those, I think).

The big difference is that with the proposed change in place, ld will no longer skip linking needed libraries by default. The current default behaviour will lead ld to skip linking with a library if it is listed as a needed by another library that the program uses. In abstract terms, if libA is needed by libB and your program requires both libA and libB, your program may only link to libB. Then if another version of libB comes out that does not list libA as a needed library, then a recompilation will mysteriously break.

What do I do?

If you encounter this error, the error message will prompt you to explicitly link to the DSO that you need. From the foo example, adding foo3.so will get rid of the error:

gcc -o foo1 foo1.o foo2.so foo3.so -Wl,--rpath-link=.

Example deltarpm

I checked out a 'devel' version of deltarpm from :

:pserver:anonymous@cvs.fedproject.org:/cvs/pkgs

and ran 'make srpm' to produce a source rpm in order to do a mock build. I ran this build with the new binutils rpm (http://roland.fedorapeople.org/ld-test/) so that it would have the version of ld with the proposed changes. After running the build the following error appeared in the log file :

RPM build errors:
/usr/bin/ld.bfd: rpmdumpheader.o: undefined reference to symbol 'Fopen'
/usr/bin/ld.bfd: note: 'Fopen' is defined in DSO /usr/lib/librpmio.so.0 so try adding it to the linker command line
/usr/lib/librpmio.so.0: could not read symbols: Invalid operation
*** /usr/bin/ld: ld behavior mismatch! ***
*** /usr/bin/ld.bfd succeeeded ***
*** /usr/bin/ld.bfd --no-add-needed exits 1 ***
*** arguments: --eh-frame-hdr --build-id -m elf_i386 --hash-style=gnu -dynamic-linker /lib/ld-linux.so.2 -o rpmdumpheader
/usr/lib/gcc/i686-redhat-linux/4.4.2/../../../crt1.o /usr/lib/gcc/i686-redhat-linux/4.4.2/../../../crti.o /usr/lib/gcc/i686-redhat-linux/4.4.2/crtbegin.o
-L/usr/lib/gcc/i686-redhat-linux/4.4.2 -L/usr/lib/gcc/i686-redhat-linux/4.4.2 -L/usr/lib/gcc/i686-redhat-linux/4.4.2/../../.. rpmdumpheader.o -lrpm -lgcc 
--as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/i686-redhat-linux/4.4.2/crtend.o
/usr/lib/gcc/i686-redhat-linux/4.4.2/../../../crtn.o
collect2: ld returned 1 exit status
make: *** [rpmdumpheader] Error 1


So this would seem to indicate that the version of deltarpm used was using /usr/lib/librpmio.so.0 without explicitly linking to it. The solution would be to simply ensure that any binaries that use librpmio.so, have a -lrpmio added when linking is happening.

Making the following change to the Makefile fixes this problem :

rpmdumpheader: rpmdumpheader.o
-       $(CC) $(LDFLAGS) $^ -lrpm -o $@
+       $(CC) $(LDFLAGS) $^ -lrpm -lrpmio -o $@