20050623

More on Building Python for Solaris 10 on SPARC

Continuing the building Python saga.

Surprisingly, Python and g++ don't play well together on Solaris 10. For example, this simple file:

#include "Python.h"
#include
Will not compile with g++, spewing out errors like:
$ g++ -m64 -I/usr/local/64/include/python2.4 -c test.cc
In file included from /usr/local/64/include/python2.4/Python.h:8,
from test.cc:1:
/usr/local/64/include/python2.4/pyconfig.h:844:1: warning: "_XOPEN_SOURCE" redefined
:84:1: warning: this is the location of the previous definition
In file included from test.cc:2:
[...]../include/c++/3.4.4/cwchar:145: error: `::btowc' has not been declared
[...]../include/c++/3.4.4/cwchar:150: error: `::fwide' has not been declared
...

This is a result of g++ requiring _XOPEN_SOURCE to be 500 (bugzilla; it looks like this won't be fixed), whereas Python defines it to be 600. This is corrected in Python's configure script on Solaris 8 and 9, but not 10.

Unfortunately, Python's configure also defines _XOPEN_SOURCE_EXTENDED which Solaris 10's feature tests header (/usr/include/sys/feature_tests.h) uses to force the ABI version backwards. Anyway, apply the patches listed here (which should be in the next versions of Python) and re-configure, re-build and re-install Python and everything is happy again!

20050615

Building Python for Solaris 10 on SPARC

Building python seems to be harder than it ought to be, especially since despite using an autoconf-generated configure script, it is not consistently used.

I want both 32 and 64-bit builds of Python, I want them to co-exist happily in /usr/local and I want shared library versions. Is that too much to ask? Possibly.

Hopefully these notes will help someone else in the future.

Here's the configure command I had to use to get Python 2.4.1 to compile using Sun's Sun Studio 9 compilers:

$ bash ../Python-2.4.1/configure CCSHARED="-KPIC" \
LDSHARED="cc -xtarget=native -G" LDFLAGS="-xtarget=native" \
CC="cc" CPP="cc -xtarget=native -E" BASECFLAGS="-xtarget=native" \
OPT="-xO5" CFLAGS="-xtarget=native" CXX="CC -xtarget=native" \
--prefix=$HOME/python-32-2.4.1 --enable-shared --without-gcc \
--disable-ipv6
And that was just the 32-bit version.

  • I have to run configure from bash rather than the default shell since there's a reliance on test -e which doesn't exist in Solaris sh.
  • The CCSHARED is required to make it build with position-independent code, why configure didn't detect this I don't know.
  • There appears to be a bug in Sun Studio 9, requiring a plethora of -xc99=none flags to work around problems with configure and Solaris 10 headers, but using Sun Studio 10 fixes that.
  • Why BASECFLAGS? I don't know, it's undocumented but it seems required (it seems to be used by the module builder, surely this should be documented!).

Then it's the usual:

$ gmake -j 6
$ gmake test
$ gmake install
to build and install (getpwd fails the test for some reason).

To get a 64-bit version to build here's the configure command:

$ bash ../Python-2.4.1/configure CCSHARED="-KPIC" \
LDSHARED="cc -xtarget=native64 -G" LDFLAGS="-xtarget=native64" \
CC="cc" CPP="cc -xtarget=native64 -E" \
BASECFLAGS="-xtarget=native64" OPT="-xO5" \
CFLAGS="-xtarget=native64" CXX="CC -xtarget=native64" \
--prefix="$HOME/python-64-2.4.1/64" --enable-shared \
--without-gcc --disable-ipv6
Note the similarity with the previous 32-bit configure, except:
  • I've used a different prefix, so the 32-bit and 64-bit versions can be tested separately before being combined with stow.
  • In fact, the prefix is now completely independent of all 32-bit programs. Try as I might, several times, over several days, I simply could not make a multi-arch build that would live together happily (e.g., using the with-suffix=-64, libdir and includedir options, setting LIBDIR etc manually).
  • I don't use --exec-prefix since the Python Makefile has hard-coded expansions for some variables. For example the pyconfig.h header and the libpython2.4.so library are installed in precisely the wrong place, instead of paying attention to configure arguments of libdir and includedir. It does this because the Makefile.pre.in they have the following:
    LIBDIR=  $(exec_prefix)/lib
    MANDIR= @mandir@
    INCLUDEDIR= @includedir@
    CONFINCLUDEDIR= $(exec_prefix)/include
    SCRIPTDIR= $(prefix)/lib
    So you cannot override the locations of LIBDIR, CONFINCLUDEDIR, and SCRIPTDIR to be precisely where they should be.

There is also a bug in the build scripts, with an easy fix:

--- Lib/distutils/command/build_scripts.py~     2004-11-11 09:23:15.000000000 +1100
+++ Lib/distutils/command/build_scripts.py 2005-06-15 12:50:51.925373000 +1000
@@ -104,7 +104,7 @@
outf.write("#!%s%s\n" %
(os.path.join(
sysconfig.get_config_var("BINDIR"),
- "python" + sysconfig.get_config_var("EXE")),
+ "python" + str(sysconfig.get_config_var("EXE"))),
post_interp))
outf.writelines(f.readlines())
outf.close()
This is probably because the suffix -64 is interpereted as a number. This doesn't really matter to me anymore since I now install in a completely separate prefix which is independent of all 32-bit programs.

The build, test and install also required some rejigging, thanks to the hard-coded value of (undocumented) RUNSHARED (rather than being configurable).

$ gmake -j 6 RUNSHARED="LD_LIBRARY_PATH_64=`pwd`"
$ gmake RUNSHARED="LD_LIBRARY_PATH_64=`pwd`" test
$ gmake RUNSHARED="LD_LIBRARY_PATH_64=`pwd`" install
LD_LIBRARY_PATH_64 is the Solaris 64-bit linker library path.

I then create symlinks so the 64-bit binaries are available:

$ cd $HOME/python-64-2.4.1
$ mkdir bin
$ cd bin
$ ln -s ../64/bin/python python-64
$ ln -s ../64/bin/python2.4 python2.4-64

Both 32 and 64-bit packages are stow-ed into /usr/local.

Since the 64-bit package puts the shared libraries in a new directory we have to add this directory to the runtime linking path:

# crle -64 -u -l /usr/local/64/lib

20050609

Permission weirdness in Solaris 10

I recently upgraded our two Sun machines from Solaris 9 to Solaris 10. That is a story in itself, which I probably won't bother to tell, but one of the machines upgraded fine using lu(1M) while the other had no end of problems, and I finally upgraded it by doing a full install into a new partition and copying across the various changes we had made (e.g., NIS+ user database, NFS exports, sendmail, NTP, backups, cron, mailman, etc etc).

Anyway, it seems pretty happy now, which is good.

I was doing some testing, and I had the following behaviour:

$ cd /opt/tmp
$ mkdir test
$ rmdir test
$ mkdir test
$ rm -r test
rm: cannot determine if this is an ancestor of the current working directory
tmp
$
That is, I could remove a directory using rmdir but not using rm -r (this was as an unprivileged user).

I did a truss(1) and got the following:

...
lstat64("tmp", 0xFFBFF6C8) = 0
resolvepath("tmp", "tmp", 1024) = 3
getcwd("/opt/tmp", 1024) = 0
open64(".", O_RDONLY) = 3
stat64(".", 0xFFBFF1D0) = 0
chdir("..") = 0
lstat64(".", 0xFFBFF1D0) = 0
chdir("..") Err#13 EACCES [file_dac_search]
...

After doing some googling I decided it was probably a mount-related problem: /opt is a separate partition to /. Perhaps the permissions of the /opt mount point in the / partition were bad.

The visible permissions of /opt were fine:

$ ls -ald /opt
drwxr-xr-x 24 root sys 512 May 6 16:05 /opt
$

But how to check and/or change the permissions of a mount point without unmounting the target? This would have meant stopping a whole heap of services that are running from /opt.

Jason suggested to use NFS to mount / by itself in some other directory and check it:

# share -F nfs -o rw=localhost,root=localhost /
# mount -F nfs -o vers=3 localhost:/ /mnt
The -o vers=3 is to get around some permission weirdness with NFS version 4 (the default version in Solaris 10). Perhaps I need to set NFS_MAPID_DOMAIN to something sensible (like anu.edu.au) in /etc/default/nfs.
# ls -ld /mnt/opt
drwx------ 2 root other 512 May 4 11:48 /mnt/opt
Sure enough, the permissions for /mnt/opt were wrong. To fix:
# chmod go+rx /mnt/opt
# umount /mnt
# unshare /

And everything is fine, I can now do rm -r within /opt.

How did this happen, and why didn't I notice it before? This was on the machine that was live-upgraded, so perhaps the permissions have been "wrong" for quite some time and it didn't matter in Solaris 9 (i.e., this permissions particular corner case wasn't checked by the kernel), or perhaps there is a bug in live-upgrade itself, causing mount points to get weird permissions.