The linux package upgrade nightmare, part 1


We all spend far too much of our time doing sysadmin. I'm upgrading and it's as usual far more work than it should be. I have a long term plan for this but right now I want to talk about one of Linux's greatest flaws -- the dependencies in the major distributions.

When Unix/Linux began, installing free software consisted of downloading it, getting it to compile on your machine, and then installing it, hopefully with its install scripts. This always works but much can go wrong. It's also lots of work and it's too disconnected a process. Linuxes, starting with Red Hat, moved to the idea of precompiled binary packages and a package manager. That later was developed into an automated system where you can just say, "I want package X" and it downloads and installs that program and everything else it needs to run with a single command. When it works, it "just works" which is great.

When you have a fresh, recent OS, that is. Because when packagers build packages, they usually do so on a recent machine, typically fully updated. And the package tools then decide the new package "depends" on the latest version of all the libraries and other tools it uses. You can't install it without upgrading all the other tools, if you can do this at all.

This would make sense if the packages really depended on the very latest libraries. Sometimes they do, but more often they don't. However, nobody wants to test extensively with old libraries, and serious developers don't want to run old distributions, so this is what you get.

So as your system ages, if you don't keep it fully up to date, you run into a serious problem. At first you will find that if you want to install some new software, or upgrade to the lastest version to get a fix, you also have to upgrade a lot of other stuff that you don't know much about. Most of the time, this works. But sometimes the other upgrades are hard, or face a problem, one you don't have time to deal with.

However, as your system ages more, it gets worse. Once you are no longer running the most recent distribution release, nobody is even compiling for your old release any more. If you need the latest release of a program you care about, in order to fix a bug or get a new feature, the package system will no longer help you. Running that new release or program requires a much more serious update of your computer, with major libraries and more -- in many ways the entire system. And so you do that, but you need to be careful. This often goes wrong in one way or another, so you must only do it at a time when you would be OK not having your system for a day, and taking a day or more to work on things. No, it doesn't usually take a day -- but it might. And you have to be ready for that rare contingency. Just to get the latest version of a program you care about.

Compare this to Windows. By and large, most binary software packages for windows will install on very old versions of Windows. Quite often they will still run on Windows 95, long ago abandoned by Microsoft. Win98 is still supported. Of late, it has been more common to get packages that insist on 7 year old Windows 2000. It's fairly rare to get something that insists on 5-year-old Windows XP, except from Microsoft itself, which wants everybody to need to buy upgrades.

Getting a new program for your 5 year old Linux is very unlikley. This is tolerated because Linux is free. There is no financial reason not to have the latest version of any package. Windows coders won't make their program demand Windows XP because they don't want to force you to buy a whole new OS just to run their program. Linux coders forget that the price of the OS is often a fairly small part of the cost of an upgrade.

Systems have gotten better at automatic upgrades over time, but still most people I know don't trust them. Actively used systems acquire bit-rot over time, things start going wrong. If they're really wrong you fix them, but after a while the legacy problems pile up. In many cases a fresh install is the best solution. Even though a fresh install means a lot of work recreating your old environment. Windows fresh installs are terrible, and only recently got better.

Linux has been much better at the incremental upgrade, but even there fresh installs are called for from time to time. Debian and its children, in theory, should be able to just upgrade forever, but in practice only a few people are that lucky.

One of the big curses (one I hope to have a fix for) is the configuration file. Programs all have their configuration files. However, most software authors pre-load the configuration file with helpful comments and default configurations. The user, after installing, edits the configuration file to get things as they like, either by hand, or with a GUI in the program. When a new version of the program comes along, there is a new version of the "default" configuration file, with new comments, and new default configuration. Often it's wrong to run your old version, or doing so will slowly build more bit-rot, so your version doesn't operate as nicely as a fresh one. You have to go in and manually merge the two files.

Some of the better software packages have realized they must divide the configuration -- and even the comments -- made by the package author or the OS distribution editor from the local changes made by the user. Better programs have their configuration file "include" a normally empty local file, or even better all files in a local directory. This does not allow comments but it's a start.

Unfortunately the programs that do this are few, and so any major upgrade can be scary. And unfortunately, the more you hold off on upgrading the scarier it will be. Most individual package upgrades go smoothly, most of the time. But if you leave it so you need to upgrade 200 packages at once, the odds of some problem that diverts you increase, and eventually they become close to 100%.

Ubuntu, which is probably my favourite distribution, has announced that their "Dapper Drake" distribution, from mid 2006, will be supported for desktop use for 3 years, and 5 years for server use. I presume that means they will keep compiling new packages to run on the older base of Dapper, and test all upgrades. This is great, but it's thanks to the generousity of Mark Shuttleworth, who uses his internet wealth to be a fabulous sugar daddy to the Linux and Ubuntu movements. Already the next release is out, "Edgy" and it's newer and better than Dapper, but with half the support promise. It will be interesting to see what people choose.

When it comes to hardware, Linux is even worse. Each driver works with precisely one kernel it is compiled for. Woe onto you once you decide to support some non-standard hardware in your Linux box that needs a special driver. Compiling a new driver isn't hard once, until you realize you must do it all again any time you would like to slightly upgrade your kernel. Most users simply don't upgrade their kernels unless they face a screaming need, like fixing a major bug, or buying some new hardware. Linux kernels come out every couple of weeks for the eager, but few are so eager.

As I get older, I find I don't have the time to compile everything from source, or to sysadmin every piece of software I want to use. I think there are solutions to some of these problems, and a simple first one will be talked about in the next installment, namely an analog of Service Packs


This is one of the SEVERAL reasons I would never use
Linux, but prefer VMS. (VMS is free for non-commercial use;
for commercial use, it does cost money, but a) you get what
you pay for and b) it more than pays for itself since one needs
far fewer people to keep it running.) I have NEVER had to do
a fresh install. OS upgrades or patches are out-of-the-box.
It just works. I'm even running the latest version of VMS on
hardware which is 15 years old, and of course an executable
produced 15 or 20 years ago will still run on the most modern
hardware, with any version of the OS.

I'm fortunate to have a VMS day job, but I also have VMS at home.
No need to worry about viruses! True, it's not a popular virus
target, but even if targeted it is difficult or impossible to
write a virus.

Some folks might have used VMS several years ago and think it
is somewhat old-fashioned. Actually, it is still being developed
and was recently ported to Itanium hardware.

I am not sure people would switch to VMS to fix this, unless it did what they need to do. Or Windows for that matter.

This is something that Linux and most other systems have to improve. Sysadmin cost is the highest cost of any OS, and the fact that linux is free-as-in-beer is really a pretty minor part of the cost equation right now.

To my mind, unless you do want to do a lot of things yourself, you need to pick a popular distribution. There is strength in numbers in any OS. With a popular distro (Ubuntu/debian, Fedora, SuSE) what you're going to get is a large body of people who have been through your problems, and already solved it. That's in part of what people like about the package systems -- on a big, popular distro, if there is some software you want, it is more likely to be in the package system so somebody has done the work of configuring and testing it on your system.

But we can do better.

One aspect that I appreciate (at least theoretically; I run Debian Sid) about Ubuntu (and its K- and Edu- sisters) is the completeness of the system install: most users don't want to choose between the XFree86 and X.Org, for example. Let the advanced users worry about specialized configurations, rather than leaving it up to everyone to make each choice. And newer machines have enough disk space that installing a "complete" set of libraries and system tools isn't an issue.

If we were further willing to "version" those libraries (i.e. in the file tree, not just within the libraries), we might be able to let applications compile against newer libraries without affecting which version of each particular library other applications used. (And tracking the reverse links could allow an eventual trimming of the library versions. I'd like to see improved support for such capabilities in the file system, but that might cause additional problems, especially in the near term.)

As for kernel upgrades, I have never understood why the "default" binary driver packages didn't simply do the compile and install within the installation script, rather than making me execute such a script manually, line by line - often needing to consult pieces of multiple diverse sources where a script could determine the environment automatically.

I think making custom kernels is still considered a wizard thing in Linux. There are some automatic tools for certain drivers that will do all the work for you.

But even if the package is smart enough to compile with the headers of your current kernel, you need it against all kernels you run, and you need it to know that if I upgrade my kernel, I need to recompile all the special drivers for the new kernel -- which is a risky step because they probably have not been tested against that kernel, and the whole idea of a binary package is you're getting something that's been built and tested at least minimally against what it depends on.

I don't know what the plans are for the 2.8 linux kernel but a more modular driver architecture would be nice.

Add new comment