Williams and Holland's Law: If enough data is collected, anything may be proven by statistical methods.
A few people have asked about this in #slackware lately so I thought I'd put down a few notes on how I got UTF-8 up and running. This howto applies to Slackware but you should be able to follow these directions on any *nix with a few path adjustments.
In these days of high powered X desktop environments with all their flashy composite bells and whistles it may seem strange to worry about how the plain Linux console behaves outside of X, but we need to remember that there are some very nice applications that live in the console and can be just as (or even more) productive than the X Windows counterparts. I'm thinking of screen, irssi, mutt, elinks, vim/vi, mplayer (using svga), links, zgv and slrn to name but a few. And of course we have to mention Slackware tools like pgktool, slackpkg and sbopkg. Also, if you are running any sort of headless box you will probably need to ssh into it and use some of these cli tools. And once you get hooked on using the console it is hard to go back. Anyway here we go ...
The first thing to do is configure the console. There is a kernel parameter we can add to /etc/lilo.conf to do this:
In fact, when you install slackware you are asked if you want a unicode enabled terminal, and if you confirm then this is the line that gets added to lilo.conf.
Here's an example from mine:
# Linux bootable partition config begins image = /boot/vmlinuz-188.8.131.52-custom root = /dev/hda1 label = Linux-custom append="vt.default_utf8=1" read-only # Linux bootable partition config ends
Once you have saved lilo.conf and ran lilo a quick reboot is needed.
For grub, append the parameter to the end of the kernel line in grub.conf or menu.lst
After that we need to find a font that actually contains the characters you want to see. My 'font of the moment' is lat9w-16. This contains the British £ (pound) and € (euro) symbols that I need, plus a lot of useful accented characters. It is also important to have the correct drawing characters so that curses programs like pkgtool and sbopkg are printed on the screen correctly. If you experiment with the setconsolefont command you will find that some fonts draw curses box borders as squares or question marks, so check out some fonts and see which ones work and which don't. The setconsolefont command will put your chosen font into /etc/rc.d/rc.font so it will load at bootup.
Ok, great you can now read characters correctly, but you need the correct keymap to be able to type them. That's a little harder. If you can find a UTF-8 keymap for your locale/hardware then that's fine. Personally I didn't find any of the installed keymaps suitable so I looked around for an alternative. I found a uk-utf8 keymap on the intertubes, but this was unfortunately missing all the Ctrl+[a-z] keys so I edited it and added my own. I also added some AltGr+[a-z] for accented and other characters. You may find this keymap useful as a starting point to create your own if you cannot find a more suitable one. Have a look through the comments and the codes and you should pick up the method. To add your own combinations you will need to know the correct codes for the characters you need. You can find a table of UTF-8 characters here:
Look up the ascii code in that table and convert it to hex. E.G. è is 232 in decimal, E8 in hex; é is ascii 233 in decimal, E9 in hex, so the entry in your keymap for 'e' would be:
# key = normal shifted AltGr+e AltGr+shift+e keycode 18 = e E U+00E8 U+00E9 # prints: e E è é
You can find the keycodes with the showkey program. Once edited you can load the map with loadkeys. Loadkeys will update /etc/rc.d/rc.keymap but you will need to copy the keymap to /usr/share/kbd/keymaps/i386/qwerty/ so it is accessible during bootup (assuming you are using a qwerty keymap). Here is a link to my updated uk-utf8 map:
You can load this ungzipped but it's better to gzip it back after editing to keep things consistent.
By now you should have a fully usable keyboard/console correctly printing unicode characters. You will probably need to tell some programs like mutt and irrsi that you are using a UTF-8 system.
Note: There exists two shell scripts - unicode_start and unicode_stop. Typing unicode_start <font> will load the required font and set up keyboard correctly for unicode input.