2015년 5월 19일 화요일

Issues with Terminator and GNU Screen

One of the first Linux distros I used was Crunchbang Linux based on Ubuntu 9.04. The default terminal was terminator, and I thought it was so cool how you could split terminal windows vertically and horizontally multiple times using the hotkeys C-S-o (horizontal split), C-S-e (vertical split). I have continued using terminator on all my personal machines and even in some VM's where terminator's ability to split a window into multiple sub-windows is very helpful for making the most of limited screen real estate.

When I was a Linux hobbyist, I almost never used ssh or telnet, let alone a serial console cable to connect to other machines, but now as a Linux engineer I often connect remotely to dozens of different machines in one session. Using gnome-terminal with tabs or terminator with split windows has its limits when working with more than a handful of remote machines. I started using GNU Screen as my terminal multiplexer of choice because unlike tmux (which many people prefer to Screen) GNU Screen supports serial console connections with much less fuss than minicom and with much more robustness than putty.

Over the last several months I have launched GNU Screen from within terminator, which does a passable job of supporting screen for simple terminal tasks. The problem, however, is that if you try to use vi/vim in a screen session that is itself launched inside terminator, upon exiting from your vim editing session, GNU Screen's output will only scroll in the top half of the terminator window!

Another weird occurrence when using GNU Screen from within terminator is that after several thousand lines are stored in terminator's scrollback buffer (which I have set to 'unlimited') during a serial console session, Screen will start returning gibberish even though the console speed is set to the correct setting for your hardware.

Check out this totally broken login screen (which is supposed to be a standard /etc/issue banner)

#####################################################################
#  T␤␋⎽ ⎽≤⎽├␊└ ␋⎽ °⎺⎼ ├␤␊ ┤⎽␊ ⎺° ▒┤├␤⎺⎼␋≥␊␍ ┤⎽␊⎼⎽ ⎺┼┌≤.             #
#  I┼␍␋┴␋␍┤▒┌⎽ ┤⎽␋┼± ├␤␋⎽ ␌⎺└⎻┤├␊⎼ ⎽≤⎽├␊└ ┬␋├␤⎺┤├ ▒┤├␤⎺⎼␋├≤, ⎺⎼ ␋┼  #
#  ␊│␌␊⎽⎽ ⎺° ├␤␊␋⎼ ▒┤├␤⎺⎼␋├≤, ▒⎼␊ ⎽┤␉┘␊␌├ ├⎺ ␤▒┴␋┼± ▒┌┌ ⎺° ├␤␊␋⎼    #
#  ▒␌├␋┴␋├␋␊⎽ ⎺┼ ├␤␋⎽ ⎽≤⎽├␊└ └⎺┼␋├⎺⎼␊␍ ▒┼␍ ⎼␊␌⎺⎼␍␊␍ ␉≤ ⎽≤⎽├␊└       #
#  ⎻␊⎼⎽⎺┼┼␊┌.                                                       #
#                                                                   #
#  I┼ ├␤␊ ␌⎺┤⎼⎽␊ ⎺° └⎺┼␋├⎺⎼␋┼± ␋┼␍␋┴␋␍┤▒┌⎽ ␋└⎻⎼⎺⎻␊⎼┌≤ ┤⎽␋┼± ├␤␋⎽    #

The blocks are control characters. The banner should look something like:

#####################################################################
#  This system is for the use of authorized users only.             #
#  Individuals using this computer system without authority, or in  #
#  excess of their authority, are subject to having all of their    #
#  activities on this system monitored and recorded by system       #
#  personnel.
...
This problem can be solved by detaching the screen session with C-a d, exiting the current terminator session, opening an entirely new terminator session and attaching to the detached GNU Screen session with screen -r.

What is interesting is that I have not run into the scrolling bug (in which text only scrolls in the top half of the terminal window.) when using vi/vim within a screen session launched from lxterminal. (Note: scrolling issues after invoking vim in a GNU screen session can be mostly fixed by adding altscreen on to your ~/.screenrc)

Although I still enjoy using terminator for day-to-day CLI tasks on my local machine, for remote sessions I now use GNU Screen with other terminal emulators besides terminator. I have tested GNU Screen and xfce4-terminal and lxterminal and I suspect that screen will also work well with gnome-terminal without the bugs I experienced with terminator.

Postscript 2015-10-11

After several months of using various terminals with GNU Screen over serial console, I have learned that serial console text corruption issue that often occurs after a remote machine reboot when the scrollback buffer is quite full affects all the vte-dependent terminals I've used including terminator, lxterminal, and xfce4-terminal.

2015년 5월 14일 목요일

ksvalidator - linter for kickstart automated install config files

Unattended installations with anaconda and Kickstart files are a must when installing RHEL/CentOS on multiple machines via PXE. One problem, however, is that you often don't know if you've made a typo or some syntax error within your Kickstart file until the anaconda installer tells you something is wrong with the configuration parameters. Once such an error occurs, you have no choice but to reboot.

This is a big headache when working on servers that have super-long reboot times, like the HP Proliant DL980 Gen 8 and 9 machines that take 10 minutes or more to reboot. Several reboots caused by invalid Kickstart files will easily eat up an hour or two during tight maintenance windows at night.

Thankfully, there is a python-based linter for Kickstart files called ksvalidator which is available from the pykickstart package on RHEL and CentOS or the python2-pykickstart package from AUR for Archlinux.

The most important option flag is -v (--version) which takes the argument RHELX where X is some number denoting the RHEL version, i.e. RHEL5, 6, 7.

If you don't specify a version, it will use the highest current RHEL version by default, which is RHEL7 (as of May 2015).

Here is some sample output:

[archjun@arch pxe]$ksvalidator ks5_sk_20140701.cfg The following problem occurred on line 8 of the kickstart file: Unknown command: key The following problem occurred on line 24 of the kickstart file: Unknown command: interactive The following problem occurred on line 204 of the kickstart file: Section %packages does not end with %end Running ksvalidator on a kickstart file for RHEL5 shows the errors above, but these would only be errors according to the kickstart syntax for RHEL7! The command above with no option flags is equivalent to executing ksvalidator -v RHEL7 ... If we re-run ksvalidator on the same file, this time specifying version RHEL5, no syntax errors are found: [archjun@arch pxe]$ ksvalidator -v RHEL5 ks5_sk_20140701.cfg

This is all well and good, but you should note that ksvalidator cannot catch logic errors, for example, trying to format a partition with an unsupported partition type. For instance, RHEL5.X on Linux kernel 2.6.18-X does not support ext4, but if you try to format a partition as ext4 in a RHEL5.X kickstart file, the linter will not catch the error! ksvalidator also cannot catch the incorrect use of mbr partition table type on a system using UEFI instead of legacy BIOS. In such a case, the kickstart file would have to specify use of gpt partition table.

For these types of mistakes, you must create your own error-checking.

Let's look at an excerpt from a problematic kickstart file for RHEL5 containing the following lines:

...
clearpart --initlabel --all
zerombr # no prompt when deleting all partitions
part /boot --fstype ext4 --size=20482 --ondisk=cciss/c0d0 --asprimary
#part /usr/local --fstype ext3 --size=30720 --ondisk=cciss/c0d0
part /usr --fstype ext4 --size=30720 --ondisk=cciss/c0d0
part /var --fstype ext4 --size=18432 --ondisk=cciss/c0d0
part / --fstype ext5 --size=10240 --ondisk=cciss/c0d0 --asprimary
part swap --size=8192 --ondisk=cciss/c0d0 --asprimary
part /workspace --fstype ext3 --size=2048 --ondisk=cciss/c0d0
firstboot --disable
...

There are several problems above. As mentioned earlier, RHEL5.X does not support ext4, so trying to format /usr and /var/ as ext4 will cause the anaconda installer to terminate. Another problem is the typo ext5 which is a non-existent partition type.

You could do RHEL5.X kickstart filesystem error checking with the following one-liner:

[archjun@arch pxe]\$ grep -wi "part" ks5-err-example.cfg | grep -v "ext3"
part /boot --fstype ext4 --size=20482 --ondisk=cciss/c0d0 --asprimary
part /usr --fstype ext4 --size=30720 --ondisk=cciss/c0d0
part /var --fstype ext4 --size=18432 --ondisk=cciss/c0d0
part / --fstype ext5 --size=10240 --ondisk=cciss/c0d0 --asprimary
part swap --size=8192 --ondisk=cciss/c0d0 --asprimary

This returns all the problematic lines containing non-ext3 partitions. It would be great if ksvalidator added functionality for catching some obvious partitioning errors.

2015년 5월 7일 목요일

Some observations about glibc GHOST vulnerability patching in the field

I currently work as a Linux System Engineer for an open source software service company that provides manpower for various Managed Service Providers (MSP) serving the Korean telecom industry. I spend most of my days out of the office on service calls ranging from basic server installation, system monitoring, to troubleshooting. In the normal course of my work I come into contact with sysadmins with varying levels of Linux familiarity. About 50% of the time the sysadmins I meet are more familiar with Windows or various flavors of Unix like HPUX, AIX, Solaris, etc. Of course, there are some sysadmins who are quite adept at Linux, too.

Thanks to an itinerant work arrangement in which I visit different sites every day, I have some perspective on the diverse ways that IT staff maintain their Linux servers. 2014 and early 2015 have been a busy time for patching servers with bugs found in Bash, OpenSSL and glibc. Although the glibc GHOST vulnerability was announced in Jan 2015, some of our clients still haven't completed patching all of their servers. Several sysadmins have asked me about the proper rpm commands for upgrading glibc-related packages on RHEL or CentOS. I was a bit worried when they said they used commands like

rpm -Uvh --nodeps pkgName

rpm -i --force --nodeps pkgName

as hacky workarounds for dependency errors that popped up when they tried to update glibc.

For the record, at my employer Growin, we recommend the following steps for upgrading glibc.

1) First identify relevant glibc packages installed on your RHEL/CentOS system

rpm -qa | grep -E "glibc|nscd" | grep -v "compat"

This will return a list of glibc-related packages on your system with the exclusion of compat- packages which don't need to be upgraded. For why compat-glibc doesn't need to be patched, see the Redhat Solutions post Is compat-glibc affected by GHOST, glibc vulnerability CVE-2015-0235? (login required). I quote,

The dynamic libraries provided by the compat-glibc package are not vulnerable because they do not provide runtime code...

2) Although there are more than 16 packages related to glibc that you could install, most probably you don't have glibc-debug* or the glibc-utils packages installed on your system. If these packages did not appear in the results from step 1, do not try to install them (doing so can lead to dependency errors)! Prepare only the updated rpm's that you need for an upgrade in a separate directory and then

rpm -Uvh glibc*

should do the trick. If your system has nscd installed, the above command would become:

rpm -Uvh glibc* nscd*

It is also possible to skip step 1 if you use the rpm -F flag (--freshen), which, according to man rpm:

will upgrade packages, but only ones for which an earlier version is installed

Desktop Linux users of Fedora, CentOS, or other rpm-based distros might wonder why sysadmins go to all the trouble of manually upgrading packages using rpm -Uvh pkgName when they could just do a yum update and upgrade all packages at one go.

There are several good reasons for not using yum update in an enterprise environment. First, some servers are only connected to an internal network. Second, custom applications created by developers may be compiled against a certain version of the C libraries in glibc, so doing a yum update runs the risk of breaking applications. Mainly for the second reason, it is not uncommon to see production servers running really old kernels like 2.6.18 (RHEL 5.X), and I have heard horror stories from coworkers about companies that use even older kernels!