On commenting code: external vs internal documentation

In my day job, I sometimes have to write scripts in order to automate things that either happen too often to be handled manually or need to be handled in the shortest amount of time possible.But more often, I am tasked with modifying the behavior of an already existing script and/or adding new functionalities.Needless to say, some scripts are way longer than they should be, and they do things that are not supposed to do in a shell-scripting language (think of doing stuff on/to an XML document, using only the Bash shell and tools from the coreutils package). You might think that stuff like that is crazy, and yes, it is a bit crazy but it works and quite frankly, I got to see some rather clever solutions and approaches to unusual shell-scripting problems, and some very creative uses of more or less known features of Bash.

Call it masochism, but I am quite fond of reading code.

Sometimes though, reading scripts is not so pleasant. As shell script is still programming, I see a recurring pattern of poor programming in the form of poor (or non-existant) comments in the code.

I have come to think that code should really be interspersed with two kinds of documentation, and I like to call them “external” and “internal” documentation.

External documentation is supposed to answer the following questions:

  • what does this script/function do?
  • what are, if any, its side effects?
  • what parameters are needed?
  • can you make an example of the use of this script/function ?

Long story short, external documentation is supposed to help the user of your code use correctly, in particular when using your function the first time. Documenting parameters is particularly important: in dynamic languages like python, the name of a parameter says very little about its type but the type, and in the bash language you don’t even have to list formal parameters: whether you pass two or five of them, they will all be available via their position using variables like $1, $5 and so on.

Internal documentation on the other hand, should really address and describe the internals and the details of the implementation. We could say internal documentation answers the following questions:

  • how is the problem being approached?
  • what is the general strategy ?
  • what’s the meaning/the semantics of the various magic number and magic strings that are in the code?

If you think magic strings are bad, you probably haven’t had the pleasure of writing that many shell scripts. Magic strings are things like utility-specific format strings. Think of the date utility and its many available format strings, or the ps utility (see the “STANDARD FORMAT SPECIFIERS” section in the manpage). Sometimes magic strings can be weird and particular bash syntax.

You don’t see magic strings when you’re writing code because you have just looked it up and it’s obvious to you, but it might not be the case for someone else.

As an example: when formatting a timestamp using the date command, it is a good idea to add a comment with an example output of said format string. Are you writing

date '+%D @ %T'

This is more effective:

date '+%D @ %T'  ## 09/02/17 @ 10:20:53

And this is better commenting:

## date: '+%D @ %T' => '09/02/17 @ 10:20:53'

No need to look it up. You can immediately see what generates what.

 

In the end, as text processing is a significant part of pretty much every system administration task, you can go very far using the standard basic tools from the coreutils package. Once you get to know them, they can be super effective and reasonaby fast for most tasks.

But shell scripting is sometimes underappreciated, and this usually leads to many issue of poor documentation.

Annunci

XDM, fingerprint readers and ecryptfs

how to setup pam/xdm to work correctly with fingerprint login and ecryptfs-encrypted filesystem

I got myself a new toy last week, it’s a ThinkPad X200s. I bought it home for less than 60 euros and for 24 more i got a compatible battery. It’s a nice toy, definitely slow, but I now have something light I can carry around without worrying too much.

Of course, I installed GNU/Linux on it, and to keep things simple I chose Debian with a combination of light applications: XDM as login manager, i3 as window manager, xterm as terminal emulator, GNU Emacs as editor, claws-mail as mail client. The only fat cow on this laptop is Firefox.

As I started saving passwords and stuff on its disk, I decided to encrypt my home folder and my choice for this “simple” setup is ecryptfs. It’s secure enough, simple to setup and integrates very well with other Debian stuff.

This ThinkPad also has a fingerprint reader, correctly recognized by the operating system and I enrolled one of my fingerprints. It’s very comfortable, and it allows me to “sudo su” to root even with people around without fearing shoulder-surfing.

The first login though, it still requires me to input my password as it is needed to unwrap the ecrypts passphrase. And here is where the problem arises: first login is usually done via display manager, namely XDM.

As far as I know XDM, being an old-times thing, has no clue about fingerprint readers and stuff, but the underlying PAM does. And since XDM relies on PAM for authentication… It works, kinda. Basically you either swipe your fingerprint and log in without all your files (you didn’t input your password ⇒ your passphrase was not unwrapped ⇒ your private data was not mounted in your home) or wait for the fingerprint-based login to timeout and XDM to prompt you fro your password.

So if you are okay with waiting 10-15 seconds before being asked for your password then you can stop reading, otherwise keep reading to see how to fix it.

Long story short, PAM in Debian (my guess is that it works pretty much the same for other distributions too) has some nicely formatted, application-specific configuration files under /etc/pam.d . One of those, is /etc/pam.d/xdm and defines how PAM should behave when it’s interacting with XDM.

If you open it, you’ll see it actually does nothing particularly fancy: it just imports some common definition and uses the same settings every application uses that is, try fingerprint first, then fall back on password if it fails or times out.

Such behaviour is defined in /etc/pam.d/common-auth and it is just fine for all other application. For XDM though, it’s advisable in my opinion to ask right out for password and just don’t ask for fingerprint swipe.

My fix for this problem is then to:

  1. open /etc/pam.d/xdm
  2. replace the line importing /etc/pam.d/common-auth with the content of such file
  3. alter the newly added content fo ask right away for the password

tl;dr:

pam-xdm-right

Now XDM is going to ask for the password straight away.

Learn how to use GNU info

Recently I’ve been digging a lot into GNU/Linux system administration and as part of this, I have finally taken some time to google about that mysterious info command that has been sitting here in my GNU/Linux systems, unused for years.

Well, I can tell you, it has been a life-changing experience.

Texinfo-based documentation is awesome.

In this article, I want to share why is info documentation cool and why you should read its documentation if you didn’t already.

First, some terminology.

  • info: the command-line tool you use to read documents written using the texinfo format.
  • Texinfo is a document format and a software package providing tools to manipulate texinfo documents.

Texinfo was invented by the Free Software Foundation to overcome the limitations of manpages such as “un-browseability” (man pages are really documents supposed to be printed to paper but rendered to the terminal instead) and the lack of hyperlinking capabilities in the format.

GNU info was designed as a third level in documentation. If you take a typical program from the free software foundation, you can expect it to have three level of documentations:

  • the –help option: quick, one- or two-screenful of documentation
  • the man page
  • the info manual

Try and do “ls –help”, “man ls” and see the difference.

So info documents are documents that can be divided into sections, browsed only in parts, can have links to other pages of the documentations and can have links to other pieces of documentation as well! Also, they can be viewed with different viewers.

How do I learn to use info ?

Well, if the right way to learn about man is “man man”, the right way to learn about info is “info info”, and indeed such command will teach how to use the info tool.

Basically, you can go browser documents going forwards and backwards between nodes using n/] or p/[. Scrolling happens with space or backspace.

That is really the basic usage. Now go and type “info info” in your terminal.

The game-changer

As I said earlier, Texinfo Documents can be viewed outside the terminal too, while retaining all of their capabilities. I you have ever read some documentation on the webside of the Free Software Foundation then congrats, you have been reading a texinfo document translated to HTML.

For me, the game changer has been reading the GNU Emacs manual (a texinfo document) using GNU Emacs itself! They keystrokes are pretty much the same as the terminal ones, but you get variable-sized fonts and different colors for, say, hyperlinks and stuff like this.

Being able to read the Emacs manual inside Emacs is a game-changer for me because every time I don’t know something about Emacs, I can just start the manual and look it up. Clean and fast, no browser required (my CPUs are thanking me a lot)

Writing Texinfo documents

Here is where, sadly, the story takes an ugly turn.

I didn’t dig this topic very deep, but as far as I’ve seen, Texinfo documents are a major pain to write. The syntax looks quirky, but it seems worth it as Texinfo documents can be exported to HTML and PDF too.

There should be an org-mode plugin to export to Texinfo, but I couldn’t get it to work.

Again, I have to dig this topic a bit more, but it seems quite worth it.