OCR for your old scanner/printer’s E-mails

Yesterday I explained how to make a scrap computer do OCR on your scanner/printer’s scanned PDFs in case you have a SMB file share (a Windows file share) where the printer will write to.

I also promised I would make the E-Mail feature of the printer send E-mails with the PDFs in that E-mail being OCR scanned.

I had earlier explained how you can make your old scanner/printer support modern SMTP servers that have TLS, by introducing a scrap computer running Postfix to forward the E-mails for you. This article depends on that, of course. As we will let the scrap computer now do the OCR part. If you have not yet done that, first do it before continuing here.

I looked at Xavier Merten‘s CockooMX, and decided to massacre it until it would do what I want it to do. Namely call ocrmypdf on the application/pdf attachments and then add the resulting PDF/A (which will have OCR text) to the E-mail.

First install some extra software: apt-get install libmime-tools-perl . It will provide you with MIME::Tools, we will use MIME::Parser and MIME::Entity.

Create a Perl script called /usr/local/bin/ocrpdf.pl (chmod 755 it) that looks like this (which is Xavier’s CockooMX massacred and reduced to what I need – Sorry Xavier. Maybe we could try to make CockooMX have a plugin like infrastructure? But writing what looks suspicious to a database ain’t what I’m aiming for here):


#!/usr/bin/perl

# Copyright note

use Digest::MD5;
use File::Path qw(make_path remove_tree);
use File::Temp;
use MIME::Parser;
use Sys::Syslog;
use strict;
use warnings;

use constant EX_TEMPFAIL => 75; # Mail sent to the deferred queue (retry)
use constant EX_UNAVAILABLE => 69; # Mail bounced to the sender (undeliverable)

my $syslogProgram	= "ocrpdf";
my $sendmailPath	= "/usr/sbin/sendmail";
my $syslogFacility	= "mail";
my $outputDir		= "/var/ocrpdf";
my $ocrmypdf		= "/usr/bin/ocrmypdf";

# Create our working directory
$outputDir = $outputDir . '/' . $$;
if (! -d $outputDir && !make_path("$outputDir", { mode => 0700 })) {
  syslogOutput("mkdir($outputDir) failed: $!");
  exit EX_TEMPFAIL;
}

# Save the mail from STDIN
if (!open(OUT, ">$outputDir/content.tmp")) {
  syslogOutput("Write to \"$outputDir/content.tmp\" failed: $!");
  exit EX_TEMPFAIL;
}
while() {
  print OUT $_;
}
close(OUT);

# Save the sender & recipients passed by Postfix
if (!open(OUT, ">$outputDir/args.tmp")) {
  syslogOutput("Write to \"$outputDir/args.tmp\" failed: $!");
  exit EX_TEMPFAIL;
}
foreach my $arg (@ARGV) {
  print OUT $arg . " ";
}
close(OUT);

# Extract MIME types from the message
my $parser = new MIME::Parser;
$parser->output_dir($outputDir);
my $entity = $parser->parse_open("$outputDir/content.tmp");

# Extract sender and recipient(s)
my $headers = $entity->head;
my $from = $headers->get('From');
my $to = $headers->get('To');
my $subject = $headers->get('Subject');
chomp($from);
chomp($subject);

syslogOutput("Processing mail from: $from ($subject)");

processMIMEParts($entity);
deliverMail($entity);
remove_tree($outputDir) or syslogOuput("Cannot delete \"$outputDir\": $!");

exit 0;

sub processMIMEParts
{
  my $entity = shift || return;
  for my $part ($entity->parts) {
    if($part->mime_type eq 'multipart/alternative' ||
       $part->mime_type eq 'multipart/related' ||
       $part->mime_type eq 'multipart/mixed' ||
       $part->mime_type eq 'multipart/signed' ||
       $part->mime_type eq 'multipart/report' ||
       $part->mime_type eq 'message/rfc822' ) {
         # Recursively process the message
         processMIMEParts($part);
     } else {
       if( $part->mime_type eq 'application/pdf' ) {
         my $type = lc  $part->mime_type;
         my $bh = $part->bodyhandle;
         syslogOutput("OCR for: \"" . $bh->{MB_Path} . "\" (" . $type . ") to \"" . $bh->{MB_Path} . ".ocr.pdf" . "\"" );
         # Perform the OCR scan, output to a new file
         system($ocrmypdf, $bh->{MB_Path}, $bh->{MB_Path} . ".ocr.pdf");
         # Add the new file as attachment
         $entity->attach(Path   => $bh->{MB_Path} . ".ocr.pdf",
                         Type   => "application/pdf",
                         Encoding => "base64");
      }
     }
   }
   return;
}

#
# deliverMail - Send the mail back
#
sub deliverMail {
  my $entity = shift || return;

  # Write the changed entity to a temporary file
  if (! open(FH, '>', "$outputDir/outfile.tmp")) {
    syslogOutput("deliverMail: cannot write $outputDir/outfile.tmp: $!");
    exit EX_UNAVAILABLE;
  }
  $entity->print(\*FH);
  close(FH);

  # Read saved arguments
  if (! open(IN, "<$outputDir/args.tmp")) {
    syslogOutput("deliverMail: Cannot read $outputDir/args.tmp: $!");
    exit EX_TEMPFAIL;
  }
  my $sendmailArgs = ;
  close(IN);
	
  # Read mail content from temporary file of changed entity
  if (! open(IN, "<$outputDir/outfile.tmp")) {
    syslogOutput("deliverMail: Cannot read $outputDir/content.txt: $!");
    exit EX_UNAVAILABLE;
  }
	
  # Spawn a sendmail process
  syslogOutput("Spawn=$sendmailPath -G -i $sendmailArgs");
  if (! open(SENDMAIL, "|$sendmailPath -G -i $sendmailArgs")) {
    syslogOutput("deliverMail: Cannot spawn: $sendmailPath $sendmailArgs: $!");
    exit EX_TEMPFAIL;
  }
  while() {
    print SENDMAIL $_;
  }
  close(IN);
  close(SENDMAIL);
}

#
# Send Syslog message using the defined facility
#
sub syslogOutput {
  my $msg = shift or return(0);
  openlog($syslogProgram, 'pid', $syslogFacility);
  syslog('info', '%s', $msg);
  closelog();
}

Now we just do what Xavier’s CockooMX documentation also tells you to do: add it to master.cf:

Create a UNIX user: adduser ocrpdf

Change the smtp service:

smtp      inet  n       -       -       -       -       smtpd
-o content_filter=ocrpdf

Create a new service

ocrpdf  unix  -       n       n       -       -       pipe
user=ocrpdf argv=/usr/local/bin/ocrpdf.pl -f ${sender} ${recipient}

OCR for your old printer/scanner

Modern printers can do OCR on your scans. But as we talked about last time, aren’t all printers or scanners modern.

We have a scrap computer that is (already) catching all E-mails on a badly configured local SMTP server, to then forward it to a well configured SMTP server that has TLS. Now we also want to do OCR on the scanned PDFs.

My printer has a so called Network Scan function that scans to a SMB file share (that’s a Windows share). The scrap computer is configured to share /var/scan using Samba as ‘share’, of course. The printer is configured to use that share. Note that you might need in smb.conf this for very old printers:

client min protocol = LANMAN1
server min protocol = LANMAN1
client lanman auth = yes
client ntlmv2 auth = no
client plaintext auth = yes
ntlm auth = yes
security = share

And of course also something like this:

[scan]
path = /var/scan
writable = yes
browsable = yes
guest ok = yes
public = yes
create mask = 0777

First install software: apt-get install ocrmypdf inotify-tools screen bash

We need a script to perform OCR scan on a PDF. We’ll here use it in another script that monitors /var/scan for changes. Later in another post I’ll explain how to use it from Postfix’s master.cf on the attachments of an E-mail. Here is /usr/local/bin/fixpdf.sh:

! /bin/sh
a=$1
TMP=`mktemp -d -t XXXXX`
DIR=/var/scan
mkdir -p $DIR/ocr
cd $DIR
TIMESTAMP=`stat -c %Y "$a"`
ocrmypdf --force-ocr "$a" "$TMP/OCR-$a"
mv -f "$TMP/OCR-$a" "$DIR/ocr/$TIMESTAMP-$a"
chmod 777 "$DIR/ocr/$TIMESTAMP-$a"
cd /tmp
rm -rf $TMP

Note that I prepend the filename with a timestamp. That’s because my printer has no way to give the scanned files a good filename that I can use for my archiving purposes. You can of course do this different.

Now we want a script that monitors /var/scan and launches that fixpdf.sh script in the background each time a file is created.

My Xerox WorkCentre 7232 uses a directory called SCANFILE.LCK/ for its own file locking. When it is finished with a SCANFILE.PDF it deletes that LCK directory.

Being bad software developers the Xerox people didn’t use a POSIX rename for SCANFILE.PDF to do an atomic write operation at the end.

It looks like this:

inotifywait -r -m  /var/scan | 
while read file_path file_event file_name; do
echo ${file_path}${file_name} event: ${file_event}
done
Setting up watches. Beware: since -r was given, this may take a while!
Watches established.
/var/scan/ event: OPEN,ISDIR
/var/scan/ event: ACCESS,ISDIR
/var/scan/ event: ACCESS,ISDIR
/var/scan/ event: CLOSE_NOWRITE,CLOSE,ISDIR
/var/scan/ event: OPEN,ISDIR
/var/scan/ event: ACCESS,ISDIR
/var/scan/ event: ACCESS,ISDIR
/var/scan/ event: CLOSE_NOWRITE,CLOSE,ISDIR
/var/scan/XEROXSCAN003.LCK event: CREATE,ISDIR
/var/scan/XEROXSCAN003.LCK event: OPEN,ISDIR
/var/scan/XEROXSCAN003.LCK event: ACCESS,ISDIR
/var/scan/XEROXSCAN003.LCK event: CLOSE_NOWRITE,CLOSE,ISDIR
/var/scan/ event: OPEN,ISDIR
/var/scan/ event: ACCESS,ISDIR
/var/scan/ event: ACCESS,ISDIR
/var/scan/ event: CLOSE_NOWRITE,CLOSE,ISDIR
/var/scan/ event: OPEN,ISDIR
/var/scan/ event: ACCESS,ISDIR
/var/scan/ event: CLOSE_NOWRITE,CLOSE,ISDIR
/var/scan/XEROXSCAN003.PDF event: CREATE
/var/scan/XEROXSCAN003.PDF event: OPEN
/var/scan/XEROXSCAN003.PDF event: MODIFY
/var/scan/XEROXSCAN003.PDF event: MODIFY
...
/var/scan/XEROXSCAN003.PDF event: MODIFY
/var/scan/XEROXSCAN003.PDF event: MODIFY
/var/scan/XEROXSCAN003.PDF event: CLOSE_WRITE,CLOSE
/var/scan/XEROXSCAN003.PDF event: ATTRIB
/var/scan/XEROXSCAN003.LCK event: OPEN,ISDIR
/var/scan/XEROXSCAN003.LCK/ event: OPEN,ISDIR
/var/scan/XEROXSCAN003.LCK event: ACCESS,ISDIR
/var/scan/XEROXSCAN003.LCK/ event: ACCESS,ISDIR
/var/scan/XEROXSCAN003.LCK event: ACCESS,ISDIR
/var/scan/XEROXSCAN003.LCK/ event: ACCESS,ISDIR
/var/scan/XEROXSCAN003.LCK event: CLOSE_NOWRITE,CLOSE,ISDIR
/var/scan/XEROXSCAN003.LCK/ event: CLOSE_NOWRITE,CLOSE,ISDIR
/var/scan/XEROXSCAN003.LCK/ event: DELETE_SELF
/var/scan/XEROXSCAN003.LCK event: DELETE,ISDIR

The printer deleting that SCANFILE.LCK/ directory is a good moment to start our OCR script (call it for example /usr/local/bin/monitorscan.sh):

! /bin/bash
inotifywait -r -m -e DELETE,ISDIR /var/scan |
while read file_path file_event file_name; do
if [ ${file_event} = "DELETE,ISDIR" ]; then
if [[ ${file_name} == *"LCK" ]]; then
suffix=".LCK"
filename=`echo ${file_name} | sed -e "s/$suffix$//"`.PDF
/usr/local/bin/fixpdf.sh $filename &
fi
fi
done

Give both scripts 755 permissions with chmod and now you just run screen /usr/local/bin/monitorscan.sh

When your printer was written by good software developers, it will do POSIX rename. That looks like this (yes, also when done over a SMB network share):

inotifywait -r -m  /var/scan | 
while read file_path file_event file_name; do
echo ${file_path}${file_name} event: ${file_event}
done
Setting up watches. Beware: since -r was given, this may take a while!
Watches established.
/var/scan/ event: OPEN,ISDIR
/var/scan/ event: ACCESS,ISDIR
/var/scan/ event: CLOSE_NOWRITE,CLOSE,ISDIR
/var/scan/ event: OPEN,ISDIR
/var/scan/ event: ACCESS,ISDIR
/var/scan/ event: CLOSE_NOWRITE,CLOSE,ISDIR
/var/scan/ event: OPEN,ISDIR
/var/scan/ event: ACCESS,ISDIR
/var/scan/ event: CLOSE_NOWRITE,CLOSE,ISDIR
/var/scan/ event: OPEN,ISDIR
/var/scan/ event: ACCESS,ISDIR
/var/scan/ event: CLOSE_NOWRITE,CLOSE,ISDIR
/var/scan/.tmp123.GOODBRANDSCAN-123.PDF event: CREATE
/var/scan/.tmp123.GOODBRANDSCAN-123.PDF event: OPEN
/var/scan/.tmp123.GOODBRANDSCAN-123.PDF event: MODIFY
...
/var/scan/.tmp123.GOODBRANDSCAN-123.PDF event: MOVED_FROM
/var/scan/GOODBRANDSCAN-123.PDF event: MOVED_TO

That means that your parameters for inotifywait could be -r -m -e MOVED_TO and in ${file_name} you’ll have that GOODBRANDSCAN-123.PDF. This is of course better than Xerox’s way with their not invented here LCK things that probably also wouldn’t be necessary with a POSIX rename call.

I will document how to do this to the E-mail feature of the printer with Postfix later.

I first need a moment in my life where I actually need this hard enough that I will start figuring out how to extract certain attachment MIME parts from an E-mail with Posix’s master.cf. I guess I will have to look into CockooMX by Xavier Mertens for that. Update: that article is available now.

.

Wrong SMTP config so that your old scanner/printer can send mails

This is a warning: only do this at home, instead of don’t do this at home. On your local home network that is behind a firewall and/or gateway.

Edit: Philip Paeps sent me a few suggestions to restrict things even more. I have adapted this blog item to mention all of them.

Unfortunately there are companies that make and made printers and then don’t provide firmware upgrades for it.

You end up with a scanner/printer that works perfectly fine. Except you can’t find any SMTP servers without TLS anymore. Because rightfully so has more or less every E-mail provider turned off plain text SMTP.

Now your printer cannot send the scanned documents over E-mail anymore. Firmware upgrade of your scanner/printer? What if there is none?

What to do? Run a unencrypted postfix smtpd on some scrap computer on your local network, that relays your mails with its smtp client to a TLS enabled SMTP server.

apt-get install postfix # and select Internet site

I didn’t want fake security by having authentication on the smtpd side, as there will be no encryption between printer and our local postfix. When somebody listens on your local network they would not only have the PDFs that you scanned with your scanner/printer, they will also have those authentication tokens.

I suppose you can add authentication, but then at least don’t be silly and use usernames and passwords that you use somewhere else. Note that some really old scanners/printers also can’t do SMTP with authentication.

I used these relay_restrictions for smtpd. We will place the printer’s IP address in/within mynetworks, so we will relay for it through permit_mynetworks.

smtpd_relay_restrictions = permit_mynetworks reject_unauth_destination

The mydestination should be as restricted as possible so we’ll just empty it (no local delivery at all):

mydestination =
local_recipient_maps =
local_transport = error:local delivery is disabled

You can now also remove the local delivery agent from master.cf if you want. We restrict the mynetworks of course too. The scanner/printer is with DHCP on 192.168.0.0/24, so add that:

mydestination =
mynetworks = 127.0.0.0/8 192.168.0.0/24

Even better is to have your scanner/printer on a fixed IP address and then use that one IP address:

mynetworks = 127.0.0.0/8 192.168.0.11

In fact, when the scrap computer has a fixed IP address then you can further restrict things by using inet_interfaces of course:

inet_interfaces = 192.168.0.14, 127.0.0.1

And now we configure the relayhost, we will relay to our TLS enabled SMTP(s) server:

relayhost = [smtps.server.org]:587
smtp_sasl_auth_enable = yes
smtp_sasl_password_maps = hash:/etc/postfix/sasl_passwd
smtp_sasl_security_options = noanonymous
smtp_tls_security_level = encrypt
header_size_limit = 4096000
message_size_limit = 524288000

Now make a file /etc/postfix/sasl_passwd to add the username / password. After that of course run postmap /etc/postfix/sasl_passwd

[smtps.server.org]:587 username:password

We now on the printer configure the scrap computer’s local network IP address or DHCP hostname as SMTP server.

Other security considerations that are possible is to place printer and scrap computer on a VLAN or let there be a crossed UTP cable between them. But if you are going to do those things then you also know how to do it yourself. Such things do make sense: your easily hackable LED lamps and even more easily hackable Smart TV don’t need to talk to your printer nor this scrap computer. VLAN everything!

Going around media bans

For the people who want to know how to get around the ongoing media bans:

Do a Google search and type in ‘free ssh country‘. Use as ‘country‘ the country where the media is that you want to reach. Take the first hit. Create the SSH tunnel account that is freely available.

In a console (if you don’t have SSH, you can install it with for example Git Bash, Cygwin, etc on Windows. You can also use Putty of course – make a SOCKS tunnel). Note that the username and hostname will be different (the website will tell you).

ssh -D 1337 -q -C -N user-vpnthathost.com@countryNr.thathost.com

For Firefox: Settings->Network Settings [Settings]->Manual Proxy configuration. Type next to SOCKS Host: localhost, and next to Port: 1337. Next check ‘Proxy DNS when using SOCKS v5’. You have equivalent settings in Chrome and Chromium among other browsers.

There are many options as VPN service. You can also search with Tor-browser if you think the secret services want to kill you or something. It’s not very likely, though. But if you are paranoid, then I guess sure.

Tor-browser itself might by the way also work just fine.

How to make a movie from the history of wikipedia images

Install some software

apt-get install imagej wget imagemagick bash

Get a JSON with all the versions of a image on Wikipedia

Let’s assume you want to watch how the invasion in Ukraine took place. That’s this one:

https://en.wikipedia.org/w/api.php?action=query&titles=File%3A2022_Russian_invasion_of_Ukraine.svg&prop=imageinfo&iilimit=999&iiprop=%7Curl&format=json

We only need the URL, so iiprop=|url. If you need the timestamp (when was the image file made) then use iiprop=|url|timestamp. A list of them you can find here.

You can use wget or in your browser just do save file as (use RAW data at the top then, for example in Firefox).

Fetch the images

#! /bin/bash

mkdir svg
cd svg
ITEMS=`cat api-result.json | jq '.query.pages."-1".imageinfo | .[] | .url'`
for a in $ITEMS
do
wget "$a"
sleep 1
done

Convert them to PNGs:

Normally they are already in the right order. So no renaming should be needed. Else you have to add to the iiprop of the query ‘timestamp’ and then with jg you extract that from the JSON to for example add it somehow to the filenames.

Or just use touch to change the file’s last modification date after wget fetched it and then here you use UNIX find to loop in the right order, and write PNG files like 0001.png, 0002.png, etc.

#! /bin/bash
cd ..
mkdir png
for a in svg/*
do
# You can come up with a better translation for the filename
b=`echo $a | sed s/svg//g`
convert -density 250 -size 1546x1038 $a png/$b.png
done

Convert PNGs to a movie

  • Start imagej
  • File > Import > Image Sequence
  • Select the png directory where the images where converted to
  • File -> Save As -> AVI -> fill in 2 frames per second -> movie.avi

Convert into fake-news, propaganda, etc

  • Use another software to insert dramatic background music
  • Upload to youtube for fame and fortune or whatever
  • Buy likes, become a hipster, convert from hipster to influencer

Improving Qt

We are a few years further. A few years in which we all tried to make a difference.

I’m incredibly proud of my achievement of QTBUG-61928. At the time I thought I could never convince the Qt development team of changing their APIs. They did and today in Qt6 it’s all very much part of the package.

I want to thank Thiago and others. But I also think it’s a team effort. It might not be because of just me. But I still feel a little bit proud of having pushed this team just enough to make the changes.

I am now at a new Qt bug report. This time it’s about int64_t. I think that QModelIndex should be completely supporting it. Again, I think a lot. And I have a lot of opinions. But I anyway filed QTBUG-99312 for this.

Avast, Qt6 announcing new QPromise and QFuture APIs

Qt published its New_Features in Qt 6.0.

Some noteworthy items in their list:

  • QPromise allows setting values, progress and exceptions to QFuture
  • QFuture supports attaching continuations

I like to think I had my pirate-hook in it at least a little bit with QTBUG-61928.

I need to print this out and put it above my bed:

Thiago Macieira added a comment – 
You’re right
Philip Van Hoof added a comment – 
Damn, and I was worried the entire morning that I had been ranting again.
Thiago Macieira added a comment – 
oh, you were ranting. Doesn’t mean you’re wrong.

Thanks for prioritizing this Thiago.

Versionering van je product, dit opbergen in een versie controle systeem

Het komt uiteraard van één of ander forum. Maar ik wilde dit dus toch eventjes vereeuwigen in één van mijn fameuze ontzettend belangrijke blog posts. Zodat de ganse wereld het voor goed zou kunnen naslagen.

Het is ook belangrijk voor stel dat je me onder contract neemt. Zo weet je te minste wat deze zagevent continu in je organisatie zal zeggen tegen iedereen als advies hierover.

Dat doe ik dan vooral omdat ik best wel wat ervaring heb in wat niet werkt (dat vooral) en ook in wat wel werkt. Het is me opgevallen dat er erg veel geprobeerd wordt met alles waar ik ervaring in heb en waarvan ik weet dat het niet werkt.

Maarja. Iedere project manager wil bewijzen dat hij of zij dat kan bestieren dat wat bewezen is niet te werken. Op zichzelf is dat prima. Dan factureer ik als freelancer gewoon meer en langer geld uit zijn project-budget. Maar niettemin zal ik dus altijd het volgende adviseren:

Ze zouden misschien beter semver.org hanteren voor hun versie nummers. Dat maakt het voor andere techneuten eenvoudiger om te volgen:

0.0.z wil zeggen dat het eigenlijk nog niet gereleased is, maar nog volledig in ontwikkeling ligt. Die z ophogingen zijn incrementele stappen voor de ontwikkelaars zelf.

0.y.z wil net hetzelfde zeggen. Maar de ontwikkelaars zijn begonnen met het beoefenen van versionering. Je zou ook kunnen zeggen dat iedere y ophoging betekent dat er een test is gebeurd.

1.0.0 wil zeggen dat mensen buiten het eigen ontwikkel en test-team het product in gebruik (kunnen) nemen. Dat het getest is. Dat het werkt. Dat het stabiel is.

1.0.1 wil zeggen dat er 1 enkele bugfix was op 1.0.0 en dat er enkel die bugfix in zit.

1.0.2 wil zeggen dat er 2 zulke bugfixes zijn gedaan. Nadat 1.0.1 uitgebracht was.

1.1.0 wil zeggen dat er 1 extra feature is toegevoegd aan 1.0.0.

1.1.1 wil zeggen dat er 1 extra feature is toegevoegd aan 1.0.0 en dat er in die feature een bug zat. Of dat er een oud probleem in 1.0.0 zat en dat dat hersteld is. Maar dan brengt men naast de 1.1.1 ook een 1.0.3 uit. Die 1.0.3 heeft niet die 1 extra feature van 1.1.0 maar heeft dan wel de fix die 1.1.1 heeft, gebackport voor 1.0.0 (en eigenlijk voor 1.0.2)

2.0.0 wil zeggen dat er iets gewijzigd is aan de 1.y.z reeks dat projecten die afhankelijk er van zijn kan breken. Bv. een breaking API change (of een breaking ABI). Of men heeft iets weggehaald. Er is een grote wijziging geweest.

2.0.1 wil zeggen dat er 1 bug is gefixed in 2.0.0. 2.1.0 wil zeggen 1 feature toegevoegd aan 2.0.z.

Wil je dat nu in een systeem voor versiecontrole allemaal netjes bijhouden waarbij je de wijzigingen commit per commit en branch per branch en versie per versie en auteur per auteur wil kunnen vergelijken en opvragen, voor om het even welk soort documenten, dan gebruik je gitflow.

Dat er geen gevaar is

Na een zware periode waar ik zelf angst heb gehad, bedacht ik me een paar uur geleden:

Niets blijkt moeilijker te zijn dan te accepteren dat er geen gevaar is.

Ik heb besloten dat dit de nieuwe ondertoon van deze blog wordt. Wat dat juist wil zeggen? Dat een tiental van de komende blog artikels dat als lijn gaan aanhouden.

Clowns to the left of me are prevaricating (uitereraard een verwijzing naar de song die in Reservoir Dogs aanwezig was), is geschiedenis.

Wat was het vorige? Toen dacht ik er vast nog niet zo hard over na. Misschien zou ik dat nu beter ook niet doen? Ik denk te veel over ongeveer alles na.

Dus, daarom de nieuwe ondertitel:

Accepteer, dat er geen gevaar is.

Ik heb hem Nederlands gemaakt. Want de enige groepen die zich in mijn blog interesseren zijn a) jullie of b) misschien staatsveiligheid. Die laatste heeft in dat geval een budget om één en ander te laten vertalen en jullie spreken al Nederlands.

Goed ja. Er is wel wat gevaar natuurlijk. Maar we hebben het eigenlijk erg goed onder controle.

Live backups with VMWare ESXi, and no vSphere

To make live backups of a guest on an ESXi host’s SSH UNIX shell, you need to utilize the fact that when a snapshot of a VMDK file gets made, the original VMDK file turns so called read-only. Releasing the locks that would otherwise withhold vmkfstools from creating a clone.

This means that if you make a snapshot that you can use vmkfstools of the non-snapshot VMDK files from which the snapshot was made.

Let’s get started scripting this.

GUEST="GUESTNAME"
DISKS="$GUEST EXTRADISK"
SRC=/vmfs/volumes/STORAGE/$GUEST
DST=/vmfs/volumes/STORAGE/backup/$GUEST

First get the VmId:

VMID=`vim-cmd vmsvc/getallvms | grep $GUEST | cut -d " " -f -1`

Create a poor man’s backup snapshot on $GUEST:

vim-cmd vmsvc/snapshot.create $VMID backup poor-mans-backup 0 0

Create the clones of the non-snapshot VMDK files (the one without numbers after $DISK)

mkdir -p $DST
for DISK in $DISKS; do
   vmkfstools -i $SRC/$DISK.vmdk $DST/$DISK.vmdk -d sesparse
done

Now remove the snapshots from $GUEST:

vim-cmd vmsvc/snapshot.removeall $VMID

Now, copy the VMX file:

cp $SRC/$GUEST.vmx $DST/$GUEST.vmx

Alternatively you can use ghettoVCB which is a little program that does the same thing.

Automated provisioning with VMWare ESXi

For a Jenkins environment I had to automate the creation of a lot of identical build agents. Identical up until of course the network configuration. Sure I could have used Docker or what not. But the organization standardized on VMWare ESXi. So I had to work with the tools I got.

A neat trick that you can do with VMWare is to write so called guestinfo variables in the VMX file of your guests.

You can get SSH access to the UNIX-like environment of a VMWare ESXi host. In that environment you can do typical UNIX scripting.

First we prepare a template that has VMWare guest tools installed. We punch the zeros of the vmdk file and all that stuff. So that it’s nicely packaged and quick to make clones from. On the guest you do:

dd if=/dev/zero of=/largefile bs=10M ; rm /largefile

On the ESXi host you do:

vmkfstools --punchzero /vmfs/volumes/STORAGE/template/DISK.vmdk

Now you can for example do this (on the ESXi host’s UNIX environment):

SRC=/vmfs/volumes/STORAGE/template
DST=/vmfs/volumes/STORAGE/auto
mkdir -p $DST/$1

# Don't use cp to make copies of vmdk files. It'll just
# take ages longer as it will copy 0x0 bytes too.
# vmkfstools is what you should use instead
vmkfstools -i $SRC/DISK.vmdk $DST/$1/DISK.vmdk -d thin

# Replace some values in the destination VMX file
cat $SRC/TEMPLATE.vmx | sed s/TEMPLATE/$1/g > $DST/$1/$1.vmx

And now of course you add the guestinfo variables:

echo "guestinfo.HOSTN=$1" >> $DST/$1/$1.vmx
echo "guestinfo.EXTRA=$2" >> $DST/$1/$1.vmx

Now when the guest boots, you can make a script to read those guestinfo things out and let it for example configure itself (on the guest):

#! /bin/sh
HOSTN=`vmtoolsd --cmd "info-get guestinfo.HOSTN"`
EXTRA=`vmtoolsd --cmd "info-get guestinfo.EXTRA"`
if test "$EXTRA" = "provision"; then
   echo $HOSTN > /etc/hostname
   reboot
fi

Some other useful VMWare ESXi commands:

# Register the VMX as a new virtual machine
VIMID=`vim-cmd /solo/register $DST/$1/$1.vmx`

# Turn it on
vim-cmd /vmsvc/power.on $VIMID &

# Answer 'Copied' on the question whether it got
# copied or moved
sleep 2
VMMSG=`vim-cmd /vmsvc/message $VIMID | grep "Virtual machine message" | cut -d : -f -1 | cut -d " " -f 4`
if [ ! -z $VMMSG ]; then
    vim-cmd /vmsvc/message $VIMID $VMMSG 2
fi

That should be all you need. I’m sure we can adapt the $1.vmx file such that the question doesn’t get asked. But my solution with answering the question also worked for me.

Next thing we know you’re putting a loop around this and you just ‘programmed’ creating a few hundred Jenkins build agents on some powerful piece of ESXi equipment. Imagine that. Bread on the table and the entire flock of programmers of your customer happy.

But! Please don’t hire me to do your DevOps. I’ve been there before several times. It sucks. You get to herd brogrammers. They suck the blood out of you with their massive ignorance on almost all really simple standard things (like versioning, building, packaging, branching, etc. Anything that must not be invented here). Instead of people who take the time to be professional about their job and read five lines of documentation, they’ll waste your time with their nonsense self invented crap. Which you end up having to automate. Which they always make utterly impossible and (of course) non-standard. While the standard techniques are ten million times better and more easy.

Doing It Right examples on autotools, qmake, cmake and meson

About

I finished my earlier work on build environment examples. Illustrating how to do versioning on shared object files right with autotools, qmake, cmake and meson. You can find it here.

The DIR examples are examples for various build environments on how to create a good project structure that will build libraries that are versioned with libtool or have versioning that is equivalent to what libtool would deliver, have a pkg-config file and have a so called API version in the library’s name.

What is right?

Information on this can be found in the autotools mythbuster docs, the libtool docs on versioning and freeBSD’s chapter on shared libraries. I tried to ensure that what is written here works with all of the build environments in the examples.

libpackage-4.3.so.2.1.0, what is what?

You’ll notice that a library called ‘package’ will in your LIBDIR often be called something like libpackage-4.3.so.2.1.0

We call the 4.3 part the APIVERSION, and the 2.1.0 part the VERSION (the ABI version).

I will explain these examples using semantic versioning as APIVERSION and either libtool’s current:revision:age or a semantic versioning alternative as field for VERSION (like in FreeBSD and for build environments where compatibility with libtool’s -version-info feature ain’t a requirement).

Noting that with libtool’s -version-info feature the values that you fill in for current, age and revision will not necessarily be identical to what ends up as suffix of the soname in LIBDIR. The formula to form the filename’s suffix is, for libtool, “(current – age).age.revision”. This means that for soname libpackage-APIVERSION.so.2.1.0, you would need current=3, revision=0 and age=1.

The VERSION part

In case you want compatibility with or use libtool’s -version-info feature, the document libtool/version.html on autotools.io states:

The rules of thumb, when dealing with these values are:

  • Increase the current value whenever an interface has been added, removed or changed.
  • Always increase the revision value.
  • Increase the age value only if the changes made to the ABI are backward compatible.

The libtool’s -version-info feature‘s updating-version-info part of libtool’s docs states:

  1. Start with version information of ‘0:0:0’ for each libtool library.
  2. Update the version information only immediately before a public release of your software. More frequent updates are unnecessary, and only guarantee that the current interface number gets larger faster.
  3. If the library source code has changed at all since the last update, then increment revision (‘c:r:a’ becomes ‘c:r+1:a’).
  4. If any interfaces have been added, removed, or changed since the last update, increment current, and set revision to 0.
  5. If any interfaces have been added since the last public release, then increment age.
  6. If any interfaces have been removed or changed since the last public release, then set age to 0.

When you don’t care about compatibility with libtool’s -version-info feature, then you can take the following simplified rules for VERSION:

  • SOVERSION = Major version
  • Major version: increase it if you break ABI compatibility
  • Minor version: increase it if you add ABI compatible features
  • Patch version: increase it for bug fix releases.

Examples when these simplified rules are or can be applicable is in build environments like cmake, meson and qmake. When you use autotools you will be using libtool and then they ain’t applicable.

The APIVERSION part

For the API version I will use the rules from semver.org. You can also use the semver rules for your package’s version:

Given a version number MAJOR.MINOR.PATCH, increment the:

  1. MAJOR version when you make incompatible API changes,
  2. MINOR version when you add functionality in a backwards-compatible manner, and
  3. PATCH version when you make backwards-compatible bug fixes.

When you have an API, that API can change over time. You typically want to version those API changes so that the users of your library can adopt to newer versions of the API while at the same time other users still use older versions of your API. For this we can follow section 4.3. called “multiple libraries versions” of the autotools mythbuster documentation. It states:

In this situation, the best option is to append part of the library’s version information to the library’s name, which is exemplified by Glib’s libglib-2.0.so.0 > soname. To do so, the declaration in the Makefile.am has to be like this:

lib_LTLIBRARIES = libtest-1.0.la

libtest_1_0_la_LDFLAGS = -version-info 0:0:0

The pkg-config file

Many people use many build environments (autotools, qmake, cmake, meson, you name it). Nowadays almost all of those build environments support pkg-config out of the box. Both for generating the file as for consuming the file for getting information about dependencies.

I consider it a necessity to ship with a useful and correct pkg-config .pc file. The filename should be /usr/lib/pkgconfig/package-APIVERSION.pc for soname libpackage-APIVERSION.so.VERSION. In our example that means /usr/lib/pkgconfig/package-4.3.pc. We’d use the command pkg-config package-4.3 –cflags –libs, for example.

Examples are GLib’s pkg-config file, located at /usr/lib/pkgconfig/glib-2.0.pc

The include path

I consider it a necessity to ship API headers in a per API-version different location (like for example GLib’s, at /usr/include/glib-2.0). This means that your API version number must be part of the include-path.

For example using earlier mentioned API-version 4.3, /usr/include/package-4.3 for /usr/lib/libpackage-4.3.so(.2.1.0) having /usr/lib/pkg-config/package-4.3.pc

What will the linker typically link with?

The linker will for -lpackage-4.3 typically link with /usr/lib/libpackage-4.3.so.2 or with libpackage-APIVERSION.so.(current – age). Noting that the part that is calculated as (current – age) in this example is often, for example in cmake and meson, referred to as the SOVERSION. With SOVERSION the soname template in LIBDIR is libpackage-APIVERSION.so.SOVERSION.

What is wrong?

Not doing any versioning

Without versioning you can’t make any API or ABI changes that wont break all your users’ code in a way that could be managable for them. If you do decide not to do any versioning, then at least also don’t put anything behind the .so part of your so’s filename. That way, at least you wont break things in spectacular ways.

Coming up with your own versioning scheme

Knowing it better than the rest of the world will in spectacular ways make everything you do break with what the entire rest of the world does. You shouldn’t congratulate yourself with that. The only thing that can be said about it is that it probably makes little sense, and that others will probably start ignoring your work. Your mileage may vary. Keep in mind that without a correct SOVERSION, certain things will simply not work correct.

In case of libtool: using your package’s (semver) release numbering for current, revision, age

This is similarly wrong to ‘Coming up with your own versioning scheme’.

The Libtool documentation on updating version info is clear about this:

Never try to set the interface numbers so that they correspond to the release number of your package. This is an abuse that only fosters misunderstanding of the purpose of library versions.

This basically means that once you are using libtool, also use libtool’s versioning rules.

Refusing or forgetting to increase the current and/or SOVERSION on breaking ABI changes

The current part of the VERSION (current, revision and age) minus age, or, SOVERSION is/are the most significant field(s). The current and age are usually involved in forming the so called SOVERSION, which in turn is used by the linker to know with which ABI version to link. That makes it … damn important.

Some people think ‘all this is just too complicated for me’, ‘I will just refuse to do anything and always release using the same version numbers’. That goes spectacularly wrong whenever you made ABI incompatible changes. It’s similarly wrong to ‘Coming up with your own versioning scheme’.

That way, all programs that link with your shared library can after your shared library gets updated easily crash, can corrupt data and might or might not work.

By updating the current and age, or, SOVERSION you will basically trigger people who manage packages and their tooling to rebuild programs that link with your shared library. You actually want that the moment you made breaking ABI changes in a newer version of it.

When you don’t want to care about libtool’s -version-info feature, then there is also a set of more simple to follow rules. Those rules are for VERSION:

  • SOVERSION = Major version (with these simplified set of rules, no subtracting of current with age is needed)
  • Major version: increase it if you break ABI compatibility
  • Minor version: increase it if you add ABI compatible features
  • Patch version: increase it for bug fix releases.

What isn’t wrong?

Not using libtool (but nonetheless doing ABI versioning right)

GNU libtool was made to make certain things more easy. Nowadays many popular build environments also make things more easy. Meanwhile has GNU libtool been around for a long time. And its versioning rules, commonly known as the current:revision:age field as parameter for -verison-info, got widely adopted.

What GNU libtool did was, however, not really a standard. It’s is one interpretation of how to do it. And a rather complicated one, at that.

Please let it be crystal clear that not using libtool does not mean that you can do ABI versioning wrong. Because very often people seem to think that they can, and think they’ll still get out safely while doing ABI versioning completely wrong. This is not the case.

Not having a APIVERSION at all

It isn’t wrong not to have an APIVERSION in the soname. It however means that you promise to not ever break API. Because the moment you break API, you disallow your users to stay on the old API for a little longer. They might both have programs that use the old and that use the new API. Now what?

When you have an APIVERSION then you can allow the introduction of a new version of the API while simultaneously the old API remains available on a user’s system.

Using a different naming-scheme for APIVERSION

I used the MAJOR.MINOR version numbers from semver to form the APIVERSION. I did this because only the MAJOR and the MINOR are technically involved in API changes (unless you are doing semantic versioning wrong – in which case see ‘Coming up with your own versioning scheme’).

Some projects only use MAJOR. Examples are Qt which puts the MAJOR number behind the Qt part. For example libQt5Core.so.VERSION (so that’s “Qt” + MAJOR + Module). The GLib world, however, uses “g” + Module + “-” + MAJOR + “.0” as they have releases like 2.2, 2.3, 2.4 that are all called libglib-2.0.so.VERSION. I guess they figured that maybe someday in their 2.x series, they could use that MINOR field?

DBus seems to be using a similar thing to GLib, but then without the MINOR suffix: libdbus-1.so.VERSION. For their GLib integration they also use it as libdbus-glib-1.so.VERSION.

Who is right, who is wrong? It doesn’t matter too much for your APIVERSION naming scheme. As long as there is a way to differentiate the API in a) the include path, b) the pkg-config filename and c) the library that will be linked with (the -l parameter during linking/compiling). Maybe someday a standard will be defined? Let’s hope so.

Differences in interpretation per platform

FreeBSD

FreeBSD’s Shared Libraries of Chapter 5. Source Tree Guidelines and Policies states:

The three principles of shared library building are:

  1. Start from 1.0
  2. If there is a change that is backwards compatible, bump minor number (note that ELF systems ignore the minor number)
  3. If there is an incompatible change, bump major number

For instance, added functions and bugfixes result in the minor version number being bumped, while deleted functions, changed function call syntax, etc. will force the major version number to change.

I think that when using libtool on a FreeBSD (when you use autotools), that the platform will provide a variant of libtool’s scripts that will convert earlier mentioned current, revision and age rules to FreeBSD’s. The same goes for the VERSION variable in cmake and qmake. Meaning that with those tree build environments, you can just use the rules for GNU libtool’s -version-info.

I could be wrong on this, but I did find mailing list E-mails from ~ 2011 stating that this SNAFU is dealt with. Besides, the *BSD porters otherwise know what to do and you could of course always ask them about it.

Note that FreeBSD’s rules are or seem to be compatible with the rules for VERSION when you don’t want to care about libtool’s -version-info compatibility. However, when you are porting from a libtoolized project, then of course you don’t want to let newer releases break against releases that have already happened.

Modern Linux distributions

Nowadays you sometimes see things like /usr/lib/$ARCH/libpackage-APIVERSION.so linking to /lib/$ARCH/libpackage-APIVERSION.so.VERSION. I have no idea how this mechanism works. I suppose this is being done by packagers of various Linux distributions? I also don’t know if there is a standard for this.

I will update the examples and this document the moment I know more and/or if upstream developers need to worry about it. I think that using GNUInstallDirs in cmake, for example, makes everything go right. I have not found much for this in qmake, meson seems to be doing this by default and in autotools you always use platform variables for such paths.

As usual, I hope standards will be made and that the build environment and packaging community gets to their senses and stops leaving this into the hands of developers. I especially think about qmake, which seems to not have much at all to state that standardized installation paths must be used (not even a proper way to define a prefix).

Questions that I can imagine already exist

Why is there there a difference between APIVERSION and VERSION?

The API version is the version of your programmable interfaces. This means the version of your header files (if your programming language has such header files), the version of your pkgconfig file, the version of your documentation. The API is what software developers need to utilize your library.

The ABI version can definitely be different and it is what programs that are compiled and installable need to utilize your library.

An API breaks when recompiling the program without any changes, that consumes a libpackage-4.3.so.2, is not going to succeed at compile time. The API got broken the moment any possible way package’s API was used, wont compile. Yes, any way. It means that a libpackage-5.0.so.0 should be started.

An ABI breaks when without recompiling the program, replacing a libpackage-4.3.so.2.1.0 with a libpackage-4.3.so.2.2.0 or a libpackage-4.3.so.2.1.1 (or later) as libpackage-4.3.so.2 is not going to succeed at runtime. For example because it would crash, or because the results would be wrong (in any way). It implies that libpackage-4.3.so.2 shouldn’t be overwritten, but libpackage-4.3.so.3 should be started.

For example when you change the parameter of a function in C to be a floating point from a integer (and/or the other way around), then that’s an ABI change but not neccesarily an API change.

What is this SOVERSION about?

In most projects that got ported from an environment that uses GNU libtool (for example autotools) to for example cmake or meson, and in the rare cases that they did anything at all in a qmake based project, I saw people converting the current, revision and age parameters that they passed to the -version-info option of libtool to a string concatenated together using (current – age), age, revision as VERSION, and (current – age) as SOVERSION.

I wanted to use the exact same rules for versioning for all these examples, including autotools and GNU libtool. When you don’t have to (or want to) care about libtool’s set of (for some people, needlessly complicated) -version-info rules, then it should be fine using just SOVERSION and VERSION using these rules:

  • SOVERSION = Major version
  • Major version: increase it if you break ABI compatibility
  • Minor version: increase it if you add ABI compatible features
  • Patch version: increase it for bug fix releases.

I, however, also sometimes saw variations that are incomprehensible with little explanation and magic foo invented on the spot. Those variations are probably wrong.

In the example I made it so that in the root build file of the project you can change the numbers and calculation for the numbers. However. Do follow the rules for those correctly, as this versioning is about ABI compatibility. Doing this wrong can make things blow up in spectacular ways.

The examples

qmake in the qmake-example

Note that the VERSION variable must be filled in as “(current – age).age.revision” for qmake (to get 2.1.0 at the end, you need VERSION=2.1.0 when current=3, revision=0 and age=1)

To try this example out, go to the qmake-example directory and type

$ cd qmake-example
$ mkdir=_test
$ qmake PREFIX=$PWD/_test
$ make
$ make install

This should give you this:

$ find _test/
_test/
├── include
│   └── qmake-example-4.3
│       └── qmake-example.h
└── lib
    ├── libqmake-example-4.3.so -> libqmake-example-4.3.so.2.1.0
    ├── libqmake-example-4.3.so.2 -> libqmake-example-4.3.so.2.1.0
    ├── libqmake-example-4.3.so.2.1 -> libqmake-example-4.3.so.2.1.0
    ├── libqmake-example-4.3.so.2.1.0
    ├── libqmake-example-4.la
    └── pkgconfig
        └── qmake-example-4.3.pc

When you now use pkg-config, you get a nice CFLAGS and LIBS line back (I’m replacing the current path with $PWD in the output each time):

$ export PKG_CONFIG_PATH=$PWD/_test/lib/pkgconfig
$ pkg-config qmake-example-4.3 --cflags
-I$PWD/_test/include/qmake-example-4.3
$ pkg-config qmake-example-4.3 --libs
-L$PWD/_test/lib -lqmake-example-4.3

And it means that you can do things like this now (and people who know about pkg-config will now be happy to know that they can use your library in their own favorite build environment).

$ export LD_LIBRARY_PATH=$PWD/_test/lib
$ echo -en "#include <qmake-example.h>\nmain() {} " > test.cpp
$ g++ -fPIC test.cpp -o test.o `pkg-config qmake-example-4.3 --libs --cflags`

You can see that it got linked to libqmake-example-4.3.so.2, where that 2 at the end is (current – age).

$ ldd test.o 
    linux-gate.so.1 (0xb77b0000)
    libqmake-example-4.3.so.2 => $PWD/_test/lib/libqmake-example-4.3.so.2 (0xb77a6000)
    libstdc++.so.6 => /usr/lib/i386-linux-gnu/libstdc++.so.6 (0xb75f5000)
    libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xb759e000)
    libgcc_s.so.1 => /lib/i386-linux-gnu/libgcc_s.so.1 (0xb7580000)
    libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb73c9000)
    /lib/ld-linux.so.2 (0xb77b2000)

cmake in the cmake-example

Note that the VERSION property on your library target must be filled in with “(current – age).age.revision” for cmake (to get 2.1.0 at the end, you need VERSION=2.1.0 when current=3, revision=0 and age=1. Note that in cmake you must also fill in the SOVERSION property as (current – age), so SOVERSION=2 when current=3 and age=1).

To try this example out, go to the cmake-example directory and do

$ cd cmake-example
$ mkdir _test
$ cmake -DCMAKE_INSTALL_PREFIX:PATH=$PWD/_test
-- Configuring done
-- Generating done
-- Build files have been written to: .
$ make
[ 50%] Building CXX object src/libs/cmake-example/CMakeFiles/cmake-example.dir/cmake-example.cpp.o
[100%] Linking CXX shared library libcmake-example-4.3.so
[100%] Built target cmake-example
$ make install
[100%] Built target cmake-example
Install the project...
-- Install configuration: ""
-- Installing: $PWD/_test/lib/libcmake-example-4.3.so.2.1.0
-- Up-to-date: $PWD/_test/lib/libcmake-example-4.3.so.2
-- Up-to-date: $PWD/_test/lib/libcmake-example-4.3.so
-- Up-to-date: $PWD/_test/include/cmake-example-4.3/cmake-example.h
-- Up-to-date: $PWD/_test/lib/pkgconfig/cmake-example-4.3.pc

This should give you this:

$ tree _test/
_test/
├── include
│   └── cmake-example-4.3
│       └── cmake-example.h
└── lib
    ├── libcmake-example-4.3.so -> libcmake-example-4.3.so.2
    ├── libcmake-example-4.3.so.2 -> libcmake-example-4.3.so.2.1.0
    ├── libcmake-example-4.3.so.2.1.0
    └── pkgconfig
        └── cmake-example-4.3.pc

When you now use pkg-config, you get a nice CFLAGS and LIBS line back (I’m replacing the current path with $PWD in the output each time):

$ pkg-config cmake-example-4.3 --cflags
-I$PWD/_test/include/cmake-example-4.3
$ pkg-config cmake-example-4.3 --libs
-L$PWD/_test/lib -lcmake-example-4.3

And it means that you can do things like this now (and people who know about pkg-config will now be happy to know that they can use your library in their own favorite build environment):

$ echo -en "#include <cmake-example.h>\nmain() {} " > test.cpp
$ g++ -fPIC test.cpp -o test.o `pkg-config cmake-example-4.3 --libs --cflags`

You can see that it got linked to libcmake-example-4.3.so.2, where that 2 at the end is the SOVERSION. This is (current – age).

$ ldd test.o
    linux-gate.so.1 (0xb7729000)
    libcmake-example-4.3.so.2 => $PWD/_test/lib/libcmake-example-4.3.so.2 (0xb771f000)
    libstdc++.so.6 => /usr/lib/i386-linux-gnu/libstdc++.so.6 (0xb756e000)
    libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xb7517000)
    libgcc_s.so.1 => /lib/i386-linux-gnu/libgcc_s.so.1 (0xb74f9000)
    libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb7342000)
    /lib/ld-linux.so.2 (0xb772b000)

autotools in the autotools-example

Note that you pass -version-info current:revision:age directly with autotools. The libtool will translate that to (current – age).age.revision to form the so’s filename (to get 2.1.0 at the end, you need current=3, revision=0, age=1).

To try this example out, go to the autotools-example directory and do

$ cd autotools-example
$ mkdir _test
$ libtoolize
$ aclocal
$ autoheader
$ autoconf
$ automake --add-missing
$ ./configure --prefix=$PWD/_test
$ make
$ make install

This should give you this:

$ tree _test/
_test/
├── include
│   └── autotools-example-4.3
│       └── autotools-example.h
└── lib
    ├── libautotools-example-4.3.a
    ├── libautotools-example-4.3.la
    ├── libautotools-example-4.3.so -> libautotools-example-4.3.so.2.1.0
    ├── libautotools-example-4.3.so.2 -> libautotools-example-4.3.so.2.1.0
    ├── libautotools-example-4.3.so.2.1.0
    └── pkgconfig
        └── autotools-example-4.3.pc

When you now use pkg-config, you get a nice CFLAGS and LIBS line back (I’m replacing the current path with $PWD in the output each time):

$ export PKG_CONFIG_PATH=$PWD/_test/lib/pkgconfig
$ pkg-config autotools-example-4.3 --cflags
-I$PWD/_test/include/autotools-example-4.3
$ pkg-config autotools-example-4.3 --libs
-L$PWD/_test/lib -lautotools-example-4.3

And it means that you can do things like this now (and people who know about pkg-config will now be happy to know that they can use your library in their own favorite build environment):

$ echo -en "#include <autotools-example.h>\nmain() {} " > test.cpp
$ export LD_LIBRARY_PATH=$PWD/_test/lib
$ g++ -fPIC test.cpp -o test.o `pkg-config autotools-example-4.3 --libs --cflags`

You can see that it got linked to libautotools-example-4.3.so.2, where that 2 at the end is (current – age).

$ ldd test.o 
    linux-gate.so.1 (0xb778d000)
    libautotools-example-4.3.so.2 => $PWD/_test/lib/libautotools-example-4.3.so.2 (0xb7783000)
    libstdc++.so.6 => /usr/lib/i386-linux-gnu/libstdc++.so.6 (0xb75d2000)
    libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xb757b000)
    libgcc_s.so.1 => /lib/i386-linux-gnu/libgcc_s.so.1 (0xb755d000)
    libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb73a6000)
    /lib/ld-linux.so.2 (0xb778f000)

meson in the meson-example

Note that the version property on your library target must be filled in with “(current – age).age.revision” for meson (to get 2.1.0 at the end, you need version=2.1.0 when current=3, revision=0 and age=1. Note that in meson you must also fill in the soversion property as (current – age), so soversion=2 when current=3 and age=1).

To try this example out, go to the meson-example directory and do

$ cd meson-example
$ mkdir -p _build/_test
$ cd _build
$ meson .. --prefix=$PWD/_test
$ ninja
$ ninja install

This should give you this:

$ tree _test/
_test/
├── include
│   └── meson-example-4.3
│       └── meson-example.h
└── lib
    └── i386-linux-gnu
        ├── libmeson-example-4.3.so -> libmeson-example-4.3.so.2.1.0
        ├── libmeson-example-4.3.so.2 -> libmeson-example-4.3.so.2.1.0
        ├── libmeson-example-4.3.so.2.1.0
        └── pkgconfig
            └── meson-example-4.3.pc

When you now use pkg-config, you get a nice CFLAGS and LIBS line back (I’m replacing the current path with $PWD in the output each time):

$ export PKG_CONFIG_PATH=$PWD/_test/lib/i386-linux-gnu/pkgconfig
$ pkg-config meson-example-4.3 --cflags
-I$PWD/_test/include/meson-example-4.3
$ pkg-config meson-example-4.3 --libs
-L$PWD/_test/lib -lmeson-example-4.3

And it means that you can do things like this now (and people who know about pkg-config will now be happy to know that they can use your library in their own favorite build environment):

$ echo -en "#include <meson-example.h>\nmain() {} " > test.cpp
$ export LD_LIBRARY_PATH=$PWD/_test/lib/i386-linux-gnu
$ g++ -fPIC test.cpp -o test.o `pkg-config meson-example-4.3 --libs --cflags`

You can see that it got linked to libmeson-example-4.3.so.2, where that 2 at the end is the soversion. This is (current – age).

$ ldd test.o 
    linux-gate.so.1 (0xb772e000)
    libmeson-example-4.3.so.2 => $PWD/_test/lib/i386-linux-gnu/libmeson-example-4.3.so.2 (0xb7724000)
    libstdc++.so.6 => /usr/lib/i386-linux-gnu/libstdc++.so.6 (0xb7573000)
    libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xb751c000)
    libgcc_s.so.1 => /lib/i386-linux-gnu/libgcc_s.so.1 (0xb74fe000)
    libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb7347000)
    /lib/ld-linux.so.2 (0xb7730000)

Doing it right, making libraries using popular build environments

Enough with the political posts!

Making libraries that are both API and libtool versioned with qmake, how do they do it?

I started a project on github that will collect what I will call “doing it right” project structures for various build environments.

With right I mean that the library will have a API version in its Library name, that the library will be libtoolized and that a pkg-config .pc file gets installed for it.

I have in mind, for example, autotools, cmake, meson, qmake and plain make. First example that I have finished is one for qmake.

Let’s get started working on a libqmake-example-3.2.so.3.2.1

We get the PREFIX, MAJOR_VERSION, MINOR_VERSION and PATCH_VERSION from a project-wide include

include(../../../qmake-example.pri)

We will use the standard lib template of qmake

TEMPLATE = lib

We need to set VERSION to a semver.org version for compile_libtool (in reality it should use what is called current, revision and age to form an API and ABI version number. In the actual example it’s explained in the comments, as this is too much for a small blog post).

VERSION = $${MAJOR_VERSION}"."$${MINOR_VERSION}"."$${PATCH_VERSION}

According section 4.3 of Autotools’ mythbusters we should have as target-name the API version in the library’s name

TARGET = qmake-example-$${MAJOR_VERSION}"."$${MINOR_VERSION}

We will write a define in config.h for access to the semver.org version as a double quoted string

QMAKE_SUBSTITUTES += config.h.in

Our example happens to use QDebug, so we need QtCore here

QT = core

This is of course optional

CONFIG += c++14

We will be using libtool style libraries

CONFIG += compile_libtool
CONFIG += create_libtool

These will create a pkg-config .pc file for us

CONFIG += create_pc create_prl no_install_prl

Project sources

SOURCES = qmake-example.cpp

Project’s public and private headers

HEADERS = qmake-example.h

We will install the headers in a API specific include path

headers.path = $${PREFIX}/include/qmake-example-$${MAJOR_VERSION}"."$${MINOR_VERSION}

Here put only the publicly installed headers

headers.files = $${HEADERS}

Here we will install the library to

target.path = $${PREFIX}/lib

This is the configuration for generating the pkg-config file

QMAKE_PKGCONFIG_NAME = $${TARGET}
QMAKE_PKGCONFIG_DESCRIPTION = An example that illustrates how to do it right with qmake
# This is our libdir
QMAKE_PKGCONFIG_LIBDIR = $$target.path
# This is where our API specific headers are
QMAKE_PKGCONFIG_INCDIR = $$headers.path
QMAKE_PKGCONFIG_DESTDIR = pkgconfig
QMAKE_PKGCONFIG_PREFIX = $${PREFIX}
QMAKE_PKGCONFIG_VERSION = $$VERSION
# These are dependencies that our library needs
QMAKE_PKGCONFIG_REQUIRES = Qt5Core

Installation targets (the pkg-config seems to install automatically)

INSTALLS += headers target

This will be the result after make-install

├── include
│   └── qmake-example-3.2
│       └── qmake-example.h
└── lib
    ├── libqmake-example-3.2.so -> libqmake-example-3.2.so.3.2.1
    ├── libqmake-example-3.2.so.3 -> libqmake-example-3.2.so.3.2.1
    ├── libqmake-example-3.2.so.3.2 -> libqmake-example-3.2.so.3.2.1
    ├── libqmake-example-3.2.so.3.2.1
    ├── libqmake-example-3.la
    └── pkgconfig
        └── qmake-example-3.pc

ps. Dear friends working at their own customers: when I visit your customer, I no longer want to see that you produced completely stupid wrong qmake based projects for them. Libtoolize it all, get an API version in your Library’s so-name and do distribute a pkg-config .pc file. That’s the very least to pass your exam. Also read this document (and stop pretending that you don’t need to know this when at the same time you charge them real money pretending that you know something about modern UNIX software development).

The upcoming NATO top

I said it before, we shouldn’t finance the US’s war-industry any longer. It’s not a reliable partner.

I’m sticking to my guns on this one,

Let’s build ourselves a European army, utilizing European technology. Build, engineered and manufactured by Europeans.

We engineers are ready. Let us do it.

PGP voor militaire zaken, nee?

Wordt het eens geen tijd dat ons centrum voor cybersecurity overheidsdiensten zoals het Belgisch leger oplegt om steeds a.d.h.v. met bv. PGP (minimaal) getekende (en hopelijk ook geëncrypteerde) E-mails te communiceren? Ja ja. We kunnen ze zelfs encrypteren. Hightech at Belgium. Stel je dat maar eens voor. Waanzin!

Stel je voor. Men zou zowel de E-mail (de content, het bericht zelf) kunnen verifiëren, als de afzender als dat men tijdens de transit én opslag van het bericht de inhoud zou kunnen encrypteren. Bij een eventueel “onafhankelijk” onderzoek zouden we (wiskundige) garanties hebben dat één en ander nu exact is zoals hoe het toen verstuurd werd.

Allemaal zaken die erg handig zouden geweest zijn in de saga over de E-mails over of onze F-16 vliegtuigen langer kunnen vliegen of niet.

Bij de ICT diensten van de oppositiepartijen zou men dan een opleiding van een halfuurtje kunnen krijgen over hoe ze met PGP in de hand één en ander cryptografisch kunnen verifiëren.

ps. Ik weet ook wel dat, in het wereldje waar het over gaat, nu net het feit dat bepaalde zaken achteraf niet meer te achterhalen zijn als waardevolle feature gezien wordt.

Wij hebben in Leuven de beste cryptografen van de wereld zitten. Maar ons Belgisch leger kan dit niet implementeren voor hun E-mails?

Clowns to the right of me

Wat ontbreekt* in de aanpassingen van het voorstel voor de aftapwet van de Nederlandse overheid is een rechterlijke toetsing van de proportionaliteit om al dan niet over te gaan tot een digitale zoeking. Zo’n zoeking is wat mij betreft gelijkaardig aan een huiszoeking.

Dit is onontbeerlijk in een verlichte samenleving waar de drie machten gescheiden zijn.

Spinoza, dé bodemvoorbereider voor de bodem waarop de verlichtingsfilosofie werd gebaseerd, was een Amsterdammer. Het is dus een aardshock voor zij die zich met filosofie bezig houden mee te maken dat Nederland niet meer mee doet.

Noot* dat de toetsingscommissie bestaat uit twee Nederlandse rechters. Twee zulke rechters kunnen nooit een degelijk proportionaliteitsonderzoek uitvoeren voor alle aanvragen.