Truly huge files and the problem of continuous virtual address space

As we all know does mmap, or even worse on Windows CreateFileMapping, need contiguous virtual address space for a given mapping size. That can become a problem when you want to load a file of a gigabyte with mmap.

The solution is of course to mmap the big file using multiple mappings. For example like adapting yesterday’s demo this way:

void FileModel::setFileName(const QString &fileName)
{
    ...
    if (m_file->open(QIODevice::ReadOnly)) {
        if (m_file->size() > MAX_MAP_SIZE) {
            m_mapSize = MAX_MAP_SIZE;
            m_file_maps.resize(1 + m_file->size() / MAX_MAP_SIZE, nullptr);
        } else {
            m_mapSize = static_cast(m_file->size());
            m_file_maps.resize(1, nullptr);
        }
        ...
    } else {
        m_index->open(QFile::ReadOnly);
        m_rowCount = m_index->size() / 4;
    }
    m_file_maps[0] = m_file->map(0, m_mapSize, QFileDevice::NoOptions);
    qDebug() << "Done loading " << m_rowCount << " lines";
    map_index = m_index->map(0, m_index->size(), QFileDevice::NoOptions);

    beginResetModel();
    endResetModel();
    emit fileNameChanged();
}

And in the data() function:

QVariant FileModel::data( const QModelIndex& index, int role ) const
{
    QVariant ret;
    ...
    quint32 mapIndex = pos_i / MAX_MAP_SIZE;
    quint32 map_pos_i = pos_i % MAX_MAP_SIZE;
    quint32 map_end_i = end_i % MAX_MAP_SIZE;
    uchar* map_file = m_file_maps[mapIndex];
    if (map_file == nullptr)
        map_file = m_file_maps[mapIndex] = m_file->map(mapIndex * m_mapSize, m_mapSize, QFileDevice::NoOptions);
    position = m_file_maps[mapIndex] + map_pos_i;
    if (position) {
            const int length = static_cast(end_i - pos_i);
            char *buffer = (char*) alloca(length+1);
            if (map_end_i >= map_pos_i)
                strncpy (buffer, (char*) position, length);
            else {
                const uchar *position2 = m_file_maps[mapIndex+1];
                if (position2 == nullptr) {
                    position2 = m_file_maps[mapIndex+1] = m_file->map((mapIndex+1) *
                         m_mapSize, m_mapSize, QFileDevice::NoOptions);
                }
                strncpy (buffer, (char*) position, MAX_MAP_SIZE - map_pos_i);
                strncpy (buffer + (MAX_MAP_SIZE - map_pos_i), (char*) position2, map_end_i);
            }
            buffer[length] = 0;
            ret = QVariant(QString(buffer));
        }
    }
    return ret;
}

You could also not use mmap for the very big source text file and use m_file.seek(map_pos_i) and m_file.read(buffer, length). The most important mapping is of course the index one, as the reading of the individual lines can also be done fast enough with normal read() calls (as long as you don’t have to do it for each and every line of the very big file and as long as you know in a O(1) way where the QAbstractListModel’s index.row()’s data is).

But you already knew that. Right?

Loading truly truly huge text files with a QAbstractListModel

Sometimes people want to do crazy stuff like loading a gigabyte sized plain text file into a Qt view that can handle QAbstractListModel. Like for example a QML ListView. You know, the kind of files you generate with this commando:

base64 /dev/urandom | head -c 100000000 > /tmp/file.txt

But, how do they do it?

FileModel.h

So we will make a custom QAbstractListModel. Its private member fields I will explain later:

#ifndef FILEMODEL_H
#define FILEMODEL_H

#include <QObject>
#include <QVariant>
#include <QAbstractListModel>
#include <QFile>

class FileModel: public QAbstractListModel {
    Q_OBJECT

    Q_PROPERTY(QString fileName READ fileName WRITE setFileName NOTIFY fileNameChanged )
public:
    explicit FileModel( QObject* a_parent = nullptr );
    virtual ~FileModel();

    int columnCount(const QModelIndex &parent) const;
    int rowCount( const QModelIndex& parent =  QModelIndex() ) const Q_DECL_OVERRIDE;
    QVariant data( const QModelIndex& index, int role = Qt::DisplayRole ) const  Q_DECL_OVERRIDE;
    QVariant headerData( int section, Qt::Orientation orientation,
                         int role = Qt::DisplayRole ) const  Q_DECL_OVERRIDE;
    void setFileName(const QString &fileName);
    QString fileName () const
        { return m_file->fileName(); }
signals:
    void fileNameChanged();
private:
    QFile *m_file, *m_index;
    uchar *map_file;
    uchar *map_index;
    int m_rowCount;
    void clear();
};

#endif// FILEMODEL_H

FileModel.cpp

We will basically scan the very big source text file for newline characters. We’ll write the offsets of those to a file suffixed with “.mmap”. We’ll use that new file as a sort of “partition table” for the very big source text file, in the data() function of QAbstractListModel. But instead of sectors and files, it points to newlines.

The reason why the scanner itself isn’t using the mmap’s address space is because apparently reading blocks of 4kb is faster than reading each and every byte from the mmap in search of \n characters. Or at least on my hardware it was.

You should probably do the scanning in small qEventLoop iterations (make sure to use nonblocking reads, then) or in a thread, as your very big source text file can be on a unreliable or slow I/O device. Plus it’s very big, else you wouldn’t be doing this (please promise me to just read the entire text file in memory unless it’s hundreds of megabytes in size: don’t micro optimize your silly homework notepad.exe clone).

Note that this is demo code with a lot of bugs like not checking for \r and god knows what memory leaks and stuff was remaining when it suddenly worked. I leave it to the reader to improve this. An example is that you should check for validity of the “.mmap” file: your very big source text file might have changed since the newline partition table was made.

Knowing that I’ll soon find this all over the place without any of its bugs fixed, here it comes ..

#include "FileModel.h"

#include <QDebug>

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <pthread.h>
#include <unistd.h>

FileModel::FileModel( QObject* a_parent )
    : QAbstractListModel( a_parent )
    , m_file (nullptr)
    , m_index(nullptr)
    , m_rowCount ( 0 ) { }

FileModel::~FileModel() { clear(); }

void FileModel::clear()
{
    if (m_file) {
        if (m_file->isOpen() && map_file != nullptr)
            m_file->unmap(map_file);
        delete m_file;
    }
    if (m_index) {
        if (m_index->isOpen() && map_index != nullptr)
            m_index->unmap(map_index);
        delete m_index;
    }
}

void FileModel::setFileName(const QString &fileName)
{
   clear();
   m_rowCount = 0;
   m_file = new QFile(fileName);
   int cur = 0;
   m_index = new QFile(m_file->fileName() + ".mmap");
   if (m_file->open(QIODevice::ReadOnly)) {
       if (!m_index->exists()) {
           char rbuffer[4096];
           m_index->open(QIODevice::WriteOnly);
           char nulbuffer[4];
           int idxnul = 0;
           memset( nulbuffer +0, idxnul >> 24 & 0xff, 1 );
           memset( nulbuffer +1, idxnul >> 16 & 0xff, 1 );
           memset( nulbuffer +2, idxnul >>  8 & 0xff, 1 );
           memset( nulbuffer +3, idxnul >>  0 & 0xff, 1 );
           m_index->write( nulbuffer, sizeof(quint32));
           qDebug() << "Indexing to" << m_index->fileName();
           while (!m_file->atEnd()) {
               int in = m_file->read(rbuffer, 4096);
               if (in == -1)
                   break;
               char *newline = (char*) 1;
               char *last = rbuffer;
               while (newline != 0) {
                   newline = strchr ( last, '\n');
                   if (newline != 0) {
                     char buffer[4];
                     int idx = cur + (newline - rbuffer);
                     memset( buffer +0, idx >> 24 & 0xff, 1 );
                     memset( buffer +1, idx >> 16 & 0xff, 1 );
                     memset( buffer +2, idx >>  8 & 0xff, 1 );
                     memset( buffer +3, idx >>  0 & 0xff, 1 );
                     m_index->write( buffer, sizeof(quint32));
                     m_rowCount++;
                     last = newline + 1;
                  }
               }
               cur += in;
           }
           m_index->close();
           m_index->open(QFile::ReadOnly);
           qDebug() << "done";
       } else {
           m_index->open(QFile::ReadOnly);
           m_rowCount = m_index->size() / 4;
       }
       map_file= m_file->map(0, m_file->size(), QFileDevice::NoOptions);
       qDebug() << "Done loading " << m_rowCount << " lines";
       map_index = m_index->map(0, m_index->size(), QFileDevice::NoOptions);
   }
   beginResetModel();
   endResetModel();
   emit fileNameChanged();
}

static quint32
read_uint32 (const quint8 *data)
{
    return data[0] << 24 |
           data[1] << 16 |
           data[2] << 8 |
           data[3];
}

int FileModel::rowCount( const QModelIndex& parent ) const
{
    Q_UNUSED( parent );
    return m_rowCount;
}

int FileModel::columnCount(const QModelIndex &parent) const
{
    Q_UNUSED( parent );
    return 1;
}

QVariant FileModel::data( const QModelIndex& index, int role ) const
{
    if( !index.isValid() )
        return QVariant();
    if (role == Qt::DisplayRole) {
        QVariant ret;
        quint32 pos_i = read_uint32(map_index + ( 4 * index.row() ) );
        quint32 end_i;
        if ( index.row() == m_rowCount-1 )
            end_i = m_file->size();
        else
            end_i = read_uint32(map_index + ( 4 * (index.row()+1) ) );
        uchar *position;
        position = map_file +  pos_i;
        uchar *end = map_file + end_i;
        int length = end - position;
        char *buffer = (char*) alloca(length +1);
        memset (buffer, 0, length+1);
        strncpy (buffer, (char*) position, length);
        ret = QVariant(QString(buffer));
        return ret;
    }
    return QVariant();
}

QVariant FileModel::headerData( int section, Qt::Orientation orientation, int role ) const
{
    Q_UNUSED(section);
    Q_UNUSED(orientation);
    if (role != Qt::DisplayRole)
           return QVariant();
    return QString("header");
}

main.cpp

#include <QGuiApplication>
#include <QQmlApplicationEngine>
#include <QtQml>// qmlRegisterType

#include "FileModel.h"

int main(int argc, char *argv[])
{
    QGuiApplication app(argc, argv);
    qmlRegisterType<FileModel>( "FileModel", 1, 0, "FileModel" );
    QQmlApplicationEngine engine;
    engine.load(QUrl(QStringLiteral("qrc:/main.qml")));
    return app.exec();
}

main.qml

import QtQuick 2.3
import QtQuick.Window 2.2
import FileModel 1.0

Window {
    visible: true

    FileModel { id: fileModel }
    ListView {
        id: list
        anchors.fill: parent
        delegate: Text { text: display }
        MouseArea {
            anchors.fill: parent
            onClicked: {
                list.model = fileModel
                fileModel.fileName = "/tmp/file.txt"
            }
        }
    }
}

profile.pro

TEMPLATE = app
QT += qml quick
CONFIG += c++11
SOURCES += main.cpp \
    FileModel.cpp
RESOURCES += qml.qrc
HEADERS += \
    FileModel.h

qml.qrc

<RCC>
    <qresource prefix="/">
        <file>main.qml</file>
    </qresource>
</RCC>

Judges Dredd in het land gesignaleerd

Blijkbaar zijn er weer een paar rechercheurs die geloven dat de rechtstaat niet aan hen besteed is; dat ze zoals Judge Dredd aan de slag kunnen met het afluisteren van ziekenhuizen, artsen en hulpverleners zoals psychologen.

Alles is goed om hun parallelle constructies te ondersteunen. De wet zit voor hen niet in de weg. Wie heeft dat nu nodig? Wetten? Pfuh. Daar doet Dredd niet aan mee. Judge Dredd is de wet. Wat is dat nu.

Hadden ze aanwijzing dat de arts mee in het complot zat? Nee dat was er niet. Want waarom is de orde der geneesheren dan niet op de hoogte gebracht? Het was gewoon volstrekt illegaal om dat ziekenhuis af te luisteren.

Ik hoop van harte dat deze rechercheurs een zware gevangenisstraf krijgen en tot slot nooit nog het beroep van rechercheur mogen uitoefenen.

We hebben dat hier niet nodig. Ga maar in Het VK politieagentje spelen. Zolang het nog bestaat. Bende knoeiers.

Composition and aggregation with QObject

Consider these rather simple relationships between classes

Continuing on this subject, here are some code examples.

Class1 & Class2: Composition
An instance of Class1 can not exist without an instance of Class2.

Example of composition is typically a Bicycle and its Wheels, Saddle and a HandleBar: without these the Bicycle is no longer a Bicycle but just a Frame.

It can no longer function as a Bicycle. Example of when you need to stop thinking about composition versus aggregation is whenever you say: without the other thing can’t in our software the first thing work.

Note that you must consider this in the context of Class1. You use aggregation or composition based on how Class2 exists in relation to Class1.

Class1 with QScopedPointer:

#ifndef CLASS1_H
#define CLASS1_H

#include <QObject>
#include <QScopedPointer>
#include <Class2.h>

class Class1: public QObject
{
    Q_PROPERTY( Class2* class2 READ class2 WRITE setClass2 NOTIFY class2Changed)
public:
    Class1( QObject *a_parent = nullptr )
        : QObject ( a_parent) {
        // Don't use QObject parenting on top here
        m_class2.reset (new Class2() );
    }
    Class2* class2() {
        return m_class2.data();
    }
    void setClass2 ( Class2 *a_class2 ) {
        Q_ASSERT (a_class2 != nullptr); // Composition can't set a nullptr!
        if ( m_class2.data() != a_class2 ) {
            m_class2.reset( a_class2 );
            emit class2Changed()
        }
    }
signals:
    void class2Changed();
private:
    QScopedPointer<Class2> m_class2;
};

#endif// CLASS1_H

Class1 with QObject parenting:

#ifndef CLASS1_H
#define CLASS1_H

#include <QObject>
#include <Class2.h>

class Class1: public QObject
{
    Q_PROPERTY( Class2* class2 READ class2 WRITE setClass2 NOTIFY class2Changed)
public:
    Class1( QObject *a_parent = nullptr )
        : QObject ( a_parent )
        , m_class2 ( nullptr ) {
        // Make sure to use QObject parenting here
        m_class2 = new Class2( this );
    }
    Class2* class2() {
        return m_class2;
    }
    void setClass2 ( Class2 *a_class2 ) {
         Q_ASSERT (a_class2 != nullptr); // Composition can't set a nullptr!
         if ( m_class2 != a_class2 ) {
             // Make sure to use QObject parenting here
             a_class2->setParent ( this );
             delete m_class2; // Composition can never be nullptr
             m_class2 = a_class2;
             emit class2Changed();
         }
    }
signals:
    void class2Changed();
private:
    Class2 *m_class2;
};

#endif// CLASS1_H

Class1 with RAII:

#ifndef CLASS1_H
#define CLASS1_H

#include <QObject>
#include <QScopedPointer>

#include <Class2.h>

class Class1: public QObject
{
    Q_PROPERTY( Class2* class2 READ class2 CONSTANT)
public:
    Class1( QObject *a_parent = nullptr )
        : QObject ( a_parent ) { }
    Class2* class2()
        { return &m_class2; }
private:
    Class2 m_class2;
};
#endif// CLASS1_H

Class3 & Class4: Aggregation

An instance of Class3 can exist without an instance of Class4. Example of composition is typically a Bicycle and its driver or passenger: without the Driver or Passenger it is still a Bicycle. It can function as a Bicycle.

Example of when you need to stop thinking about composition versus aggregation is whenever you say: without the other thing can in our software the first thing work.

Class3:

#ifndef CLASS3_H
#define CLASS3_H

#include <QObject>

#include <QPointer>
#include <Class4.h>

class Class3: public QObject
{
    Q_PROPERTY( Class4* class4 READ class4 WRITE setClass4 NOTIFY class4Changed)
public:
    Class3( QObject *a_parent = nullptr );
    Class4* class4() {
        return m_class4.data();
    }
    void setClass4 (Class4 *a_class4) {
         if ( m_class4 != a_class4 ) {
             m_class4 = a_class4;
             emit class4Changed();
         }
    }
signals:
    void class4Changed();
private:
    QPointer<Class4> m_class4;
};
#endif// CLASS3_H

Class5, Class6 & Class7: Shared composition
An instance of Class5 and-or an instance of Class6 can not exist without a instance of Class7 shared by Class5 and Class6. When one of Class5 or Class6 can and one can not exist without the shared instance, use QWeakPointer at that place.

Class5:

#ifndef CLASS5_H
#define CLASS5_H

#include <QObject>
#include <QSharedPointer>

#include <Class7.h>

class Class5: public QObject
{
    Q_PROPERTY( Class7* class7 READ class7 CONSTANT)
public:
    Class5( QObject *a_parent = nullptr, Class7 *a_class7 );
        : QObject ( a_parent )
        , m_class7 ( a_class7 ) { }
    Class7* class7()
        { return m_class7.data(); }
private:
    QSharedPointer<Class7> m_class7;
};

Class6:

#ifndef CLASS6_H
#define CLASS6_H

#include <QObject>
#include <QSharedPointer>

#include <Class7.h>

class Class6: public QObject
{
    Q_PROPERTY( Class7* class7 READ class7 CONSTANT)
public:
    Class6( QObject *a_parent = nullptr, Class7 *a_class7 )
        : QObject ( a_parent )
        , m_class7 ( a_class7 ) { }
    Class7* class7()
        { return m_class7.data(); }
private:
    QSharedPointer<Class7> m_class7;
};
#endif// CLASS6_H

Interfaces with QObject

FlyBehavior:

#ifndef FLYBEHAVIOR_H
#define FLYBEHAVIOR_H
#include <QObject>
// Don't inherit QObject here (you'll break multiple-implements)
class FlyBehavior {
    public:
        Q_INVOKABLE virtual void fly() = 0;
};
Q_DECLARE_INTERFACE(FlyBehavior , "be.codeminded.Flying.FlyBehavior /1.0") 
#endif// FLYBEHAVIOR_H

FlyWithWings:

#ifndef FLY_WITH_WINGS_H
#define FLY_WITH_WINGS_H
#include <QObject>  
#include <Flying/FlyBehavior.h>
// Do inherit QObject here (this is a concrete class)
class FlyWithWings: public QObject, public FlyBehavior
{
    Q_OBJECT
    Q_INTERFACES( FlyBehavior )
public:
    explicit FlyWithWings( QObject *a_parent = nullptr ): QObject ( *a_parent ) {}
    ~FlyWithWings() {}

    virtual void fly() Q_DECL_OVERRIDE;
}
#endif// FLY_WITH_WINGS_H

MVVM, Model View ViewModel, with Qt and QML

In the XAML world it’s very common to use the MVVM pattern. I will explain how to use the technique in a similar way with Qt and QML.

The idea is to not have too much code in the view component. Instead we have declarative bindings and move most if not all of our view code to a so called ViewModel. The ViewModel will sit in between the actual model and the view. The ViewModel typically has one to one properties for everything that the view displays. Manipulating the properties of the ViewModel alters the view through bindings. You typically don’t alter the view directly.

In our example we have two list-models, two texts and one button: available-items, accepted-items, available-count, accepted-count and a button. Pressing the button moves stuff from available to accepted. Should be a simple example.

First the ViewModel.h file. The class will have a property for ~ everything the view displays:

#ifndef VIEWMODEL_H
#define VIEWMODEL_H

#include <QAbstractListModel>
#include <QObject>

class ViewModel : public QObject
{
	Q_OBJECT

	Q_PROPERTY(QAbstractListModel* availableItems READ availableItems NOTIFY availableItemsChanged )
	Q_PROPERTY(QAbstractListModel* acceptedItems READ acceptedItems NOTIFY acceptedItemsChanged )
	Q_PROPERTY(int available READ available NOTIFY availableChanged )
	Q_PROPERTY(int accepted READ accepted NOTIFY acceptedChanged )
public:

	ViewModel( QObject *parent = 0 );
	~ViewModel() { }

	QAbstractListModel* availableItems()
		{ return m_availableItems; }

	QAbstractListModel* acceptedItems()
		{ return m_acceptedItems; }

	int available ()
		{ return m_availableItems->rowCount(); }

	int accepted ()
		{ return m_acceptedItems->rowCount(); }

	Q_INVOKABLE void onButtonClicked( int availableRow );

signals:
	void availableItemsChanged();
	void acceptedItemsChanged();
	void availableChanged();
	void acceptedChanged();

private:
	QAbstractListModel* m_availableItems;
	QAbstractListModel* m_acceptedItems;
};

#endif

The ViewModel.cpp implementation of the ViewModel. This is of course a simple example. The idea is that ViewModels can be quite complicated while the view.qml remains simple:

#include <QStringListModel>

#include "ViewModel.h"

ViewModel::ViewModel( QObject *parent ) : QObject ( parent )
{
	QStringList available;
	QStringList accepted;

	available << "Two" << "Three" << "Four" << "Five";
	accepted << "One";

	m_availableItems = new QStringListModel( available, this );
	emit availableItemsChanged();

	m_acceptedItems = new QStringListModel( accepted, this );
	emit acceptedItemsChanged();
}

void ViewModel::onButtonClicked(int availableRow)
{
	QModelIndex availableIndex = m_availableItems->index( availableRow, 0, QModelIndex() );
	QVariant availableItem = m_availableItems->data( availableIndex, Qt::DisplayRole );

	int acceptedRow = m_acceptedItems->rowCount();

	m_acceptedItems->insertRows( acceptedRow, 1 );

	QModelIndex acceptedIndex = m_acceptedItems->index( acceptedRow, 0, QModelIndex() );
	m_acceptedItems->setData( acceptedIndex, availableItem );
	emit acceptedChanged();

	m_availableItems->removeRows ( availableRow, 1, QModelIndex() );
	emit availableChanged();
}

The view.qml. We’ll try to have as few JavaScript code as possible; the idea is that coding itself is done in the ViewModel. The view should only be view code (styling, UI, animations, etc). The import url and version are defined by the use of qmlRegisterType in the main.cpp file, lower:

import QtQuick 2.0
import QtQuick.Controls 1.2

import be.codeminded.ViewModelExample 1.0

Rectangle {
    id: root
    width: 640; height: 320

	property var viewModel: ViewModel { }

	Rectangle {
		id: left
		anchors.left: parent.left
		anchors.top: parent.top
		anchors.bottom: button.top
		width: parent.width / 2
		ListView {
		    id: leftView
			anchors.left: parent.left
			anchors.right: parent.right
			anchors.top: parent.top
			anchors.bottom: leftText.top

			delegate: rowDelegate
		        model: viewModel.availableItems
		}
		Text {
			id: leftText
			anchors.left: parent.left
			anchors.right: parent.right
			anchors.bottom: parent.bottom
			height: 20
			text: viewModel.available
		}
	}

	Rectangle {
		id: right
		anchors.left: left.right
		anchors.right: parent.right
		anchors.top: parent.top
		anchors.bottom: button.top
		ListView {
		    id: rightView
			anchors.left: parent.left
			anchors.right: parent.right
			anchors.top: parent.top
			anchors.bottom: rightText.top

			delegate: rowDelegate
		        model: viewModel.acceptedItems
		}
		Text {
			id: rightText
			anchors.left: parent.left
			anchors.right: parent.right
			anchors.bottom: parent.bottom
			height: 20
			text: viewModel.accepted
		}
	}

	Component {
		id: rowDelegate
		Rectangle {
			width: parent.width
			height: 20
			color: ListView.view.currentIndex == index ? "red" : "white"
			Text { text: 'Name:' + display }
			MouseArea {
				anchors.fill: parent
				onClicked: parent.ListView.view.currentIndex = index
			}
		}
	}

	Button {
		id: button
		anchors.left: parent.left
		anchors.right: parent.right
		anchors.bottom: parent.bottom
		height: 20
	        text: "Accept item"
		onClicked: viewModel.onButtonClicked( leftView.currentIndex );
	}
}

A main.cpp example. The qmlRegisterType defines the url to import in the view.qml file:

#include <QGuiApplication>
#include <QQuickView>
#include <QtQml>
#include <QAbstractListModel>

#include "ViewModel.h"

int main(int argc, char *argv[])
{
	QGuiApplication app(argc, argv);
	QQuickView view;
	qRegisterMetaType<QAbstractListModel*>("QAbstractListModel*");
	qmlRegisterType<ViewModel>("be.codeminded.ViewModelExample", 1, 0, "ViewModel");
	view.setSource(QUrl("qrc:/view.qml"));
	view.show();
	return app.exec();
}

A project.pro file. Obviously should you use cmake nowadays. But oh well:

TEMPLATE += app
QT += quick
SOURCES += ViewModel.cpp main.cpp
HEADERS += ViewModel.h
RESOURCES += project.qrc

And a project.qrc file:

<!DOCTYPE RCC>
<RCC version="1.0">
<qresource prefix="/">
    <file>view.qml</file>
</qresource>
</RCC>

Geef vorm

We zijn goed. We tonen dat door ons respect voor privacy en veiligheid te combineren. Kennis is daar onontbeerlijk voor. Ik pleit voor investeren in techneuten die de twee beheersen.

Onze overheid moet niet alles investeren in miljoenen voor het bestrijden van computervredebreuk; wel ook investeren in betere software.

Belgische bedrijven maken soms software. Ze moeten aangemoedig worden, gestuurd, om het goede te doen.

Ik zou graag van ons centrum cybersecurity zien dat ze bedrijven aanmoedigt om goede en dus veilige software te maken. We moeten ook inzetten op repressie. Maar we moeten net zo veel inzetten op hoge kwaliteit.

Wij denken wel eens dat, ach, wij te klein zijn. Maar dat is niet waar. Als wij beslissen dat hier, in België, de software goed moet zijn: dan creërt dat een markt die zich zal aanpassen aan wat wij willen. Het is zaak standvastig te zijn.

Wanneer wij zeggen dat a – b hier welkom is, of niet, geven we vorm aan technologie.

Ik verwacht niet minder van mijn land. Geef vorm.

QML coding conventions checker that uses QML parser’s own abstract syntax tree

My colleague Henk Van Der Laak made a interesting tool that checks your code against the QML coding conventions. It uses the internal parser’s abstract syntax tree of Qt 5.6 and a visitor design.

It has a command line, but being developers ourselves we want an API too of course. Then we can integrate it in our development environments without having to use popen!

So this is how to use that API:

// Parse the code
QQmlJS::Engine engine;
QQmlJS::Lexer lexer(&engine);
QQmlJS::Parser parser(&engine);

QFileInfo info(a_filename);
bool isJavaScript = info.suffix().toLower() == QLatin1String("js");
lexer.setCode(code,  1, !isJavaScript);
bool success = isJavaScript ? parser.parseProgram() : parser.parse();
if (success) {
    // Check the code
    QQmlJS::AST::UiProgram *program = parser.ast();
    CheckingVisitor checkingVisitor(a_filename);
    program->accept(&checkingVisitor);
    foreach (const QString &warning, checkingVisitor.getWarnings()) {
        qWarning() << qPrintable(warning);
    }
}

Item isChild of another Item in QML

Damned, QML is inconsistent! Things have a content, data or children. And apparently they can all mean the same thing. So how do we know if something is a child of something else?

After a failed stackoverflow search I gave up on copy-paste coding and invented the damn thing myself.

function isChild( a_child, a_parent ) {
	if ( a_parent === null ) {
		return false
	}

	var tmp = ( a_parent.hasOwnProperty("content") ? a_parent.content
		: ( a_parent.hasOwnProperty("children") ? a_parent.children : a_parent.data ) )

	if ( tmp === null || tmp === undefined ) {
		return false
	}

	for (var i = 0; i < tmp.length; ++i) {

		if ( tmp[i] === a_child ) {
			return true
		} else {
			if ( isChild ( a_child, tmp[i] ) ) {
				return true
			}
		}
	}
	return false
}

Composition and aggregation to choose memory types in Qt

As we all know has Qt types like QPointerQSharedPointer and we know about its object trees. So when do we use what?

Let’s first go back to school, and remember the difference between composition and aggregation. Most of you probably remember drawings like this?

It thought us when to use composition, and when to use aggregation:

  • Use composition when the user can’t exist without the dependency. For example a Human can’t exist without a Head unless it ceases to be a human. You could also model Arm, Hand, Finger and Leg as aggregates but it might not make sense in your model (for a patient in a hospital perhaps it does?)
  • Use aggregate when the user can exist without the dependency: A car without a passenger is still a car in most models.

This model in the picture will for example tell us that a car’s passenger must have ten fingers.

But what does this have to do with QPointer, QSharedPointer and Qt’s object trees?

First situation is a shared composition. Both Owner1 and Owner2 can’t survive without Shared (composition, filled up diamonds). For this situation you would typically use a QSharedPointer<Shared> at Owner1 and Owner2:

If there is no other owner, then it’s probably better to just use Qt’s object trees and setParent() instead. Note that for example QML’s GC is not very well aware of QSharedPointer, but does seem to understand Qt’s object trees.

Second situation are shared users. User1 and User2 can stay alive when Shared goes away (aggregation, empty diamonds). In this situation you typically use a QPointer<Shared> at User1 and at User2. You want to be aware when Shared goes away. QPointer<Shared>’s isNull() will become true after that happened.

Third situation is a mixed one. In this case you could at Owner use a QSharedPointer<Shared> or a parented raw QObject pointer (using setParent()), but a QPointer<Shared> at User. When Owner goes away and its destructor (due to the parenting) deletes Shared, User can check for it using the previously mentioned isNull check.

Finally if you have a typical object tree, then use QObject’s infrastructure for this.

 

 

Visitor for Klartext

Felt good about explaining my work last time. For no reason. I guess I’m happy, or I no longer feel PGO’s pressure or something. Having to be politically correct all the times, sucks. Making technically and architecturally good solutions is what drives me.

Today I explained the visitor pattern. We want to parse Klartext in such a way that we can present its structure in a editing component. It’s the same component for which I utilized a LRU last week. We want to visualize significant lines like tool changes, but also make cycles foldable like SciTe does with source code and a whole lot of other stuff that I can’t tell you because of teh secretz. Meanwile these files are, especially when generated using cad-cam software, amazingly huge.

Today I had some success with explaining visitor using the Louvre as that what is “visitable” (the AST) and a Japanese guy who wants to collect state (photos) as a visitor of fine arts. Hoping my good-taste solutions (not my words, it’s how Matthias Hasselmann describes my work at Nokia) will once again yield a certain amount of success.

ps. I made sure that all the politically correcting categories are added to this post. So if you’d have filtered away the condescending and controversial posts from my blog, you could have protected yourself from being in total shock now (because I used the sexually tinted word “sucks”, earlier). Guess you didn’t. Those categories have been in place on my blog’s infrastructure since many years. They are like the Körperwelten (Bodyworlds) exhibitions; you don’t have to visit them.

Putting an LRU in your code

For the ones who didn’t find the LRU in Tracker’s code (and for the ones who where lazy).

Let’s say we will have instances of something called a Statement. Each of those instances is relatively big. But we can recreate them relatively cheap. We will have a huge amount of them. But at any time we only need a handful of them.

The ones that are most recently used are most likely to be needed again soon.

First you make a structure that will hold some administration of the LRU:

typedef struct {
	Statement *head;
	Statement *tail;
	unsigned int size;
	unsigned int max;
} StatementLru;

Then we make the user of a Statement (a view or a model). I’ll be using a Map here. You can in Qt for example use QMap for this. Usually I want relatively fast access based on a key. You could also each time loop the stmt_lru to find the instance you want in useStatement based on something in Statement itself. That would rid yourself of the overhead of a map.

class StatementUser
{
	StatementUser();
	~StatementUser();
	void useStatement(KeyType key);
private:
	StatementLru stmt_lru;
	Map<KeyType, Statement*> stmts;
	StatementFactory stmt_factory;
}

Then we will add to the private fields of the Statement class the members prev and next: We’ll make a circular doubly linked list.

class Statement: QObject {
	Q_OBJECT
    ...
private:
	Statement *next;
	Statement *prev;
};

Next we initialize the LRU:

StatementUser::StatementUser() 
{
	stmt_lru.max = 500;
	stmt_lru.size = 0;		
}

Then we implement using the statements

void StatementUser::useStatement(KeyType key)
{
	Statement *stmt;

	if (!stmts.get (key, &stmt)) {

		stmt = stmt_factory.createStatement(key);

		stmts.insert (key, stmt);

		/* So the ring looks a bit like this: *
		 *                                    *
		 *    .--tail  .--head                *
		 *    |        |                      *
		 *  [p-n] -> [p-n] -> [p-n] -> [p-n]  *
		 *    ^                          |    *
		 *    `- [n-p] <- [n-p] <--------'    */

		if (stmt_lru.size >= stmt_lru.max) {
			Statement *new_head;

		/* We reached max-size of the LRU stmt cache. Destroy current
		 * least recently used (stmt_lru.head) and fix the ring. For
		 * that we take out the current head, and close the ring.
		 * Then we assign head->next as new head. */

			new_head = stmt_lru.head->next;
			auto to_del = stmts.find (stmt_lru.head);
			stmts.remove (to_del);
			delete stmt_lru.head;
			stmt_lru.size--;
			stmt_lru.head = new_head;
		} else {
			if (stmt_lru.size == 0) {
				stmt_lru.head = stmt;
				stmt_lru.tail = stmt;
			}
		}

	/* Set the current stmt (which is always new here) as the new tail
	 * (new most recent used). We insert current stmt between head and
	 * current tail, and we set tail to current stmt. */

		stmt_lru.size++;
		stmt->next = stmt_lru.head;
		stmt_lru.head->prev = stmt;

		stmt_lru.tail->next = stmt;
		stmt->prev = stmt_lru.tail;
		stmt_lru.tail = stmt;

	} else {
		if (stmt == stmt_lru.head) {

		/* Current stmt is least recently used, shift head and tail
		 * of the ring to efficiently make it most recently used. */

			stmt_lru.head = stmt_lru.head->next;
			stmt_lru.tail = stmt_lru.tail->next;
		} else if (stmt != stmt_lru.tail) {

		/* Current statement isn't most recently used, make it most
		 * recently used now (less efficient way than above). */

		/* Take stmt out of the list and close the ring */
			stmt->prev->next = stmt->next;
			stmt->next->prev = stmt->prev;

		/* Put stmt as tail (most recent used) */
			stmt->next = stmt_lru.head;
			stmt_lru.head->prev = stmt;
			stmt->prev = stmt_lru.tail;
			stmt_lru.tail->next = stmt;
			stmt_lru.tail = stmt;
		}

	/* if (stmt == tail), it's already the most recently used in the
	 * ring, so in this case we do nothing of course */
	}

	/* Use stmt */

	return;
}

In case StatementUser and Statement form a composition (StatementUser owns Statement, which is what makes most sense), don’t forget to delete the instances in the destructor of StatementUser. In the example’s case we used heap objects. You can loop the stmt_lru or the map here.

StatementUser::~StatementUser()
{
	Map<KeyType, Statement*>::iterator i;
    	for (i = stmts.begin(); i != stmts.end(); ++i) {
		delete i.value();
	}
}

Secretly reusing my own LRU code

Last week, I secretly reused my own LRU code in the model of the editor of a CNC machine (has truly huge files, needs a statement editor). I rewrote my own code, of course. It’s Qt based, not GLib. Wouldn’t work in original form anyway. But the same principle. Don’t tell Jürg who helped me write that, back then.

Extra points and free beer for people who can find it in Tracker’s code.

Als het goed is, zeggen we het ook

De Belgische Marechaussee heeft goed werk verricht. Het gespuis is grotendeels opgepakt. Daar zal speurwerk voor nodig geweest zijn. Toch bleken er weinig inbreuken te zijn en heeft de bevolking weinig vrijheden ingeleverd.

Met andere woorden: er is gericht werk verricht.

Hoe hoger het resultaat met hoe minder ingeleverde vrijheden, te hoger de kwaliteit van onze diensten.

We gaan dat landje hier vrij houden.

RE: Toeristen, geef alstublieft een extra stukje privacy prijs (opiniestuk CEO van Thomas Cook, De Tijd)

Toeristen, geef alstublieft een extra stukje privacy prijs, zodat we U en uw koopgedrag nog meer en beter kunnen tracken bij Thomas Cook.

We hebben de neiging minder flexibel te zijn als het om onze privacy gaat. Maar we zullen niet anders kunnen dan onze privacy voor een stuk opgeven. Voor onze eigen veiligheid, zodat de overheden kunnen screenen wie op welke vlucht zit. Die evolutie zagen we al na de aanslagen van 11 september 2001 in New York.

Hoe juist helpt het dat Thomas Cook weet wie op welke vlucht zal zitten? De veiligheidsdiensten weten dat trouwens nu al. Hoe helpt dit wanneer iemand lang voor om het even welke identiteitscontrole in de vertrekhal een bom laat afgaan? Hoe heeft dat securitytheater na 9/11 trouwens om het even welke aanslag vermeden? Welke bewijzen zijn er dat het geholpen heeft? Welke bewijzen zijn er dat de in extreme zin toegenomen surveillance helpt?

Dit ruikt sterk naar goedkope politieke recuperatie van een stuk menselijke ellende. Jan Dekeyser, CEO van Thomas Cook België, wil U en uw koopgedrag kunnen tracken. Zodat hij veel geld met uw privacy kan verdienen. Nog meer geld. Hij roept daarvoor op nu dat de mensen bang zijn en bereidwillig hun privacy op te geven. Van timing gesproken.

We moeten nu ook van de CEO van Thomas Cook onze Facebook profielen aan hem en zijn marketing afdeling geven. Dat zegt hij letterlijk in zijn opiniestuk:

Dat kan veranderen. Intussen heeft zowat iedereen een smartphone, een Facebook-profiel of een Twitter-account. Het is dus perfect mogelijk om via persoonlijke communicatie de bezorgdheid van de reizigers weg te nemen en hen rechtstreeks te informeren over hoe hun reis verder zal verlopen.

Waarschijnlijk kan je dus binnenkort geen klant meer zijn bij Thomas Cook tenzij je een Facebook profiel hebt waar hij en zijn bedrijf volledige toegang toe hebben. Zullen we de luchthaven nog wel binnenmogen tenzij hij en zijn bedrijven alles over U weten, zodoende hij U en uw koopgedrag volledig in kaart kan brengen?

Wanneer wordt het nu eindelijk eens duidelijk dat geen enkele hoeveelheid surveillance bovenop wat er nu reeds is, aanslagen zal voorkomen? In tegendeel zal de onnodige verdachtmaking onschuldige burgers net kwaad en wantrouwig tegenover hun overheid maken.

Dus neen, Jan Dekeyser CEO van Thomas Cook België, ik geef niet een extra stukje privacy aan U en uw marketing mensen prijs. Onze veiligheidsdiensten weten nu al perfect wie er op welke vlucht zit. Die databases bestaan al.

Geef de actoren verantwoordelijk voor het doorspelen van informatie naar de juiste diensten en mensen een betere opleiding, verhoog getalmatig manschappen en budgetten en werk met de informatie die er al is. M.a.w. doe als ambtenaar in een veiligheidsfunctie uw job.

Maar een Thomas Cook moet echt niet nog meer van onze privacy weten.

It never ends

Allereerst, na Brussel blijf ik hier bij:

[over encryptie] Allereerst zolang er geen nood is moet men niets doen. Want ik zie niet meteen een echte noodzaak om alle mensen hun elektronica te kunnen kraken. Dat was vroeger niet nodig; vandaag zijn er niet meer of minder gevaarlijke gekken en elektronica voegt maar weinig toe aan hun wapenarsenaal. Dus is dat nu niet nodig.

Ik sta ook hier achter:

Oh, and to our security services: well done catching those guys from Molenbeek and Vorst. Good job

Niet gestresseerd geraken. Zoals alle mensen presteer je slecht wanneer je gestresseerd bent. Hold the line. Molenbeek en Vorst was trouwens prachtig werk. Goed gedaan.

We vinden die balans wel.

Up next: PADI master diver

A year or so after my PADI rescue, my diving club has convinced me to wake up about twelve times early in the morning on Saturday, to get me trained to become a PADI master diver.

The jokers in my club told me it’s not so hard as PADI Rescue training. The only hard part is Saturday morning. We’ll see, I wonder. heh.

Oh, and to our security services: well done catching those guys from Molenbeek and Vorst. Good job!

Oh, I got invited to NLGG to talk about Tracker in Utrecht tomorrow.

RE: President Obama benadrukt noodzaak van een encryptie-backdoor

‘t ga hier over

We leggen in België al behoorlijk wat verantwoordelijkheid over onze bezittingen bij de notaris. Hij maakte de eigendomsakte van uw huis op, over de erfdienstbaarheid van uw grond, uw aandelen en de oprichtingsakte van uw bedrijf. Hij regelt ook de erfenis. Dat loopt hier vrij goed. Het zal wel eens af en toe mislopen, maar niet vaak. Volgens mij is dat omdat de deontologische code van het notariaat in België redelijk streng is. Je wordt ook niet zomaar notaris. De opleiding en examens ervoor blijken vrij moeilijk te zijn.

Allereerst zolang er geen nood is moet men niets doen. Want ik zie niet meteen een echte noodzaak om alle mensen hun elektronica te kunnen kraken. Dat was vroeger niet nodig; vandaag zijn er niet meer of minder gevaarlijke gekken en elektronica voegt maar weinig toe aan hun wapenarsenaal. Dus is dat nu niet nodig.

Maar om den Obama toch wat te helpen met de oprichting van z’n surveillancestaat, waarbij iPhones moeten kunnen gedecrypteerd worden door de overheid, zou ik een systeem of concept waarbij een soort van master key fysiek opgeslagen wordt bij de notaris willen voorstellen (je hebt dus eerst een deftig notariaat nodig, maar dat hebben we al).

De notaris krijgt een deontologische code die er op neerkomt dat hij of zij de proportionaliteit moet onderzoeken wanneer overheidsdiensten een afdruk van die master key opvragen. De afdruk van die key zou zo kunnen voorzien worden dat enkel bepaalde documenten er mee gedecrypteerd kunnen worden, i.p.v. alles.

De burger die zijn eigen keys verliest kan dan tegen een typisch notaristarief een nieuwe key laten maken a.d.h.v. de bij de notaris aanwezige key. Ik zou de overheid dat tarief ook laten betalen. Anders willen de notarissen die bevoegdheid waarschijnlijk niet (en zo’n proportionaliteitsonderzoek vraagt juridische kennis en tijd: de twee kostbaarste dingen in het leven van de notaris).

Ik denk wel dat onze cryptografen zoiets in elkaar kunnen steken.

Ge moet het vooral allemaal niet te moeilijk of complex maken: geen (extra) backdoors in de software of hardware want dan zit overmorgen de overheid van ieder land, en een hele bende criminelen ook, in al onze politici en bedrijfsleiders hun Facebook-doorgeef-luik, ik bedoel hun smartphone en tablets. Het is nu al erg genoeg wat ze er zelf achteloos en publiek op neerkwakken (terwijl velen, zeer naief, denken dat het allemaal privé is wat er in die Cloud rotzooi gebeurt).

Het Internet Der Crap

Ik heb in Belgische bedrijven al bugs gefiled waarbij een buffer afkomstig van een Angular HTTP POST sessie gewoon met memcpy in een stack buffer van 1024 bytes gekopieerd wordt, met uiteraard strlen op de origin buffer als size_t n parameter. Daar kwam zelfs ruzie van want de geniale programmeur vond dat dat geen bug was, omdat je de admin credentials nodig had om te reproduceren. Het probleem was dat ook zonder in te loggen de buffer tot aan de memcpy kon geraken.

Het is ook geen uitzondering om wanneer je met de strings tool op de binaries op vele routers losgaat, je gewoonweg backdoors en hardcoded passwords krijgt. Je komt programmeurs tegen die dat er gewoon ingestopt hebben. Dat vinden ze normaal. Ruwe anti-security arrogantie. En ze zijn nog trots op hun onprofessionalisme ook.

Deze tijd is dan ook niet de tijd van het Internet Der Dingen, maar wel van Het Internet Der Crap. Gemaakt door prutsers. Zelfs de backdoors zijn prutswerk.

Het is zeer erg gesteld. Ik voel me beschaamd om vele van mijn collega’s en wens mij te distantiëren.