We make ‘null’ mistakes

Wherever software is created, mistakes are being made. Software providers often presume their products are bug-free, but software of that kind doesn’t exist. Our departments works hard to prevent it, yet in our HIquality Life Cycle new bugs could still be introduced, even in the oldest modules that have been in use for over 25 years already. 

HIquality bug cycle

Usually our customers are satisfied with our product suite. At customer support I never receive information about the successful implementations. I got to know our software through the problems that occur, and in almost 15 years of acceptance testing and customer support, I’ve seen all kind of bugs passing by.
HIquality bug cycleSoftware crashes and never ending loops are nasty. Worse are those bugs that are not that visible in the beginning, but keep on growing in the course of time.
Recently we caught such a bug in our longest existing product HIquality Identify. Continue reading ‘We make ‘null’ mistakes’

First time right? Let your data decide!

Data quality consultants will tell you that collecting data correctly, getting it right first time is essential, whilst in contrast almost every organisation actually puts most of their budget and labour into attempting to cleanse data after collection.

The proactive versus reactive debate rages, but in fact data quality must be both a proactive and a reactive process. The data will dictate which to use, or whether both are required. Continue reading ‘First time right? Let your data decide!’

An Easy mashup of ETL and DQ

Today I saw how easy it can be to make a mashup from ETL and DataQuality tools. More and more ETL vendors see the need to not only extract, transform and load data, but at the same time also enhance the data by hand with data quality tools. Most of them stick to so-called tick mark data quality – main stream easy to get enhancements. These results are mostly experienced as disappointing or at max average. Building ETL solutions is another ball-game than building data quality solutions. You need to mash these worlds together.
Together with Pentaho we as Human Inference are creating a mashup with their Kettle ETL tool and our HIquality Data Quality solutions. The nice thing is that the data quality solutions can be used both in the cloud as well as on-premise.
It’s almost finished now and as a teaser I just want to show you a hot screenshot of it. Soon available as add-on from our easyDQ website, followed by an inclusion in the coming Pentaho release. If you need it right away, please contact us directly.

Komerc in Croatia

People find many ways to be unique, including in their choice of names and how they are written.  Common names may be written in any number of ways (Zachery, Zaccari, Zachery, Zakarey and so) and in any number of forms (Za’Korey, zaKori). This variation, and the importance that the customer attaches to it, reinforces the importance of first time right when collecting information about a person’s name.

I was reminded recently that this rule applies also to company names when reviewing a directory containing Croatian companies.  The directoryshowed a great variation in words that at first glance would seem ideal candidates for correction and standardization. For example, many companies contained strings like these:

Commerc, Comerce, Comerc, Kommerce, Kommerc, Komerce, Komerc

There are many other examples which had me scratching my head: Compani, Konsulting, Konzalting, Konsalting and so on.

Why the variance is spelling?  Are these companies with the English word commerce in their names where that word has been typed as heard by call centre workers with a limited knowledge of English? Are they typos of a valid Croatian word? Are they accurate representations of a valid Croatian word as rendered in different dialects? Is it a mixture of all these factors?

Continue reading ‘Komerc in Croatia’

Ask Me is linked with Any Body and relates with Walther Von Stolzing

Weird subject, isn’t it? Quite obvious for everybody, the persons ‘Ask Me’ and ‘Any Body’ are artificial names. They will never belong to a real person. How they relate to ‘Walter von Stolzing’ will follow.

For over 25 years Human Inference has collected reference data, for instance on persons. Because of our reference set we immediately recognize that ‘Ask Me’ and ‘Any Body’ are fake names. People are using these either in test situations or to hide their actual names.

In the old days we only needed to test on ‘Test Test’, in more recent years we see great inventiveness on these fake names. A brief example can be seen in the following list.

Alpha Beta Any Body
Ask Me Best Friend
Blue Sky Cool Dude
Dress Code El Comandante
Guess Who In Cognito

In case you cannot rely on reference data and interpretation you need to provide a check list. Providing it is one thing, but since users tend to be really creative, maintaining it is essential. Continue reading ‘Ask Me is linked with Any Body and relates with Walther Von Stolzing’

200 years of family names

Today is a memorable day for data quality in the Netherlands. Exactly two hundred years ago, on August 18, 1811, the French emperor (and occupier) Napoleon Bonaparte issued the decree that all citizens of the northern provinces of the Netherlands were to choose a surname. This name was very useful in the municipal registers of the Dutch inhabitants: how else could the French army know which lad to draw for military service, or which peasant to pursue for taxes? Continue reading ‘200 years of family names’