We make ‘null’ mistakes

Wherever software is created, mistakes are being made. Software providers often presume their products are bug-free, but software of that kind doesn’t exist. Our departments works hard to prevent it, yet in our HIquality Life Cycle new bugs could still be introduced, even in the oldest modules that have been in use for over 25 years already. 

HIquality bug cycle

Usually our customers are satisfied with our product suite. At customer support I never receive information about the successful implementations. I got to know our software through the problems that occur, and in almost 15 years of acceptance testing and customer support, I’ve seen all kind of bugs passing by.
HIquality bug cycleSoftware crashes and never ending loops are nasty. Worse are those bugs that are not that visible in the beginning, but keep on growing in the course of time.
Recently we caught such a bug in our longest existing product HIquality Identify. Continue reading ‘We make ‘null’ mistakes’

High precision matching – apples, oranges or fruit salad?

apples-oranges In his excellent post “New matching engines go beyond apples and oranges”, Winfried van Holland states that traditional matching engines are based on atomic string comparison functions, like match-codes, phonetic comparison, Levenshtein string distance and n-gram comparisons. He further argues that the drawback of these functions is that it’s not always clear for what purpose one needs to utilize a particular function, and that these low-level DQ functions cannot distinguish between apples and oranges – you end up comparing family names with street names.

Good point! In essence, this is the basis of the discussion on the matching approach within customer data management: As intelligent automated matching of records distributed over various heterogeneous data sources is an essential pre-requisite for correct and adequate customer data integration, there are many opinions on how to achieve this.

In theories on data matching, there are in general two methods that prevail when customer data management is concerned: deterministic and probabilistic matching. Continue reading ‘High precision matching – apples, oranges or fruit salad?’