Boundless Search

We live with restrictions every day.

  • A rafter blocks my cellar stairs, so I always bend when I enter.
  • At the end of the street a barking dog runs to the fence whenever I pass, so Ialways cross the street just before I reach the dog.

We learn to live with restrictions and they become a habit. After a while you just stopAngry dog realizing why you do things the way you do. The rafter has been removed, and the dog has died. Then why do I still bend on the cellar stairs, and why do I still cross the street before I reach the end? Recently I was confronted with similar obsolete restrictions at Human Inference customer support.

A rafter is visible and a barking dog can be heard, so it doesn’t take long before my habits change to fit to the new situation. It’s different however for technical restrictions of which you never get to know that they have disappeared. When I build descriptions for more than a million source records in a SQL Server database, I automatically switch from the free SQL Server Express database to an Enterprise edition. A customer decided to build descriptions for 7 million source records in a SQL Server Express edition. I was rather surprised the build was successful at the end. It turned out that in SQL Server Express 2008 the maximum database size is 10 GB as compared to SSE 2005 having 4GB. As It turned out I had been crossing the street for years to hide for a dead dog.

Searching with HIquality Identify

HIquality Identify is used for search, deduplication and data matching. To retrieve fast search results, subsets are used to preselect records for evaluation. Before the actual search, the maximum number of evaluations as set in the configuration is checked. When a subset exceeds this maximum, it is skipped. When all subsets are skipped, a message is returned indicating that not enough search data was entered. Searching for Müller with house number 9 in the whole of Germany for example will preselect too many candidates, and is, even when results are found, not very useful. Besides the number of candidates per subset, the total of evaluations in all subsets is checked with the maximum. This maximum is not allowed to be above the magical limit of 32768.

Continue reading ‘Boundless Search’

We make ‘null’ mistakes

Wherever software is created, mistakes are being made. Software providers often presume their products are bug-free, but software of that kind doesn’t exist. Our departments works hard to prevent it, yet in our HIquality Life Cycle new bugs could still be introduced, even in the oldest modules that have been in use for over 25 years already. 

HIquality bug cycle

Usually our customers are satisfied with our product suite. At customer support I never receive information about the successful implementations. I got to know our software through the problems that occur, and in almost 15 years of acceptance testing and customer support, I’ve seen all kind of bugs passing by.
HIquality bug cycleSoftware crashes and never ending loops are nasty. Worse are those bugs that are not that visible in the beginning, but keep on growing in the course of time.
Recently we caught such a bug in our longest existing product HIquality Identify. Continue reading ‘We make ‘null’ mistakes’