200 years of family names

Today is a memorable day for data quality in the Netherlands. Exactly two hundred years ago, on August 18, 1811, the French emperor (and occupier) Napoleon Bonaparte issued the decree that all citizens of the northern provinces of the Netherlands were to choose a surname. This name was very useful in the municipal registers of the Dutch inhabitants: how else could the French army know which lad to draw for military service, or which peasant to pursue for taxes?

We have 180 million names! Which one is right?

The internet is an ocean of wealthy content, but unfortunately, as in the real world, it’s heavily polluted.

As a company in business for 25 years, Human Inference absolutely sees the benefits of the internet. For our reasoning processes, based on natural language processing, we gather content and we classify this content on type, such as given names, family names, prefix, suffix, etc. (See also my blog post on the comparison of apples and oranges ….)

In the past this was done manually by, for example, investigating telephone books or manual research of census lists. But these were the 'pioneer years'. What we see now is an enormous amount of content that can be gathered on the internet. It's quite easy to find an internet page with 180 million records of person names. Great, so knowledge gathering is passé now?