People find many ways to be unique, including in their choice of names and how they are written.  Common names may be written in any number of ways (Zachery, Zaccari, Zachery, Zakarey and so) and in any number of forms (Za’Korey, zaKori). This variation, and the importance that the customer attaches to it, reinforces the importance of first time right when collecting information about a person’s name.

I was reminded recently that this rule applies also to company names when reviewing a directory containing Croatian companies.  The directoryshowed a great variation in words that at first glance would seem ideal candidates for correction and standardization. For example, many companies contained strings like these:

Commerc, Comerce, Comerc, Kommerce, Kommerc, Komerce, Komerc

There are many other examples which had me scratching my head: Compani, Konsulting, Konzalting, Konsalting and so on.

Why the variance is spelling?  Are these companies with the English word commerce in their names where that word has been typed as heard by call centre workers with a limited knowledge of English? Are they typos of a valid Croatian word? Are they accurate representations of a valid Croatian word as rendered in different dialects? Is it a mixture of all these factors?

I am assured that commerce does not have a similar Croatian equivalent. The best translation of commerce would be trgovina or obrt. Typos and mis-rendering aside, it would appear that in a significant number of cases these strings as written are actually part of an accurate company name – they are attempts to anglicise and internationalise the company name, either mis-spelling the English word or rendering it to sound like ‘commerce’ in the local Croatian dialect.

If one needed to reactively cleanse company data that has not been collected correctly, one could choose to translate these words when found, or to standardise them to ‘commerce’, but either process would reduce the accuracy of the data, because, wrong or not, these words are included as parts of company names. They show the individuality of the company, and attempts to ‘correct’ data is as bad as correcting a personal name like SuZann because it is ‘written wrongly’.

There are parts of company names that can be processed, most often the legal form of the company – PLC, Ltd, Inc. or, to continue the Croatian theme, d.o.o., but the rest of the company string needs to be left alone.

Like personal names, company names are carefully chosen to show individuality, to set that company apart from its competitors. The names vary greatly in form and spelling. Unless one has an intimate knowledge of a company, viewing a name after data collection will not show where any errors in it exist, and no post processing will allow those errors to be corrected – in fact, post processing often reduces the accuracy of company names.  As with personal names, the rule for company names is to collect then correctly – right first time.



