An increasing number of companies have to deal with data from the world’s fastest emerging economy: China. And the big question in this issue is of course: How can we compare these “strange” Chinese characters with our own writing set?
Grammar and character set of our Western alphabet-languages (such as English, French, Dutch or German) differ tremendously from Mandarin Chinese (which is the language spoken by most in the People’s Republic of China and abroad. Mandarin is a tonal language with an ideographic character set. Almost all characters have a semantic and a phonetic component. The different pithch in the pronunciation eventually determines the signification
Complicated? Definitely. But what about the other way around? Have you ever thought about the difficulties the Chinese have to face when trying to convert their language into meaningful English?
This phenomenon is sometimes hilariously being illustrated by the many public signs in China used to inform foreign visitors or to help them finding their way around.
This is truly a delightful side-effect of internationalization. ….
The German sinologist Oliver Lutz Radtke christened these linguistic attempts “Chinglish” and collected many examples, which can be found virtually everywhere: on hotel room doors, on road signs along the highways, shampoo bottles and t-shirts. A small anthology:
- A warning sign for a steep slope: “Please, watch your slip”
- To avoid all misunderstandings, on the inside of a taxi door: “Don’t forget to carry your thing”
- A sign above a store entrance, to let our fantasy run free: “Welcome to presence”
Although this is all very funny, from a data quality point of view, this definitely leaves a thing or two to consider. For example: What should we think of fault-tolerance with regard to typo’s when we think of entering Chinese customer data into a database? What is the influence of typo’s in an ideographic writing set on searching, matching, enriching and correcting customer data?
There is still a lot of work to be done in international data quality. For more information, check out the Human Inference website on HIquality Name Worldwide.