Looking at the data can give you deeper insight in your data processes. Not only the processes in theory, but in practice too. How? By focusing on the exceptions!
For example, do you think the field ‘social security number’ is reliable because it is mandatory and uses a modulo 11 check digit? Maybe this field is filled with a fake-value like ‘111222333′ , or the social security number of the agent who entered the data. A frequency check on the values in this field will identify values that occur more frequently than others.
Why do people do that, cheat? One reason may be that staff at the front desk want to help their customer, or sell something, and the customer doesn’t know their social security number at that moment.
The same applies to all other fields. Address fields mandatory, but the customer doesn’t want to give their address ? Fill in the address of the office. Customer deceased? Put the text (deceased) after their surname. Ex-directory phone number? Use the email field for entering the text ‘ex-directory’.
Get to know your processes by analyzing your data!