Friday, August 3, 2012

Glossary versus Vocabulary versus Concept System

Just a brief note on something that has been bothering me for a while.

In reading Prof. Campbell Harvey's Hypertextual Finance Glossary ( I noticed that it contains the ISO 4217 Currency Codes, e.g. there is an entry for "USD" with the definition "The ISO 4217 currency code for the USA Dollar".   However, the ISO 4217 Currency Codes are sprinkled though the glossary, because all terms are arranged in alphabetical order (actually, lexicographical order since numbers and symbols have to be taken into account).

This means that there is no single list of ISO 4217 Currency Codes in the Glossary.  To extract them would require searching on a piece of text, such as "4217" which is hopefully in every definition. 

Figure 1: Concept System of Customer Type

It also requires the searcher to know in advance that there is such a concept system as Currency Code.  But how is this always possible?  Consider what is shown in Figure 1.  Here there are three concepts: Customer Type, Individual Customer, and Corporate Customer.  Suppose these are put into a business glossary containing a few hundred entries, and that a reader comes across "Customer Type".  How will the reader be aware that there are two other very closely related concepts - Individual Customer and Corporate Customer?  Or if someone finds "Corporate Customer", how will they be aware of Customer Type and Individual Customer?

This shows that a large business glossary merges all concept systems and loses relationships between concepts.  It might be argued that I can put information into definitions to mitigate this.  For instance, I could define Individual Customer as "a Customer Type where the customer is an individual person".  However, it would be very unwise to list out all the Customer Types within every entry of the three types.  This is not defining a concept, but repeatedly describing the concept system.
However, I do wonder if the understanding that definitions are to be included in a glossary affects the way they are written, so that there are attempts to document the concept system and the relations within it.  This risks creating confusion.

A better attempt might be to have self contained vocabularies.  So we could form a mini-vocabulary for the three concepts shown in Figure 1.  At least they will all be together.  But it would seem logical to start off with the concept system itself, or at least the highest genus - Customer Type.  However, "Corporate Customer" will sort lexicographically ahead of that.  So this is another problem.
Producing a diagram and explanation of the concept system in which all the concepts are defined, and their relationships explained (is a relationship a concept I wonder) is probably the best way.  So this leads us to conceptual models.

In reality, a glossary, vocabulary, and concept system are three views of the same semantic space and we probably need all three.  However, we have to recognize their limitations and advantages.