Monday, June 11, 2012

Dangers of Automated Hyperlinking in Definitions

In my previous post I noted the example of the definition of “Mortgage-Backed Securities” in Prof. Campbell Harvey’s Hypertextual Finance Glossary.  The definition is:

Securities backed by a pool of mortgage loans  [http://www.duke.edu/~charvey/Classes/wpg/glossary.htm]


The term “pool” is hyperlinked in this definition, and the definition of “pool” is:

In capital budgeting, the concept that investment projects are financed out of a pool of bonds, preferred stock, and common stock, and a weighted-average cost of capital must be used to calculate investment returns. In insurance, a group of insurers who share premiums and losses in order to spread risk. In investments, the combination of funds for the benefit of a common project, or a group of investors who use their combined influence to manipulate prices.

This definition of “pool” does not fit the use of the term “pool” as it appears in the definition of “Mortgage-Backed Securities”.  I would argue that “pool” in this context should be preliminarily defined as:

a set of mortgages with common characteristics that act as collateral for debt instruments 

This is quite different to Prof. Harvey’s definition, so how did there come to be such a difference? 

I wonder if what we are looking at here is the automated hyperlinking of terms in definitions.  I do not know that this is happening in Prof. Harvey’s Glossary, but I have seen it as a feature of other semantic tools.  It is very convenient to type in a definition and have all the terms within it hyperlinked if they occur elsewhere in the vocabulary that is being constructed.  If this is not done, the analyst has to go and create the hyperlinks manually – a process that is very time-consuming.

But while automated hyperlinking is a great feature, it needs to be controlled.  I think the best way to do this is to have a cross reference report that provides a side by side comparison of a definition with the definitions of the terms used in it.  We can imagine the definition of a term on the left hand side of a page, with the hyperlinked terms highlighted.  On the right hand side of the page we could have the definitions of all the terms highlighted on the left hand side.  The analyst can then check whether the way in which each term is used in the definition on the left is consistent with the definition of that term on the right.

Simply relying on automated hyperlinking without this kind of check would seem to be inviting trouble.

A further check would be to ensure that the terms used in a definition to not link to definitions that are outside the vocabulary under consideration.  For instance,   here is a definition of “pool” from http://oxforddictionaries.com/definition/pool--2?region=us

a supply of vehicles or goods available for use when needed

This is definitely not the concept being identified by “pool” in Prof. Harvey’s Glossary.  However, in a broad semantics repository it is quite possible that a wide range of vocabularies might be present, and a term in a definition might get hyperlinked to a definition in an entirely different vocabulary – a term which is signifying a concept that is quite alien to the vocabulary which contains the original definition.   

No comments:

Post a Comment