The Norm Program

The lexical program, Norm, generates the normalized strings for terms included in the SPECIALIST Lexicon. The normalization process involves stripping possessives, replacing punctuation with spaces, removing stop words such as "No Other Specification" or NOS, lower-casing each word, breaking a string into its constituent words, and sorting the words in alphabetic order.

Below is an example of the normalization process for the term Hodgkin's diseases, NOS.

Hodgkins diseases, NOS

Remove genitive

Hodgkin diseases, NOS

Remove stop words

Hodgkin diseases,


hodgkin diseases,

Strip punctuation

hodgkin diseases


hodgkin disease

Sort words

disease hodgkin

The Norm program is used in systems to:

  • Find similar terms
  • Map terms to UMLS concepts
  • Find lexical variants for a term

