Unified Medical Language System® (UMLS®)

The Word Index Generator - Wordind

Wordind creates word indexes by breaking a string into a unique list of lowercased "words." Wordind defines a word as a sequence of one or more alphanumeric characters.

For example, the phrase Increased heart rate in an overweight forty-year-old male would generate:

  • increased
  • heart
  • rate
  • overweight
  • forty
  • year
  • old
  • male

WordInd reads from standard input and writes to standard output, one line per word. This tool is used by the UMLS to produce the word index for the Metathesaurus (the file named MRXW.RRF).

