# If we want to map abbreviations of full-English words (e.g. map “Jan” to “January”), how can we identify abbreviations which map to multiple words?

## Short Version:

### How can we construct such a trie which maps abbreviations of names-of-the-month to full-month (we map the abbreviation "mar" to "march")?

• The set of all abbreviations is formed by:
• keeping the first letter of the month name. (all abbreviations of "`january`" begin with "`j`")
• deleting 1 or more characters ("`jan`" deletes "`uary`" from "`january`")

### The Looonnnnggg Version:

How can we construct such a trie?
What algorithm will build the appropriate trie from the container of verbose strings.

Consider the English names for months of the year:

• January
• February
• March
• April
• [… truncated …]
• October
• November
• December

We find it useful say that the English names for months of the year are "verbose" strings.
For any "verbose" string $$v$$, and any string $$a$$ we say that $$a$$ is an "abbreviation" of $$v$$ if and only if all of the following conditions are met:

• $$a$$ non-empty. $$|a| \geq 1$$
• $$a$$ can formed by deleting 1 or more characters from "verbose" string $$v$$
• $$a(1) = v(1)$$. Assume that string indexing begins at $$1$$, and not $$0$$.

For example, "`jan`" is an abbreviation of "`january`."

Suppose you want to write an algorithm which:

• accepts a list of verbose strings as inputs.
• the algorithm outputs a "trie" data-structure (information retrieval tree) $$T$$ such that:
• The trie $$T$$ accepts any ASCII string as input.
• An output (leaf node) of the trie should be set of strings $$S$$ such that:
• every string in $$S$$ is a verbose string
• the string fed as input into trie $$T$$ is an abbreviation of every verbose string in container $$S$$

Some examples of input to the trie and output of the trie are shown below:

• Example 1

• Input: "`Ma`"
• Output: $$\{$$"`March`", "`May`"$$\}$$
• Example 2

• Input: "`Mar`"
• Output: $$\{$$"`March`"$$\}$$
• Example 3

• Input: "`Decuary`"
• Output: $$\{$$"` `"$$\}$$ ….. the empty-set

The output from the trie should be one of:

• the empty set
• a set of one item
• a set of two or more items

For months of the year, we might write javascript so that an end-user can type in any half-way reasonable date-format, instead getting an error message when they put slashes instead of dashes, etc….

If you do not like the months of the year application, a different use-case would be to write write your own Linux Shell (similar to BASH). Maybe any half-way reasonable abbreviation of "`make directory`" will map to "`mkdir`" In that case, we could have many-to-one mapping from high-level shell-commands to low-level Linux commands.

### The question is:

How can we construct such a trie?
What algorithm will build the appropriate trie from the container of verbose strings.

Also, can we avoid brute-force generating a list of all aberrations before-hand? The set of all strings form-able by deleting 1 or more characters from the verbose strings is quite large. We would like to avoid combinatorial explosion, if we can.

The programming language (Java, python, C+ + ) does not matter for answering this question.