Twice

Created: 2014-05-19 06:56
Updated: 2014-05-19 07:03
c++

README.md

Word Jumble

Problem Specification

The program should accept a string as input, and then return a list of words that can be created using the submitted letters

Usage

  • Type: make
  • Execute: ./JumbleSolver
  • The program takes one command line argument which is the jumble of words. Otherwise it asks for a word jumble.

Technical spec

There's a couple of ways to approach this problem. One important question is will this program be used infrequently or will it be called repeatedly, for example as part of some web service that's solving online word jumbles. I chose to work under the condition that the function is going to be called a lot so I wrote the jumbleSolver class with the ability to build a hashmap of strings -> a list of strings.

Every word in the dictionary is sorted and that sorted version is used to index the hashmap. We then insert the original word into the list. This removes the need to check every k-permutation, instead I just check every n choose k combination as k goes from 1 to n.

At a certain limit it takes less time to search through the entire dictionary then it does to generate every n choose k combination as k goes from 1 to n. If we let the dictionary have size m and the average length of a word be x and our input jumble be of size n then we need to solve for what value of n causes:

                                    m*lg(x) + 2^n -1 > m*(x*lg(x)+x)

If we assume the average length of a word is 4 and our dictionary size is 86000 then we find n to be about 18. When our word jumble has more than 18 letters in it it's faster to just search the dictionary as is then generate all the combinations.

If the function wasn't going to be called often then it doesn't pay to do much preprocessing. I could either visit each k-permutation and perform a binary search on the dictionary file to see if that k-permutation is a valid word. Or I could do some preprocessing of the dictionary by building a trie and then instead of searching every k-permutation only search permutations that have valid prefixes in them. I chose not to do this simply because implementing a trie and debugging it has, in my experience, taken more than 30 minutes to do. And making it a generic robust trie takes even longer.

Cookies help us deliver our services. By using our services, you agree to our use of cookies Learn more