Scramble Middle Letters

Pages: 12
@Ganado
That’s so coool!! Hmm. I wonder if I could repurpose that to make it into a working dictionary...?
@Ganado
I wasn’t sure posting the scrambled text would be very readable, not least of which because many users of this site read English as a second (or third) language.

Very nice.

@gongong
Er, it is a list of words in a text file. He simply tries every possible permutation of the letters and prints out every matching word. Download the words file and run the program. It’ll take a second or two to work. (~.5 to 1 sec to load the words file, ~.5 to 1 second to process every permutation.)

If you really want to speed things up for the general case you’ll want to do two things: sort the letters of the word and hash them as a general lookup key. C++ makes this easy enough using an unordered mapping, and it improves your lookup speed tremendously, with only a very small increase in the time it takes to load the dictionary.

One thing that words.txt does that always catches people unaware is CaSe ProBLemS. You’ve got to take the time tolower() everything, alas.

Here’s a variation that can process the output of my previous program:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
/*
  cl /EHsc /Ox /std:c++17 /DPROFILE unscramble.cpp
  g++      -O3 -std=c++17 -DPROFILE unscramble.cpp -o unscramble
  clang++  -O3 -std=c++17 -DPROFILE unscramble.cpp -o unscramble
*/

#include <algorithm>
#include <cctype>
#include <ciso646>
#include <ctime>
#include <fstream>
#include <iostream>
#include <sstream>
#include <string>
#include <unordered_map>

//-------------------------------------------------------------------------------------------------
#ifdef PROFILE
  #define CLOCK( v ) auto v = std::clock();
  #define COUT_ELAPSED( message, u, v ) std::cout << "(" << message << ((double)(v - u) / CLOCKS_PER_SEC) << " seconds)\n";
#else
  #define CLOCK( n )
  #define COUT_ELAPSED( message, u, v )
#endif

//-------------------------------------------------------------------------------------------------
#ifdef _WIN32
  #include <io.h>
  #ifndef isatty
  #define isatty _isatty
  #endif
#else
  #include <unistd.h>
#endif

//-------------------------------------------------------------------------------------------------
std::string& sort( std::string& s )
{
  std::sort( s.begin(), s.end() );
  return s;
}

std::string& tolower( std::string& s )
{
  for (auto& c : s) c = std::tolower( c );
  return s;
}

//-------------------------------------------------------------------------------------------------
struct Words: public std::unordered_multimap <std::string, std::string>
{
  Words()
  {
    std::ifstream f( "words.txt" );
    std::string word;
    while (f >> word)
    {
      std::string dorw = tolower( word );
      emplace( sort( dorw ), word );
    }
  }

  std::vector <std::string> lookup( std::string& s )
  {
    std::vector <std::string> matches;
    auto [first, last] = equal_range( sort( tolower( s ) ) );
    while (first != last)
      matches.emplace_back( first++ -> second );
    return matches;
  }
};

//-------------------------------------------------------------------------------------------------
template <typename Iterator>
std::string join( Iterator first, Iterator last, const std::string& sep )
{
  if (first == last) return "";
  std::string s = *first++;
  while (first != last) s.append( sep ).append( *first++ );
  return s;
}

template <typename Container>
std::string join( const Container& strings, const std::string& sep )
{
  return join( strings.begin(), strings.end(), sep );
}

//-------------------------------------------------------------------------------------------------
std::string read_file( std::istream& ins )
{
  std::string s;
  getline( ins, s, '\0' );
  return s;
}

//-------------------------------------------------------------------------------------------------
int main( int argc, char** argv )
{
  CLOCK( t0 )

  Words words;

  CLOCK( t1 )
  COUT_ELAPSED( "Loaded words in ", t0, t1 )

  // Our input may be some combination of command-line arguments and standard input
  // We only ask for standard input if it is a redirected file or there are no args
  auto s = join( argv + 1, argv + argc, " " );
  if (!isatty( 0 ) or s.empty())
    s.append( " " ).append( read_file( std::cin ) );

  // We are only interested in alphabetic input, and all input must be minuscule
  // (We'll ignore numbers later)
  for (auto& c : s)
  {
    if (std::isalnum( c )) c = std::tolower( c );
    else                   c = ' ';
  }

  // For each word in the input, look up all the unscramble possibilities
  std::size_t nwords = 0;
  std::istringstream ss{ s };
  while (ss >> s)
  {
    if (isdigit( s[0] ))
      std::cout << s << "\n";
    else
    {
      nwords += 1;
      std::cout << join( words.lookup( s ), " " ) << "\n";
    }
  }

  CLOCK( t2 )
  COUT_ELAPSED( nwords << " words unscrambled in ", t1, t2 )
  COUT_ELAPSED( "Total time is ",                   t0, t2 )
}

The speedup for a single lookup is pretty stark, even with all the additional string transforms performed by the new code:

C:\Users\Michael\Programming\cpp\cpp.com\unscramble> unscramble.ganado trosensapd
(Loaded words in 0.618 seconds)
transposed
(1 words unscrambled in 0.527 seconds)
(Total time is 1.147 seconds)

C:\Users\Michael\Programming\cpp\cpp.com\unscramble> unscramble.ganado trosensapd
(Loaded words in 0.639 seconds)
transposed
(1 words unscrambled in 0.519 seconds)
(Total time is 1.162 seconds)

C:\Users\Michael\Programming\cpp\cpp.com\unscramble> unscramble.ganado trosensapd
(Loaded words in 0.61 seconds)
transposed
(1 words unscrambled in 0.524 seconds)
(Total time is 1.137 seconds)

Average: 1.149 seconds

C:\Users\Michael\Programming\cpp\cpp.com\unscramble> unscramble.duthomhas trosensapd
(Loaded words in 0.712 seconds)
transposed
(1 words unscrambled in 0.004 seconds)
(Total time is 0.716 seconds)

C:\Users\Michael\Programming\cpp\cpp.com\unscramble> unscramble.duthomhas trosensapd
(Loaded words in 0.693 seconds)
transposed
(1 words unscrambled in 0.005 seconds)
(Total time is 0.698 seconds)

C:\Users\Michael\Programming\cpp\cpp.com\unscramble> unscramble.duthomhas trosensapd
(Loaded words in 0.709 seconds)
transposed
(1 words unscrambled in 0.003 seconds)
(Total time is 0.712 seconds)

Average: 0.707 seconds

Enjoy!
Topic archived. No new replies allowed.
Pages: 12