Ruby Q. wrote:
Write a Ruby crossword solver. Randomly fill a crossword template with words
from a dictionary. Use any dictionary you want (/usr/share/dict/words, text of
pickaxe, etc.).
This solution requires Gecode/R ( http://gecoder.rubyforge.org/ ).
== Overview
The problem is modelled as a matrix of cells where each cell can be
assigned a number representing the letters a…z and #. The letters are
converted to integers in order to use integer variables to represent
them. Similarly each word is converted to a base 26 number with the
converted letters as digits. Hence allowing the words to be represented
as integers as well.
The constraint that sequences of cells longer than one letter must form
words is then added by constraining that the linear combination of the
involved cells forming the corresponding base 26 number, with each
variable as one digit, must be equal to one of the words converted to a
number.
== Weaknesses
It has two major problems
- The words converted to numbers have to fit in the range of an integer
variable, which only allows words of up to 6 characters in my case. This
can probably be helped by using multiple numbers to represent each word.
- There’s no randomness in the exploration. Gecode itself does support
custom branching (which would allow such randomness), but Gecode/R does
not.
In some cases the exploration will take a long time. I’m unsure whether
it’s because of flaws in the model or because of the problem’s
difficulty.
== Code
require ‘enumerator’
require ‘rubygems’
require ‘gecoder’
The base we use when converting words to and from numbers.
BASE = (‘a’…‘z’).to_a.size
The offset of characters compared to digits in word-numbers.
OFFSET = ‘a’[0]
The range of integers that we allow converted words to be in. We are
only using the unsigned half, we could use both halves, but it would
complicate
things without giving a larger allowed word length.
ALLOWED_INT_RANGE = 0…Gecode::Raw::Limits::Int::INT_MAX
The maximum length of a word allowed.
MAX_WORD_LENGTH = (Math.log(ALLOWED_INT_RANGE.last) /
Math.log(BASE)).floor
Describes an immutable dictionary which represents all contained words
as numbers of base BASE where each digit is the corresponding letter
itself converted to a number of base BASE.
class Dictionary
Creates a dictionary from the contents of the specified dictionary
file which is assumed to contain one word per line and be sorted.
def initialize(dictionary_location)
@word_arrays = []
File.open(dictionary_location) do |dict|
previous_word = nil
dict.each_line do |line|
word = line.chomp.downcase
# Only allow words that only contain the characters a-z and are
# short enough.
next if previous_word == word or word.size > MAX_WORD_LENGTH or
word =~ /[^a-z]/
(@word_arrays[word.length] ||= []) << self.class.s_to_i(word)
previous_word = word
end
end
end
Gets an enumeration containing all numbers representing word of the
specified length.
def words_of_size(n)
@word_arrays[n] || []
end
Converts a string to a number of base BASE (inverse of #i_to_s ).
def self.s_to_i(string)
string.downcase.unpack(‘C*’).map{ |x| x - OFFSET }.to_number(BASE)
end
Converts a number of base BASE back to the corresponding string
(inverse of #s_to_i ).
def self.i_to_s(int)
res = []
loop do
digit = int % BASE
res << digit
int /= BASE
break if int.zero?
end
res.reverse.map{ |x| x + OFFSET }.pack(‘C*’)
end
end
class Array
Computes a number of the specified base using the array’s elements
as digits.
def to_number(base = 10)
inject{ |result, variable| variable + result * base }
end
end
Models the solution to a partially completed crossword.
class Crossword < Gecode::Model
The template should take the format described in RubyQuiz #132 . The
words used are selected from the specified dictionary.
def initialize(template, dictionary)
@dictionary = dictionary
# Break down the template and create a corresponding square matrix.
# We let each square be represented by integer variable with domain
# -1...BASE where -1 signifies # and the rest signify letters.
squares = template.split(/\n\s*\n/).map!{ |line| line.split(' ') }
@letters = int_var_matrix(squares.size, squares.first.size,
-1...BASE)
# Do an initial pass, filling in the prefilled squares.
squares.each_with_index do |row, i|
row.each_with_index do |letter, j|
unless letter == '_'
# Prefilled letter.
@letters[i,j].must == self.class.s_to_i(letter)
end
end
end
# Add the constraint that sequences longer than one letter must form
# words. @words will accumelate all word variables created.
@words = []
# Left to right pass.
left_to_right_pass(squares, @letters)
# Top to bottom pass.
left_to_right_pass(squares.transpose, @letters.transpose)
branch_on wrap_enum(@words), :variable => :largest_degree,
:value => :min
end
Displays the solved crossword in the same format as shown in the
quiz examples.
def to_s
output = []
@letters.values.each_slice(@letters.column_size) do |row|
output << row.map{ |x| self.class.i_to_s(x) }.join(’ ‘)
end
output.join(“\n\n”).upcase.gsub(’#', ’ ')
end
private
Parses the template from left to right, line for line, constraining
sequences of two or more subsequent squares to form a word in the
dictionary.
def left_to_right_pass(template, variables)
template.each_with_index do |row, i|
letters = []
row.each_with_index do |letter, j|
if letter == ‘#’
must_form_word(letters) if letters.size > 1
letters = []
else
letters << variables[i,j]
end
end
must_form_word(letters) if letters.size > 1
end
end
Converts a word from integer form to string form, including the #.
def self.i_to_s(int)
if int == -1
return ‘#’
else
Dictionary.i_to_s(int)
end
end
Converts a word from string form to integer form, including the #.
def self.s_to_i(string)
if string == ‘#’
return -1
else
Dictionary.s_to_i(string)
end
end
Constrains the specified variables to form a word contained in the
dictionary.
def must_form_word(letter_vars)
raise ‘The word is too long.’ if letter_vars.size > MAX_WORD_LENGTH
# Create a variable for the word with the dictionary’s words as
# domain and add the constraint.
word = int_var @dictionary.words_of_size(letter_vars.size)
letter_vars.to_number(BASE).must == word
@words << word
end
end
puts ‘Reading the dictionary…’
dictionary = Dictionary.new(ARGV.shift || ‘/usr/share/dict/words’)
puts ‘Please enter the template (end with ^D)’
template = ‘’
loop do
line = $stdin.gets
break if line.nil?
template << line
end
puts ‘Building the model…’
model = Crossword.new(template, dictionary)
puts ‘Searching for a solution…’
puts((model.solve! || ‘Failed’).to_s)
END