Hi,
I’m using ferret (via acts_as_ferret) in a somewhat unorthodox manner
and am having a strange wildcard problem. Before anyone wonders why
we’re doing things this way, the answer is basically that it lets us
precompute what would be expensive database queries and store the
results in a simple way (ferret index) prior to pushing the static data
to our production server.
Basically, I’ve got two (for the sake of simplicity) models, both of
which are indexed on a similar (but separate) non-model field. However,
one of those two models does not seem to get the proper number of
results for a wildcard search:
First of all, there’s a non-indexed model called ProductTuple that’s got
a supplier_id as well as a product_category_id and product_material_id
as well as some other id fields that aren’t really important here. Thus,
a ProductTuple has foreign key relationships to Suppliers and
ProductCategories and ProductMaterials, but for ferret purposes just
think of those foreign keys as what they are - ids (e.g. integers).
The first model, Supplier, is ferret-indexed on several fields, such as
the supplier name and supplier country, as well as the
‘ferret_product_tuples’ non-model field. ferret_product_tuples simply
takes all the product tuples for a supplier and concatenates their
product_category_id, product_material_id, etc. with delimiters.
So, for a product tuple with product_category_id 82, product_material_id
88, and undefined product_technique_id, the resulting part of the
ferret_product_tuple string would look like x00082_00088_00000x (where
we use 00000 to indicate null). the xs are used as anchors, essentially,
as a given supplier’s ferret_product_tuple string might look like
‘x00082_00088_00000x x00000_00081_00013x’.
now, the ferret query that gets constructed when we do the relevant
queries simply looks like:
‘ferret_product_tuple:x00082_???_???x’ and this would, in the above
instance, match that supplier.
Everything I’ve described works perfectly, EXCEPT…
we also index product_categories on this same string. So product
category #82 would have a bunch of ferret_product_tuple strings that
start out x00082 and have various things in the other positions. Here’s
what’s strange… a product_category query for
‘ferret_product_tuple:x?????????x’ should return ALL product
categories, right? Yet it only returns six. A product category query for
‘ferret_product_tuple:x???00081???x’ should return all the product
categories that share product_tuples with product_material #81, but in
fact returns only a small number of categories. Yet making the wildcard
match MORE restrictive by substituting
‘ferret_product_tuple:x00082_00081_???x’ into that query yields
product_category #82, which is erroneously not included in the 6 results
for ‘ferret_product_tuple:x???00081???x’.
So, have I stumbled upon a bug in the wildcard handling? My initial
thought was that the different analyzer I was using for the
product_category index was the culprit, but I changed that analyzer out
to no effect, so I’ve ruled that out.
Any ideas? Thanks!