On 02/02/2012 08:21 PM, Dmitry N. wrote:
Ryan D. wrote in post #1043813:
IMHO re.match is just as useless as Regexp.new, Array.new, and Hash.new
(assuming no args/blocks passed). They’re throwbacks to java devs and
serve no purpose but to make things more verbose. In this specific case,
there are tangible reasons to use =~ over #match.
The reason I tried to use Regexp.new is because I figured it would
pre-compile the regexp - the way “qr/ test /” in Perl would do, so that
it doesn’t have to re-compile it on every iteration.
Everything in Ruby is an object, even regexps, so you can save your
regexp to a variable or a constant to avoid a recompile. In addition,
the // expression is pretty much just syntactic sugar for
Regexp.new(“some string”) or Regexp.new(/some regexp/), so you can
forgoe that noise. The sugar is probably faster too since it should
avoid Ruby method calls, unlike Regexp.new, not that it should be an
issue in this example.
To see if this helps at all, try changing the code to the following:
s = “This is a test string”
re = / test /
for a in 0…1E7
s =~ re
end
Try a similar change to the other looping variations that have been
discussed and see if and how much they may improve. For me I didn’t
really see any difference between using re as above or using the simple
regexp directly; however, the code was almost an order of magnitude
slower when I replaced the comparison as follows:
s =~ / test#{} /
It seems that Ruby is smart enough to see that the simple regexp will
never need to be re-evaluated. The regexp used above must force that
optimization off because #{} while constantly evaluated to the empty
string is technically dynamic, thus the regexp needs to be re-evaluated
in every iteration of the loop.
If you really need performance in the end, however, you might want to
consider coding your critical code paths in something like C and then
calling those from Ruby as a direct extension or using something like
ffi to call into a DLL containing the logic. Your overall code base may
be a little messy, but sometimes the speed you need requires such a
trade-off. Hopefully, you can keep the mess limited to only a small set
of your overall application logic. Of course, the same holds true for
Perl in this regard.
-Jeremy