Thanks Robert, Corey, Philip, Stephano, et. al, for all of the great
suggestions. However, they all seem to ignore the conditional nature of
‘y’
as a vowel. I would like the regex to treat ‘y’ as a vowel when there is
no
other vowel either before or after it. The string I used initially, was
drawn from the following perl code that accomplishes this (found at http://www.perlmonks.org/?node_id=592867):
my @vowels = ( /[aeiou]|(?<![aeiou])y(?![aeiou])/gi );
The “(?<!..)” is a “zero-width negative-look-behind assertion”.
The “(?!..)” is a “zero-width negative-look-ahead assertion”.
Together, they match the condition of treating “y as a vowel only if
there
is no other vowel before or after it.”
This was my attempt at converting the Perl fragment to Ruby syntax:
scan(/[aeiou]|(?![aeiou])y(?![aeiou])/i)
I have since discovered that Ruby 1.8 lacks regex look-behind assertion
which explains why the code was failing. As a result, I have fallen back
to
the following which currently ignores the ‘y’:
class String
def vowels
scan(/[aeiou]/i)
end
def consonants
scan(/[^aeiou]/i)
end
end
Any ideas how to modify this to include the conditional treatment of “y
as a
vowel only if there is no other vowel before or after it?”
(i.e., is there a way to simulate the perl “zero-width
negative-look-behind”
and “zero-width negative-look-ahead” assertions for ‘y’ in Ruby 1.8?)
Interesting. I guess you could post-process a bit on the two sets that
you
get back. A regular expression that can handle the y would be good, I
guess.
they all seem to ignore the conditional nature of ‘y’
as a vowel.
What about diphthongs? Technically a diphthong is one vowel
made up of two letters. The rules vary by language; Spanish
even has triphthongs, e.g. in Raoul. A diphthong/triphthong
occurs wherever a succession of vowel symbols doesn’t contain
a syllable break.
The “(?!..)” is a “zero-width negative-look-ahead assertion”.
the following which currently ignores the ‘y’:
Any ideas how to modify this to include the conditional treatment of "y as a
Robert
Origumura is the default regex library for Ruby 1.9. It includes look-
behind assertions (wohoo!) … and … It turns out that there is a
gem available so you don’t have to upgrade to Ruby 1.9 or monkey
around with creating a custom Ruby 1.8.x build.
After installing the library, origuruma installation is a breeze
using:
sudo gem install -r origuruma
However, my progress has come to a screeching halt as I am now
receiving the following error:
** Starting Mongrel listening at 0.0.0.0:3000
** Starting Rails with development environment…
Exiting
/usr/local/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in gem_original_require': ./lib/string_extensions.rb:4: undefined (?...) sequence: /[aeiou]|(?<![aeiou])y(?![aeiou])/ (SyntaxError) ./lib/string_extensions.rb:8: undefined (?...) sequence: /![aeiou]|(? <=[aeiou])y(?=[aeiou])/ from /usr/local/lib/ruby/site_ruby/1.8/ rubygems/custom_require.rb:27:in require’
It seems to be complaining about the look-behind and look-ahead
assertions in the following code fragment (which origuruma is supposed
to support):
class String
def vowels
scan(/[aeiou]|(?<![aeiou])y(?![aeiou])/i)
end
def consonants
scan(/![aeiou]|(?<=[aeiou])y(?=[aeiou])/i)
end
end
According to this reference (サービス終了のお知らせ
oniguruma/
doc/RE.txt), the look behind and look ahead syntax that I am using
appears to be correct
(ref section 7. Extended groups).
This suggests either:
A. Ruby may be using the default regexp library instead of the
oniguruma regexp library,
B. The oniguruma regexp library is not accessible via the ‘scan’
method, or
my @vowels = ( /[aeiou]|(?<![aeiou])y(?![aeiou])/gi );
scan(/[aeiou]|(?![aeiou])y(?![aeiou])/i)
scan(/[^aeiou]/i) [email protected], “Robert D.”
The gem relies upon a c library that can be downloaded from here:サービス終了のお知らせ
** Starting Rails with development environment…
assertions in the following code fragment (which origuruma is supposed
B. The oniguruma regexp library is not accessible via the ‘scan’
method, or
C. Something else entirely
… hmmm …
Any suggestions?
Thanks for all the help everyone. The problem was solved with the help
from pullmonkey on Rails Forum! Here is the solution:
Objective:
Extract vowels and consonants from a string
Handle the conditional treatment of ‘y’ as a vowel under the
following circumstances:
y is a vowel if it is surrounded by consonants
y is a consonant if it is adjacent to a vowel
Here is the code that works:
def vowels(name_str)
reg = Oniguruma::ORegexp.new(‘[aeiou]|(?<![aeiou])y(?![aeiou])’)
reg.match_all(name_str).to_s.scan(/./)
end
def consonants(name_str)
reg = Oniguruma::ORegexp.new(‘[bcdfghjklmnpqrstvwx]|(?<=[aeiou])y|
y(?=[aeiou])’)
reg.match_all(name_str).to_s.scan(/./)
end
(Note, the .scan(/./) can be eliminated to return an array)
The major problem was getting the code to accurately treat “y” as a
consonant. The key to solving this problem was to:
define unconditional consonants explicitly (i.e.,
[bcdfghjklmnpqrstvwx]) – not as [^aeiou] which automatically includes
“y” thus OVER-RIDING any conditional reatment of “y” that follows
define conditional “y” regexp assertions independently, i.e., “| (?
<=[aeiou]) y | y (?=[aeiou])” – not “|(?<=[aeiou]) y (?=[aeiou])”
which only matches “y” preceded AND followed by a vowel, not preceded
OR followed by a vowel
HTH.
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.