I’m wondering if there is a method for the String class that splits a
string on some characters and keeps the split characters in the elements
of the resulting array?
The split method returns an array in this example:
p “This is a sentence. This is a sentence! This is a
sentence?”.strip.split(/.|?|!/)
[“This is a sentence”, " This is a sentence", " This is a sentence"]
The three sentences in the above string have very different meanings,
but loose those meanings without the punctuation, so I’d like to keep
the punctuation. I’d like a method that keeps the split characters, and
returns this array:
[“This is a sentence.”, " This is a sentence!", " This is a sentence?"]
Does such an array exist? If not, would it be possible to modify the
split method to produce that result?
Does such an array exist? If not, would it be possible to modify the split method to produce that result?
If you put the pattern in a group, it will be included in the array –
but not quite in the way you described:
irb(main):001:0> p “This is a sentence. This is a sentence! This is a
sentence?”.strip.split(/(.|?|!)/)
[“This is a sentence”, “.”, " This is a sentence", “!”, " This is a
sentence", “?”]
The three sentences in the above string have very different meanings, but loose those meanings without the punctuation, so I’d like to keep the punctuation. I’d like a method that keeps the split characters, and returns this array:
[“This is a sentence.”, " This is a sentence!“, " This is a sentence?”]
Does such an array exist? If not, would it be possible to modify the split method to produce that result?
I’m running Ruby 1.8.6 on Windows.
Hm, you could do it with lookbehind on 1.9. On 1.8 you only have
lookforward which gives you this:
We can turn look-ahead into into look-behind, though it’s not pretty:
$ ruby -ve ‘p “This is a sentence. This is a sentence! This is a
sentence?”.reverse.split(/(?=(?:\A|\s+)[.!?])/).map { |s|
s.reverse }.reverse’
ruby 1.8.6 (2007-09-24 patchlevel 111) [i686-darwin9.1.0]
["This is a sentence. ", "This is a sentence! ", “This is a sentence?”]
This post is old, but since I’m searching for something else but landed
on this post in Google, but I have a better solution for this particular
post than other suggestions posted above, so I hope you guy don’t mind
my resurrection of the post.
My solution is this:
puts “This is a sentence. This is a sentence! This is a
sentence?”.strip.split(/\b(.|?|!)\b/)
If you try out the code above, it will return:
This is a sentence. This is a sentence! This is a sentence?
=> nil
Try that in irb!
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.