Question about split method

Hello,

I’m wondering if there is a method for the String class that splits a
string on some characters and keeps the split characters in the elements
of the resulting array?

The split method returns an array in this example:

p “This is a sentence. This is a sentence! This is a
sentence?”.strip.split(/.|?|!/)

[“This is a sentence”, " This is a sentence", " This is a sentence"]

The three sentences in the above string have very different meanings,
but loose those meanings without the punctuation, so I’d like to keep
the punctuation. I’d like a method that keeps the split characters, and
returns this array:

[“This is a sentence.”, " This is a sentence!", " This is a sentence?"]

Does such an array exist? If not, would it be possible to modify the
split method to produce that result?

I’m running Ruby 1.8.6 on Windows.

Thanks for your help.

Does such an array exist? If not, would it be possible to modify the split method to produce that result?

If you put the pattern in a group, it will be included in the array –
but not quite in the way you described:

irb(main):001:0> p “This is a sentence. This is a sentence! This is a
sentence?”.strip.split(/(.|?|!)/)
[“This is a sentence”, “.”, " This is a sentence", “!”, " This is a
sentence", “?”]

Regards,
Thomas.

2008/2/25, Glenn [email protected]:

The three sentences in the above string have very different meanings, but loose those meanings without the punctuation, so I’d like to keep the punctuation. I’d like a method that keeps the split characters, and returns this array:

[“This is a sentence.”, " This is a sentence!“, " This is a sentence?”]

Does such an array exist? If not, would it be possible to modify the split method to produce that result?

I’m running Ruby 1.8.6 on Windows.

Hm, you could do it with lookbehind on 1.9. On 1.8 you only have
lookforward which gives you this:

irb(main):002:0> “a. b.”.split /(?=.\s+)/
=> [“a”, “. b.”]

Not quite what you wanted. :slight_smile:

But here’s an alternative approach which works with 1.8:

irb(main):005:0> “a. b. c! d? e.”.scan /.*?.!?/
=> ["a. ", "b. ", "c! ", "d? ", “e.”]

Kind regards

robert

how about spliting across the the possible
punctuation? e.g. “This is a sentence. This is a
sentence? This is a
sentence!”.strip.split(/(.|?|!)/)

— Glenn [email protected] wrote:

sentence?".strip.split(/.|?|!/)

  ____________________________________________________________________________________

Looking for last minute shopping deals?
Find them fast with Yahoo! Search.
http://tools.search.yahoo.com/newsearch/category.php?category=shopping

On Feb 25, 2008, at 6:43 AM, Robert K. wrote:

sentence?".strip.split(/.|?|!/)
sentence?"]
=> [“a”, “. b.”]

Not quite what you wanted. :slight_smile:

We can turn look-ahead into into look-behind, though it’s not pretty:

$ ruby -ve ‘p “This is a sentence. This is a sentence! This is a
sentence?”.reverse.split(/(?=(?:\A|\s+)[.!?])/).map { |s|
s.reverse }.reverse’
ruby 1.8.6 (2007-09-24 patchlevel 111) [i686-darwin9.1.0]
["This is a sentence. ", "This is a sentence! ", “This is a sentence?”]

James Edward G. II

This post is old, but since I’m searching for something else but landed
on this post in Google, but I have a better solution for this particular
post than other suggestions posted above, so I hope you guy don’t mind
my resurrection of the post.

My solution is this:

puts “This is a sentence. This is a sentence! This is a
sentence?”.strip.split(/\b(.|?|!)\b/)

If you try out the code above, it will return:
This is a sentence. This is a sentence! This is a sentence?
=> nil

:slight_smile: Try that in irb!