Regular expressions in Ruby on rails


I hope that it is OK to post this here, please say if not…

I’m pretty new to ruby and regular expressions. I would like to use a
regular expression to match only numbers contained within square
brackets. e.g. a sentence may be…

[1] to [2], corresponding to [3] to [4] output using IRP 829 Version
2XX, [5] to [6] output using IRP 452 Version 3XX

The numbers are refences to specifications in another table. At the
moment, I use /[\d*]/ This extracts the numbers including brackets,
but now can I just extract the numbers by altering the expression? The
full ruby code I use for this is as follows if this is any use, probably
also there is a better way than splitting on a space…

Any help would be great and really appreciated.

Many thanks


<%if (opt==“display”)%>
<%cycles = 1
n = 0
@text_line = “”
for part in @translation.text_line.split(’ ')
if ((/[\d*]/).match(part))
if (@fields[n])
@[email protected]_unit(@fields[n].unit_id).name
@[email protected]_prefix(@fields[n].prefix_id).name

        if (@translation.text_line.split(' ').size==cycles)
            @text_line = "#{@text_line}"+"#{@fields[n].result}"+" 

@text_line = “#{@text_line}”+"#{@fields[n].result}"+"
“+”#{@prefix_new}"+"#{@unit}"+" "
n += 1
if (@translation.text_line.split(’ ').size==cycles)
@text_line = @text_line+"#{part}"
@text_line = @text_line+"#{part}"+" "
cycles += 1

On Tue, Oct 24, 2006 at 08:23:31AM +0200, Darren E. wrote:

2XX, [5] to [6] output using IRP 452 Version 3XX

The numbers are refences to specifications in another table. At the
moment, I use /[\d*]/ This extracts the numbers including brackets,
but now can I just extract the numbers by altering the expression? The
full ruby code I use for this is as follows if this is any use, probably
also there is a better way than splitting on a space…

Any help would be great and really appreciated.

use () around the part of the match you want to extract, i.e.

if part =~ /[(\d*)]/
puts $1 # $2 would mean the second group of (), and so on


btw, you really should get that code out of your view, and put it
into a helper or model…


Jens Krämer
[email protected]


Thank you very much. I am not that good at sticking to the rules of Ruby
on Rails and have probably suffered as a result. I have moved this to a

The regular expression works well for long series as I require. However,
it does not work for ([1]) for example. I would like to say "ignore
everything before and after the square brackets.

Does anyone have any ideas how to do this? I have tried a lot of
symbols, but often I am not allowed to put these at the start.

Many thanks


Jens K. wrote:

On Tue, Oct 24, 2006 at 08:23:31AM +0200, Darren E. wrote:

2XX, [5] to [6] output using IRP 452 Version 3XX

The numbers are refences to specifications in another table. At the
moment, I use /[\d*]/ This extracts the numbers including brackets,
but now can I just extract the numbers by altering the expression? The
full ruby code I use for this is as follows if this is any use, probably
also there is a better way than splitting on a space…

Any help would be great and really appreciated.

use () around the part of the match you want to extract, i.e.

if part =~ /[(\d*)]/
puts $1 # $2 would mean the second group of (), and so on


btw, you really should get that code out of your view, and put it
into a helper or model…


Jens Krä­¥r
[email protected]

Darren E. wrote:

The regular expression works well for long series as I require. However,
it does not work for ([1]) for example. I would like to say "ignore
everything before and after the square brackets.

Um … yes it does?

irb(main):001:0> part = ‘([1])’
=> “([1])”
irb(main):002:0> part
=> “([1])”
irb(main):003:0> if part =~ /[(\d*)]/
irb(main):004:1> puts $1
irb(main):005:1> end
=> nil

Maybe you should post your actual model code so we can try to see why it
isn’t working for you …


Then the problem as I see it is with the split, but I don’t know how to
overcome this. Because of the way the split is handled

[1] - ([2]) corresponding to [3] to [4] output using IQS 452 Version 2XX
[5] to [6] output using IQS 452 Version 3XX

(stored in the database) becomes

[1] - [2] corresponding to [3] to [4] output using IQS 452 Version 2XX
[5] to [6] output using IQS 452 Version 3XX

Due to inadequacies in my model.

The second line is a text line displayed (so the user can see what the
final output will be with specifications and units included), but of
course the way I do it the brackets around the [2] have disappeared

What I want to do is to take the top line and just replace the [value]
part but leave the brackets (could be anything else too, a comma, full
stop etc).

The important view part is as follows:

<%unless @translation.text_line.nil?%>
 <%@[email protected]_text_line(@translation, 

@specification, @fields)%>

<%if (@translation.level==1)%><%end%>
<%if (@translation.level==2)%><%end%>
<%if (@translation.level==3)%><%end%>
<%if (@translation.level==4)%>
  • <%end%>
    <%if (@translation.level==5)%>&nbsp - &nbsp<%end%>
    <%if (@translation.level==6)%>
  • <%end%>
    <%if (@translation.level==7)%><%if
    <%[email protected]%>
    <%if ((@translation.level==3) || (@translation.level==4) ||

    Probably, only the part @text_line is important, the rest is format
    stuff. The model is as follows:

    def display_text_line(translation, specification, fields)
    cycles = 1
    n = 0
    text_line = “”
    for part in translation.text_line.split(’ ')
    if /[(\d*)]/.match(part)
    if (fields[n])

            if (translation.text_line.split(' ').size==cycles)
                text_line = "#{text_line}"+"#{fields[n].result}"+" 

    text_line = “#{text_line}”+"#{fields[n].result}"+"
    “+”#{@prefix_new}"+"#{@unit}"+" "
    n += 1
    if (translation.text_line.split(’ ').size==cycles)
    text_line = text_line+"#{part}"
    text_line = text_line+"#{part}"+" "
    cycles += 1
    return text_line

    I hope that someone can help with this, I have tried a lot, but have
    failed to resolve this.



    Chris G. wrote:

    Darren E. wrote:

    The regular expression works well for long series as I require. However,
    it does not work for ([1]) for example. I would like to say "ignore
    everything before and after the square brackets.

    Um … yes it does?

    irb(main):001:0> part = ‘([1])’
    => “([1])”
    irb(main):002:0> part
    => “([1])”
    irb(main):003:0> if part =~ /[(\d*)]/
    irb(main):004:1> puts $1
    irb(main):005:1> end
    => nil

    Maybe you should post your actual model code so we can try to see why it
    isn’t working for you …


    I have got to the stage where if I take the string (example):


    I want to extract everything after the somenumber]

    To do this I have at the moment:


    This gives:


    Now all I want to do is tell the regular expression to ignore the “]”,
    then I’m there. Does anyone know how to “ignore” a “]”.

    Many thanks for your help.


    Darren E. wrote:


    Then the problem as I see it is with the split, but I don’t know how to
    overcome this. Because of the way the split is handled

    [1] - ([2]) corresponding to [3] to [4] output using IQS 452 Version 2XX
    [5] to [6] output using IQS 452 Version 3XX

    (stored in the database) becomes

    [1] - [2] corresponding to [3] to [4] output using IQS 452 Version 2XX
    [5] to [6] output using IQS 452 Version 3XX

    Due to inadequacies in my model.

    The second line is a text line displayed (so the user can see what the
    final output will be with specifications and units included), but of
    course the way I do it the brackets around the [2] have disappeared

    What I want to do is to take the top line and just replace the [value]
    part but leave the brackets (could be anything else too, a comma, full
    stop etc).

    The important view part is as follows:

    <%unless @translation.text_line.nil?%>
     <%@[email protected]_text_line(@translation, 

    @specification, @fields)%>

    <%if (@translation.level==1)%><%end%>
    <%if (@translation.level==2)%><%end%>
    <%if (@translation.level==3)%><%end%>
    <%if (@translation.level==4)%>
  • <%end%>
    <%if (@translation.level==5)%>&nbsp - &nbsp<%end%>
    <%if (@translation.level==6)%>
  • <%end%>
    <%if (@translation.level==7)%><%if
    <%[email protected]%>
    <%if ((@translation.level==3) || (@translation.level==4) ||

    Probably, only the part @text_line is important, the rest is format
    stuff. The model is as follows:

    def display_text_line(translation, specification, fields)
    cycles = 1
    n = 0
    text_line = “”
    for part in translation.text_line.split(’ ')
    if /[(\d*)]/.match(part)
    if (fields[n])

            if (translation.text_line.split(' ').size==cycles)
                text_line = "#{text_line}"+"#{fields[n].result}"+" 

    text_line = “#{text_line}”+"#{fields[n].result}"+"
    “+”#{@prefix_new}"+"#{@unit}"+" "
    n += 1
    if (translation.text_line.split(’ ').size==cycles)
    text_line = text_line+"#{part}"
    text_line = text_line+"#{part}"+" "
    cycles += 1
    return text_line

    I hope that someone can help with this, I have tried a lot, but have
    failed to resolve this.



    Chris G. wrote:

    Darren E. wrote:

    The regular expression works well for long series as I require. However,
    it does not work for ([1]) for example. I would like to say "ignore
    everything before and after the square brackets.

    Um … yes it does?

    irb(main):001:0> part = ‘([1])’
    => “([1])”
    irb(main):002:0> part
    => “([1])”
    irb(main):003:0> if part =~ /[(\d*)]/
    irb(main):004:1> puts $1
    irb(main):005:1> end
    => nil

    Maybe you should post your actual model code so we can try to see why it
    isn’t working for you …

    A regular expression like /^(.)[(\d)](.*)$/ will let you capture all
    three parts of the string you appear to be interested in. However, are
    you sure you will never have two sets of numbers within square brackets
    (e.g. “[3]”) that are not separated by a space (e.g.
    “abcd[1]efgh[2]ijkl”)? Because a standard regular expression match will
    only match once in a string.

    An example:


    #!/usr/bin/env ruby -w

    regexp = /^(.)[(\d)](.*)$/

    part = ‘sadfas112f([1]),fds121a’
    match = regexp.match(part)
    puts “part = ‘#{part}’”
    puts “match[1] = #{match[1]}”
    puts “match[2] = #{match[2]}”
    puts “match[3] = #{match[3]}”

    part = ‘sadfas112f([1]),fd[2]s121a’
    match = regexp.match(part)
    puts “part = ‘#{part}’”
    puts “match[1] = #{match[1]}”
    puts “match[2] = #{match[2]}”
    puts “match[3] = #{match[3]}”


    part = ‘sadfas112f([1]),fds121a’
    match[1] = sadfas112f(
    match[2] = 1
    match[3] = ),fds121a

    part = ‘sadfas112f([1]),fd[2]s121a’
    match[1] = sadfas112f([1]),fd
    match[2] = 2
    match[3] = s121a

    If you’re sure this won’t be an issue for you, then this technique
    should work. If not, you probably want to look at gsub or scan, as
    discussed in this thread: Ruby Regexp vs Perl and C# - Ruby - Ruby-Forum

    However, I have to say that the code for your model method looks very
    complicated, and I honestly can’t figure out what it is supposed to do.
    Can you describe what it is does? I (or someone else here) may be able
    to rework it into something simpler and more readable.

    Dear Chris

    Thank you very much for this advice. I have something that works now how
    I want it to (using your regular expression), even if it’s not the
    prettiest code in the world.

    I may come back later with a description of what I have done if that is
    OK with you? There must be an easier way to do this (but it is fairly
    complex anyway).

    It’s my first effort with Ruby on Rails (with very limited support),
    though I do have a reasonable amount of experience with php. My code
    starts to look like php too often at the moment, I know that & will work
    on it.

    Thanks again, it’s really appreciated.



    Chris G. wrote:

    A regular expression like /^(.)[(\d)](.*)$/ will let you capture all
    three parts of the string you appear to be interested in. However, are
    you sure you will never have two sets of numbers within square brackets
    (e.g. “[3]”) that are not separated by a space (e.g.
    “abcd[1]efgh[2]ijkl”)? Because a standard regular expression match will
    only match once in a string.

    An example:


    #!/usr/bin/env ruby -w

    regexp = /^(.)[(\d)](.*)$/

    part = ‘sadfas112f([1]),fds121a’
    match = regexp.match(part)
    puts “part = ‘#{part}’”
    puts “match[1] = #{match[1]}”
    puts “match[2] = #{match[2]}”
    puts “match[3] = #{match[3]}”

    part = ‘sadfas112f([1]),fd[2]s121a’
    match = regexp.match(part)
    puts “part = ‘#{part}’”
    puts “match[1] = #{match[1]}”
    puts “match[2] = #{match[2]}”
    puts “match[3] = #{match[3]}”


    part = ‘sadfas112f([1]),fds121a’
    match[1] = sadfas112f(
    match[2] = 1
    match[3] = ),fds121a

    part = ‘sadfas112f([1]),fd[2]s121a’
    match[1] = sadfas112f([1]),fd
    match[2] = 2
    match[3] = s121a

    If you’re sure this won’t be an issue for you, then this technique
    should work. If not, you probably want to look at gsub or scan, as
    discussed in this thread: Ruby Regexp vs Perl and C# - Ruby - Ruby-Forum

    However, I have to say that the code for your model method looks very
    complicated, and I honestly can’t figure out what it is supposed to do.
    Can you describe what it is does? I (or someone else here) may be able
    to rework it into something simpler and more readable.