Regexp Help

Hi,

I am looking for a way to do the following:

I have some html in strings, mostly links (i.e. ). I need to delete these lines. I’ve been
trying to do this, but it’s not working out:

question.gsub! ("<a href="/([.]+)/" >$/i", “”)
or also: question.gsub! ("<a href="/([a-zA-z0-9]+)/" >$/i", “”)

(and various permutations of these two).

How would one go about this?

Thanks,
Jillian

Jillian K. wrote:

Hi,

I am looking for a way to do the following:

I have some html in strings, mostly links (i.e. ). I need to delete these lines. I’ve been
trying to do this, but it’s not working out:

question.gsub! (“<a href="/([.]+)/" >$/i”, “”)
or also: question.gsub! (“<a href="/([a-zA-z0-9]+)/" >$/i”, “”)

(and various permutations of these two).

How would one go about this?

Thanks,
Jillian

q = ‘Hello world.’

result = q.gsub(/<a.*?>/, “”)
puts result

–output:–
Hello world.

I would write like

result = q.gsub(/<[^>]*>/, “”)

Thanks

On Tue, Jul 28, 2009 at 10:20 AM, Kai König
[email protected]wrote:

Why not =>

doc.xpath(‘//a’)[0].attributes[“href”].to_s
=> “http://mysite.com

Actually, doc.content would do what the op requested.

Ray

Why not =>

irb(main):001:0> require ‘nokogiri’
=> true
irb(main):002:0> doc = Nokogiri::HTML(<<-eohtml)
irb(main):003:1"
irb(main):004:1"
irb(main):005:1" Bla
irb(main):006:1"
irb(main):007:1"
irb(main):008:1" eohtml

doc.xpath(‘//a’)[0].attributes[“href”].to_s
=> “http://mysite.com

http://tenderlovemaking.com/2008/10/30/nokogiri-is-released/
Cheers