I am using the following code:
require ‘net/http’
read the page data
http = Net::HTTP.new(‘kvcrpf.org’, 80)
resp, page = http.get(‘/achievements.htm’, nil )
BEGIN processing HTML
def parse_html(data,tag)
return data.scan(%r{<#{tag}\s*.?>(.?)</#{tag}>}im).flatten
end
output = []
table_data = parse_html(page,“table”)
table_data.each do |table|
out_row = []
row_data = parse_html(table,“tr”)
row_data.each do |row|
cell_data = parse_html(row,“td”)
cell_data.each do |cell|
cell.gsub!(%r{<.*?>},“”)
end
out_row << cell_data
end
output << out_row
end
END processing HTML
examine the result
def parse_nested_array(array,tab = 0)
n = 0
array.each do |item|
if(item.size > 0)
puts “#{”\t" * tab}[#{n}] {"
if(item.class == Array)
parse_nested_array(item,tab+1)
else
puts “#{”\t" * (tab+1)}#{item}"
end
puts “#{”\t" * tab}}"
end
n += 1
end
end
parse_nested_array(output[2][4])
It displays the output like this:
[0] {
2004
}
[1] {
65
}
[2] {
58
}
[3] {
89.23
}
I want to store the value 2004, 65, 58, 89.23 in variables like this:
aa = 2004
ab = 65
ac = 58
ad = 89.23
Please help me, in doing this.
Thanks in advance
Vikash
Vikash Kumar wrote:
[snip]
[2] {
ab = 65
ac = 58
ad = 89.23
Do you need them to be local variables, or can they be instance
variables, or symbols in a hash?
Do you want them to be sequentially labelled, like you have them here
(e.g. ‘aa’ … ‘zz’ ) or are those just example names? Would you instead
want to supply an array of strings that are the names for each element?
Does all the HTML processing have anything to do with this, or is it
just that you have a nested array that you want to ‘splat’ to varaibles?
Gavin K. wrote:
Vikash Kumar wrote:
[snip]
[2] {
ab = 65
ac = 58
ad = 89.23
Do you need them to be local variables, or can they be instance
variables, or symbols in a hash?
Do you want them to be sequentially labelled, like you have them here
(e.g. ‘aa’ … ‘zz’ ) or are those just example names? Would you instead
want to supply an array of strings that are the names for each element?
Does all the HTML processing have anything to do with this, or is it
just that you have a nested array that you want to ‘splat’ to varaibles?
I want to just store these values in local variables, any help will be
appreciated.
Thanks
Vikash
Vikash Kumar wrote:
Gavin K. wrote:
Vikash Kumar wrote:
[snip]
[2] {
ab = 65
ac = 58
ad = 89.23
Do you need them to be local variables, or can they be instance
variables, or symbols in a hash?
Do you want them to be sequentially labelled, like you have them here
(e.g. ‘aa’ … ‘zz’ ) or are those just example names? Would you instead
want to supply an array of strings that are the names for each element?
I want to just store these values in local variables, any help will be
appreciated.
Thanks
Vikash
any local variables, a = 65, l = 58, m = 89.23
Vikash Kumar wrote:
I want to just store these values in local variables, any help will be
appreciated.
I would like to try to help, but I think I need more information.
Please answer the rest of my questions.
For what it’s worth, dynamically creating local variables (with
script-derived names) is slightly harder than dynamically creating
instance variables or hash entries. The wording of your last answer
makes me wonder if you thought that asking for local variables was the
easiest way; would your solution be just as good if it used instance
variables or hash entries instead of local variables?
Gavin K. wrote:
Vikash Kumar wrote:
I want to just store these values in local variables, any help will be
appreciated.
I would like to try to help, but I think I need more information.
Please answer the rest of my questions.
For what it’s worth, dynamically creating local variables (with
script-derived names) is slightly harder than dynamically creating
instance variables or hash entries. The wording of your last answer
makes me wonder if you thought that asking for local variables was the
easiest way; would your solution be just as good if it used instance
variables or hash entries instead of local variables?
Storing those values in local variables will be fine, any local
variables, a = 65, l = 58, m = 89.23
Vikash Kumar wrote:
[…]
puts "#{"\t" * tab}}"
2004
I want to store the value 2004, 65, 58, 89.23 in variables like this:
aa = 2004
ab = 65
ac = 58
ad = 89.23
aa, ab, ac, ad = output[2][4]
Good luck.
def parse_html(data,tag)
return data.scan(%r{<#{tag}\s*.?>(.?)</#{tag}>}im).flatten
end
Have you looked at hpricot? (Google it.) With it, your code’ll look
something like:
def parse_html(el, tag)
el.search(“//#{tag}”)
end
Get to know Array#map. Your code’ll look something like:
output = parse_html(page, ‘table’).map do |table|
parse_html(table, ‘tr’).map do |row|
parse_html(row,‘td’).map do |cell|
cell.inner_html.gsub(%r{<.*?>},“”)
end
end
end
puts "#{"\t" * tab}}"
end
n += 1
end
end
- Array#each_with_index will keep track of n for you.
- Array#inspect or Kernel#p will print the array in a readable format.
Otherwise, I think we need more information.
Devin
I am new to hpricot, but I tried this
require ‘watir’
require ‘hpricot’
require ‘open-uri’
page = Hpricot(open(“http://kvcrpf.org/achievements.htm”))
def parse_html(e1,tag)
e1.search(“//#{tag}}”)
end
output = parse_html(page, ‘table’).map do |table|
parse_html(table, ‘tr’).map do |row|
parse_html(row,‘td’).map do |cell|
cell.inner_html.gsub(%r{<.*?>},“”)
end
end
end
parse_nested_array(output[2][4])
By running this code, I get the following error:
c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4-mswin32/lib/hpricot/traverse.rb:110:in
search': undefined method
[]=’ for #MatchData:0x2e25624
(NoMethodError)
from test1.rb:30:in `parse_html’
from test1.rb:33
Please help me in solving this. Also, what should be the approach to
parse data from an open page, i.e. is there is something like page =
Hpricot(attach(“http://kvcrpf.org/achievements.htm”))
Thanks for your help
Vikash