Comparing/search struct array

Hi,
I have drafted this tiny script, which works almost(!!) fine except that
every item shows 5x more than it should.

I have one file with the report, where there is no customer name but
customer number is there. I have another file with both names and
numbers. So I put both file into array and try to find the relevant
fields.

if File.exists?(“report.output”)
puts “We can work…”
else
puts “Copy the 'pascpgen.output’to the same folder, please”
exit
end

define Class

class Results <
Struct.new(:date, :custo, :invoice, :descr, :type, :money, :ref)
def print_csv_record
date.length==0 ? printf(",") : printf("%s “, date)
custo.length==0 ? printf(”,") : printf("%s “, custo)
invoice.length==0 ? printf(”,") : printf("%s “, invoice)
descr.length==0 ? printf(”,") : printf("%s “, descr)
type.length==0 ? printf(”,") : printf("%s “, type)
money.length==0 ? printf(”,") : printf("%s “, money)
ref.length==0 ? printf(”,") : printf("%s “, ref)
printf(”\n")
end
end

class Query <
Struct.new(:cuname, :custo)
def print_csv_record2
cuname.length==0 ? printf(",") : printf("%s", cuname)
# custo.length==0 ? printf(",") : printf(""%s",", custo)
# printf("\n")
end
end

create array

arr = Array.new
arr2 = Array.new

f = File.open(“report.output”, ‘r’).each {|line|
if line =~ /MST Revenue/

   # read every line into string
   str = line.chomp!
   str.length

   date = str[20, 5]
   custo = str[28,6]
   invoice = str[48,8]
   descr = str[59,28]
   type = str[112,4]
   money = str[123,11]
   ref = str [36,7]

 r = Results.new

 r.date = date
 r.custo = custo
 r.invoice = invoice
 r.descr = descr
 r.type = type
 r.money = money
 r.ref = ref
 arr.push(r)
 end
}

f2 = File.open(“customers.txt”, ‘r’).each {|line|

read every line into string

str = line.chomp!
if str != nil
  str.length

   cuname = str[3, 21]
   custo = str[27,6]

 q = Query.new

 q.cuname = cuname
 q.custo = custo
 arr2.push(q)

end
}

arr.each { |p| p[:custo]
arr2.each { |q| q[:custo]
if p[:custo] == q[:custo]

  q.print_csv_record2
  p.print_csv_record

end
}
}

This script finds the matching fields but for some reason it writes 5x
everything. I believe the “arr.each …” part is wrong. Any idea? Thanks
a lot! Szabolcs

2010/11/8 Szabolcs Tóth [email protected]

Hi,
I have drafted this tiny script, which works almost(!!) fine except that
every item shows 5x more than it should.

Hi. I’m not really sure from looking at it. If you can give me an
example of
the data you are reading so I could follow its execution, that would
make it
a lot easier.

if File.exists?(“report.output”)
puts “We can work…”
else
puts “Copy the 'pascpgen.output’to the same folder, please”
exit
end

Here, I would say “exit 1”, when you just call exit, it exits with a
status
code of 0, which is considered to mean that the application executed
correctly. However, we are exiting because there is an error, so you
shouldn’t have an exit status of 0.

 ref.length==0 ? printf(",") : printf("%s ", ref)
 printf("\n")

end
end

Four things on this one.

  1. I don’t think the object should be responsible for printing itself. I
    think it would be better that it returns the csv_record as a String, and
    lets whoever is using it decide how they want to deal with it (whatever
    code
    uses it decides to print)
  2. Structs have #each defined on them, which just yields all the
    members.
    This means that rather than manually checking each one for its length,
    you
    could just use the Enumerable methods (in this case, probably map). So I
    would probably rewrite it

def csv_record
map { |value| if value.empty? then ‘,’ else "#{value} " end }.join
end

(note that if you prefer to use C style string patterns, there is also a
sprintf function)

And I might change the name of the method to be “to_csv”

  1. The stdlib has a CSV library in it, you might check that out rather
    than
    building your own. (RDoc Documentation)

  2. I would name the class singular, “Result” rather than “Results”
    because
    each line in your file contains a Result instance, and the collection of
    lines I look at as being the results. That is undoubtedly the influence
    of
    Rails, but a few simple conventions like this can make it much easier to
    follow and communicate about the code.

create array

arr = Array.new
arr2 = Array.new

Here, I would name them based on what they contain, rather than what
type
they are. So I would probably rename them to “results” and “queries”,
which
most Rubyists would immediately identify as a collection of Result and
Query
instances. Then, at the bottom, when iterating over them, instead of
naming
the elements p and q, I would name them “result” and “query” to indicate
that this is the current instance of the Result or Query class.

f = File.open(“report.output”, ‘r’).each {|line|

  type = str[112,4]
r.money = money
r.ref = ref
arr.push(r)
end

}

3 things on this one

  1. You use File.open, then iterate on the result there. Unfortunately,
    that
    file object you are iterating over, nothing ever happens with it, so
    your
    file doesn’t get closed. Rather than calling #each on the returned
    object I
    would receive the object in a block, then call each on it in there.

Tricky to find this one in the docs, its actually under IO rather than
File
class IO - RDoc Documentation “If the optional code
block
is given, it will be passed io as an argument, and the
IOhttp://ruby-doc.org/core/classes/IO.htmlobject will automatically
be closed when the block terminates.” So you can
let File close the file for you (this is considered a best practice,
since
it does certain things to ensure this will happen, even if an error is
raised)

File.open ‘report.output’ do |file|
file.each do |line|

end
end

  1. I’m sure plenty of people will disagree with me here, but rather than
    putting the entire contents nested under an if statement, I strongly
    prefer
    to have the if statement just jump to the next iteration. Then
    everything is
    less indented, which for me, makes it much easier to read and
    understand.

So rather than
if line =~ /MST Revenue/

end

I would do
next unless line =~ /MST Revenue/

  1. The constructor for a struct takes the members you created it with.
    So
    you don’t have to individually assign each member.

r = Results.new
r.date = date
r.custo = custo
r.invoice = invoice
r.descr = descr
r.type = type
r.money = money
r.ref = ref

could be rewritten as

r = Results.new date , custo , invoice , descr , type , money , ref

Generally, people use do … end for multiline blocks, and reserver {
… }
for single-line blocks. Not a hard rule, but is easier for most people
:slight_smile:

Here, I realize that if you are coming from a C background, where there
is
only the top level namespace, then you might be naming these methods
differently to avoid a method name conflict. In Ruby, you can namespace
methods by whatever contains them. In this case, the Result and Query
classes. That means that you can define “print_csv_record” in both
classes,
and when you call that method on the given object, it will know about
its
method and not the other’s method, so it will be correct.

This script finds the matching fields but for some reason it writes 5x
everything. I believe the “arr.each …” part is wrong. Any idea? Thanks
a lot! Szabolcs

I don’t see anything conspicuous. Are you sure it isn’t in your data?
(ie 5
months of reports)

On Mon, Nov 8, 2010 at 6:14 PM, Josh C. [email protected] wrote:

let File close the file for you (this is considered a best practice, since
it does certain things to ensure this will happen, even if an error is
raised)

File.open ‘report.output’ do |file|
file.each do |line|

end
end

If you are not doing anything else with the file apart from reading
the lines you could also do:

File.foreach(“report.output”) do |line|

end

Jesus.

Thanks for the comments. I am working on your suggestions.

I figured out that there were repeating lines in the customers.txt that
was the reason why some of the lines were duplicated in the result as
well. So, just modified the customers.txt file read like this:

lastline =""
f2 = File.open(“customers.txt”, ‘r’).each {|line|

read every line into string

str = line.chomp!
if str != nil
str.length

  cuname = str[3, 21]
  custo = str[27,6]
    if lastline != cuname
        q = Query.new

        q.cuname = cuname
        q.custo = custo
        arr2.push(q)
    end
      lastline =cuname
  end
}

Now, it is fine. But, more and more I am working on this code, my
feeling is that I started in a very wrong direction and made my life and
the code too complicated.

Thank you again!

2010/11/8 Jess Gabriel y Galn [email protected]

is given, it will be passed io as an argument, and the
end

Thanks, I was looking for that, but couldn’t remember what it was called
I
found each_line in the docs, and thought that’s what I had remembered.