NokoGiri XML Parser

Hi,
I am trying to build a hash table from the xml file attached. I am
trying this with using “nokogiri” but open to any other suggestions.

Basically the XML structure is a tree structure of the Design Hierarchy.

When the TYPE is “HIER” it has “componentInstances” section and so on.

I cannot predict how many levels deep it will go but would like to build
a hash and go from there.

Thanks

Devender P. wrote in post #1112339:

Hi,
I am trying to build a hash table from the xml file attached. I am
trying this with using “nokogiri” but open to any other suggestions.

Basically the XML structure is a tree structure of the Design Hierarchy.

When the TYPE is “HIER” it has “componentInstances” section and so on.

I cannot predict how many levels deep it will go but would like to build
a hash and go from there.

Thanks

If you insist on having a hash returned I’d recommend taking a look at
the Ox gem. It will do that.

That said, for big or complex XML files, I’d use Nokogiri. Instead of
having to navigate through a hash, which is harder to do, Nokogiri lets
you use CSS and XPath selectors to pick out just the parts you want.

You don’t say what tags you want to extract, but it’s usually an easy
task.

I am trying to parse the tags and build a Record or Hash like Structure.
For example i want to capture spirit:componentInstances tag.If it is
empty i want to know what hierarchy it belonged to. I am not
quite sure how to get this.

Thanks

The problem with xpath or maybe i am not sure how to use it but if i say

doc.xpath(“tlv componentInstances”) produces the list of all attirbutes
inside the componentInstances without giving me the hierarchy.

Here is what i have in my XML.

Each cellname with a type “HIER” will have tag with
additional attributes.

A1
HIER…

A2
HIER

A3
MODEL…

B2
MODEL

I want to be able to capture the componentInstances with respect to the
parent cell.

For-example for A2(cellname) A1 is the parent and for A3(cellname) A2 is
the parent.

For B2 the parent is A1.

Please feel free to suggest a different XML structure if the paresers
can handle them better.

Thanks

On Fri, Jun 14, 2013 at 4:58 PM, Devender P. [email protected]
wrote:

I am trying to parse the tags and build a Record or Hash like Structure.
For example i want to capture spirit:componentInstances tag.If it is
empty i want to know what hierarchy it belonged to. I am not
quite sure how to get this.

Well, you can get that from the DOM - probably even with a single XPath
expression. But you still do not disclose what information you want
eventually in the Hash and according to what rules. And still: what was
your question exactly? Do you want someone else code that for you?

Cheers

robert

It would be great,if you provide sample output,you are looking for.Which
inturn help us to understand what you are looking for.

On Thu, Jun 13, 2013 at 8:11 PM, Devender P. [email protected]
wrote:

Hi,
I am trying to build a hash table from the xml file attached. I am
trying this with using “nokogiri” but open to any other suggestions.

Nokogiri is quite capable so this is a good choice.

Basically the XML structure is a tree structure of the Design Hierarchy.

When the TYPE is “HIER” it has “componentInstances” section and so on.

I cannot predict how many levels deep it will go but would like to build
a hash and go from there.

And your question is?

Cheers

robert

PS: You didn’t even mention according to what rules your Hash should be
built, whether you want a structure of nested Hashes etc.

On Fri, Jun 14, 2013 at 6:46 PM, Devender P. [email protected]
wrote:

  <TYPE>MODEL

This is not valid XML. Where are the closing tags for and
?

I want to be able to capture the componentInstances with respect to the
parent cell.

You’ll likely need a two step approach: search for all and
for
each one of them search for the closest anchestor .

require ‘nokogiri’

def c_name(c_node)
t = c_node.at_xpath(‘n/text()’) and t.to_s
end

dom = Nokogiri.XML <<XML_DOC


foo

bar

bat




XML_DOC

dom.xpath(‘//c’).each do |c_node|
parent = c_node.at_xpath ‘ancestor::c’

if parent
printf “Node: %s parent %s\n”, c_name(c_node), c_name(parent)
else
printf “Node: %s no parent\n”, c_name(c_node)
end
end

Please feel free to suggest a different XML structure if the paresers
can handle them better.

I would start with valid XML. You should model the XML primarily
according to the requirements of the model that you need to created and
not
necessarily according to what will have the easiest XPath for retrieval.

Cheers

robert

Thanks Robert. This is what i am wanting to do. Is there a book or link
that have all these neat little tricks that i can read and play with.

Thanks again.

On Mon, Jun 17, 2013 at 8:35 PM, Devender P. [email protected]
wrote:

Thanks Robert. This is what i am wanting to do. Is there a book or link
that have all these neat little tricks that i can read and play with.

You can read the specs at W3C, there are plenty of XPath tutorials out
there and there is a book by O’Reilly which I haven’t read.

Cheers

robert

On Sun, Jun 16, 2013 at 4:35 PM, Robert K.
[email protected]wrote:

dom.xpath(‘//c’).each do |c_node|
parent = c_node.at_xpath ‘ancestor::c’

Correction

parent = c_node.at_xpath(‘ancestor::c[1]’)

Cheers

robert