2009-03-03 19:31:03 by abowman in Ravings of a Lunatic (no comments) permalink
Here’s a question posted to a mailing list that I follow:
I just want to go through the xml and return the names of all the elements as well as the names of their attributes. It seems that most of the time, people know the name of the tag they’re looking for, and they want to get the inner html or the value of one of its attributes.
Is there a method in Hpricot or REXML that enables you to just get a list of all the tag names and attributes?
Here’s my solution:
#!/usr/local/bin/ruby
require 'rubygems'
require 'hpricot'
doc = Hpricot(open("x.xml"))
elements = Array.new
attributes = Array.new
doc.search("//*").each do |e|
if e.class == Hpricot::Elem
elements << e.name
e.attributes.each do |a|
attributes << a[0]
end
end
end
puts "Elements:"
puts elements.join("\n")
puts
puts "Attributes:"
puts attributes.join("\n")
It's ugly, but it gives you an array of elements and attributes. You could run sort and unique on the arrays if you wanted to get a stripped down list.