Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.
Path: blob/master/lib/rex/parser/graphml.rb
Views: 11778
# -*- coding: binary -*-12module Rex3module Parser4#5# A partial implementation of the GraphML specification for loading structured data from an XML file. Notable6# missing components include GraphML parse meta-data (XML attributes with the "parse" prefix), hyperedges and ports.7# See: http://graphml.graphdrawing.org/8#9module GraphML10#11# Load the contents of a GraphML file by parsing it with Nokogiri and returning12# the top level GraphML structure.13#14# @param file_path [String] The file path to load the data from.15# @return [Rex::Parser::GraphML::Element::GraphML]16def self.from_file(file_path)17parser = Nokogiri::XML::SAX::Parser.new(Document.new)18parser.parse(File.read(file_path, mode: 'rb'))19parser.document.graphml20end2122#23# Convert a GraphML value string into a Ruby value depending on the specified type. Values of int and long will be24# converted to Ruby integer, while float and double values will be converted to floats. For booleans, values that are25# either blank or "false" (case-insensitive) will evaluate to Ruby's false, while everything else will be true.26#27# @param attr_type [Symbol] The type of the attribute, one of either boolean, int, long, float, double or string.28# @param value [String] The value to convert into a native Ruby data type.29def self.convert_attribute(attr_type, value)30case attr_type31when :boolean32value.strip!33if value.blank?34value = false35else36value = value.downcase != 'false'37end38when :int, :long39value = Integer(value)40when :float, :double41value = Float(value)42when :string # rubocop:disable Lint/EmptyWhen43else44raise ArgumentError, 'Unsupported attribute type: ' + attr_type.to_s45end4647value48end4950#51# Define a GraphML attribute including its name, data type, default value and where it can be applied.52#53class MetaAttribute54# @param id [String] The attribute's document identifier.55# @param name [String] The attribute's name as used by applications.56# @param type [Symbol] The data type of the attribute, one of either boolean, int, long, float, double or string.57# @param domain [Symbol] What elements this attribute is valid for, one of either edge, node, graph or all.58# @param default An optional default value for this attribute.59def initialize(id, name, type, domain: :all, default: nil)60@id = id61@name = name62@type = type63@domain = domain64@default = default65end6667#68# Create a new instance from a Key element.69#70# @param key [Rex::Parser::GraphML::Element::Key] The key to create a new instance from.71def self.from_key(key)72new(key.id, key.attr_name, key.attr_type, domain: key.domain, default: key.default&.value)73end7475#76# Convert a value to the type specified by this attribute.77#78# @param value The value to convert.79def convert(value)80GraphML.convert_attribute(@type, value)81end8283#84# Whether or not the attribute is valid for the specified element.85#86# @param element [Rex::Parser::GraphML::AttributeContainer] The element to check.87def valid_for?(element)88@domain == :all || @domain == element.class::ELEMENT_NAME.to_sym89end9091# @!attribute id92# @return [String] The attribute's document identifier.93attr_reader :id94# @!attribute name95# @return [String] The attribute's name as used by applications.96attr_reader :name97# @!attribute type98# @return [Symbol] The data type of the attribute.99attr_reader :type100# @!attribute domain101# @return [Symbol] What elements this attribute is valid for.102attr_reader :domain103# @!attribute default104# @return An optional default value for this attribute.105attr_reader :default106end107108#109# A base class for GraphML elements that are capable of storing attributes.110#111class AttributeContainer112def initialize113@attributes = {}114end115116# @!attribute attributes117# @return [Hash] The defined attributes for the element.118attr_reader :attributes119end120121#122# A module for organizing GraphML elements that define the data structure. Each provides a from_xml_attributes123# function to create an instance from a hash of XML attributes.124#125module Element126#127# A data element defines the value of an attribute for the parent XML node.128# See: http://graphml.graphdrawing.org/specification/xsd.html#element-data129#130class Data131ELEMENT_NAME = 'data'.freeze132# @param key [String] The identifier of the attribute that this object contains a value for.133def initialize(key)134@key = key135@value = nil136end137138def self.from_xml_attributes(xml_attrs)139key = xml_attrs['key']140raise Error::InvalidAttributeError.new('data', 'key') if key.nil?141142new(key)143end144145# @!attribute key146# @return [String] The identifier of the attribute that this object contains a value for.147attr_reader :key148# @!attribute value149# @return The value of the attribute.150attr_reader :value151end152153#154# A default element defines the optional default value of an attribute. If not default is specified, per the GraphML155# specification, the attribute is undefined.156# See: http://graphml.graphdrawing.org/specification/xsd.html#element-default157#158class Default159ELEMENT_NAME = 'default'.freeze160# @param value The default attribute value.161def initialize(value: nil)162@value = value163end164165def self.from_xml_attributes(_xml_attrs)166new # no attributes for this element167end168169# @!attribute value170# @return The default attribute value.171attr_reader :value172end173174#175# An edge element defines a connection between two nodes. Connections are optionally directional.176# See: http://graphml.graphdrawing.org/specification/xsd.html#element-edge177#178class Edge < AttributeContainer179ELEMENT_NAME = 'edge'.freeze180# @param source [String] The id of the node that this edge originated from.181# @param target [String] The id of the node that this edge is destined for.182# @param directed [Boolean] Whether or not this edge only connects in one direction.183# @param id [String] The optional, unique identifier of this edge.184def initialize(source, target, directed, id: nil)185@source = source186@target = target187@directed = directed188@id = id189super()190end191192def self.from_xml_attributes(xml_attrs, edgedefault)193source = xml_attrs['source']194raise Error::InvalidAttributeError.new('edge', 'source') if source.nil?195196target = xml_attrs['target']197raise Error::InvalidAttributeError.new('edge', 'target') if target.nil?198199directed = xml_attrs['directed']200if directed.nil?201directed = edgedefault == :directed202elsif %w[true false].include? directed203directed = directed == 'true'204else205raise Error::InvalidAttributeError.new('edge', 'directed', details: 'must be either true or false when specified', missing: false)206end207208new(source, target, directed, id: xml_attrs['id'])209end210211# !@attribute source212# @return [String] The id of the node that this edge originated from.213attr_reader :source214# !@attribute target215# @return [String] The id of the node that this edge is destined for.216attr_reader :target217# !@attribute directed218# @return [Boolean] Whether or not this edge only connects in one direction.219attr_reader :directed220# !@attribute id221# @return [String] The optional, unique identifier of this edge.222attr_reader :id223end224225#226# A graph element defines a collection of nodes and edges.227# See: http://graphml.graphdrawing.org/specification/xsd.html#element-graph228#229class Graph < AttributeContainer230ELEMENT_NAME = 'graph'.freeze231# @param edgedefault [Boolean] Whether or not edges within this graph should be directional by default.232# @param id [String] The optional, unique identifier of this graph.233def initialize(edgedefault, id: nil)234@edgedefault = edgedefault235@id = id236237@nodes = {}238@edges = []239super()240end241242def self.from_xml_attributes(xml_attrs)243edgedefault = xml_attrs['edgedefault']244unless %w[directed undirected].include? edgedefault245# see: http://graphml.graphdrawing.org/primer/graphml-primer.html section 2.3.1246raise Error::InvalidAttributeError.new('graph', 'edgedefault', missing: edgedefault.nil?)247end248249edgedefault = edgedefault.to_sym250251new(edgedefault, id: xml_attrs['id'])252end253254# @!attribute edgedefault255# @return [Boolean] Whether or not edges within this graph should be directional by default.256attr_reader :edgedefault257# @!attribute id258# @return [String] The optional, unique identifier of this graph.259attr_reader :id260# @!attribute edges261# @return [Array] An array of edge elements within this graph.262attr_reader :edges263# @!attribute nodes264# @return [Hash] A hash of node elements, keyed by their string identifier.265attr_reader :nodes266end267268#269# A graphml element is the root of a GraphML document.270# See: http://graphml.graphdrawing.org/specification/xsd.html#element-graphml271#272class GraphML273ELEMENT_NAME = 'graphml'.freeze274def initialize275@nodes = {}276@edges = []277@graphs = []278end279280# @!attribute nodes281# @return [Hash] A hash of all node elements within this GraphML document, keyed by their string identifier.282attr_reader :nodes283# @!attribute edges284# @return [Array] An array of all edge elements within this GraphML document.285attr_reader :edges286# @!attribute graphs287# @return [Array] An array of all graph elements within this GraphML document.288attr_reader :graphs289end290291#292# A key element defines the attributes that may be present in a document.293# See: http://graphml.graphdrawing.org/specification/xsd.html#element-key294#295class Key296ELEMENT_NAME = 'key'.freeze297# @param id [String] The document identifier of the attribute described by this element.298# @param name [String] The name (as used by applications) of the attribute described by this element.299# @param type [Symbol] The data type of the attribute described by this element, one of either boolean, int, long, float, double or string.300# @param domain [Symbol] What elements the attribute described by this element is valid for, one of either edge, node, graph or all.301def initialize(id, name, type, domain)302@id = id303@attr_name = name304@attr_type = type305@domain = domain # using 'for' would cause an awkward keyword conflict306@default = nil307end308309def self.from_xml_attributes(xml_attrs)310id = xml_attrs['id']311raise Error::InvalidAttributeError.new('key', 'id') if id.nil?312313name = xml_attrs['attr.name']314raise Error::InvalidAttributeError.new('key', 'attr.name') if name.nil?315316type = xml_attrs['attr.type']317unless %w[boolean int long float double string].include? type318raise Error::InvalidAttributeError.new('key', 'attr.type', details: 'must be boolean int long float double or string', missing: type.nil?)319end320321type = type.to_sym322323domain = xml_attrs['for']324unless %w[graph node edge all].include? domain325raise Error::InvalidAttributeError.new('key', 'for', details: 'must be graph node edge or all', missing: domain.nil?)326end327328domain = domain.to_sym329330new(id, name, type, domain)331end332333def default=(value)334@default = GraphML.convert_attribute(@attr_type, value)335end336337# @!attribute id338# @return [String] The document identifier of the attribute described by this element.339attr_reader :id340# @!attribute attr_name341# @return [String] The name (as used by applications) of the attribute described by this element.342attr_reader :attr_name343# @!attribute attr_type344# @return [Symbol] The data type of the attribute described by this element.345attr_reader :attr_type346# @!attribute domain347# @return [Symbol] What elements the attribute described by this element is valid for.348attr_reader :domain349# @!attribute default350# @return The default value of the attribute described by this element.351attr_reader :default352end353354#355# A node element defines an object within the graph that can have zero or more edges connecting it to other nodes. A356# node element may contain a graph element.357#358class Node < AttributeContainer359ELEMENT_NAME = 'node'.freeze360# @param id [String] The unique identifier for this node element.361def initialize(id)362@id = id363@edges = []364@subgraph = nil365super()366end367368def self.from_xml_attributes(xml_attrs)369id = xml_attrs['id']370raise Error::InvalidAttributeError.new('node', 'id') if id.nil?371372new(id)373end374375# @return [Array] An array of all edges for which this node is the target.376def source_edges377# edges connected to this node378@edges.select { |edge| edge.target == @id || !edge.directed }379end380381# @return [Array] An array of all edges for which this node is the source.382def target_edges383# edges connecting this to other nodes384@edges.select { |edge| edge.source == @id || !edge.directed }385end386387# @!attribute id388# @return [String] The unique identifier for this node.389attr_reader :id390# @!attribute edges391# @return [Array] An array of all edges for which this node is either the source or the target.392attr_reader :edges393# @!attribute subgraph394# @return [Graph,nil] A subgraph contained within this node.395attr_accessor :subgraph396end397end398399#400# A module collecting the errors raised by this parser.401#402module Error403#404# The base error class for errors raised by this parser.405#406class GraphMLError < StandardError407end408409#410# An error describing an issue that occurred while parsing the data structure.411#412class ParserError < GraphMLError413end414415#416# An error describing an XML attribute that is invalid either because the value is missing or otherwise invalid.417#418class InvalidAttributeError < ParserError419def initialize(element, attribute, details: nil, missing: true)420@element = element421@attribute = attribute422# whether or not the attribute is invalid because it is absent423@missing = missing424425message = "Element '#{element}' contains an invalid attribute: '#{attribute}'"426message << " (#{details})" unless details.nil?427428super(message)429end430end431end432433#434# The top-level document parser.435#436class Document < Nokogiri::XML::SAX::Document437def initialize438@stack = []439@nodes = {}440@meta_attributes = {}441@graphml = nil442super443end444445def start_element(name, attrs = [])446attrs = attrs.to_h447448case name449when 'data'450raise Error::ParserError, 'The \'data\' element must be a direct child of an attribute container' unless @stack[-1].is_a? AttributeContainer451452element = Element::Data.from_xml_attributes(attrs)453454when 'default'455raise Error::ParserError, 'The \'default\' element must be a direct child of a \'key\' element' unless @stack[-1].is_a? Element::Key456457element = Element::Default.from_xml_attributes(attrs)458459when 'edge'460raise Error::ParserError, 'The \'edge\' element must be a direct child of a \'graph\' element' unless @stack[-1].is_a? Element::Graph461462element = Element::Edge.from_xml_attributes(attrs, @stack[-1].edgedefault)463@graphml.edges << element464465when 'graph'466element = Element::Graph.from_xml_attributes(attrs)467@stack[-1].subgraph = element if @stack[-1].is_a? Element::Node468@graphml.graphs << element469470when 'graphml'471element = Element::GraphML.new472raise Error::ParserError, 'The \'graphml\' element must be a top-level element' unless @stack.empty?473474@graphml = element475476when 'key'477raise Error::ParserError, 'The \'key\' element must be a direct child of a \'graphml\' element' unless @stack[-1].is_a? Element::GraphML478479element = Element::Key.from_xml_attributes(attrs)480raise Error::InvalidAttributeError.new('key', 'id', details: 'duplicate key id') if @meta_attributes.key? element.id481if @meta_attributes.values.any? { |attr| attr.name == element.attr_name }482raise Error::InvalidAttributeError.new('key', 'attr.name', details: 'duplicate key attr.name')483end484485when 'node'486raise Error::ParserError, 'The \'node\' element must be a direct child of a \'graph\' element' unless @stack[-1].is_a? Element::Graph487488element = Element::Node.from_xml_attributes(attrs)489raise Error::InvalidAttributeError.new('node', 'id', details: 'duplicate node id') if @nodes.key? element.id490491@nodes[element.id] = element492@graphml.nodes[element.id] = element493494else495raise Error::ParserError, 'Unknown element: ' + name496497end498499@stack.push element500end501502def characters(string)503element = @stack[-1]504case element505when Element::Data506parent = @stack[-2]507meta_attribute = @meta_attributes[element.key]508unless meta_attribute.valid_for? parent509raise Error::ParserError, "The #{meta_attribute.name} attribute is invalid for #{parent.class::ELEMENT_NAME} elements"510end511512if meta_attribute.type == :string && !parent.attributes[meta_attribute.name].nil?513# this may be run multiple times if there is an XML escape sequence in the string to concat the parts together514parent.attributes[meta_attribute.name] << meta_attribute.convert(string)515else516parent.attributes[meta_attribute.name] = meta_attribute.convert(string)517end518519when Element::Default520@stack[-1] = Element::Default.new(value: string)521522end523end524525def end_element(name)526element = @stack.pop527528populate_element_default_attributes(element) if element.is_a? AttributeContainer529530case name531when 'default'532key = @stack[-1]533key.default = element534535when 'edge'536graph = @stack[-1]537graph.edges << element538539when 'graph'540element.edges.each do |edge|541source_node = element.nodes[edge.source]542raise Error::InvalidAttributeError.new('edge', 'source', details: "undefined source: '#{edge.source}'", missing: false) if source_node.nil?543544target_node = element.nodes[edge.target]545raise Error::InvalidAttributeError.new('edge', 'target', details: "undefined target: '#{edge.target}'", missing: false) if target_node.nil?546547source_node.edges << edge548target_node.edges << edge549end550551when 'key'552meta_attribute = MetaAttribute.from_key(element)553@meta_attributes[meta_attribute.id] = meta_attribute554555when 'node'556graph = @stack[-1]557graph.nodes[element.id] = element558559end560end561562# @!attribute graphml563# @return [Rex::Parser::GraphML::Element::GraphML] The root of the parsed document.564attr_reader :graphml565566private567568def populate_element_default_attributes(element)569@meta_attributes.values.each do |meta_attribute|570next unless meta_attribute.valid_for? element571next if element.attributes.key? meta_attribute.name572next if meta_attribute.default.nil?573574element.attributes[meta_attribute.name] = meta_attribute.default575end576end577end578end579end580end581582583