GraphViz2::Marpa::Parser - A Perl parser for Graphviz dot files. Input comes from GraphViz2::Marpa::Lexer.
perl scripts/lex.pl -h perl scripts/parse.pl -h perl scripts/g2m.pl -h
perl scripts/lex.pl -input_file x.gv -lexed_file x.lex x.gv is a Graphviz dot file. x.lex will be a CSV file of lexed tokens.
perl scripts/parse.pl -lexed_file x.lex -parsed_file x.parse x.parse will be a CSV file of parsed tokens.
perl scripts/parse.pl -lexed_file x.lex -parsed_file x.parse -output_file x.rend x.rend will be a Graphviz dot file.
perl scripts/g2m.pl -input_file x.gv -lexed_file x.lex -parsed_file x.parse -output_file x.rend
GraphViz2::Marpa::Lexer provides a Marpa::XS-based parser for http://www.graphviz.org/ dot files.
The input is expected to be, via RAM or a CSV file, from GraphViz2::Marpa::Lexer.
Demo lexer/parser output: http://savage.net.au/Perl-modules/html/graphviz2.marpa/index.html.
State Transition Table: http://savage.net.au/Perl-modules/html/graphviz2.marpa/default.stt.html.
Command line options and object attributes: http://savage.net.au/Perl-modules/html/graphviz2.marpa/code.attributes.html.
My article on this set of modules: http://www.perl.com/pub/2012/10/an-overview-of-lexing-and-parsing.html.
The Marpa grammar as an image: http://savage.net.au/Ron/html/graphviz2.marpa/Marpa.Grammar.svg. This image was created with Graphviz via GraphViz2.
Install GraphViz2::Marpa as you would for any Perl module:
Perl
Run:
cpanm GraphViz2::Marpa
or run:
sudo cpan GraphViz2::Marpa
or unpack the distro, and then either:
perl Build.PL ./Build ./Build test sudo ./Build install
or:
perl Makefile.PL make (or dmake or nmake) make test make install
new() is called as my($parser) = GraphViz2::Marpa::Parser -> new(k1 => v1, k2 => v2, ...).
new()
my($parser) = GraphViz2::Marpa::Parser -> new(k1 => v1, k2 => v2, ...)
It returns a new object of type GraphViz2::Marpa::Parser.
GraphViz2::Marpa::Parser
Key-value pairs accepted in the parameter list (see corresponding methods for details [e.g. maxlevel()]):
Specify the name of a CSV file of lexed tokens to read. This file can be output from the lexer.
Default: ''.
The default means the file is not read.
The value supplied by the 'tokens' option takes preference over the 'lexed_file' option.
See the distro for data/*.lex.
Specify a logger compatible with Log::Handler, for the parser and renderer to use.
Default: A logger of type Log::Handler which writes to the screen.
To disable logging, just set 'logger' to the empty string (not undef).
This option affects Log::Handler.
You can get more output by calling new(maxlevel => 'info') and even more with new(maxlevel => 'debug').
See the Log::Handler::Levels docs.
Default: 'notice'.
Default: 'error'.
No lower levels are used.
Specify the name of a file to be passed to the renderer.
The default means the renderer is not called.
Specify the name of a CSV file of parsed tokens to write. This file can be input to the default renderer.
The default means the file is not written.
Specify a renderer for the parser to use.
Default: A object of type GraphViz2::Marpa::Renderer::GraphViz2.
Log the forest of paths recognised by the parser.
Default: 0.
Log the items recognised by the lexer.
Specify an arrayref of tokens output by the lexer.
Returns an object of type Tree, where the root element is not used, but the children of this root are each the first node in a path. Here, path means each separately specified path in the input file.
Consider part of data/55.gv:
A -> B ... B -> C [color = orange penwidth = 5] ... C -> D [arrowtail = obox arrowhead = crow dir = both minlen = 2] D -> E [arrowtail = odot arrowhead = dot dir = both minlen = 2 penwidth = 5]
Even though Graphviz will link A -> B -> C -> D when drawing the image, edges() returns 4 separate paths. If you call new() as new(report_forest => 1) on data/55.gv, the output will include:
Edges: root. Edge attrs: {} |---A. Edge attrs: {color => "purple"} | |---B. Edge attrs: {} |---B. Edge attrs: {color => "orange", penwidth => "5"} | |---C. Edge attrs: {} |---C. Edge attrs: {arrowhead => "crow", arrowtail => "obox", color => "purple", dir => "both", minlen => "2"} | |---D. Edge attrs: {} |---D. Edge attrs: {arrowhead => "dot", arrowtail => "odot", color => "purple", dir => "both", minlen => "2", penwidth => "5"} | |---E. Edge attrs: {} ...
This says:
If the last path was:
D -> E -> F [arrowtail = odot arrowhead = dot dir = both minlen = 2 penwidth = 5]
Then the output would be:
|---D. Edge attrs: {arrowhead => "dot", arrowtail => "odot", color => "purple", dir => "both", minlen => "2", penwidth => "5"} | |---E. Edge attrs: {} | |---F. Edge attrs: {}
This structure is used by "find_clusters()" in GraphViz2::Marpa::PathUtils.
Warning: The forest of paths is faulty for graphs such as:
digraph graph_47 { big -> { small smaller smallest } }
The result will be:
Edges: root. Edge attrs: {} |---big. Edge attrs: {}
See also "nodes()", "style()" and "type()".
Returns nothing.
Outputs the CSV file of parsed items, if new() was called as new(parsed_file => $string).
Returns the Marpa::R2::Recognizer object.
Called by "run()".
Returns a string representation of the hashref.
Returns the next value of the internal item counter.
Returns an arrayref of parsed tokens. Each element of this arrayref is a hashref. See "How is the parsed graph stored in RAM?" for details.
These parsed tokens do not bear a one-to-one relationship to the lexed tokens returned by the lexer's "GraphViz2::Marpa::Lexer" in items() method. However, they are (necessarily) very similar.
If you provide an output file by using the 'parsed_file' option to "new()", or the "parsed_file()" method, the file will have 2 columns, type and value.
E.g.: If the arrayref looks like:
... {count => 10, name => '', type => 'start_attribute', value => '['}, {count => 11, name => '', type => 'attribute_id' , value => 'color'}, {count => 13, name => '', type => 'attribute_value', value => 'red'}, {count => 14, name => '', type => 'end_attribute' , value => ']'}, ...
then the output file will look like:
"type","value" ... start_attribute , "[" attribute_id , "color" attribute_value , "red" end_attribute , "]" ...
Usage:
my($parser) = GraphViz2::Marpa::Parser -> new(...); # $parser -> items actually returns an object of type Set::Array. if ($parser -> run == 0) { my(@items) = @{$parser -> items}; }
Here, the [] indicate an optional parameter.
Get or set the name of the CSV file of lexed tokens to read. This file can be output by the lexer.
'lexed_file' is a parameter to "new()". See "Constructor and Initialization" for details.
Logs the given string $s at the given log level $level.
For levels, see Log::Handler::Levels.
Get or set the logger object.
To disable logging, just set 'logger' to the empty string, in the call to "new()".
This logger is passed to the default renderer.
'logger' is a parameter to "new()". See "Constructor and Initialization" for details.
Get or set the value used by the logger object.
This option is only used if GraphViz2::Marpa:::Lexer or GraphViz2::Marpa::Parser use or create an object of type Log::Handler. See Log::Handler::Levels.
'maxlevel' is a parameter to "new()". See "Constructor and Initialization" for details.
'minlevel' is a parameter to "new()". See "Constructor and Initialization" for details.
Returns a object of type GraphViz2::Marpa::Parser.
See "Constructor and Initialization" for details on the parameters accepted by "new()".
Adds a new item to the internal list of parsed items.
At the end of the run, call "items()" to retrieve this list.
Returns a hashref of all nodes, keyed by node name, with the value of each entry being a hashref of node-specific data. The keys to this hashref are:
These attributes include those specified at the class level, with (from data/55.gv):
node [shape = house]
And those specified for nodes with explicitly defined attributes:
A [color = blue]
But, be warned, Graphviz does not apply class-level attributes to nodes with explicitly declared attributes, but only to nodes defined with no attributes, or declared implicitly by appearing in the declaration of an edge:
C ... H -> I
See fixed just below.
The graph of data/55.gv then, is expected to have just these 3 nodes in the shape of houses.
So, if you call new() as new(report_forest => 1) on data/55.gv, the output will include:
Nodes: A. Attr: {} B. Attr: {fillcolor => "goldenrod", shape => "square", style => "filled"} C. Attr: {shape => "house"} D. Attr: {fillcolor => "turquoise4", shape => "circle", style => "filled"} E. Attr: {fillcolor => "turquoise4", shape => "circle", style => "filled"} F. Attr: {fillcolor => "yellow", shape => "hexagon", style => "filled"} G. Attr: {fillcolor => "darkorchid", shape => "pentagon", style => "filled"} H. Attr: {fillcolor => "lightblue", fontsize => "20", shape => "house", style => "filled"} I. Attr: {fillcolor => "lightblue", fontsize => "20", shape => "house", style => "filled"} J. Attr: {fillcolor => "magenta", fontsize => "26", shape => "square", style => "filled"} K. Attr: {fillcolor => "magenta", fontsize => "26", shape => "triangle", style => "filled"}
This is a Boolean which records whether or not Graphviz will apply class-level attributes to nodes.
See also "edges()", "style()" and "type()".
Get or set the name of the file to be passed to the renderer.
'output_file' is a parameter to "new()". See "Constructor and Initialization" for details.
Get or set the name of the file of parsed tokens for the parser to write. This file can be input to the renderer.
'parsed_file' is a parameter to "new()". See "Constructor and Initialization" for details.
Reserved.
See also "edges()", "nodes()", "style()" and "type()".
Calls "tree2string([$edges])" for $self -> edges.
Called by "run()" at the end of the run, if new() was called as new(report_forest => 1).
Logs all details stored in the getters "edges()", "nodes()", "style()" and "type()".
Get or set the renderer object.
This renderer renders the tokens output by the parser.
'renderer' is a parameter to "new()". See "Constructor and Initialization" for details.
Logs the list of parsed items if new() was called as new(report_items => 1).
The [] indicate an optional parameter.
Get or set the value which determines whether or not to log the forest of paths recognised by the parser.
'report_forest' is a parameter to "new()". See "Constructor and Initialization" for details.
Get or set the value which determines whether or not to log the items recognised by the parser.
'report_items' is a parameter to "new()". See "Constructor and Initialization" for details.
Returns 0 for success and 1 for failure.
This is the only method the caller needs to call. All parameters are supplied to "new()" (or other methods).
At the end of the run, you can call any or all of these:
"edges()", "items()", "nodes()", "style()" and "type()".
If you called new() without setting any report options, you could also call:
"print_structure()" and "report()".
Returns a hashref of attributes used to style the rendered graph:
Style: {label => "Complex Syntax Test", rankdir => "TB"}
See also "edges()", "nodes()" and "type()".
Get or set the arrayref of lexed tokens to process.
'tokens' is a parameter to "new()". See "Constructor and Initialization" for details.
If $edges is not supplied, it defaults to $self -> edges.
Returns an arrayref which can be printed with:
print map{"$_\n"} @{$self -> tree2string};
Calls "Tree::DAG_Node/tree2string([$options], [$some_tree])".
Only override this in a sub-class if you wish to log the forest in a different format.
Returns a hashref of attributes describing what type of graph it is.
Type: {digraph => "1", graph_id => "graph_55", strict => "1"}
This hashref always has the same 3 keys.
See also "edges()", "nodes()" and "style()".
Get or set the utils object.
Default: A object of type GraphViz2::Marpa::Utils.
Yes. Consider these 3 situations and their corresponding lexed or parsed output:
digraph , "yes" graph_id , "g" start_scope , "1"
start_subgraph , "1" graph_id , "s" start_scope , "2"
start_scope , "2"
No. In particular, subgraph info is still missing.
Traps for young players:
See data/38.* for good examples.
Install Marpa::PP manually. It is not mentioned in Build.PL or Makefile.PL.
Patch GraphViz2::Marpa::Parser (line 15) from Marpa::XS to Marpa:PP.
Then, run the tests which ship with this module. I've tried this, and the tests all worked. You don't need to install the code to test it. Just use:
shell> cd GraphViz2-Marpa-1.00/ shell> prove -Ilib -v t
In "Scripts" in GraphViz2::Marpa.
Items are stored in an arrayref. This arrayref is available via the "items()" method.
These items have the same format as the arrayref of items returned by the items() method in GraphViz2::Marpa::Lexer, and the same as in GraphViz2::Marpa::Lexer::DFA.
However, the precise values in the 'type' field of the following hashref vary between the lexer and the parser.
Each element in the array is a hashref:
{ count => $integer, # 1 .. N. name => '', # Unused. type => $string, # The type of the token. value => $value, # The value from the input stream. }
$type => $value pairs used by the parser are listed here in alphabetical order by $type:
This represents 3 special tokens where the author of the dot file used one or more of the 3 words edge, graph, or node, to specify attributes which apply to all such cases. So:
node [shape = Msquare]
means all nodes after this point in the input stream default to having a square shape. Of course this can be overidden by another such line, or by any specific node having a shape as part of its list of attributes.
See data/51.* for sample code.
This separates nodes from ports and ports from compass points.
'yes' => digraph and 'no' => graph.
$id is either '->' for a digraph or '--' for a graph.
This indicates the end of a set of attributes.
This indicates the end of a graph or subgraph or any stand-alone {}, and - for subgraphs - preceeds the subgraph's 'end_subgraph'.
$brace_count increments by 1 each time 'graph_id' is detected in the input string, and decrements each time a matching 'end_scope' is detected.
This indicates the end of a subgraph, and follows the subgraph's 'end_scope'.
$subgraph_count increments by 1 each time 'start_subgraph' is detected in the input string, and decrements each time a matching 'end_subgraph' is detected.
This indicates both the graph's $id and each subgraph's $id.
For graphs and subgraphs, the $id may be '' (the empty string).
This indicates the start of a set of attributes.
This indicates the start of the graph, a subgraph, or any stand-alone {}.
This indicates the start of a subgraph, and preceeds the subgraph's 'graph_id'.
$subgraph_count increments by 1 each time 'start_subgraph' is detected in the input string, and decrements each time a matching 'end_subgraph' is detected
'yes' => strict and 'no' => not strict.
Consult data/*.lex and the corresponding data/*.parse for many examples.
Comments are not expected in the input stream.
See http://savage.net.au/Perl-modules/html/graphviz2.marpa/Lexing.and.Parsing.with.Marpa.html.
Correct. My policy is that stand-alone modules should use a light-weight object manager (my choice is Hash::FieldHash), whereas apps can - and probably should - use Moose.
The file CHANGES was converted into Changelog.ini by Module::Metadata::Changes.
Version numbers < 1.00 represent development versions. From 1.00 up, they are production versions.
Email the author, or log a bug on RT:
https://rt.cpan.org/Public/Dist/Display.html?Name=GraphViz2::Marpa.
GraphViz2::Marpa was written by Ron Savage <ron@savage.net.au> in 2012.
Home page: http://savage.net.au/index.html.
Australian copyright (c) 2012, Ron Savage.
All Programs of mine are 'OSI Certified Open Source Software'; you can redistribute them and/or modify them under the terms of The Artistic License, a copy of which is available at: http://www.opensource.org/licenses/index.html
To install GraphViz2::Marpa, copy and paste the appropriate command in to your terminal.
cpanm
CPAN shell
perl -MCPAN -e shell install GraphViz2::Marpa
For more information on module installation, please visit the detailed CPAN module installation guide.