Ron Savage > GraphViz2-Marpa-PathUtils-1.04 > GraphViz2::Marpa::PathUtils

Download:
GraphViz2-Marpa-PathUtils-1.04.tgz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 1.04   Source  

NAME ^

GraphViz2::Marpa::PathUtils - Provide various analyses of Graphviz dot files

SYNOPSIS ^

All scripts and input and output files listed here are shipped with the distro.

Finding clusters

Either pass parameters in to new():

        GraphViz2::Marpa::PathUtils -> new
        (
            input_file      => 'data/jointed.edges.gv',
            report_clusters => 1,
            tree_dot_file   => 'data/clusters.gv',
            tree_image_file => 'html/clusters.svg',
        );

Or call methods to set parameters;

        my($parser) = GraphViz2::Marpa::PathUtils -> new;

        $parser -> input_file('data/jointed.edges.gv');
        $parser -> report_clusters(1);
        $parser -> tree_dot_file('data/fixed.length.paths.gv');
        $parser -> tree_image_file('html/fixed.length.paths.sgv');

And then:

        $parser -> find_clusters;

Command line usage:

        shell> perl scripts/find.clusters.pl -h

Or see scripts/find.clusters.sh, which hard-codes my test data values.

Finding fixed length paths

Either pass parameters in to new():

        GraphViz2::Marpa::PathUtils -> new
        (
            allow_cycles    => 1,
            input_file      => 'data/90.KW91.gv',
            path_length     => 4,
            report_paths    => 1,
            start_node      => 'Act_1',
            tree_dot_file   => 'data/fixed.length.paths.gv',
            tree_image_file => 'html/fixed.length.paths.svg',
        ) -> find_fixed_length_paths;

Or call methods to set parameters;

        my($parser) = GraphViz2::Marpa::PathUtils -> new;

        $parser -> allow_cycles(1);
        $parser -> input_file('data/90.KW91.gv');
        $parser -> path_length(4);
        $parser -> report_paths(1);
        $parser -> start_node('Act_1');
        $parser -> tree_dot_file('data/fixed.length.paths.gv');
        $parser -> tree_image_file('html/fixed.length.paths.sgv');

And then:

        $parser -> find_fixed_length_paths;

Command line usage:

        shell> perl scripts/find.fixed.length.paths.pl -h

Or see scripts/fixed.length.paths.sh, which hard-codes my test data values.

DESCRIPTION ^

GraphViz2::Marpa::PathUtils parses Graphviz dot files and processes the output in various ways.

This class is a descendent of GraphViz2::Marpa, and hence inherits all its keys to new(), and all its methods.

Currently, the only feature available is to find all paths of a given length starting from a given node.

Sample output: http://savage.net.au/Perl-modules/html/graphviz2.pathutils/index.html.

Command line options and object attributes: http://savage.net.au/Perl-modules/html/graphviz2.pathutils/code.attributes.html.

Scripts shipped with this distro ^

All scripts are in the scripts/ directory. This means they do not get installed along with the package.

Data files are in data/, while html and svg files are in html/.

o code.attributes2html.pl

Generate both data/code.attributes.csv and data/code.attributes.html.

o copy.config.pl

During installation, this copies config/.htgraphviz2.marpa.pathutils.conf to a dir as discussed under "The Configuration File".

o find.clusters.pl

This runs the "find_clusters()" method in GraphViz2::Marpa::PathUtils.

o find.clusters.sh

This runs find.clusters.pl with hard-coded parameters, and is what I use for testing new code.

o find.fixed.length.paths.pl

This runs the "find_fixed_length_paths()" method in GraphViz2::Marpa::PathUtils.

Try shell> perl find.fixed.length.paths.pl -h

o find.fixed.length.paths.sh

This runs find.fixed.length.paths.pl with hard-coded parameters, and is what I use for testing new code.

o generate.demo.pl

This uses the Text::Xslate template file htdocs/assets/templates/graphviz2/marpa/pathutils/pathutils.tx to generate html/index.html.

o generate.demo.sh

Runs generate.demo.pl and code.attributes2html.pl.

Then copies html/*.html and html/*.svg to my web server's dir, $DR/Perl-modules/html/graphviz2.pathutils/.

o pod2html.sh

Converts all *.pm files to *.html, and copies them in my web server's dir structure.

o test.set.tiny.pl

Check that Set::Tiny's is_subset() and is_proper_subset() behave as expected.

See also t/test.all.t.

Distributions ^

This module is available as a Unix-style distro (*.tgz).

See http://savage.net.au/Perl-modules/html/installing-a-module.html for help on unpacking and installing distros.

Installation ^

The Module Itself

Install GraphViz2::Marpa::PathUtils as you would for any Perl module:

Run:

        cpanm GraphViz2::Marpa::PathUtils

or run:

        sudo cpan GraphViz2::Marpa::PathUtils

or unpack the distro, and then either:

        perl Build.PL
        ./Build
        ./Build test
        sudo ./Build install

or:

        perl Makefile.PL
        make (or dmake or nmake)
        make test
        make install

The Configuration File

All that remains is to tell GraphViz2::Marpa::PathUtils your values for some options.

For that, see config/.htgraphviz2.marpa.pathutils.conf.

If you are using Build.PL, running Build (without parameters) will run scripts/copy.config.pl, as explained next.

If you are using Makefile.PL, running make (without parameters) will also run scripts/copy.config.pl.

Either way, before editing the config file, ensure you run scripts/copy.config.pl. It will copy the config file using File::HomeDir, to a directory where the run-time code in GraphViz2::Marpa::PathUtils will look for it.

        shell>cd GraphViz2-Marpa-PathUtils-1.00
        shell>perl scripts/copy.config.pl

Under Debian, this directory will be $HOME/.perl/GraphViz2-Marpa-PathUtils/. When you run copy.config.pl, it will report where it has copied the config file to.

Check the docs for File::HomeDir to see what your operating system returns for a call to my_dist_config().

The point of this is that after the module is installed, the config file will be easily accessible and editable without needing permission to write to the directory structure in which modules are stored.

That's why File::HomeDir and Path::Class are pre-requisites for this module.

Although this is a good mechanism for modules which ship with their own config files, be advised that some CPAN tester machines run tests as users who don't have home directories, resulting in test failures.

Constructor and Initialization ^

Calling new()

new() is called as my($obj) = GraphViz2::Marpa::PathUtils -> new(k1 => v1, k2 => v2, ...).

It returns a new object of type GraphViz2::Marpa::PathUtils.

This class is a descendent of GraphViz2::Marpa, and hence inherits all its keys to new(), and all its methods.

Specifically, see "Constructor and Initialization" in GraphViz2::Marpa for more options to new(), including maxlevel.

Further, these key-value pairs are accepted in the parameter list (see corresponding methods for details [e.g. "path_length($integer)"]):

o allow_cycles => $integer

Specify whether or not cycles are allowed in the paths found.

Values for $integer:

o 0 - Do not allow any cycles

This is the default.

o 1 - Allow any node to be included once or twice.

Default: 0.

This option is only used when calling "find_fixed_length_paths()".

o driver => thePathToDot

Specify the OS's path to the dot program, to override the default.

Default: Use which('dot'), via the module File::Which, to find the dot executable.

o format => $aDOTOutputImageFormat

Specify the image type to pass to dot, as the value of dot's -T option.

Default: 'svg'.

o path_length => $integer

Specify the length of all paths to be included in the output.

Here, length means the number of edges between nodes.

Default: 0.

This parameter is mandatory, and must be > 0.

This option is only used when calling "find_fixed_length_paths()".

o report_clusters => $Boolean

Specify whether or not to print a report of the clusters found.

Default: 0 (do not print).

This option is only used when calling "find_clusters()".

o report_paths => $Boolean

Specify whether or not to print a report of the paths found.

Default: 0 (do not print).

This option is only used when calling "find_fixed_length_paths()".

o start_node => $theNameOfANode

Specify the name of the node where all paths must start from.

Default: ''.

This parameter is mandatory.

The name can be the empty string, but must not be undef.

This option is only used when calling "find_fixed_length_paths()".

o tree_dot_file => aDOTInputFileName

Specify the name of a file to write which will contain the DOT description of the image of all solutions.

Default: ''.

This file is not written if the value is ''.

o tree_image_file => aDOTOutputFileName

Specify the name of a file to write which will contain the output of running dot.

The value of the format option determines what sort of image is created.

Default: ''.

This file is not written if the value is ''.

Methods ^

This class is a descendent of GraphViz2::Marpa, and hence inherits all its methods.

Further, these methods are implemented.

allow_cycles([$integer])

Here the [] indicate an optional parameter.

Get or set the value determining whether or not cycles are allowed in the paths found.

'allow_cycles' is a parameter to "new()". See "Constructor and Initialization" for details.

cluster_edge_set()

Returns an arrayref of clusters, where each element is an arrayref.

Within the inner arrayrefs, each element is a 2-element arrayref. If the 2nd element is defined, the 2 elements are the ends of an edge. If the 2nd element is not defined, the 1st element is the name of a node which is the only node in the set.

See the source code for "output_cluster_image()" for sample code.

cluster_set()

Returns an arrayref of clusters, where each element is an object of type Set::Tiny. The members of each set are the stringified names of the members of the clusters.

See the source code of "report_cluster_members()" for sample usage.

dot_input()

Returns the string which will be input to the dot program.

dot_output()

Returns the string which has been output by the dot program.

driver([$pathToDot])

Here the [] indicate an optional parameter.

Get or set the OS's path to the dot program.

find_clusters()

This is one of the methods which does all the work, and hence must be called. The other is "find_fixed_length_paths()".

See the "Synopsis" and scripts/find.clusters.pl.

Returns 0 for success and 1 for failure.

find_fixed_length_paths()

This is one of the methods which does all the work, and hence must be called. The other is "find_clusters()".

See the "Synopsis" and scripts/find.fixed.length.paths.pl.

Returns 0 for success and 1 for failure.

fixed_path_set()

Returns the arrayref of paths found. Each element is 1 path, and paths are stored as an arrayref of objects of type Tree.

See the source code of sub "report_fixed_length_paths()" for sample usage.

format([$string])

Here the [] indicate an optional parameter.

Get or set the type of image to be output when running dot.

'format' is a parameter to "new()". See "Constructor and Initialization" for details.

new()

See "Constructor and Initialization" for details on the parameters accepted by "new()".

output_cluster_image()

This writes the clusters found, as a DOT output file, as long as new(tree_image_file => $name) was specified.

output_fixed_length_image($title)

This writes the paths found, as a DOT output file, as long as new(tree_image_file => $name) was specified, or if tree_image_file($name) was called before "find_fixed_length_paths()" was called.

output_dot_file($title)

This writes the DOT file generated, as long as new(tree_dot_file => $name) was specified.

path_length([$integer])

Here the [] indicate an optional parameter.

Get or set the length of the paths to be searched for.

'path_length' is a parameter to "new()". See "Constructor and Initialization" for details.

report_cluster_members()

This prints the clusters found, if new() was called as new(report_clusters => 1).

report_clusters([$Boolean])

Here the [] indicate an optional parameter.

Get or set the option which determines whether or not the clusters found are printed.

'report_clusters' is a parameter to "new()". See "Constructor and Initialization" for details.

report_fixed_length_paths()

This prints the fixed length paths found, if new() was called as new(report_paths => 1).

report_paths([$Boolean])

Here the [] indicate an optional parameter.

Get or set the option which determines whether or not the fixed length paths found are printed.

'report_paths' is a parameter to "new()". See "Constructor and Initialization" for details.

start_node([$string])

Here the [] indicate an optional parameter.

Get or set the name of the node from where all paths must start.

'start_node' is a parameter to "new()". See "Constructor and Initialization" for details.

tree_dot_file([$name])

Here the [] indicate an optional parameter.

Get or set the name of the dot input file to write.

'tree_dot_file' is a parameter to "new()". See "Constructor and Initialization" for details.

tree_image_file([$name])

Here the [] indicate an optional parameter.

Get or set the name of the dot output file to write.

The type of image comes from the format parameter to new().

'tree_image_file' is a parameter to "new()". See "Constructor and Initialization" for details.

FAQ ^

I used a node label of "\N" and now your module doesn't work!

The most likely explanation is that you're calling find_fixed_path_lengths() and you've specified all nodes, or at least some, to have a label like "\N".

This escape sequence triggers special processing in AT&T's Graphviz to generate labels for nodes, overriding the code I use to re-name nodes in the output of find_fixed_path_lengths().

See _prepare_fixed_length_output() for the gory details.

The purpose of my re-numbering code is to allow a node to appear in the output multiple times but to stop Graphviz automatically regarding all such references to be the same node. Giving a node different names (which are un-seen) but the same label (which is seen) makes Graphviz think they are really different nodes.

The 3 samples in part 2 of the demo page should make this issue clear.

What is the homepage of Marpa?

http://jeffreykegler.github.com/Marpa-web-site/.

See also Jeffrey's the annotated blog.

How are clusters named?

The names of the nodes in each cluster are sorted, and the first is arbitrarily chosen as the name of the cluster.

Sometimes the cluster has the wrong shape for a node

Correct. The code does not handle a file such as:

        digraph X
        {
            node [shape = Mdiamond]
            node_1
            node [shape = Msquare]
            node_2
        }

The output uses the last node shape found by the parser, Msquare, for all nodes which don't have a shape specified explicitly.

See data/03.clusters.in.gv for an instance. See also "TODO".

In clusters, edge attributes in the input file are ignored

Correct. The code does not implement the complexity required to handle this yet. See also "TODO".

How are cycles in fixed path length analysis handled?

This is controlled by the allow_cycles option to new(), or the corresponding method "allow_cycles($integer)".

The code keeps track of the number of times each node is entered. If new(allow_cycles => 0) was called, nodes are only considered if they are entered once. If new(allow_cycles => 1) was called, nodes are also considered if they are entered a second time.

Sample code: Using the input file data/90.KW91.lex (see scripts/fixed.length.paths.sh) we can specify various combinations of parameters like this:

        allow_cycles  path_length  start node  solutions
        0             3            Act_1       9
        1             3            Act_1       22

        0             4            Act_1       12
        1             4            Act_1       53

Are all (fixed length) paths found unique?

Yes, as long as they are unique in the input. Something like this produces 8 identical solutions (starting from A, of path length 3) because each node B, C, D, can be entered in 2 different ways, and 2**3 = 8.

        digraph G
        {
            A -> B -> C -> D;
            A -> B -> C -> D;
        }

See data/01.non.unique.gv and html/01.non.unique.svg.

The number of options is confusing!

Agreed. Remember that this code calls GraphViz2::Marpa's run() method, which expects a large number of options because it calls both the lexer and the parser.

Isn't your code at risk from the 'combinatorial explosion' problem?

Yes. The code does limit the number of possibilies as quickly as possible, but of course there will always be graphs which can't be processed by this module.

Such graphs are deemed to be pathological.

Why do I get error messages like the following?

        Error: <stdin>:1: syntax error near line 1
        context: digraph >>>  Graph <<<  {

Graphviz reserves some words as keywords, meaning they can't be used as an ID, e.g. for the name of the graph.

The keywords are: digraph, edge, graph, node, strict and subgraph. Compass points are not keywords.

See keywords in the discussion of the syntax of DOT for details.

So, don't do this:

        strict graph graph{...}
        strict graph Graph{...}
        strict graph strict{...}
        etc...

Likewise for non-strict graphs, and digraphs. You can however add double-quotes around such reserved words:

        strict graph "graph"{...}

Even better, use a more meaningful name for your graph...

How does the code handle ports attached to nodes?

So far, the code has not been tested on graphs which use ports.

This module uses Hash::FieldHash, which has an XS component!

Correct. My policy is that stand-alone modules should use a light-weight object manager (my choice is Hash::FieldHash), whereas apps can - and probably should - use Moose.

Reference ^

Combinatorial Algorithms for Computers and Calculators, A Nijenhuis and H Wilf, p 240.

This books very clearly explains the backtracking parser I used to process the combinations of nodes found at each point along each path. Source code in the book is in Fortran.

The book is now downloadable as a PDF from http://www.math.upenn.edu/~wilf/website/CombAlgDownld.html.

TODO ^

o High priority - Handle ports
o Low priority - Perhaps implement logic to find paths which end on a given node

Version Numbers ^

Version numbers < 1.00 represent development versions. From 1.00 up, they are production versions.

Machine-Readable Change Log ^

The file CHANGES was converted into Changelog.ini by Module::Metadata::Changes.

Support ^

Email the author, or log a bug on RT:

https://rt.cpan.org/Public/Dist/Display.html?Name=GraphViz2::Marpa::PathUtils.

Author ^

GraphViz2::Marpa::PathUtils was written by Ron Savage <ron@savage.net.au> in 2012.

Home page: http://savage.net.au/index.html.

Copyright ^

Australian copyright (c) 2012, Ron Savage.

        All Programs of mine are 'OSI Certified Open Source Software';
        you can redistribute them and/or modify them under the terms of
        The Artistic License, a copy of which is available at:
        http://www.opensource.org/licenses/index.html
syntax highlighting: