search.cpan.org is shutting down
Jonathan Yu > Algorithm-Diff-Any-1.001 > Algorithm::Diff::Any

Algorithm-Diff-Any-1.001.tar.gz

Dependencies

Annotate this POD

# CPAN RT

 Open 0
View/Report Bugs
Module Version: 1.001

# NAME

Algorithm::Diff::Any - Perl module to find differences between files

# VERSION

Version 1.001 (\$Id: Any.pm 10595 2009-12-23 00:29:52Z FREQUENCY@cpan.org \$)

# DESCRIPTION

This is a simple module to select the best available implementation of the standard `diff` algorithm, which works by effectively trying to solve the Longest Common Subsequence (LCS) problem. This algorithm is described in: A Fast Algorithm for Computing Longest Common Subsequences, CACM, vol.20, no.5, pp.350-353, May 1977.

However, it is algorithmically rather complicated to solve the LCS problem; for arbitrary sequences, it is an NP-hard problem. Simply comparing two strings together of lengths n and m is O(n x m). Consequently, this means the algorithm necessarily has some tight loops, which, for a dynamic language like Perl, can be slow.

In order to speed up processing, a fast (C/XS-based) implementation of the algorithm's core loop was implemented. It can confer a noticable performance advantage (benchmarks show a 54x speedup for the `compact_diff` routine).

# SYNOPSIS

```  use Algorithm::Diff::Any;

my \$diff = Algorithm::Diff::Any->new(\@seq1, \@seq2);```

For complete usage details, see the Object-Oriented interface description for the Algorithm::Diff module.

# PURPOSE

The intent of this module is to provide single simple interface to the two (presumably) compatible implementations of this module, namely, Algorithm::Diff and Algorithm::Diff::XS.

If, for some reason, you need to determine what version of the module is actually being included by `Algorithm::Diff::Any`, then:

`  print 'Backend type: ', \$Algorithm::Diff::Any::DRIVER, "\n";`

In order to force use of one or the other, simply load the appropriate module:

```  use Algorithm::Diff::XS;
my \$diff = Algorithm::Diff::XS->new();
# or
use Algorithm::Diff;
my \$diff = Algorithm::Diff->new();```

# COMPATIBILITY

This module was tested under Perl 5.10.1, using Debian Linux. However, because it's Pure Perl and doesn't do anything too obscure, it should be compatible with any version of Perl that supports its prerequisite modules.

# EXPORTABLE FUNCTIONS

The following functions are available for import into your namespace:

• prepare
• LCS
• LCSidx
• LCS_length
• diff
• sdiff
• compact_diff
• traverse_sequences
• traverse_balanced

For full documentation, see the relevant functional descriptions in the Pure Perl implementation, Algorithm::Diff.

# METHODS

## new

`  Algorithm::Diff::Any->new( \@seq1, \@seq2, \%opts );`

Creates a `Algorithm::Diff::Any` object, based upon either the optimized C/XS version of the algorithm, Algorithm::Diff::XS, or falls back to the Pure Perl implementation, Algorithm::Diff.

Example code:

```  my \$diff = Algorithm::Diff::Any->new( \@seq1, \@seq2 );
# or with options
my \$diff = Algorithm::Diff::Any->new( \@seq1, \@seq2, \%opts );```

This method will return an appropriate Algorithm::Diff::Any object or throw an exception on error.

## Next

`  \$diff->Next( \$count )`

See Algorithm::Diff for method documentation.

## Prev

`  \$diff->Prev( \$count )`

See Algorithm::Diff for method documentation.

## Reset

`  \$diff->Reset( \$pos )`

See Algorithm::Diff for method documentation.

## Copy

`  \$diff->Copy( \$pos, \$newBase )`

See Algorithm::Diff for method documentation.

## Base

`  \$diff->Base( \$newBase )`

See Algorithm::Diff for method documentation.

## Diff

`  \$diff->Diff( )`

See Algorithm::Diff for method documentation.

## Same

See Algorithm::Diff for method documentation.

Code example:

`  \$diff->Same( )`

## Items

`  \$diff->Items( \$seqNum )`

See Algorithm::Diff for method documentation.

## Range

`  \$diff->Range( \$seqNum, \$base )`

See Algorithm::Diff for method documentation.

## Min

`  \$diff->Min( \$seqNum, \$base )`

See Algorithm::Diff for method documentation.

## Max

`  \$diff->Max( \$seqNum, \$base )`

See Algorithm::Diff for method documentation.

## Get

`  \$diff->Get( @names )`

See Algorithm::Diff for method documentation.

# AUTHOR

Jonathan Yu <jawnsy@cpan.org>

# ACKNOWLEDGEMENTS

• Many thanks go to the primary authors and maintainers of the Pure Perl implementation of this algorithm, notably:
• Mark-Jason Dominus <mjd-perl-diff@plover.com>
• Tye McQueen <tyemq@cpan.org>
• Thanks to Audrey Tang <cpan@audreyt.org>, author of Algorithm::Diff::XS, for recognizing the value of Joe Schaefer's <apreq-dev@httpd.apache.org> work on Algorithm::LCS
• Neither the Pure Perl nor C/XS-based implementations of this module would have been possible without the work of James W. Hunt (Stanford University) and Thomas G. Szymanski (Princeton University), authors of the often-cited paper for computing longest common subsequences.

In their abstract, they claim that a running time of O(n log n) can be expected, with a worst-case time of O(n^2 log n) for two subsequences of length n.

# SUPPORT

You can find documentation for this module with the perldoc command.

`    perldoc Algorithm::Diff::Any`

You can also look for information at:

# REPOSITORY

You can access the most recent development version of this module at:

http://svn.ali.as/cpan/trunk/Algorithm-Diff-Any

If you are a CPAN developer and would like to make modifications to the code base, please contact Adam Kennedy <adamk@cpan.org>, the repository administrator. I only ask that you contact me first to discuss the changes you wish to make to the distribution.

# FEEDBACK

Please send relevant comments, rotten tomatoes and suggestions directly to the maintainer noted above.

If you have a bug report or feature request, please file them on the CPAN Request Tracker at http://rt.cpan.org. If you are able to submit your bug report in the form of failing unit tests, you are strongly encouraged to do so.

Algorithm::Diff, the classic reference implementation for finding the differences between two chunks of text in Perl. It is based on the algorithm described in A Fast Algorithm for Computing Longest Common Subsequences, CACM, vol.20, no.5, pp.350-353, May 1977.

Algorithm::Diff::XS, the C/XS optimized version of Algorithm::Diff, which will be used automatically if available.

# CAVEATS

## KNOWN BUGS

There are no known bugs as of this release.

## LIMITATIONS

• It is not currently known whether Algorithm::Diff (Pure Perl version) and Algorithm::Diff::XS (C/XS implementation) produce the same output. The algorithms may not be equivalent (source code-wise) so they may produce different output under some as-yet-undiscovered conditions.
• Any potential performance gains will be limited by those features implemented by Algorithm::Diff::XS. As of time of writing, this is limited to the `cdiff` subroutine.

# QUALITY ASSURANCE METRICS

## TEST COVERAGE

```  -------------------------- ------ ------ ------ ------ ------ ------
File                        stmt   bran   cond   sub    pod   total
-------------------------- ------ ------ ------ ------ ------ ------
lib/Algorithm/Diff/Any.pm  100.0  100.0  100.0  100.0  100.0  100.0
Total                      100.0  100.0  100.0  100.0  100.0  100.0```