The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
#!/usr/local/bin/perl -w

=head1 NAME

mat2harbo.pl - Convert matrix in Senseclusters sparse format to Harwell-Boeing (HB)  
format and set input parameters (lap2) for input to SVDPACKC. 

=head1 SYNOPSIS

 mat2harbo.pl [OPTIONS] MATRIX

The file input is a SenseClusters sparse matrix

 cat input

Output => 

 5 4 12
 1 1.5 3 2.5 4 1.0
 2 2.5 3 2.5
 1 1.5 3 2.5 4 1.0
 2 2.5 3 2.5
 2 2.5 3 2.5

Convert that to Harwell-Boeing form.

 mat2harbo.pl input --title "matrix format convestion" --id "sample" --numform 10f8.4

Output => 

 matrix format convestion                                                sample
 #
 rra                        5             4            12             0
           (10i8)          (10i8)            (10f8.4)            (10f8.4)
        1       3       6      11      13
        1       3       2       4       5       1       2       3       4       5
        1       3
   1.5000  1.5000  2.5000  2.5000  2.5000  2.5000  2.5000  2.5000  2.5000  2.5000
   1.0000  1.0000

The Harwell Boeing format stores data in 80 columns. The numform 10f8.4 says that 
there should be 10 numbers per line, each with 8 numeric values, where 4 digits are to 
the right of the decimal point. 

See L<http://math.nist.gov/MatrixMarket/formats.html#hb> for a detailed explanation of 
Harwell Boeing format. 

Type C<mat2harbo.pl --help> for a quick summary of options

=head1 DESCRIPTION

Converts a sparse matrix in SenseClusters format to Harwell-Boeing (HB) 
sparse format, which is the format required by SVDPACKC. This program also creates 
(optionally) the lap2 file which provides parameter settings for SVDPACKC.

=head1 INPUT

=head2 Required Arguments:

=head3 MATRIX

A sparse MATRIX in SenseClusters' format that is to be converted into 
Harwell Boeing format.

First line should show exactly 3 numbers separated by blanks as :

 #nrows #ncols #nnz

where 

 #nrows = Number of rows 
 #ncols = Number of columns 
 #nnz = Total number of non-zero values

in the MATRIX.

Each line thereafter should show a row of the MATRIX in sparse format.
A sparse row should be a space separated list of pairs of numbers 
where the first number shows the column index of a non-zero value 
and second number is the non-zero value itself that appears at that column 
index.

Column index counting starts from 1.

Sample MATRIX examples =>

=over 4

=item 1

 5 5 15
 2 9 4 9
 1 6 2 5 3 7 4 8 5 6
 1 4 2 5
 1 7 2 6 3 7
 1 9 2 8 3 9

Shows a 5 x 5 integer matrix containing total 15 non-zero elements.
Each ith line after the first line shows the non-zero elements in the ith
row. e.g. 2nd line (1st row) has 2 non-zero values (both 9) at column indices 
2 and 4. 6th line (5th row) has 3 non-zero values; 9 at index 1, 8 at index 2
and 9 at index 3.

=item 2 

 7 8 34
 1 0.160 2 -0.059 3 1.864 5 0.724 6 -0.472 7 -0.467
 2 -0.209 4 1.487 5 6.728 7 -3.085 8 1.396
 1 14.594 3 -2.858 4 -0.618 6 16.510 8 -2.314
 3 -0.384 5 -1.189 7 -0.155 8 0.006
 1 -0.128 3 0.020 4 -0.125 8 0.039
 2 0.062 3 0.058 4 0.016 5 0.057 7 0.407 8 0.015
 4 0.033 6 1.377 7 0.074 8 0.994

Shows a 7 x 8 real matrix =>

   7 8
   0.160 -0.059  1.864  0.000  0.724 -0.472 -0.467  0.000
   0.000 -0.209  0.000  1.487  6.728  0.000 -3.085  1.396
  14.594  0.000 -2.858 -0.618  0.000 16.510  0.000 -2.314
   0.000  0.000 -0.384  0.000 -1.189  0.000 -0.155  0.006
  -0.128  0.000  0.020 -0.125  0.000  0.000  0.000  0.039
   0.000  0.062  0.058  0.016  0.057  0.000  0.407  0.015
   0.000  0.000  0.000  0.033  0.000  1.377  0.074  0.994

=back

=head2 Optional Arguments:

=head4 --title TITLE

Allows user to specify the Title of the MATRIX which is displayed at Line1
(1-72) of the output HB matrix.
If --title is not specified, mat2harbo uses the MATRIX file name as 
the default title. 

=head4 --id ID

Programs processing the HB formatted matrix can identify the matrix by the ID 
specified at Line1 (73-80). Default ID is "harbomat". This identifier is limited to 8 
characters. 

=head4 --cpform CPFORM

Specifies the Column Pointer Format. The column pointer should have the format
of type MiN which indicates that each line in Block1 contains M integer
pointers each occupying N character spaces. Default format is 10i8. 

Note: M x N must be 80. 

=head4 --rpform RPFORM

Specifies the Row Pointer Format for row pointers in Block2. This has same 
MiN type of format as --cpform.

=head4 --numform NUMFORM

Specifies the Numeric Format to represent the matrix values in Block3.

mat2harbo allows 2 numeric formats :

=over
=item 1. MiN type, which has same interpretation as --cpform and --rpform.

=item 2. MfD.F - which means that there are total M real numbers on each line of block3, each occupying total D digit/character space, of which last F digits show fractional portion. 
=back

Thus, Matrix values could be Integer or Real, selected by specifying a 
particular format. 

Default NUMFORM is (5f16.10) which uses 16 digits for each MATRIX value of which
last 10 digits stand for the fractional part and each line contains 5 such 
real numbers.

=head3 Parameter Setting Options :

The options listed in this section create the parameter file (lap2) for las2.c
automatically.

=head4 --param 

Creates the parameter file lap2 that can be directly used while running las2.

=head4 --k K

Sets the value of maxprs option in LAP2 file to K i.e. 
requests K singular triplets from las2. Value of K should not exceed the 
number of columns of MATRIX.
Default K = 300

=head4 --rf RF

Reduces the dimensions of the column space of the MATRIX by scaling factor RF
i.e. if the MATRIX has N columns, maxprs is set to N/RF where RF > = 1

In other words, N/RF singular triplets are requested from las2.
Default RF = 10 that reduces the column space to 10% or preserves 10% of the 
original dimensions.

If both --k and --rf are specified, maxprs = min(K,N/RF)
Thus, default maxprs = min(300,N/10)

=head4 --iter I

Specifies the number of iterations for las2.
I, if specified, should not exceed the number of columns in the MATRIX
and I should be at least as high as maxprs.
Default I = min((3 * maxprs),#cols) where maxprs = min(K,N/RF).

=head3 Help on setting parameters in file las2.h

The header file las2.h in SVDPACKC specifies values of various constants
for las2. This section provides some guidelines on setting these constants
for using SenseClusters. Please note that the version of SVDPACKC found in /External 
has been modified with the settings as described below.

=over

=item * NMAX

Specifies the maximum possible number of columns in the matrix given to las2.
las2.h initially has a value of NMAX = 3000, which allows a maximum of 
3000 columns. However, we have found this default is too small for many 
of our experiments, so we recommend setting NMAX much higher. We  
routinely use a value of 30,000, and will assume that the user has 
reset NMAX in las2.h to this value in the rest of this discussion.

In general, this value should be higher than NCOLS shown by the 3rd column
on the 3rd line in the output of mat2harbo.pl.

=item * NZMAX

Specifies the maximum possible number of non-zero values in the matrix.
Initially the settings in las2.h have NZMAX = 100000. However, 
again we have found this to be too small. If the user sets NMAX to 30,000, 
and if we assume a 30,000 x 30,000 matrix is approximately 1%   dense,  
NZMAX could be set to 9,000,000 (30,000 x 30,000 / 100). This is the value 
we routinely use, and we will assume that the user has reset NZMAX to this 
value in the rest of this discussion. 

The user can check the exact NZMAX for their matrix on line 3 column 4 of  
the output matrix displayed by mat2harbo.pl and then set NZMAX to  
something higher than that.

=item * LMTNW

This specifies the maximum total memory to be allocated by las2. The  
initial setting of LMTNW in las2.h is 600000, however, we find that this 
is often too small. In general, the size of LMTNW is determined by the  
values you set NMAX and NZMAX to. LMTNW should be at least as large as :

 LMTNW = (6*NMAX + 4*NMAX + 1 + NZMAX*NZMAX) 

mat2harbo.p assumes that NMAX has been reset to 30,000 and that NZMAX is  
set to 9,000,000.  Thus, 

 LMTNW = ((6 * 30,000) + (4 * 30,000) + 1 + (30,000 * 30,000))

This leads to the new value for LMTNW of 900,300,001, which is equivalent 
to a maximum working memory size of 1 GB. We have found this size to be  
more than adquate to do SVD on a 25,000 x 25,000 matrix.

math2arbo.pl will show an advisory message indicating the minimum size 
that LMNTW should be set for, and will issue a warning message if the  
actual size needed for the user matrix exceeds 900,300,001 (approx 1 GB).

Memory is dynamically allocated by las2 depending upon the size of the input
matrix, irrespective of the value of LMTNW. In short, LMTNW specifies the
upper limit on memory consumption and the actual consumption depends on the
size of the matrix. Hence, LMTNW doesn't specify the total memory that las2
will *always* consume rather its an upper limit that could be consumed if
necessary.

=back

In case if las2 fails due to insufficient values of these parameters 
as indicated by the las2.h file, an error message will be shown in output 
file lao2 suggesting that the matrix is too large or something ... User is
adviced to check 3rd line of the matrix in Harwell-Boeing format (as produced 
by this program) that is given to las2. Check if NCOLS shown at column 3 of 
line 3 in the HB matrix exceeds NMAX. If so, increase NMAX to something higher
than NCOLS. If not, check if NNZ shown by column 4 on line 3 of the HB matrix 
exceeds NZMAX in las2.h, if so, increase NZMAX. If not, increase the LMTNW
to something higher than (6*NMAX + 4*NMAX + 1 + NMAX*NMAX), or simply
increase it without too much computations until las2 succeeds :-)

The other problem that a user might notice is that sometimes las2 runs 
for a very long time like more than few days. In such case, user is advised
to restart las2 by reducing the values of parameters 'maxprs' and 'iter' in 
parameter file lap2. Specifically, the 2nd parameter in lap2 is iter and 
the 3rd one is maxprs. Remember that, iter has to be >= maxprs.

=head3 Other Options :

=head4 --help

Displays this message.

=head4 --version

Displays the version information.

=cut

###############################################################################

#                               THE CODE STARTS HERE

#$0 contains the program name along with
#the complete path. Extract just the program
#name and use in error messages
$0=~s/.*\/(.+)/$1/;

###############################################################################

#                           ================================
#                            COMMAND LINE OPTIONS AND USAGE
#                           ================================

# command line options
use Getopt::Long;
use POSIX 'ceil';

GetOptions ("help","version","title=s","id=s","cpform=s","rpform=s","numform=s","param","k=i","iter=i","rf=i");
# show help option
if(defined $opt_help)
{
        $opt_help=1;
        &showhelp();
        exit;
}

# show version information
if(defined $opt_version)
{
        $opt_version=1;
        &showversion();
        exit;
}

# show minimal usage message if no arguments
if($#ARGV<0)
{
        &showminimal();
        exit;
}

#############################################################################

#                       ================================
#                          INITIALIZATION AND INPUT
#                       ================================

if(!defined $ARGV[0])
{
        print STDERR "ERROR($0):
        Please specify the Matrix file ...\n";
        exit;
}
#accept the input file name
$infile=$ARGV[0];
if(!-e $infile)
{
        print STDERR "ERROR($0):
        Matrix file <$infile> doesn't exist...\n";
        exit;
}

open(IN,$infile) || die "Error($0):
        Error(code=$!) in opening Matrix file <$infile>.\n";

# ---------------
# Reading infile
# ---------------

$line=1;

# line1 in Matrix file should either show the
# <keyfile> tag or #rows #cols #nnz
$line1=<IN>;
if($line1=~/keyfile/)
{
        $line1=<IN>;
        $line++;
}

if($line1=~/^\s*(\d+)\s+(\d+)\s+(\d+)\s*$/)
{
        $rows=$1;
        $cols=$2;
        $nnz1=$3;
}
else
{
        print STDERR "ERROR($0):
        Line $line in Matrix file <$infile> should show #rows #cols #nnz.\n";
        exit;
}

# default parameters
if(defined $opt_param)
{
        if(!defined $opt_k)
        {
                $opt_k=300;
        }
        if(!defined $opt_rf)
        {
                $opt_rf=10;
        }
        elsif($opt_rf<1)
        {
                print STDERR "ERROR($0):
        --rf = $opt_rf should be greater than 1.\n";
                exit;
        }
	
	my $tmp_ratio = ceil($cols/$opt_rf);
        $maxprs=$opt_k > $tmp_ratio ? $tmp_ratio : $opt_k;

	if(!defined $opt_iter || $opt_iter < $maxprs || $opt_iter > $cols)
        {
                $opt_iter=(3*$maxprs) > $cols ? $cols : (3*$maxprs);
        }
        $iter=$opt_iter;
}

# output format options
if(defined $opt_cpform)
{
        if($opt_cpform =~ /^(\d+)i(\d+)$/)
	{
		if(($1 * $2) !=80)
		{
			print STDERR "ERROR($0):
	(#pointers/line ($1)) * (#digits/pointer ($2)) != 80 
	in --cpform = $opt_cpform.\n";
			exit;
		}
	}
	else
        {
                print STDERR "ERROR($0):
        Invalid column pointer format --cpform=$opt_cpform.\n";
                exit;
        }
}
if(defined $opt_rpform)
{
        if($opt_rpform =~ /^(\d+)i(\d+)$/)
	{
		if($1 * $2 !=80)
		{
			print STDERR "ERROR($0):
	(#pointers/line ($1)) * (#digits/pointer ($2)) !=80 
	in --rpform = $opt_rpform.\n";
			exit;
		}
	}
	else
        {
                print STDERR "ERROR($0):
        Invalid row pointer format --rpform=$opt_rpform.\n";
                exit;
        }
}
if(defined $opt_numform)
{
	if($opt_numform =~ /^(\d+)i(\d+)$/)
	{
		if($1 * $2 != 80)
		{
			print STDERR "ERROR($0):
	(#integers/line ($1)) * (#digits/integer ($2)) != 80 
	in --numform = $opt_numform.\n";
			exit;
		}
	}
	elsif($opt_numform =~ /^(\d+)f(\d+)\.\d+$/)
	{
		if($1 * $2 != 80)
		{
			print STDERR "ERROR($0):
	(#reals/line ($1)) * (#digits/real ($2)) != 80
	in --numform = $opt_numform.\n";
			exit;
		}
	}
	else
        {
                print STDERR "ERROR($0):
        Invalid number format --numform=$opt_numform.\n";
                exit;
        }
}

# reading the sparse matrix
# in row order
$row=0;
while(<IN>)
{
	$line++;
	$row++;
	chomp;
	s/^\s*//;
	s/\s*$//;
	if(/^\s*$/)
	{
		next;
	}
	@pairs=split(/\s+/);
	for($i=0; $i<$#pairs; $i=$i+2)
	{
		$index=$pairs[$i];
		if($index > $cols)
		{
			print STDERR "ERROR($0):
	Index <$index> at line <$line> in Matrix file <$infile> exceeds the 
	total number of columns <$cols> specified on line 1.\n";
			exit;
		}
		$value=$pairs[$i+1];
		if($value==0)
		{
			print STDERR "ERROR($0):
	Given Matrix <$infile> is not sparse. Caught 0 value at line $line.\n";
			exit;
		}
		# storing in column-wise order
		$sparse_matrix{$index}{$row}=$value;
		$nnz++;
	}
}

close IN;
if($row != $rows)
{
	print STDERR "ERROR($0):
	1st line in Matrix file <$infile> shows #vectors = #rows = $rows 
	while the actual #vectors found in this file = $row.\n";
	exit;
}

if($nnz != $nnz1)
{
	print STDERR "ERROR($0):
	1st line in Matrix file <$infile> shows #nnz = $nnz1 while the 
	actual #nnz found in the file = $nnz.\n";
	exit;
}

foreach $col (1..$cols)
{
	if(!defined $sparse_matrix{$col})
	{
		print STDERR "ERROR($0):
	Null column at $col.\n";
		exit;
	}
}

##############################################################################

#			==========================
#			      PROGRAM SECTION
#			==========================

=head1 Harwell Boeing Format

=head2 Header Section

=over 

=item * Line1 (Title[72], Id[8])

=item * Line2 Skipped (as SVDPack ignores this line)

=item * Line3 (Type[3], 11x, Nrows[14], Ncols[14], NNZ[14], Nrhs[14])

where Type[3] is a 3 Character Field in which

=over 

=item 1. char[1] = 

r for Real matrix, 

c for Complex matrix, 

p for Pattern matrix

=item 2. char[2] =

u for Unsymmetric matrix, 

h for Hermitian matrix (Aij=Aji* where Aji* is a complex conjugate of Aij), 

z for Skew Symmetrix matrix

r for Rectangular matrix

=item 3. char[3]=

a for Assembled matrix

f for Unassembled Finite Elements

=back

Nrows = Number of Rows

Ncols = Number of Columns

NNZ = Number of Non-Zero Elements

Nrhs = Number of Right-Hand Sides (not used in SenseClusters)

=item * Line4 

(Pointer_Format[16], Row_index_Format[16], Numeric_Value_Format[20],
RHS_Format[20])

Pointers and Row Indices could have MiN type of format which specifies
that there are M intergers on each line and each represented with N digits. 
(M x N must be = 80 as this format only supports column width of maximum 80
characters)

Numeric Values can have either MiN format with same interpretation of 
M and N as above or MfD.F format which specifies that there are M real 
numbers on each line, each occupying total D digit space of each last F digits 
show the fractional part. 

Note: D is that total space used to represent a number that includes the 
decimal point and +/- sign if any. 

=back 

The above 4 Lines form the Header of the HB sparse matrix. 

=head2 Data Section 

This section contains 3 blocks which contain the non-zero values in the matrix
along with their row and column index information. 

 *************************************************************************
             NON-ZERO ENTRIES ARE STORED IN COLUMN ORDER !!!
 *************************************************************************

We consider data section of 3 blocks:

=head3 BLOCK1 POINTERS

The first block is an array whose entries show the indices (in block3) of the
leading non-zero value of every column.

e.g. If a given matrix is 

 4 6
 2 3 0 0 0 1
 0 2 0 1 2 0
 0 0 2 4 1 0 
 1 1 0 0 5 0

Then the first block will contain the pointers 

[1 3 6 7 9 12 13]

This shows that 

The first column begins at the 1st non-zero entry (2)
The second column begins at the 3rd non-zero entry (3) [in COLUMN ORDER] 
The third column begins at the 6th non-zero entry (2)
The forth column begins at the 7th non-zero entry (1)
and so on ... 

 *************************************************************************
	NULL columns (having no non-zero elements) are not allowed. 
 *************************************************************************

Note: The column pointers start at 1.

The last entry in @pointers contains an extra pointer pointing to one location 
after the last entry. So the last index in @pointers will always be #nnz + 1 
(where #nnz = total no. of non-zero entries) 

=head3 BLOCK2 ROW_INDICES

This block stores the row indices of the non-zero matrix entries in column 
order. 

For above matrix, this block will look like 

[1 4 1 2 4 3 2 3 2 3 4 1]

which shows that 

The 1st non-zero entry is at 1st row.
The 2nd non-zero entry is at 4th row.
The 3rd non-zero entry is at 1st row
and so on ....

Note: Row indices start from 1.

=head3 BLOCK3 VALUES

This block contains the actual non-zero values from the matrix in 
column order.

Thus, the block3 for the above shown matrix will look like 

[2 1 3 2 1 2 1 4 2 1 5 1]


General Observations:

=over 4

=item 1. The length(block2)=length(block3) and each is equal to the number 
of non-zero entries in the matrix. 

=item 2. The length(block1) = #columns of matrix + 1 as each column will have
an entry in block1 that shows the position of the leading non-zero element 
in it and there are no NULL columns allowed.

+1 because there is an extra pointer pointing to the location after the last 
non-zero entry. 

=item 3. The column pointers in block1 are also the pointers to block3 entries
where the leading(first) non-zero entry of each column is located.

=back

=head2 Sample Output 

matrix.dat                                                 harbomat

#

 rra                  4             6            11             0

    (10i8)          (10i8)            (8f10.3)            (8f10.3)

       1       3       5       6       7      10      12
       2       3       2       3       4       1       1       2       
       3       3       4
     1.000     1.000     2.000     4.000     1.000     2.000     1.000     
     2.000     2.000     3.000     1.000

Shows the HB format for a 4 x 6 matrix :

 4 6
 0 0 0 2 1 0
 1 2 0 0 2 0
 1 4 0 0 2 3
 0 0 1 0 0 1

=cut

############################################################################

#			=============================
#			     CONVERSION SECTION
#			=============================

# creating temporary files to store sparse data blocks
$blk1="blk1". time(). ".mat2harbo";
$blk2="blk2". time(). ".mat2harbo";
$blk3="blk3". time(). ".mat2harbo";
# opening temp files in w
open(BLK1,">$blk1") || die "Error($0):
        Error(code=$!) in opening <$blk1> file.\n";
open(BLK2,">$blk2") || die "Error($0):
        Error(code=$!) in opening <$blk2> file.\n";
open(BLK3,">$blk3") || die "Error($0):
        Error(code=$!) in opening <$blk3> file.\n";

# reading the MATRIX in column order
# and making entries for each nnz in block files
$nnz=0;

@column_indices=keys %sparse_matrix;
@sorted_columns=sort {$a <=> $b} @column_indices;
foreach $col (@sorted_columns)
{
	@row_indices=keys %{$sparse_matrix{$col}};
	@sorted_rows=sort {$a <=> $b} @row_indices;
	$newcol=1;
	foreach $row (@sorted_rows)
	{
		$nnz++;
		if($newcol)
		{
			print BLK1 "$nnz\n";
			$newcol=0;
		}
		print BLK2 "$row\n";
		print BLK3 $sparse_matrix{$col}{$row} . "\n";
	}
}

print BLK1 ($nnz+1);

##############################################################################

#                       =========================
#			    PRINTING SECTION 
#			=========================

# printing Line1 (Title[72], Id[8])

# if the title is specified by the user
if(defined $opt_title)
{
	print pack("A72",$opt_title);
}
# use matrix file name as the default title
else
{
	print pack("A72",$infile);
}

# if the Id(also called as the KEY by SVDPack) 
# is specified by the user 
if(defined $opt_id)
{
	$id=$opt_id;
}
# use default id = "harbomat"
else
{
	$id="harbomat";
}
print pack("A8",$id);

# done with Line1
print "\n";

# Skipping Line2 by putting # as this line is 
# ignored by SVDPack. But should have at least 
# single character

print "#\n";

# printing Line3 (Type[3], 11x, Nrows[14], Ncols[14], NNZ[14], Nrhs[14])

# setting type characters 
$c1="r"; # we deal with Real matrices only

$c2="r"; # always assume rectangular 

$c3="a"; # always assembled matrices
$type=$c1.$c2.$c3;

printf("%3s%11s%14s%14s%14s%14s",$type," ",$rows,$cols,$nnz,0);
print "\n";

# done with Line3

# printing formats

# if column pointer format is specified
if(defined $opt_cpform)
{
	$ptrform=$opt_cpform;
	# check : valid format
	if($ptrform =~ /^(\d+)i(\d+)$/)
	{
		$ptrs_online=$1;
		$digits_per_ptr=$2;
		$ptr_string="%".$digits_per_ptr."d";

		$upper_cpform="";
                while(length($upper_cpform)<($digits_per_ptr-1))
                {
                        $upper_cpform.="9";
                }
	}
	else
	{
		print STDERR "ERROR($0):
	Invalid column pointer format $ptrform.\n";
		exit;
	}
	$ptrform="(".$opt_cpform.")";
}
# use default (10i8)
else
{
	$ptrform="(10i8)";
	$ptr_string="%8d";
	$ptrs_online=10;
	$upper_cpform="9999999";
}

# if row pointer format is specified
if(defined $opt_rpform)
{
        $rowform=$opt_rpform;
        # check : valid format
        if($rowform =~ /^(\d+)i(\d+)$/)
        {
                $rowinds_online=$1;
                $digits_per_rowind=$2;
                $row_string="%".$digits_per_rowind."d";

		$upper_rpform="";
                while(length($upper_rpform)<($digits_per_rowind-1))
                {
                        $upper_rpform.="9";
                }
        }
        else
        {
                print STDERR "ERROR($0):
        Invalid row pointer format $rowform.\n";
                exit;
        }
	$rowform="(".$opt_rpform.")";
}
# use default (10i8)
else
{
        $rowform="(10i8)";
	$row_string="%8d";
	$rowinds_online=10;
	$upper_rpform="9999999";
}

# if numeric value format is specified
if(defined $opt_numform)
{
        $numform=$opt_numform;
	# check : valid format
	# integer ?
        if($numform =~ /^(\d+)i(\d+)$/)
        {
                $nums_online=$1;
                $digits_per_num=$2;
                $num_string="%".$digits_per_num."d";

		$lower_numform="-";
                while(length($lower_numform)<($digits_per_num-1))
                {
                        $lower_numform.="9";
                }
                if($lower_numform eq "-")
                {
                        $lower_numform="0";
                }
                $upper_numform="";
                while(length($upper_numform)<($digits_per_num-1))
                {
                        $upper_numform.="9";
                }
        }
	# real ?
	elsif($numform=~/^(\d+)f(\d+)\.(\d+)$/)
	{
		$nums_online=$1;
                $digits_per_num=$2;
		$fract=$3;
                $num_string="%".$digits_per_num. "." .$fract."f";

		$lower_numform="-";
                while(length($lower_numform)<($digits_per_num-$fract-2))
                {
                        $lower_numform.="9";
                }
                $lower_numform.=".";
                while(length($lower_numform)<($digits_per_num-1))
                {
                        $lower_numform.="9";
                }

                $upper_numform="";
                while(length($upper_numform)<($digits_per_num-$fract-2))
                {
                        $upper_numform.="9";
                }
                $upper_numform.=".";
                while(length($upper_numform)<($digits_per_num-1))
                {
                        $upper_numform.="9";
                }
	}
	# invalid
        else
        {
                print STDERR "ERROR($0):
        Invalid number format $numform.\n";
                exit;
        }
	$numform="(".$opt_numform.")";
}
# use default (5f16.10)
else
{
        $numform="(5f16.10)";
	$num_string="%16.10f";
	$nums_online=5;

	$lower_numform="-999.9999999999";
        $upper_numform="9999.9999999999";
}

# we don't use rhs

# printing formats
printf("%16s%16s%20s%20s",$ptrform,$rowform,$numform,$numform);

# done with Line 4
print "\n";

# done with Header section !

# now data blocks ... 

# print from temp files

# block1 : column pointers

open(BLK1,$blk1) || die "Error($0):
        Error(code=$!) in opening <$blk1> file.\n";

# each line of BLK1 contains a single column pointer
$ptr=0;
while(<BLK1>)
{
	$ptr++;
	$value=$_;

	# check if valid pointer index
	if($value !~ /^\d+$/)
	{
		print STDERR "ERROR($0):
	Wrong pointer value<$value> at line $ptr in file<$blk1>.\n";
		exit;
	}
	if($ptr!=1 && ($ptr-1) % $ptrs_online == 0)
        {
                print "\n";
        }
	$formatted_cp=sprintf($ptr_string,$value);

	if($formatted_cp>$upper_cpform)
	{
		print STDERR "ERROR($0):
        Floating point overflow.
        Column pointer <$formatted_cp> can't be represented with format $ptr_string.\n";
                exit 1;
	}
	print "$formatted_cp";
}

print "\n";

# block2 : row indices 

open(BLK2,$blk2) || die "Error($0):
        Error(code=$!) in opening <$blk2> file.\n";

# each row of BLK2 contains a single row pointer 

$ptr=0;
while(<BLK2>)
{
	$ptr++;
	$value=$_;

	# check if valid pointer index 
	if($value !~ /^\d+$/)
        {
               print STDERR "ERROR($0):
        Wrong row pointer value <$value> at line $ptr in file<$blk2>.\n";
               exit;
        }

	if($ptr!=1 && ($ptr-1) % $rowinds_online == 0)
        {
                print "\n";
        }
	$formatted_rp=sprintf($row_string,$value);
	if($formatted_rp>$upper_rpform)
        {
                print STDERR "ERROR($0):
        Floating point overflow.
        Row pointer <$formatted_rp> can't be represented with format $row_string.\n";
                exit 1;
        }
	print "$formatted_rp";
}

print "\n";

# block3 : non-zero matrix values

open(BLK3,$blk3) || die "Error($0):
        Error(code=$!) in opening <$blk3> file.\n";

# BLK3 stores a single non-zero value per line

$ptr=0;
while(<BLK3>)
{
	$ptr++;
	s/^\s+//;
	s/\s+$//;
	$value=$_;

	# check if valid number
	if($value !~ /^-?\d+\.?\d*$/)
        {
                print STDERR "ERROR($0):
        Wrong non-zero value<$value> at line $ptr in file<$blk3>.\n";
                exit;
        }

	if($ptr!=1 && ($ptr-1) % $nums_online == 0)
        {
                print "\n";
        }
	$formatted_value=sprintf($num_string,$value);

	if($formatted_value<$lower_numform)
        {
                print STDERR "ERROR($0):
        Floating point underflow.
        Value <$formatted_value> can't be represented with format $num_string.\n";
                exit 1;
        }
	if($formatted_value>$upper_numform)
	{
                print STDERR "ERROR($0):
        Floating point overflow.
        Value <$formatted_value> can't be represented with format $num_string.\n";
                exit 1;
	}
	print "$formatted_value";
}

print "\n";

# printing complete !

unlink "$blk1";
unlink "$blk2";
unlink "$blk3";

##############################################################################

#			==============================
#			 Creating Parameter file lap2
#			==============================

# Now, create the paramters

if(defined $opt_param)
{
	if(-e "lap2")
	{
        	print STDERR "Warning($0):
        Parameter file <lap2> already exists, overwrite (y/n)? ";
        	$ans=<STDIN>;
	        if($ans !~ /(y|Y)(es)?/)
        	{
                	undef $opt_param;
        	}
	}

	if(defined $opt_param)	{
		open(PAR,">lap2") || die "Error($0):
				          Error(code=$!) in opening parameter file <lap2>.\n";	

# the need for these warnings is somewhat less now, given that
# we are providing a las2.h file set with fairly large values

#		# the minimum required amt of memory, based on the data we have...
#		$lmtnw_minimum = (6*$cols)+(4*$iter)+1+($iter**2);

#		print STDERR "NOTE($0):
#    		The size of your default work area (LMTNW) in las2.h should be 
#     		greater than $lmtnw_minimum (cols=$cols, iter=$iter).\n";
		
		# the actual value in las2.h for LMTNW is:
		### 900,300,001 

##		if ($lmtnw_minimum >= 900,300,001) { 
##			print STDERR "Warning($0):
##	        	The value of LMTNW in las2.h/SVDPACKC is too small! 
##	        	Please change LMTNW to a value greater than $lmtnw_minimum.\n"; 
##	    	}

			# left and right threshold for e-values
			$endl="-1.0e-30";
			$endr="1.0e-30";
			# allowed tolerance
			$kappa="1.0e-6";
			print PAR "'$id' $iter $maxprs $endl $endr TRUE $kappa 0\n";
	    }
}

##############################################################################

#                      ==========================
#                          SUBROUTINE SECTION
#                      ==========================

#-----------------------------------------------------------------------------
#show minimal usage message
sub showminimal()
{
        print "Usage: mat2harbo.pl [OPTIONS] MATRIX";
        print "\nTYPE mat2harbo.pl --help for help\n";
}

#-----------------------------------------------------------------------------
#show help
sub showhelp()
{
	print "Usage:  mat2harbo.pl [OPTIONS] MATRIX

Converts a given matrix in SenseClusters' sparse format to Harwell-Boeing 
sparse matrix format.

MATRIX
	Matrix in SenseClusters' sparse matrix format that is to be converted
	into Harwell-Boeing sparse matrix format.

OPTIONS:

--title TITLE
	Specify the TITLE to be used for the MATRIX at Line1 of Harwell-Boeing 
	format. Default TITLE uses the MATRIX file name.

--id ID
	Specify the ID to be used to identify the MATRIX at Line1 of Harwell-
	Boeing format. Default ID is 'harbomat'.

--cpform CPFORM
	Specify the Column Pointer FORMat. Default is 10i8. 

--rpform RPFORM
	Specify the Row Pointer FORMat. Default is 10i8.

--numform NUMFORM
	Specify the NUMber FORMat to represent the non-zero MATRIX values.
	Default is 5f16.10.

Parameter Options :

These options help to automatically create the parameter file lap2 used by
the las2.c program in SVDPack.

--param 
	Creates the parameter file lap2 that can be directly used while 
	running las2.

--k K 
	Sets the value of maxprs parameter in lap2 file to K. 
	Default K=300

--rf RF
	Reduces the column space of a given MATRIX by scaling factor RF 
	such that maxprs = #columns(MATRIX) / RF , where RF >=1
	Default RF = 10

	Both --k and --rf allow user to control the maxprs value in lap2 file 
	which specifies the number of singular triplets to be returned by las2.

	If both --k and --rf are specified, maxprs = min(K,#columns(MATRIX)/RF)

--iter ITER
	Specifies number of iterations (or lanmax) in LAP2.
	Default ITER = min((3 * maxprs), #columns(MATRIX))

	ITER should be < = #columns(MATRIX)
	and
	ITER should be > = (maxprs = min(K,#columns(MATRIX)/RF))

--help
        Displays this message.

--version
        Displays the version information.

Type 'perldoc mat2harbo.pl' to view detailed documentation of mat2harbo.\n";

}

#------------------------------------------------------------------------------
#version information
sub showversion()
{
	print '$Id: mat2harbo.pl,v 1.35 2008/03/31 21:31:13 tpederse Exp $';
##      print "mat2harbo.pl      -       Version 0.4";
	print "\nConvert a matrix in SenseClusters sparse format to Harwell-Boeing sparse format\n";
##        print "\nCopyright (c) 2002-2006, Amruta Purandare & Ted Pedersen\n";
##      print "Date of Last Update:     11/06/2004\n";
}

#############################################################################

=head1 AUTHORS

Amruta Purandare, University of Pittsburgh

Ted Pedersen, University of Minnesota, Duluth
tpederse at d.umn.edu

=head1 COPYRIGHT

Copyright (c) 2003-2008, Amruta Purandare and Ted Pedersen

This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.

This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with
this program; if not, write to

 The Free Software Foundation, Inc.,
 59 Temple Place - Suite 330,
 Boston, MA  02111-1307, USA.

=cut