The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Algorithm::LibLinear::DataSet

SYNOPSIS

  use Algorithm::LibLinear::DataSet;
  
  my $data_set = Algorithm::LibLinear::DataSet->new(data_set => [
    +{ feature => +{ 1 => 0.708333, 2 => 1, 3 => 1, ... }, label => 1, },
    +{ feature => +{ 1 => 0.583333, 2 => -1, 3 => 0.333333, ... }, label => -1, },
    +{ feature => +{ 1 => 0.166667, 2 => 1, 3 => -0.333333, ... }, label => 1, },
    ...
  ]);
  my $data_set = Algorithm::LibLinear::DataSet->load(fh => \*DATA);
  my $data_set = Algorithm::LibLinear::DataSet->load(filename => 'liblinear_file');
  my $data_set = Algorithm::LibLinear::DataSet->load(string => "+1 1:0.70833 ...");
  
  say $data_set->size;
  say $data_set->as_string;  # '+1 1:0.70833 2:1 3:1 ...'
  
  __DATA__
  +1 1:0.708333 2:1 3:1 4:-0.320755 5:-0.105023 6:-1 7:1 8:-0.419847 9:-1 10:-0.225806 12:1 13:-1 
  -1 1:0.583333 2:-1 3:0.333333 4:-0.603774 5:1 6:-1 7:1 8:0.358779 9:-1 10:-0.483871 12:-1 13:1 
  +1 1:0.166667 2:1 3:-0.333333 4:-0.433962 5:-0.383562 6:-1 7:-1 8:0.0687023 9:-1 10:-0.903226 11:-1 12:-1 13:1 
  ...

DESCRIPTION

This class represents set of labeled feature vectors used for learning.

METHODS

new(data_set => \@data_set)

Constructor.

data_set is an ArrayRef of HashRef that has 2 keys: feature and label. The value of feature is a HashRef which represents a (sparse) feature vector. Its key is an index and corresponding value is a real number. The indices must be >= 1. The value of label is an integer that is class label the feature belonging.

load(fh => \*FH | filename => $path | string => $string)

Class method. Loads data set from LIBSVM/LIBLINEAR format file.

as_string

Dumps the data set as a LIBSVM/LIBLINEAR format data.

size

The number of data.