
Net::Amazon::HadoopEC2::Cluster - Representation of Hadoop-EC2 cluster

my $hadoop = Net::Amazon::HadoopEC2->new(
{
aws_account_id => 'my account',
aws_access_key_id => 'my key',
aws_secret_access_key => 'my secret',
}
);
my $cluster = $hadoop->launch_cluster(
{
naem => 'hadoop-ec2-cluster',
image_id => 'ami-b0fe1ad9' # hadoop-ec2 official image
slaves => 2,
key_name => 'gsg-keypair',
key_file => "$ENV{HOME}/.ssh/id_rsa-gsg-keypair',
}
);
$cluster->push_file(
{
files => ['map.pl', 'reduce.pl'],
destination => '/root/',
}
);
my $option = join(' ', qw(
-mapper map.pl
-reducer reduce.pl
-file map.pl
-file reduce.pl
)
);
my $result = $cluster->execute(
{
command => "$hadoop jar $streaming $option",
}
);

A class Representing Hadoop-EC2 cluster

Constructor. Normally Net::Amazon::HadoopEC2 calls this so you won't need to think about this.
Launches hadoop-ec2 cluster. Returns Net::Amazon::HadoopEC2::Cluster instance itself when succeeded.
The image id (ami) of the cluster.
The key name to use when launching cluster. the default is 'gsg-keypair'.
Location of the private key file associated with key_name.
The number of slaves. The default is 2.
Finds hadoop-ec2 cluster. Returns Net::Hadoop::EC2::Cluster instance itself if found.
Launches hadoop-ec2 slave instance for this cluster. Returns Net::Hadoop::EC2::Cluster instance itself if succeeded. Arguments are:
The number of slaves to launch. default is 1.
Terminates all EC2 instances of this cluster. Returns Net::Amazon::EC2::TerminateInstancesResponse instance.
Terminates hadoop-ec2 slave instances of this cluster. Returns Net::Amazon::EC2::TerminateInstancesResponse instance. Arguments are:
The number of slave instances to terminate. the default is the number of exisiting instances.
Runs command on the master instance via ssh. Returns Net::Amazon::HadoopEC2::SSH::Response instance. This method is implemented in Net::Amazon::HadoopEC2::SSH and it's only wrapper of Net::SSH::Perl. Arguments are:
The command line to pass.
String to pass to STDIN of the command.
Pushes local files to hadoop-ec2 master instance via ssh. This method is also implemented in Net::Amazon::HadoopEC2::SSH. Returns true if succeeded. Arguments are:
files to push. Accepts string or arrayref of strings.
Destination of the files.
Gets files on the hadoop-ec2 master instance. This method is implemented in Net::Amazon::HadoopEC2::SSH. Returns true if succeeded. Arguments are:
files to get. String and arrayref of strings is ok.
local path to place the files.

Name of the cluster.
The key name to use when launching cluster. the default is 'gsg-keypair'.
Boolean whether EC2 api request retry or not.
MAX_MAP_TASKS to pass to the instances when boot.
MAX_REDUCE_TASKS to pass to the instances when boot.
COMPRESS to pass to the instances when boot.
additional user data to pass to the instances when boot.
Net::Amazon::EC2::RunningInstances instance of master instance.
Arrayref of Net::Amazon::EC2::RunningInstances instance of master instance.

Nobuo Danjou nobuo.danjou@gmail.com
