SimpleR::Reshape 数据处理转换
接口山寨自R语言的read.table/write.table/merge
还有reshape2包:http://cran.r-project.org/package=reshape2
支持 从文件或arrayref 按行读入数据,转换后输出新的 文件或arrayref
my $r = read_table( 'reshape_src.csv', skip_head=>1, conv_sub => sub { [ "$_[0][0] $_[0][1]", $_[0][2], $_[0][3] ] }, write_filename => '01.read_table.csv', #skip_sub => sub { $_[0][3]<200 }, #return_arrayref => 1, #sep=>',', #charset=>'utf8', );
将指定数据写入文本文件
my $d = [ [qw/a b 1/], [qw/c d 2/] ]; write_table($d, file=> 'write_table.csv', header => [ 'ka', 'kb', 'cnt'], #sep => ',', #charset => 'utf8', );
数据调整,参考R语言的reshape2包
my $r = melt('reshape_src.csv', skip_head => 1, names => [ qw/day hour state cnt rank/ ], #skip_sub => sub { $_[0][3]<1000 }, id => [ 0, 1, 2 ], measure => [3, 4], melt_filename => '02.melt.csv', );
数据重组,参考R语言的reshape2包
my $r = cast('02.melt.csv', cast_filename => '03.cast.csv', #key 有 cnt / rank 两种 names => [ qw/day hour state key value/ ], id => [ 0, 1, 2 ], measure => 3, value => 4, stat_sub => sub { my ($vlist) = @_; my @temp = sort { $b <=> $a } @$vlist; return $temp[0] }, result_names => [ qw/day hour state cnt rank/ ], #reduce_sub => sub { # my ($r) = @_; # my $s=0 ; $s+= $_ for @$r; # return [ $s ]; # }, );
注意:
reduce_sub 是在读取数据的过程中处理value,默认是直接push到value列表里
stat_sub 是在数据读取完毕后,对value列表进行最终统计处理
合并两个dataframe,在perl中是二层数组
my $r = merge( [ [qw/a b 1/], [qw/c d 2/] ], [ [qw/a b 3/], [qw/c d 4/] ], by => [ 0, 1], value => [2], ); # $r = [["a", "b", 1, 3], ["c", "d", 2, 4]]
To install SimpleR::Reshape, copy and paste the appropriate command in to your terminal.
cpanm
cpanm SimpleR::Reshape
CPAN shell
perl -MCPAN -e shell install SimpleR::Reshape
For more information on module installation, please visit the detailed CPAN module installation guide.