NAME

Runops::Optimized design

DESCRIPTION

Runops::Optimized unrolls the optree of a Perl subroutine in execution order, so that the CPU has a better chance of branch prediction and improved cache usage.

It takes a minimal approach to this and aims to simply return to a variant of the normal perl runloop if an op is seen that will have unpredictable results.

Eventually some small hot ops such as pp_nextstate, pp_const, etc may be inlined.

Some people may call this JIT but I'm of the opinion that until it actually has a closer understanding of what the underlying ops are doing it is just unrolling.

COMPONENTS

sljit

Sljit is used to actually generate the underlying machine code, this handles support for the most common CPUs and means the code isn't tied to a particular machine. It is considerably simpler than LLVM and can be shipped with this module as it is small.

Sljit is stackless, so it doesn't make use of the normal C level stack (in the normal way anyway), this is what makes it possible to safely return to the interpreter at any point. This makes dealing with edge cases easy.
Inserting code

This is one slightly evil area. Each CV is unrolled on the second time it is executed. The idea for waiting until the second time is unrolling certain setup subroutines would be of limited value.

This is recorded in the bits known as op_spare and the result of unrolling is patched straight into op_ppcode. Obviously this isn't ideal and eventually this may be stored in structure separate to the optree (potentially with a lock for threaded support).

ISSUES / TODO

This is only a proof of concept really, so there's many issues.

Test other CPUs

I've only tested this on x86_64 on OS X. This should work on anything sljit supports but needs testing.
Better code for following execution order

The code for following execution order is lame (see comment in unroll.c). It can even get stuck in a loop on some branches.
Unroll flow-control ops

last, next, etc. result in a return. These should be supported, but are quite complex. (next should be fairly easy though.)
No-multiplicity support

This only works for a non-multiplicity, non-threaded build of perl. Neither would be impossible to support, but are more work.
More tests, etc

This has only received limited testing, it probably misses even important core perl ops.

Probably worth having author tests, e.g. export PERL5OPT=-mRunops::Optimized and then run some large modules test suites.
Custom ops

Custom ops and things that do unexpected things may present issues. Some of this is mitigated by doing the unrolling at run time, so any compile time modifications to the op tree will be picked up.
Inlining hot ops

For more speed it would be interesting

Investigate memory/CPU tradeoff

How much overhead does unrolling everything have for large programs?

  $ PERL5LIB= /usr/bin/time bleadperl -MRunops::Optimized -MMoose -e1 
        0.87 real         0.81 user         0.03 sys
  $ PERL5LIB= /usr/bin/time bleadperl -MMoose -e1                    
        0.76 real         0.72 user         0.02 sys

DEBUGGING

This will break. You'll need to debug it.

First of all compile with debugging support:

  perl Makefile.PL DEBUG=1

This does two things, enable an environment variable that prints out the inner workings when it is set:

  export RUNOPS_OPTIMIZED_DEBUG=

Additionally it generates trap instructions (int3 on IA32) that run when PL_op isn't in the expected place.

To install Runops::Optimized, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Runops::Optimized

CPAN shell

perl -MCPAN -e shell
install Runops::Optimized

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)

NAME

DESCRIPTION

COMPONENTS

ISSUES / TODO

DEBUGGING

Module Install Instructions