OptreeCheck - check optrees as rendered by B::Concise
OptreeCheck supports 'golden-sample' regression testing of perl's parser, optimizer, bytecode generator, via a single function: checkOptree(%in).
It invokes B::Concise upon the sample code, checks that the rendering 'agrees' with the golden sample, and reports mismatches.
Additionally, the module processes @ARGV (which is typically unused in the Core test harness), and thus provides a means to run the tests in various modes.
# your test file use OptreeCheck; plan tests => 1; checkOptree ( name => "test-name', # optional, made from others if not given # code-under-test: must provide 1 of them code => sub {my $a}, # coderef, or source (wrapped and evald) prog => 'sort @a', # run in subprocess, aka -MO=Concise bcopts => '-exec', # $opt or \@opts, passed to BC::compile errs => 'Name "main::a" used only once: possible typo at -e line 1.', # str, regex, [str+] [regex+], # various test options # errs => '.*', # match against any emitted errs, -w warnings # skip => 1, # skips test # todo => 'excuse', # anticipated failures # fail => 1 # force fail (by redirecting result) # the 'golden-sample's, (must provide both) expect => <<'EOT_EOT', expect_nt => <<'EONT_EONT' ); # start HERE-DOCS # 1 <;> nextstate(main 45 optree.t:23) v # 2 <0> padsv[$a:45,46] M/LVINTRO # 3 <1> leavesub[1 ref] K/REFC,1 EOT_EOT # 1 <;> nextstate(main 45 optree.t:23) v # 2 <0> padsv[$a:45,46] M/LVINTRO # 3 <1> leavesub[1 ref] K/REFC,1 EONT_EONT __END__
Heres a sample failure, as induced by the following command. Note the argument; option=value, after the test-file, more on that later $> PERL_CORE=1 ./perl ext/B/t/optree_check.t testmode=cross ... ok 19 - canonical example w -basic not ok 20 - -exec code: $a=$b+42 # Failed at test.pl line 249 # got '1 <;> nextstate(main 600 optree_check.t:208) v # 2 <#> gvsv[*b] s # 3 <$> const[IV 42] s # 4 <2> add[t3] sK/2 # 5 <#> gvsv[*a] s # 6 <2> sassign sKS/2 # 7 <1> leavesub[1 ref] K/REFC,1 # ' # expected /(?ms-xi:^1 <;> (?:next|db)state(.*?) v # 2 <\$> gvsv\(\*b\) s # 3 <\$> const\(IV 42\) s # 4 <2> add\[t\d+\] sK/2 # 5 <\$> gvsv\(\*a\) s # 6 <2> sassign sKS/2 # 7 <1> leavesub\[\d+ refs?\] K/REFC,1 # $)/ # got: '2 <#> gvsv[*b] s' # want: (?^:2 <\$> gvsv\(\*b\) s) # got: '3 <$> const[IV 42] s' # want: (?^:3 <\$> const\(IV 42\) s) # got: '5 <#> gvsv[*a] s' # want: (?^:5 <\$> gvsv\(\*a\) s) # remainder: # 2 <#> gvsv[*b] s # 3 <$> const[IV 42] s # 5 <#> gvsv[*a] s # these lines not matched: # 2 <#> gvsv[*b] s # 3 <$> const[IV 42] s # 5 <#> gvsv[*a] s
Errors are reported 3 different ways;
The 1st form is directly from test.pl's like() and unlike(). Note that this form is used as input, so you can easily cut-paste results into test-files you are developing. Just make sure you recognize insane results, to avoid canonizing them as golden samples.
The 2nd and 3rd forms show only the unexpected results and opcodes. This is done because it's blindingly tedious to find a single opcode causing the failure. 2 different ways are done in case one is unhelpful.
checkOptree(%tc) constructs a testcase object from %tc, and then calls methods which eventually call test.pl's like() to produce test results.
getRendering() runs code or prog through B::Concise, and captures its rendering. Errors emitted during rendering are checked against expected errors, and are reported as diagnostics by default, or as failures if 'report=fail' cmdline-option is given.
prog is run in a sub-shell, with $bcopts passed through. This is the way to run code intended for main. The code arg in contrast, is always a CODEREF, either because it starts that way as an arg, or because it's wrapped and eval'd as $sub = sub {$code};
mkCheckRex() selects the golden-sample for the threaded-ness of the platform, and produces a regex which matches the expected rendering, and fails when it doesn't match.
The regex includes 'workarounds' which accommodate expected rendering variations. These include:
string constants # avoid injection line numbers, etc # args of nexstate() hexadecimal-numbers pad-slot-assignments # for 5.8 compat, and testmode=cross (map|grep)(start|while) # for 5.8 compat
mylike() calls either unlike() or like(), depending on expectations. Mismatch reports are massaged, because the actual difference can easily be lost in the forest of opcodes.
Since the arg is a hash, the api is wide-open, and this really is about what elements must be or are in the hash, and what they do. %tc is passed to newTestCase(), the ctor, which adds in %proto, a global prototype object.
If name property is not provided, it is synthesized from these params: bcopts, note, prog, code. This is more convenient than trying to do it manually.
Either code or prog must be present.
prog => $src provides a snippet of code, which is run in a sub-process, via test.pl:runperl, and through B::Concise like so:
'./perl -w -MO=Concise,$bcopts_massaged -e $src'
The $code arg is passed to B::Concise::compile(), and run in-process. If $code is a string, it's first wrapped and eval'd into a $coderef. In either case, $coderef is then passed to B::Concise::compile():
$subref = eval "sub{$code}"; $render = B::Concise::compile($subref)->();
expect and expect_nt args are the golden-sample renderings, and are sampled from known-ok threaded and un-threaded bleadperl (5.9.1) builds. They're both required, and the correct one is selected for the platform being tested, and saved into the synthesized property wanted.
When getRendering() runs, it passes bcopts into B::Concise::compile(). The bcopts arg can be a single string, or an array of strings.
getRendering() processes the code or prog arg under warnings, and both parsing and optree-traversal errors are collected. These are validated against the one or more errors you specify.
These properties are set as %tc parameters to change test behavior.
invokes skip('reason'), causing test to skip.
invokes todo('reason')
For code arguments, this option causes getRendering to redirect the rendering operation to STDERR, which causes the regex match to fail.
If set, this relaxes the regex check, which is normally pretty strict. It's used primarily to validate checkOptree via tests in optree_check.
These properties are added into the test object during execution.
This stores the chosen expect expect_nt string. The OptreeCheck object may in the future delete the raw strings once wanted is set, thus saving space.
This tag is added if testmode=cross is passed in as argument. It causes test-harness to purposely use the wrong string.
checkErrs() is a getRendering helper that verifies that expected errs against those found when rendering the code on the platform. It is run after rendering, and before mkCheckRex.
It selects the correct golden-sample from the test-case object, and converts it into a Regexp which should match against the original golden-sample (used in selftest, see below), and on the renderings obtained by applying the code on the perl being tested.
The selection is driven by platform mostly, but also by test-mode, which rather complicates the code. This is worsened by the potential need to make platform specific conversions on the reftext.
but is otherwise as strict as possible. For example, it should *not* match when opcode flags change, or when optimizations convert an op to an ex-op.
The selected golden-sample is massaged to eliminate various match irrelevancies. This is done so that the tests don't fail just because you added a line to the top of the test file. (Recall that the renderings contain the program's line numbers). Similar cleanups are done on "strings", hex-constants, etc.
The need to massage is reflected in the 2 golden-sample approach of the test-cases; we want the match to be as rigorous as possible, and thats easier to achieve when matching against 1 input than 2.
Opcode arguments (text within braces) are disregarded for matching purposes. This loses some info in 'add[t5]', but greatly simplifies matching 'nextstate(main 22 (eval 10):1)'. Besides, we are testing for regressions, not for complete accuracy.
The regex is anchored by default, but can be suppressed with 'noanchors', allowing 1-liner tests to succeed if opcode is found.
Unusually, this module also processes @ARGV for command-line arguments which set global modes. These 'options' change the way the tests run, essentially reusing the tests for different purposes.
Additionally, there's an experimental control-arg interface (i.e. subject to change) which allows the user to set global modes.
At 1st, optreeCheck used one reference-text, but the differences between Threaded and Non-threaded renderings meant that a single reference (sampled from say, threaded) would be tricky and iterative to convert for testing on a non-threaded build. Worse, this conflicts with making tests both strict and precise.
We now use 2 reference texts, the right one is used based upon the build's threaded-ness. This has several benefits:
1. native reference data allows closer/easier matching by regex. 2. samples can be eyeballed to grok T-nT differences. 3. data can help to validate mkCheckRex() operation. 4. can develop regexes which accommodate T-nT differences. 5. can test with both native and cross-converted regexes.
Cross-testing (expect_nt on threaded, expect on non-threaded) exposes differences in B::Concise output, so mkCheckRex has code to do some cross-test manipulations. This area needs more work.
One consequence of a single-function API is difficulty controlling test-mode. I've chosen for now to use a package hash, %gOpts, to store test-state. These properties alter checkOptree() function, either short-circuiting to selftest, or running a loop that runs the testcase 2^N times, varying conditions each time. (current N is 2 only).
So Test-mode is controlled with cmdline args, also called options below. Run with 'help' to see the test-state, and how to change it.
This argument invokes runSelftest(), which tests a regex against the reference renderings that they're made from. Failure of a regex match its 'mold' is a strong indicator that mkCheckRex is buggy.
That said, selftest mode currently runs a cross-test too, they're not completely orthogonal yet. See below.
Cross-testing is purposely creating a T-NT mismatch, looking at the fallout, which helps to understand the T-NT differences.
The tweaking appears contrary to the 2-refs philosophy, but the tweaks will be made in conversion-specific code, which (will) handles T->NT and NT->T separately. The tweaking is incomplete.
A reasonable 1st step is to add tags to indicate when TonNT or NTonT is known to fail. This needs an option to force failure, so the test.pl reporting mechanics show results to aid the user.
This is normal mode. Other valid values are: native, cross, both.
Accepts test code, renders its optree using B::Concise, and matches that rendering against a regex built from one of 2 reference renderings %tc data.
The regex is built by mkCheckRex(\%tc), which scrubs %tc data to remove match-irrelevancies, such as (args) and [args]. For example, it strips leading '# ', making it easy to cut-paste new tests into your test-file, run it, and cut-paste actual results into place. You then retest and reedit until all 'errors' are gone. (now make sure you haven't 'enshrined' a bug).
name: The test name. May be augmented by a label, which is built from important params, and which helps keep names in sync with whats being tested.
This optree regression testing framework needs tests in order to find bugs. To that end, OptreeCheck has support for developing new tests, according to the following model:
1. write a set of sample code into a single file, one per paragraph. Add <=for gentest> blocks if you care to, or just look at f_map and f_sort in ext/B/t/ for examples. 2. run OptreeCheck as a program on the file ./perl -Ilib ext/B/t/OptreeCheck.pm -w ext/B/t/f_map ./perl -Ilib ext/B/t/OptreeCheck.pm -w ext/B/t/f_sort gentest reads the sample code, runs each to generate a reference rendering, folds this rendering into an optreeCheck() statement, and prints it to stdout. 3. run the output file as above, redirect to files, then rerun on same build (for sanity check), and on thread-opposite build. With editor in 1 window, and cmd in other, it's fairly easy to cut-paste the gots into the expects, easier than running step 2 on both builds then trying to sdiff them together.
This code is purely for testing core. While checkOptree feels flexible enough to be stable, the whole selftest framework is subject to change w/o notice.
To install less, copy and paste the appropriate command in to your terminal.
cpanm
cpanm less
CPAN shell
perl -MCPAN -e shell install less
For more information on module installation, please visit the detailed CPAN module installation guide.