The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

=head1 F<regen/op_private>

This file contains all the definitions of the meanings of the flags in the
op_private field of an OP.

After editing this file, run C<make regen>. This will generate/update data


C<B::Op_private> holds three global hashes, C<%bits>, C<%defines>,
C<%labels>, which hold roughly the same information as found in this file
(after processing).

F<opcode.h> gains a series of C<OPp*> defines, and a few static data

C<PL_op_private_valid> defines, per-op, which op_private bits are legally
allowed to be set. This is a good first place to look to see if an op has
any spare private bits.

C<PL_op_private_bitdef_ix>, C<PL_op_private_bitdefs>,
C<PL_op_private_labels>, C<PL_op_private_bitfields>,
C<PL_op_private_valid> contain (in a compact form) the data needed by
Perl_do_op_dump() to dump the op_private field of an op.

This file actually contains perl code which is run by F<regen/>.
The basic idea is that you keep calling addbits() to add definitions of
what a particular bit or range of bits in op_private means for a
particular op. This can be specified either as a 1-bit flag or a 1-or-more
bit bit field. Here's a general example:

            7 => qw(OPpLVAL_INTRO LVINTRO),
            6 => qw(OPpLVAL_DEFER LVDEFER),
       '4..5' =>  {
                       mask_def  => 'OPpDEREF',
                       enum => [ qw(
                                   1   OPpDEREF_AV   DREFAV
                                   2   OPpDEREF_HV   DREFHV
                                   3   OPpDEREF_SV   DREFSV

Here for the op C<aelem>, bits 6 and 7 (bits are numbered 0..7) are
defined as single-bit flags. The first string following the bit number is
the define name that gets emitted in F<opcode.h>, and the second string is
the label, which will be displayed by F<> and Perl_do_op_dump()
(as used by C<perl -Dx>).

If the bit number is actually two numbers connected with '..', then this
defines a bit field, which is 1 or more bits taken to hold a small
unsigned integer. Instead of two string arguments, it just has a single
hash ref argument. A bit field allows you to generate extra defines, such
as a mask, and optionally allows you to define an enumeration, where a
subset of the possible values of the bit field are given their own defines
and labels. The full syntax of this hash is explained further below.

Note that not all bits for a particular op need to be added in a single
addbits() call; they accumulate. In particular, this file is arranged in
two halves; first, generic flags shared by multiple ops are added, then
in the second half, specific per-op flags are added, e.g.

   addbits($_, 7 => qw(OPpLVAL_INTRO LVINTRO)) for qw(pos substr vec  ...);

               4 => qw(OPpSUBSTR_REPL_FIRST REPL1ST),
               3 => ...

(although the dividing line between these two halves is somewhat
subjective, and is based on whether "OPp" is followed by the op name or
something generic).

There are some utility functions for generating a list of ops from
F<regen/opcodes> based on various criteria. These are:

    ops_with_arg(N, 'XYZ')

which respectively return a list of op names where:

    field 3 of regen/opcodes specifies 'ck_foo' as the check function;
    field 4 of of regen/opcodes has flag or type 'X' set;
    argument field N of of regen/opcodes matches 'XYZ';

For example

    addbits($_, 4 => qw(OPpTARGET_MY TARGMY)) for ops_with_flag('T');

If a label is specified as '-', then the flag or bit field is not
displayed symbolically by Concise/-Dx; instead the bits are treated as
unrecognised and are included in the final residual integer value after
all recognised bits have been processed (this doesn't apply to individual
enum labels).

Here is a full example of a bit field hash:

    '5..6' =>  {
        mask_def      => 'OPpFOO_MASK',
        baseshift_def => 'OPpFOO_SHIFT',
        bitcount_def  => 'OPpFOO_BITS',
        label         => 'FOO',
        enum          => [ qw(
                             1   OPpFOO_A  A
                             2   OPpFOO_B  B
                             3   OPpFOO_C  C

The optional C<*_def> keys cause defines to be emitted that specify
useful values based on the bit range (5 to 6 in this case):

    mask_def:      a mask that will extract the bit field
    baseshift_def: how much to shift to make the bit field reach bit 0
    bitcount_def:  how many bits make up the bit field

The example above will generate

    #define OPpFOO_MASK 0x60
    #define OPpFOO_SHIFT   5
    #define OPpFOO_BITS    2

The optional enum list specifies a set of defines and labels for (possibly
a subset of) the possible values of the bit field (which in this example
are 0,1,2,3). If a particular value matches an enum, then it will be
displayed symbolically (e.g. 'C'), otherwise as a small integer. The
defines are suitably shifted. The example above will generate

    #define OPpFOO_A 0x20
    #define OPpFOO_B 0x40
    #define OPpFOO_C 0x60

So you can write code like

    if ((o->op_private & OPpFOO_MASK) == OPpFOO_C) ...

The optional 'label' key causes Concise/-Dx output to prefix the value
with C<LABEL=>; so in this case it might display C<FOO=C>.  If the field
value is zero, and if no label is present, and if no enum matches, then
the field isn't displayed.


use warnings;
use strict;

# ====================================================================
# Flags where FOO is a generic term (like LVAL), and the flag is
# shared between multiple (possibly unrelated) ops.

    # The lower few bits of op_private often indicate the number of
    # arguments. This is usually set by newUNOP() and newLOGOP (to 1),
    # by newBINOP() (to 1 or 2), and by ck_fun() (to 1..15).
    # These values are sometimes used at runtime: in particular,
    # the MAXARG macro extracts out the lower 4 bits.
    # Some ops encroach upon these bits; for example, entersub is a unop,
    # but uses bit 0 for something else. Bit 0 is initially set to 1 in
    # newUNOP(), but is later cleared (in ck_rvconst()), when the code
    # notices that this op is an entersub.
    # The important thing below is that any ops which use MAXARG at
    # runtime must have all 4 bits allocated; if bit 3 were used for a new
    # flag say, then things could break.  The information on the other
    # types of op is for completeness (so we can account for every bit
    # used in every op)

    my (%maxarg, %args0, %args1, %args2, %args3, %args4);

    # these are the functions which currently use MAXARG at runtime
    # (i.e. in the pp() functions). Thus they must always have 4 bits
    # allocated
    $maxarg{$_} = 1 for qw(
        binmode bless caller chdir close enterwrite eof exit fileno getc
        getpgrp gmtime index mkdir rand reset setpgrp sleep srand sysopen
        tell umask

    # find which ops use 0,1,2,3 or 4 bits of op_private for arg count info

    $args0{$_} = 1 for qw(entersub); # UNOPs that usurp bit 0

    $args1{$_} = 1 for (
                        qw(reverse), # ck_fun(), but most bits stolen
                        qw(mapstart grepstart), # set in ck_fun, but
                                                # cleared in ck_grep,
                                                # unless there is an error
                        grep !$maxarg{$_} && !$args0{$_},
                            ops_with_flag('1'), # UNOP
                            ops_with_flag('+'), # UNOP_AUX
                            ops_with_flag('%'), # BASEOP/UNOP
                            ops_with_flag('|'), # LOGOP
                            ops_with_flag('-'), # FILESTATOP
                            ops_with_flag('}'), # LOOPEXOP
                            ops_with_flag('.'), # METHOP

    $args2{$_} = 1 for (
                        grep !$maxarg{$_} && !$args0{$_} && !$args1{$_},
                            ops_with_flag('2'), # BINOP
                            # this is a binop, but special-cased as a
                            # baseop in regen/opcodes

    $args3{$_} = 1 for grep !$maxarg{$_} && !$args0{$_}
                                            && !$args1{$_} && !$args2{$_},
                            # substr starts off with 4 bits set in
                            # ck_fun(), but since it never has more than 7
                            # args, bit 3 is later stolen

    $args4{$_} = 1 for  keys %maxarg,
                        grep !$args0{$_} && !$args1{$_}
                                                && !$args2{$_} && !$args3{$_},
                            # these other ck_*() functions call ck_fun()

    for (sort keys %args1) {
        addbits($_, '0..0' => {
                mask_def  => 'OPpARG1_MASK',
                label     => '-',

    for (sort keys %args2) {
        addbits($_, '0..1' => {
                mask_def  => 'OPpARG2_MASK',
                label     => '-',

    for (sort keys %args3) {
        addbits($_, '0..2' => {
                mask_def  => 'OPpARG3_MASK',
                label     => '-',

    for (sort keys %args4) {
        addbits($_, '0..3' => {
                mask_def  => 'OPpARG4_MASK',
                label     => '-',

# if NATIVE_HINTS is defined, op_private on cops holds the top 8 bits
# of PL_hints, although only bits 6 & 7 are officially used for that
# purpose (the rest ought to be masked off). Bit 5 is set separately

for (qw(nextstate dbstate)) {
        5 => qw(OPpHUSH_VMSISH          HUSH),

# op is in local context, or pad variable is being introduced, e.g.
#   local $h{foo}
#   my $x

addbits($_, 7 => qw(OPpLVAL_INTRO LVINTRO))
    for qw(gvsv rv2sv rv2hv rv2gv rv2av aelem helem aslice
           hslice delete padsv padav padhv enteriter entersub padrange
           pushmark cond_expr refassign lvref lvrefslice lvavref multideref),
           'list', # this gets set in my_attrs() for some reason

# in constructs like my $x; ...; $x = $a + $b,
# the sassign is optimised away and OPpTARGET_MY is set on the add op
# Note that OPpTARGET_MY is mainly used at compile-time. At run time,
# the pp function just updates the SV pointed to by op_targ, and doesn't
# care whether that's a PADTMP or a lexical var.

# Some comments about when its safe to use T/OPpTARGET_MY.
# Safe to set if the ppcode uses:
# but make sure set-magic is invoked separately for SETs(TARG) (or change
# it to SETTARG).
# Unsafe to set if the ppcode uses dTARG or [X]RETPUSH[YES|NO|UNDEF]
# Only the code paths that handle scalar rvalue context matter.  If dTARG
# or RETPUSHNO occurs only in list or lvalue paths, T is safe.
# lt and friends do SETs (including ncmp, but not scmp or i_ncmp)
# Additional mode of failure: the opcode can modify TARG before it "used"
# all the arguments (or may call an external function which does the same).
# If the target coincides with one of the arguments ==> kaboom.
# pp.c	pos substr each not OK (RETPUSHUNDEF)
#	ref not OK (RETPUSHNO)
#	trans not OK (target is used for lhs, not retval)
#	ucfirst etc not OK: TMP arg processed inplace
#	quotemeta not OK (unsafe when TARG == arg)
#	pack - unknown whether it is safe
#	sprintf: is calling do_sprintf(TARG,...) which can act on TARG
#	  before other args are processed.
#	Suspicious wrt "additional mode of failure" (and only it):
#	schop, chop, postinc/dec, bit_and etc, negate, complement.
#	Also suspicious: 4-arg substr, sprintf, uc/lc (POK_only), reverse, pack.
#	substr/vec: doing TAINT_off()???
# pp_hot.c
#	readline - unknown whether it is safe
#	match subst not OK (dTARG)
#	grepwhile not OK (not always setting)
#	join not OK (unsafe when TARG == arg)
#	concat - pp_concat special-cases TARG==arg to avoid
#		"additional mode of failure"
# pp_ctl.c
#	mapwhile flip caller not OK (not always setting)
# pp_sys.c
#	backtick glob warn die not OK (not always setting)
#	warn not OK (RETPUSHYES)
#	open fileno getc sysread syswrite ioctl accept shutdown
#	 ftsize(etc) readlink telldir fork alarm getlogin not OK (RETPUSHUNDEF)
#	umask select not OK (XPUSHs(&PL_sv_undef);)
#	fileno getc sysread syswrite tell not OK (meth("FILENO" "GETC"))
#	sselect shm* sem* msg* syscall - unknown whether they are safe
#	gmtime not OK (list context)
#	Suspicious wrt "additional mode of failure": warn, die, select.

addbits($_, 4 => qw(OPpTARGET_MY TARGMY))
    for ops_with_flag('T'),
    # This flag is also used to indicate matches against implicit $_,
    # where $_ is lexical; e.g. my $_; ....; /foo/
    qw(match subst pushre qr trans transr);

# op_targ carries a refcount
addbits($_, 6 => qw(OPpREFCOUNTED REFC))
    for qw(leave leavesub leavesublv leavewrite leaveeval);

# Do not copy return value
addbits($_, 7 => qw(OPpLVALUE LV)) for qw(leave leaveloop);

# Pattern coming in on the stack
addbits($_, 6 => qw(OPpRUNTIME RTIME))
    for qw(match subst substcont qr pushre);

# autovivify: Want ref to something
for (qw(rv2gv rv2sv padsv aelem helem entersub)) {
    addbits($_, '4..5' => {
                mask_def  => 'OPpDEREF',
                enum => [ qw(
                            1   OPpDEREF_AV   DREFAV
                            2   OPpDEREF_HV   DREFHV
                            3   OPpDEREF_SV   DREFSV

# Defer creation of array/hash elem
addbits($_, 6 => qw(OPpLVAL_DEFER LVDEFER)) for qw(aelem helem multideref);

addbits($_, 2 => qw(OPpSLICEWARNING SLICEWARN)) # warn about @hash{$scalar}
    for qw(rv2hv rv2av padav padhv hslice aslice);

# XXX Concise seemed to think that OPpOUR_INTRO is used in rv2gv too,
# but I can't see it - DAPM
addbits($_, 6 => qw(OPpOUR_INTRO OURINTR)) # Variable was in an our()
    for qw(gvsv rv2sv rv2av rv2hv enteriter split);

# We might be an lvalue to return
addbits($_, 3 => qw(OPpMAYBE_LVSUB LVSUB))
    for qw(aassign rv2av rv2gv rv2hv padav padhv aelem helem aslice hslice
           av2arylen keys rkeys kvaslice kvhslice substr pos vec multideref);

for (qw(rv2hv padhv)) {
    addbits($_,                           # e.g. %hash in (%hash || $foo) ...
        4 => qw(OPpMAYBE_TRUEBOOL BOOL?), # ... cx not known till run time
        5 => qw(OPpTRUEBOOL       BOOL),  # ... in void cxt

addbits($_, 1 => qw(OPpHINT_STRICT_REFS STRICT))
    for qw(rv2sv rv2av rv2hv rv2gv multideref);

# Treat caller(1) as caller(2)
addbits($_, 7 => qw(OPpOFFBYONE  +1)) for qw(caller wantarray runcv);

# label is in UTF8 */
addbits($_, 7 => qw(OPpPV_IS_UTF8 UTF)) for qw(last redo next goto dump);

# ====================================================================
# where FOO is typically the name of an op, and the flag is used by a
# single op (or maybe by a few closely related ops).

addbits($_, 6 => qw(OPpPAD_STATE STATE))  for qw(padav padhv padsv lvavref
                                                 lvref refassign pushmark);

addbits('aassign', 6 => qw(OPpASSIGN_COMMON COMMON));

    6 => qw(OPpASSIGN_BACKWARDS BKWARD), # Left & right switched
    7 => qw(OPpASSIGN_CV_TO_GV  CV2GV),  # Possible optimisation for constants

for (qw(trans transr)) {
        0 => qw(OPpTRANS_FROM_UTF   <UTF),
        1 => qw(OPpTRANS_TO_UTF     >UTF),
        2 => qw(OPpTRANS_IDENTICAL  IDENT),   # right side is same as left
        3 => qw(OPpTRANS_SQUASH     SQUASH),
        # 4 is used for OPpTARGET_MY
        6 => qw(OPpTRANS_GROWS      GROWS),
        7 => qw(OPpTRANS_DELETE     DEL),

addbits('repeat', 6 => qw(OPpREPEAT_DOLIST DOLIST)); # List replication

# OP_ENTERSUB and OP_RV2CV flags
# Flags are set on entersub and rv2cv in three phases:
#   parser  - the parser passes the flag to the op constructor
#   check   - the check routine called by the op constructor sets the flag
#   context - application of scalar/ref/lvalue context applies the flag
# In the third stage, an entersub op might turn into an rv2cv op (undef &foo,
# \&foo, lock &foo, exists &foo, defined &foo).  The two places where that
# happens (op_lvalue_flags and doref in op.c) need to make sure the flags do
# not conflict, since some flags with different meanings overlap between
# the two ops.  Flags applied in the context phase are only set when there
# is no conversion of op type.
#   bit  entersub flag       phase   rv2cv flag             phase
#   ---  -------------       -----   ----------             -----
#     0  OPpENTERSUB_INARGS  context
#     1  HINT_STRICT_REFS    check   HINT_STRICT_REFS       check
#     3  OPpENTERSUB_AMPER   check   OPpENTERSUB_AMPER      parser
#     4  OPpDEREF_AV         context
#     5  OPpDEREF_HV         context OPpMAY_RETURN_CONSTANT parser/context
#     6  OPpENTERSUB_DB      check   OPpENTERSUB_DB
#     7  OPpLVAL_INTRO       context OPpENTERSUB_NOPAREN    parser


    0      => qw(OPpENTERSUB_INARGS   INARGS), # Lval used as arg to a sub
    1      => qw(OPpHINT_STRICT_REFS  STRICT), # 'use strict' in scope
    2      => qw(OPpENTERSUB_HASTARG  TARG  ), # Called from OP tree
    3      => qw(OPpENTERSUB_AMPER    AMPER),  # Used & form to call
    # 4..5 => OPpDEREF,      already defined above
    6      => qw(OPpENTERSUB_DB       DBG   ), # Debug subroutine
    # 7    => OPpLVAL_INTRO, already defined above

# note that some of these flags are just left-over from when an entersub
# is converted into an rv2cv, and could probably be cleared/re-assigned

    1 => qw(OPpHINT_STRICT_REFS    STRICT), # 'use strict' in scope
    2 => qw(OPpENTERSUB_HASTARG    TARG  ), # If const sub, return the const
    3 => qw(OPpENTERSUB_AMPER      AMPER ), # Used & form to call

    6 => qw(OPpENTERSUB_DB         DBG   ), # Debug subroutine
    7 => qw(OPpENTERSUB_NOPAREN    NO()  ), # bare sub call (without parens)

#foo() called before sub foo was parsed */
addbits('gv', 5 => qw(OPpEARLY_CV EARLYCV));

# 1st arg is replacement string */
addbits('substr', 4 => qw(OPpSUBSTR_REPL_FIRST REPL1ST));

    # bits 0..6 hold target range
    '0..6' =>  {
            label         => '-',
            mask_def      => 'OPpPADRANGE_COUNTMASK',
            bitcount_def  => 'OPpPADRANGE_COUNTSHIFT',
     # 7    => OPpLVAL_INTRO, already defined above

for (qw(aelemfast aelemfast_lex)) {
        '0..7' =>  {
                label     => '-',

    2 => qw(OPpDONT_INIT_GV NOINIT), # Call gv_fetchpv with GV_NOINIT
                            # (Therefore will return whatever is currently in
                            # the symbol table, not guaranteed to be a PVGV)
    6 => qw(OPpALLOW_FAKE   FAKE),   # OK to return fake glob

                    2 => qw(OPpITER_REVERSED REVERSED),# for (reverse ...)
                    3 => qw(OPpITER_DEF      DEF),     # 'for $_' or 'for my $_'
addbits('iter',     2 => qw(OPpITER_REVERSED REVERSED));

    1 => qw(OPpCONST_NOVER        NOVER),   # no 6;
    2 => qw(OPpCONST_SHORTCIRCUIT SHORT),   # e.g. the constant 5 in (5 || foo)
    3 => qw(OPpCONST_STRICT       STRICT),  # bareword subject to strict 'subs'
    4 => qw(OPpCONST_ENTERED      ENTERED), # Has been entered as symbol
    6 => qw(OPpCONST_BARE         BARE),    # Was a bare word (filehandle?)

# Range arg potentially a line num. */
addbits($_, 6 => qw(OPpFLIP_LINENUM LINENUM)) for qw(flip flop);

# Guessed that pushmark was needed. */
addbits('list', 6 => qw(OPpLIST_GUESSED GUESSED));

# Operating on a list of keys
addbits('delete', 6 => qw(OPpSLICE SLICE));
# also 7 => OPpLVAL_INTRO, already defined above

# Checking for &sub, not {} or [].
addbits('exists', 6 => qw(OPpEXISTS_SUB SUB));

    0 => qw(OPpSORT_NUMERIC  NUM    ), # Optimized away { $a <=> $b }
    1 => qw(OPpSORT_INTEGER  INT    ), # Ditto while under "use integer"
    2 => qw(OPpSORT_REVERSE  REV    ), # Reversed sort
    3 => qw(OPpSORT_INPLACE  INPLACE), # sort in-place; eg @a = sort @a
    4 => qw(OPpSORT_DESCEND  DESC   ), # Descending sort
    5 => qw(OPpSORT_QSORT    QSORT  ), # Use quicksort (not mergesort)
    6 => qw(OPpSORT_STABLE   STABLE ), # Use a stable algorithm

# reverse in-place (@a = reverse @a) */
addbits('reverse', 3 => qw(OPpREVERSE_INPLACE  INPLACE));

for (qw(open backtick)) {
        4 => qw(OPpOPEN_IN_RAW    INBIN ), # binmode(F,":raw")  on input  fh
        5 => qw(OPpOPEN_IN_CRLF   INCR  ), # binmode(F,":crlf") on input  fh
        6 => qw(OPpOPEN_OUT_RAW   OUTBIN), # binmode(F,":raw")  on output fh
        7 => qw(OPpOPEN_OUT_CRLF  OUTCR ), # binmode(F,":crlf") on output fh

# The various OPpFT* filetest ops

# "use filetest 'access'" is in scope:
# this flag is set only on a subset of the FT* ops
addbits($_, 1 => qw(OPpFT_ACCESS FTACCESS)) for ops_with_arg(0, 'F-+');

# all OPpFT* ops except stat and lstat
for (grep { $_ !~ /^l?stat$/ } ops_with_flag('-')) {
        2 => qw(OPpFT_STACKED  FTSTACKED ),  # stacked filetest,
                                             #    e.g. "-f" in "-f -x $foo"
        3 => qw(OPpFT_STACKING FTSTACKING),  # stacking filetest.
                                             #    e.g. "-x" in "-f -x $foo"
        4 => qw(OPpFT_AFTER_t  FTAFTERt  ),  # previous op was -t

addbits($_, 1 => qw(OPpGREP_LEX GREPLEX)) # iterate over lexical $_
    for qw(mapwhile mapstart grepwhile grepstart);

    1 => qw(OPpEVAL_HAS_HH       HAS_HH ), # Does it have a copy of %^H ?
    2 => qw(OPpEVAL_UNICODE      UNI    ),
    3 => qw(OPpEVAL_BYTES        BYTES  ),
    4 => qw(OPpEVAL_COPHH        COPHH  ), # Construct %^H from COP hints
    5 => qw(OPpEVAL_RE_REPARSING REPARSE), # eval_sv(..., G_RE_REPARSING)

# These must not conflict with OPpDONT_INIT_GV or OPpALLOW_FAKE.
# See pp.c:S_rv2gv. */
    0 => qw(OPpCOREARGS_DEREF1    DEREF1), # Arg 1 is a handle constructor
    1 => qw(OPpCOREARGS_DEREF2    DEREF2), # Arg 2 is a handle constructor
   #2 reserved for OPpDONT_INIT_GV in rv2gv
   #4 reserved for OPpALLOW_FAKE   in rv2gv
    6 => qw(OPpCOREARGS_SCALARMOD $MOD  ), # \$ rather than \[$@%*]
    7 => qw(OPpCOREARGS_PUSHMARK  MARK  ), # Call pp_pushmark

addbits('split', 7 => qw(OPpSPLIT_IMPLIM IMPLIM)); # implicit limit

    2 => qw(OPpLVREF_ELEM ELEM   ),
    3 => qw(OPpLVREF_ITER ITER   ),
'4..5'=> {
           mask_def => 'OPpLVREF_TYPE',
           enum     => [ qw(
                             0   OPpLVREF_SV   SV
                             1   OPpLVREF_AV   AV
                             2   OPpLVREF_HV   HV
                             3   OPpLVREF_CV   CV
) for 'refassign', 'lvref';

    4 => qw(OPpMULTIDEREF_EXISTS EXISTS), # deref is actually exists
    5 => qw(OPpMULTIDEREF_DELETE DELETE), # deref is actually delete


# ex: set ts=8 sts=4 sw=4 et: