perltodo - Perl TO-DO List


This is a list of wishes for Perl. Send updates to If you want to work on any of these projects, be sure to check the perl5-porters archives for past ideas, flames, and propaganda. This will save you time and also prevent you from implementing something that Larry has already vetoed. One set of archives may be found at:

To do during 5.6.x ^

Support for I/O disciplines

perlio provides this, but the interface could be a lot more straightforward.

Eliminate need for "use utf8";

While the utf8 pragma is autoloaded when necessary, it's still needed for things like Unicode characters in a source file. The UTF8 hint can always be set to true, but it needs to be set to false when is being compiled. (To stop Perl trying to autoload the utf8 pragma...)

Create a char *sv_pvprintify(sv, STRLEN *lenp, UV flags)

For displaying PVs with control characters, embedded nulls, and Unicode. This would be useful for printing warnings, or data and regex dumping, not_a_number(), and so on.

Requirements: should handle both byte and UTF8 strings. isPRINT() characters printed as-is, character less than 256 as \xHH, Unicode characters as \x{HHH}. Don't assume ASCII-like, either, get somebody on EBCDIC to test the output.

Possible options, controlled by the flags: - whitespace (other than ' ' of isPRINT()) printed as-is - use isPRINT_LC() instead of isPRINT() - print control characters like this: "\cA" - print control characters like this: "^A" - non-PRINTables printed as '.' instead of \xHH - use \OOO instead of \xHH - use the C/Perl-metacharacters like \n, \t - have a maximum length for the produced string (read it from *lenp) - append a "..." to the produced string if the maximum length is exceeded - really fancy: print unicode characters as \N{...}


When the lexer sees, for instance, bytes::length, it should automatically load the bytes pragma.

Make "\u{XXXX}" et al work

Danger, Will Robinson! Discussing the semantics of "\x{F00}", "\xF00" and "\U{F00}" on P5P will lead to a long and boring flamewar.

Overloadable regex assertions

This may or may not be possible with the current regular expression engine. The idea is that, for instance, \b needs to be algorithmically computed if you're dealing with Thai text. Hence, the \b assertion wants to be overloaded by a function.

Unicode collation and normalization

Simon Cozens promises to work on this.


Unicode case mappings

    Case Mappings?

Unicode regular expression character classes

They have some tricks Perl doesn't yet implement like character class subtraction.

use Thread for iThreads

Artur Bergman's iThreads module is a start on this, but needs to be more mature.

make perl_clone optionally clone ops

So that pseudoforking, mod_perl, iThreads and nvi will work properly (but not as efficiently) until the regex engine is fixed to be threadsafe.

Work out exit/die semantics for threads

Typed lexicals for compiler

Compiler workarounds for Win32

AUTOLOADing in the compiler

Fixing comppadlist when compiling

Cleaning up exported namespace

Complete signal handling

Add PERL_ASYNC_CHECK to opcodes which loop; replace sigsetjmp with sigjmp; check wait for signal safety.

Out-of-source builds

This was done for 5.6.0, but needs reworking for 5.7.x

POSIX realtime support

POSIX 1003.1 1996 Edition support--realtime stuff: POSIX semaphores, message queues, shared memory, realtime clocks, timers, signals (the metaconfig units mostly already exist for these)

UNIX98 support

Reader-writer locks, realtime/asynchronous IO

IPv6 Support

There are non-core modules, such as Net::IPv6, but these will need integrating when IPv6 actually starts to really happen. See RFC 2292 and RFC 2553.

Long double conversion

Floating point formatting is still causing some weird test failures.


Locales and Unicode interact with each other in unpleasant ways. One possible solution would be to adopt/support ICU:

Thread-safe regexes

The regular expression engine is currently non-threadsafe.

Arithmetic on non-Arabic numerals

[1234567890] aren't the only numerals any more.

POSIX Unicode character classes

([=a=] for equivalance classes, [.ch.] for collation.) These are dependent on Unicode normalization and collation.

Factoring out common suffices/prefices in regexps (trie optimization)

Currently, the user has to optimize foo|far and foo|goo into f(?:oo|ar) and [fg]oo by hand; this could be done automatically.

Security audit shipped utilities

All the code we ship with Perl needs to be sensible about temporary file handling, locking, input validation, and so on.

Custom opcodes

Have a way to introduce user-defined opcodes without the subroutine call overhead of an XSUB; the user should be able to create PP code. Simon Cozens has some ideas on this.

spawnvp() on Win32

Win32 has problems spawning processes, particularly when the arguments to the child process contain spaces, quotes or tab characters.

DLL Versioning

Windows needs a way to know what version of a XS or libperl DLL it's loading.

Introduce @( and @)

$( may return "foo bar baz". Unfortunately, since groups can theoretically have spaces in their names, this could be one, two or three groups.

Floating point handling

NaN and inf support is particularly troublesome. (fp_classify(), fp_class(), fp_class_d(), class(), isinf(), isfinite(), finite(), isnormal(), unordered(), <ieeefp.h>, <fp_class.h> (there are metaconfig units for all these) (I think), fp_setmask(), fp_getmask(), fp_setround(), fp_getround() (no metaconfig units yet for these). Don't forget finitel(), fp_classl(), fp_class_l(), (yes, both do, unfortunately, exist), and unorderedl().)

As of Perl 5.6.1 is a Perl macro, Perl_isnan().

IV/UV preservation

Nicholas Clark has done a lot of work on this, but work is continuing. +, - and * work, but guards need to be in place for %, /, &, oct, hex and pack.

Replace pod2html with something using Pod::Parser

The CPAN module Malik::Pod::Html may be a more suitable basis for a pod2html convertor; the current one duplicates the functionality abstracted in Pod::Parser, which makes updating the POD language difficult.

Automate module testing on CPAN

When a new Perl is being beta tested, porters have to manually grab their favourite CPAN modules and test them - this should be done automatically.

sendmsg and recvmsg

We have all the other BSD socket functions but these. There are metaconfig units for these functions which can be added. To avoid these being new opcodes, a solution similar to the way sockatmark was added would be preferable. (Autoload the IO::whatever module.)

Rewrite perlre documentation

The new-style patterns need full documentation, and the whole document needs to be a lot clearer.

Convert example code to IO::Handle filehandles

Document Win32 choices

Check new modules

Make roffitall find pods and libs itself

Simon Cozens has done some work on this but it needs a rethink.

To do at some point ^

These are ideas that have been regularly tossed around, that most people believe should be done maybe during 5.8.x

Remove regular expression recursion

Because the regular expression engine is recursive, badly designed expressions can lead to lots of recursion filling up the stack. Ilya claims that it is easy to convert the engine to being iterative, but this has still not yet been done. There may be a regular expression engine hit squad meeting at TPC5.

Memory leaks after failed eval

Perl will leak memory if you eval "hlagh hlagh hlagh hlagh". This is partially because it attempts to build up an op tree for that code and doesn't properly free it. The same goes for non-syntactically-correct regular expressions. Hugo looked into this, but decided it needed a mark-and-sweep GC implementation.

Alan notes that: The basic idea was to extend the parser token stack (YYSTYPE) to include a type field so we knew what sort of thing each element of the stack was. The <perly.c code would then have to be postprocessed to record the type of each entry on the stack as it was created, and the parser patched so that it could unroll the stack properly on error.

This is possible to do, but would be pretty messy to implement, as it would rely on even more sed hackery in perly.fixer.

pack "(stuff)*"

That's to say, pack "(sI)40" would be the same as pack "sI"x40

bitfields in pack

Cross compilation

Make Perl buildable with a cross-compiler. This will play havoc with Configure, which needs to how how the target system will respond to its tests; maybe microperl will be a good starting point here. (Indeed, Bart Schuller reports that he compiled up microperl for the Agenda PDA and it works fine.) A really big spanner in the works is the bootstrapping build process of Perl: if the filesystem the target systems sees is not the same what the build host sees, various input, output, and (Perl) library files need to be copied back and forth.

Perl preprocessor / macros

Source filters help with this, but do not get us all the way. For instance, it should be possible to implement the ?? operator somehow; source filters don't (quite) cut it.

Perl lexer in Perl

Damian Conway is planning to work on this, but it hasn't happened yet.

Using POSIX calls internally

When faced with a BSD vs. SySV -style interface to some library or system function, perl's roots show in that it typically prefers the BSD interface (but falls back to the SysV one). One example is getpgrp(). Other examples include memcpy vs. bcopy. There are others, mostly in <pp_sys.c.

Mostly, this item is a suggestion for which way to start a journey into an #ifdef forest. It is not primarily a suggestion to eliminate any of the #ifdef forests.

POSIX calls are perhaps more likely to be portable to unexpected architectures. They are also perhaps more likely to be actively maintained by a current vendor. They are also perhaps more likely to be available in thread-safe versions, if appropriate.

-i rename file when changed

It's only necessary to rename a file when inplace editing when the file has changed. Detecting a change is perhaps the difficult bit.

All ARGV input should act like <>

Support for rerunning debugger

There should be a way of restarting the debugger on demand.

Test Suite for the Debugger

The debugger is a complex piece of software and fixing something here may inadvertently break something else over there. To tame this chaotic behaviour, a test suite is necessary.

my sub foo { }

The basic principle is sound, but there are problems with the semantics of self-referential and mutually referential lexical subs: how to declare the subs?

One-pass global destruction

Sweeping away all the allocated memory in one go is a laudable goal, but it's difficult and in most cases, it's easier to let the memory get freed by exiting.

Rewrite regexp parser

There has been talk recently of rewriting the regular expression parser to produce an optree instead of a chain of opcodes; it's unclear whether or not this would be a win.

Cache recently used regexps

This is to speed up

    for my $re (@regexps) {
        $matched++ if /$re/

qr// already gives us a way of saving compiled regexps, but it should be done automatically.

Re-entrant functions

Add configure probes for _r forms of system calls and fit them to the core. Unfortunately, calling conventions for these functions and not standardised.

Cross-compilation support

Bart Schuller reports that using microperl and a cross-compiler, he got Perl working on the Agenda PDA. However, one cannot build a full Perl because Configure needs to get the results for the target platform, for the host.

Bit-shifting bitvectors


    vec($v, 1000, 1) = 1;

One should be able to do

    $v <<= 1;

and have the 999'th bit set.

Currently if you try with shift bitvectors you shift the NV/UV, instead of the bits in the PV. Not very logical.

debugger pragma

The debugger is implemented in Perl in; turning it into a pragma should be easy, but making it work lexically might be more difficult. Fiddling with $^P would be necessary.

use less pragma

Identify areas where speed/memory tradeoffs can be made and have a hint to switch between them.

switch structures

Although we have in core, Larry points to the dormant nswitch and cswitch ops in pp.c; using these opcodes would be much faster.

Cache eval tree


Shrink opcode tables

Optimize away @_

Look at the "reification" code in av.c

Prototypes versus indirect objects

Currently, indirect object syntax bypasses prototype checks.

Install HMTL

HTML versions of the documentation need to be installed by default; a call to installhtml from installperl may be all that's necessary.

Prototype method calls

Return context prototype declarations


Garbage collection

There have been persistent mumblings about putting a mark-and-sweep garbage detector into Perl; Alan Burlison has some ideas about this.

IO tutorial

Mark-Jason Dominus has the beginnings of one of these.

pack/unpack tutorial

Simon Cozens has the beginnings of one of these.

Rewrite perldoc

There are a few suggestions for what to do with perldoc: maybe a full-text search, an index function, locating pages on a particular high-level subject, and so on.

Install .3p manpages

This is a bone of contention; we can create .3p manpages for each built-in function, but should we install them by default? Tcl does this, and it clutters up apropos.

Unicode tutorial

Simon Cozens promises to do this before he gets old.

Update for 1003.1-2

Retargetable installation

Allow @INC to be changed after Perl is built.

POSIX emulation on non-POSIX systems

Make behave as POSIXly as possible everywhere, meaning we have to implement POSIX equivalents for some functions if necessary.

Rename Win32 headers

Finish off lvalue functions

They don't work in the debugger, and they don't work for list or hash slices.

Update sprintf documentation

Hugo van der Sanden plans to look at this.

Use fchown/fchmod internally

This has been done in places, but needs a thorough code review. Also fchdir is available in some platforms.

Vague ideas ^

Ideas which have been discussed, and which may or may not happen.

ref() in list context

It's unclear what this should do or how to do it without breaking old code.

Make tr/// return histogram

There is a patch for this, but it may require Unicodification.

Compile to real threaded code

Structured types

Modifiable $1 et al.

    ($x = "elephant") =~ /e(ph)/;
    $1 = "g"; # $x = "elegant"

What happens if there are multiple (nested?) brackets? What if the string changes between the match and the assignment?

Procedural interfaces for IO::*, etc.

Some core modules have been accused of being overly-OO. Adding procedural interfaces could demystify them.

RPC modules

Attach/detach debugger from running program

With gdb, you can attach the debugger to a running program if you pass the process ID. It would be good to do this with the Perl debugger on a running Perl program, although I'm not sure how it would be done.

Alternative RE syntax module

    use Regex::Newbie;
    $re = Regex::Newbie->new


A non-core module that would use "native" GUI to create graphical applications.

foreach(reverse ...)


    foreach (reverse @_) { ... }

puts @_ on the stack, reverses it putting the reversed version on the stack, then iterates forwards. Instead, it could be special-cased to put @_ on the stack then iterate backwards.

Constant function cache

Approximate regular expression matching

Ongoing ^

These items always need doing:

Update guts documentation

Simon Cozens tries to do this when possible, and contributions to the perlapi documentation is welcome.

Add more tests

Michael Schwern will donate $500 to Yet Another Society when all core modules have tests.

Update auxiliary tools

The code we ship with Perl should look like good Perl 5.

Recently done things ^

These are things which have been on the todo lists in previous releases but have recently been completed.

Safe signal handling

A new signal model went into 5.7.1 without much fanfare. Operations and mallocs are no longer interrupted by signals, which are handled between opcodes. This means that PERL_ASYNC_CHECK now actually does something. However, there are still a few things that need to be done.

Tie Modules

Modules which implement arrays in terms of strings, substrings or files can be found on the CPAN.


Time::Hires has been integrated into the core.

setitimer and getimiter

Adding Time::Hires got us this too.

Testing __DIE__ hook

Tests have been added.

CPP equivalent in Perl

A C Yardley will probably have done this by the time you can read this. This allows for a generalization of the C constant detection used in building

Explicit switch statements has been integrated into the core to give you all manner of semantics.


This is


Nick Ing-Simmons has made UTF-EBCDIC (UTR13) work with Perl.


UTF Regexes

Although there are probably some small bugs to be rooted out, Jarkko Hietaniemi has made regular expressions polymorphic between bytes and characters.

perlcc to produce executable

perlcc was recently rewritten, and can now produce standalone executables.

END blocks saved in compiled output

Secure temporary file module

Tim Jenness' File::Temp is now in core.

Integrate Time::HiRes

This module is now part of core.

Turn Cwd into XS

Benjamin Sugars has done this.

Mmap for input

Nick Ing-Simmons' perlio supports an mmap IO method.

Byte to/from UTF8 and UTF8 to/from local conversion

Encode provides this.

Add sockatmark support

Added in 5.7.1

Mailing list archives,

Bug tracking

Richard Foley has written the bug tracking system at

Integrate MacPerl

Chris Nandor and Matthias Neeracher have integrated the MacPerl changes into 5.6.0.

Web "nerve center" for Perl is what you're looking for.

Regular expression tutorial

perlretut, provided by Mark Kvale.

Debugging Tutorial

perldebtut, written by Richard Foley.

Integrate new modules

Jarkko has been integrating madly into 5.7.x

Integrate profiler

Devel::DProf is now a core module.

Y2K error detection

There's a configure option to detect unsafe concatenation with "19", and a CPAN module. (D'oh::Year)

Regular expression debugger

While not part of core, Mark-Jason Dominus has written Rx and has also come up with a generalised strategy for regular expression debugging.

POD checker

That's, uh, podchecker

"Dynamic" lexicals

Cache precompiled modules

Deprecated Wishes ^

These are items which used to be in the todo file, but have been deprecated for some reason.

Loop control on do{}

This would break old code; use do{{ }} instead.

Lexically scoped typeglobs

Not needed now we have lexical IO handles.

format BOTTOM

report HANDLE

Damian Conway's text formatting modules seem to be the Way To Go.

Generalised want()/caller())

Named prototypes

These both seem to be delayed until Perl 6.

Built-in globbing

The File::Glob module has been used to replace the glob function.

Regression tests for suidperl

suidperl is deprecated in favour of common sense.

Cached hash values

We have shared hash keys, which perform the same job.

Add compression modules

The compression modules are a little heavy; meanwhile, Nick Clark is working on experimental pragmata to do transparent decompression on input.

Reorganise documentation into tutorials/references

Could not get consensus on P5P about this.

Remove distinction between functions and operators

Caution: highly flammable.

Make XS easier to use

Use Inline instead, or SWIG.

Make embedding easier to use

Use Inline::CPR.

man for perl

See the Perl Power Tools. (

my $Package::variable

Use our instead.

"or" tests defined, not truth

Suggesting this on P5P will cause a boring and interminable flamewar.

"class"-based lexicals

Use flyweight objects, secure hashes or, dare I say it, pseudo-hashes instead.


ByteLoader covers this.

Lazy evaluation / tail recursion removal

List::Util in core gives some of these; tail recursion removal is done manually, with goto &whoami;. (However, MJD has found that goto &whoami introduces a performance penalty, so maybe there should be a way to do this after all: sub foo {START: ... goto START; is better.)

Make "use utf8" the default

There is a patch available for this, search p5p archives for the Subject "[EXPERIMENTAL PATCH] make unicode (utf8) default" but this would be unacceptable because of backward compatibility: scripts could not contain any legacy eight-bit data. Also would introduce a measurable slowdown of at least few percentages since all regular expression operations would be done in full UTF-8.

