The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.
=head1 TITLE

Parrot FAQ - Frequently Asked Questions

=head1 VERSION

=over 4

=item Revision 0.5 - 04 September 2002

=item Revision 0.4 - 26 August 2002

Fixed up the licensing bits

=item Revision 0.3 - 13 March 2002

Translated to POD and added "Why aren't we using external tool or library I<X>?"

=item Revision 0.2 - 03 December 2001

Added the "Parrot and Perl" section and "Why Re-implement Perl".  Incorporated Dan's Q&A items.

=item Revision 0.1 - 03 December 2001

Adopted from Simon Cozens's article, "Parrot: A Cross-Language Virtual Machine Architecture".

=back

=head1 GENERAL QUESTIONS

=head2 What is Parrot?

Parrot is the new interpreter being designed from scratch to support 
the upcoming Perl6 language. It is being designed as a standalone virtual 
machine that can be used to execute bytecode compiled dynamic languages such as 
Perl6, but also Perl5. Ideally, Parrot can be used to support other dynamic, 
bytecode-compiled languages such as Python, Ruby and Tcl.

=head2 Why "Parrot"?

The name "Parrot" relates to Simon Cozens's L<April Fool's Joke|"LINKS"> where Larry Wall and Guido van Rossum announced the merger of 
the Perl and Python languages.

As penance, Simon spent time as Parrot's lead developer, but he's
gotten better.

=head2 Is Parrot the same as Perl 6?

No. Parrot is an implementation that is expected to be used to execute 
Perl6 programs. The Perl6 language definition is currently (December 2001) being 
crafted by Larry Wall. While the true nature of Perl6 is still unknown, it will 
be substantially similar to Perl as we know it today, and will need a runtime 
system. For more information on the nascent Perl6 language definition, check out 
Larry's L<apocalypses|"LINKS">.

=head2 Can I use Parrot today?

Yes.

Parrot is in the early phases of its implementation.  The primary way to use Parrot is to write Parrot assembly code,
described in L<PDD6|pdds/pdd6.pod>.

You can also create dynamic content within Apache using Ask Bjorn Hansen's mod_parrot module.  You are strongly 
advised that mod_parrot is a toy, and should not be used with any production code.

=head2 Why should I program in Parrot Assembly language?

Lots of reasons, actually.  :^)

=over 4

=item *

All the L<cool kids|"LINKS"> are doing it.

=item *

It's a neat hack.

=item *

You get all the pleasure of programming in assembly language without any of the requisite system crashes.

=back

Seriously, though, programming in Parrot assembly language is an interesting challenge.  It's also one of the best 
ways to write test cases for Parrot.

=head2 When can I expect to use Parrot with a I<real> programming language?

It depends on what you mean by I<real>.  :^)

=over 4

=item *

Leon Brocard has released a proof-of-concept L<Java bytecode to Parrot bytecode|"LINKS"> compiler.

=item *

Gregor Purdy is working on a little language called Jako that targets
Parrot bytecode directly.  (Available with the Parrot distribution.)

=item *

Dan Sugalski and Jeff Goff have started work on compiling Scheme down to Parrot bytecode.  (Available with the Parrot
distribution.)

=item *

Clint Pierce wrote an Integer Basic implementation in parrot
assembly, which is shipped with the parrot distribution, as are a few
example programs. (Including Hunt the Wumpus and Eliza)

=item *

There's a Befunge interpreter in the languages directory

=item *

There's an (ahem)BF interpreter in the languages directory. Be aware
that BF is not, strictly speaking, the language's name, merely its initials.

=item *

There is a prototype Perl 6 implementation in the languages directory
as well, though it's only as complete as the Perl 6 spec. (Which, at
this writing, isn't sufficiently complete)

=back

=head2 What language is Parrot written in?

C.

=head2 For the love of God, man, why?!?!?!?

Because it's the best we've got.

=head2 That's sad.

So true.  Regardless, C's available pretty much everywhere.  Perl 5's in C, so we can potentially build any place Perl 5 builds.

=head2 Why not write it in I<insert favorite language here>?

Because of one of:

=over 4

=item *

Not available everywhere.

=item *

Limited talent pool for core programmers.

=item *

Not fast enough.

=back

=head2 Why aren't you using external tool or library I<X>?

The most common issues are:

=over 4

=item *
License compatibility.

Parrot has an odd license -- it currently uses the same license as
Perl 5, which is the disjunction of the GNU GPL and the Artistic
License, which can be written (Artistic|GPL) for short.  Thus,
Parrot's license is compatible with the GNU GPL, which means you
can combine Parrot with GPL'ed code.

Code accepted into the core interpreter must fall under the same terms
as parrot. Library code (for example the ICU library we're using for
Unicode) we link into the interpreter can be covered by other licenses
so long as their terms don't prohibit this. 

=item *
Platform compatibility.

Parrot has to work on most of Perl 5's platforms, as well as a few of its own.  Perl 5 runs on eighty platforms; 
Parrot must run on Unix, Windows, Mac OS (X and Classic), VMS, Crays, Windows CE, and Palm OS, just to name a few.  
Among its processor architectures will be x86, SPARC, Alpha, IA-64, ARM, and 68x00 (Palms and old Macs).  If 
something doesn't work on all of these, we can't use it in Parrot.

=item *
Speed, size, and flexibility.

Not only does Parrot have to run on all those platforms, but it must also run efficiently.  Parrot's core size is 
currently between 250K and 700K, depending on compiler.  That's pushing it on the handheld platforms.  Any library
used by Parrot must be fast enough to have a fairly small performance impact, small enough to have little impact on
core size, and flexible enough to handle the varying demands of Perl, Python, Tcl, Ruby, Scheme, and whatever else 
some clever or twisted hacker throws at Parrot.

=back

These tests are very hard to pass; currently we're expecting we'll probably have to write everything but the Unicode
stuff.

=head2 Why your own virtual machine?  Why not compile to JVM/.NET?

Those VMs are designed for statically typed languages. That's fine,
since Java, C#, and lots of other languages are statically typed. Perl
isn't.  For a variety of reasons, it means that Perl would run more
slowly there than on an interpreter geared towards dynamic languages.

The .NET VM didn't even exist when we started development, or at
least we didn't know about it when we were working on the design. We
do now, though it's still not suitable.

=head2 So you won't run on JVM/.NET?

Sure we will. They're just not our first target. We build our own 
interpreter/VM, then when that's working we start in on the JVM and/or .NET back 
ends.

=head2 What about I<insert other VM here>

While I'm sure that's a perfectly nice, fast VM, it's probably got
the same issues as do the languages in the "Why not something besides
C" question does. I realize that the Scheme-48 interpreter's darned
fast, for example, but we're looking at the same sort of portability
and talent pool problems that we are with, say, Erlang or Haskell as
an implementation language.

=head2 Why is the development list called perl6-internals?

The mailing list precedes the Parrot joke and subsequent unveiling of
the True Grand Project by a number of months. We've just not gotten
around to renaming the mailing list. We will.

=head1 PARROT AND PERL

=head2 Why re-implement Perl?

Good question.

At The Perl Conference 4.0, in the summer of 2000, Larry Wall L<announced|"LINKS"> that it was time to recreate Perl from the ground up. 
This included the Perl language, the implementation of that language, the community of open source developers who 
volunteer to implement and maintain the language, and the larger community of programmers who use Perl.

A variety of reasons were given for embarking on this project: 

=over 4

=item *

Perl5 is a stable, reliable, robust platform for developing software; it's 
not going away for a long time, even after Perl6 is released. (Proof: Perl4 is 
still out there, no matter how much we all want it to go away.)

=item *

We have the ability to translate Perl5 into Perl6 if necessary. This 
preserves backward compatibility with a large body of existing Perl code, 
which is I<very> important.

=item *

The language can stand some revision: formats don't really belong in the 
core language, and typeglobs have outlived their usefulness. By revising the 
language now, we can make Perl better.

=item *

Some warts really should be removed: C<system> should return I<true> instead of 
I<false> on success, and C<localtime> should return the year, not the year - 1900.

=item *

It would be nice to write the Perl to Bytecode compiler in Perl, instead of 
C. That would make it much easier for Perl hackers to hack on Perl. 

=back

=head2 You want to write the Perl compiler in Perl?

Sure.  Why not?  C, Java, Lisp, Scheme, and practically every other language is self-hoisting.  Why not?

=head2 Isn't there a bootstrapping problem?

No, not really.  Don't forget that we can use Perl 5 to run Perl 5 programs, such as a Perl 5 to Parrot compiler.

=head2 How will Parrot handle both Perl 5 and Perl 6?

We don't know yet, since it depends on the Perl 6 language definition.  But we I<could> use the more appropriate of 
two Perl compilers, depending of whether we're compiling Perl 5 or Perl 6.  Larry has mumbled something about a 
C<package> statement declaring that the file is Perl 5, but we're still not quite sure on how that fits in.

=head2 Is this how Parrot will run Python, Ruby, and Tcl code?

Probably.

=head2 Latin and Klingon too?

No, Parrot won't be twisted enough for Damian.  Perhaps when Parrot is ported to a pair of supercool calcium ions, 
though...

=head2 Huh?

You had to L<be there|"LINKS">.

=head1 PARROT IMPLEMENTATION ISSUES

=head2 What's with the whole register thing machine?

Not much, why do you ask?

=head2 Don't you know that stack machines are the way to go in software?

No, in fact, I don't.

=head2 But look at all the successful stack-based VMs!

Like what?  There's just the JVM.

=head2 What about all the others?

B<What> others?  That's it, unless you count Perl, Python, or Ruby.

=head2 Yeah them!

Yeah, right.  You never thought of them as VMs, admit it.  :^)

Seriously, we're already running with a faster opcode dispatch than any of them are, and having registers just 
decreases the amount of stack thrash we get.

=head2 Right, smarty.  Then name a successful register-based VM!

The 68K emulator Apple ships with all its PPC-enabled versions of Mac OS.

=head2 Really?

L<Really.|"LINKS">

=head2 You're not using reference counting. Why not?

Reference counting has three big issues.

=over 4

=item Code complexity

Every single place where an object is referenced, and every single
place where a reference is dropped, I<must> properly alter the
refcount of the objects being manipulated. One mistake and an object
(and everything it references, directly or indirectly) lives forever
or dies prematurely. Since a lot of code references objects, that's a
lot of places to scatter reference counting code. While some of it
can be automated, that's a lot of discipline that has to be
maintained.

It's enough of a problem to track down garbage collection systems as
it is, and when your garbage collection system is scattered across
your entire source base, and possibly across all your extensions,
it's a massive annoyance. More sophisticated garbage collection
systems, on the other hand, involve much less code. It is, granted,
trickier code, but it's a small chunk of code, contained in one
spot. Once you get that one chunk correct, you don't have to bother
with the garbage collector any more.

=item Cost

For reference counting to work right, you need to twiddle reference
counts every time an object is referenced, or unreferenced. This
generally includes even short-lived objects that will exist only
briefly before dying. The cost of a reference counting scheme is
directly linked to the number of times code references, or
unreferences, objects. A tracing system of one sort or another (and
there are many) has an average-case cost that's based on the number
of live objects. 

There are a number of hidden costs in a reference-counting
scheme. Since the code to manipulate the reference counts I<must> be
scattered throughout the interpreter, the interpreter code is less
dense than it would be without reference counts. That means that more
of the processor's cache is dedicated to reference count code, code
that is ultimately just interpreter bookkeeping, and not dedicated to
running your program. The data is also less dense, as there has to be
a reference count embedded in it. Once again, that means more cache
used for each object during normal running, and lower cache density.

A tracing collector, on the other hand, has much denser code, since
all it's doing is running through active objects in a tight loop. If
done right, the entire tracing system will fit nicely in a
processor's L1 cache, which is about as tight as you can get. The
data being accessed is also done in a linear fashion, at least in
part, which lends itself well to processor's prefetch mechanisms where
they exist. The garbage collection data can also be put in a separate
area and designed in a way that's much tighter and more cache-dense.

Having said that, the worst-case performance for a tracing garbage
collecting system is worse than that of a reference counting
system. Luckily the pathological cases are quite rare, and there are
a number of fairly good techniques to deal with those. Refcounting
schemes are also more deterministic than tracing systems, which can
be an advantage in some cases. Making a tracing collector
deterministic can be somewhat expensive.

=item Self-referential structures live forever

Or nearly forever. Since the only time an object is destroyed is when
its refcount drops to zero, data in a self-referential structure will
live on forever. It's possible to detect this and clean it up, of
course... by implementing a full tracing garbage collector. That
means that you have two full garbage collection systems rather than
one, which adds to the code complexity.

=back

=head2 Could we do a partial refcounting scheme?

Well... no. It's all or nothing. If we were going to do a partial
scheme we might as well do a full scheme. (A partial refcounting
scheme is actually more expensive, since partial schemes check to see
whether refcounts need twiddling, and checks are more expensive than
you might think)

=head1 LINKS

April Fool's Joke: http://www.perl.com/pub/a/2001/04/01/parrot.htm

apocalypses: http://www.panix.com/~ziggy/

cool kids: http://use.perl.org/~acme/journal

Java bytecode to Parrot bytecode: http://archive.develooper.com/perl6-internals@perl.org/msg03864.html

http://www.perl.com/pub/a/2000/10/23/soto2000.html

be there: http://www.csse.monash.edu.au/~damian/papers/#Superpositions

Really.: http://developer.apple.com/techpubs/mac/PPCSoftware/PPCSoftware-13.html

=cut