Scott Walters > Object-PerlDesignPatterns > Object::PerlDesignPatterns

Download:
Object-PerlDesignPatterns-0.03.tar.gz

Dependencies

Annotate this POD

CPAN RT

New  1
Open  0
View Bugs
Report a bug
Module Version: 0.03   Source  

NAME ^

Object::PerlDesignPatterns - Perl architecture for structuring and refactoring large programs

SYNOPSIS ^

  lynx perldesignpatterns.html
  perldoc Object::PerlDesignPatterns

ABSTRACT ^

Documentation: Ideas for keeping programs fun to hack on even after they grow large. Object, lambda, hybrid structures, Perl specific methods of refactoring, object tricks, anti-patterns, non-structural recurring code patterns.

DESCRIPTION ^

PerlDesignPatterns is a free book sporting:

Ideas for keeping programs fun to hack on even after they grow large. Object, lambda, hybrid structures, Perl specific methods of refactoring, object tricks, anti-patterns, non-structural recurring code patterns.

Feel free to jump right in and make corrections, suggestions, ask questions, play editor, or just rant. Start in http://www.perldesignpatterns.com/?TinyWiki to learn about the TinyWiki software, make a page for yourself, play with editing that, perhaps make a link from the GuestLog to your page. The markup language is ASCII based - it couldn't be any easier.

This document is a snapshot of the current state of the Wiki, automatically compiled from hundreds of individual sections by a Perl script.

To cause my poor old server to prepare an up-to-the-minute HTML version of this document, go to http://www.perldesignpatterns.com/assemble.cgi?PerlDesignPatterns.

BUGS ^

My text hasn't been proofread or spellchecked, with few exceptions. My code hasn't been tested by other people, and has only been tested by myself in a few cases.

Since this project is (atleast partially) out of my hands, there is no firm point at which it's finished: the scope is indefinate. Because of this, parts of the document will always be in rough shape, contain inconsistencies, and so on.

The PerlDoc version is compiled by podparser.pl, at http://www.perldesignpatterns.com/podparser.pl?self, from hundreds of little text files. These text files use TinyWiki's markup. This simple ASCII format translates well to HTML. Things are lost in the translation to PerlDoc, still. Also, the pod2html that comes with Perl doesn't like to create forward links. The HTML translator loses the loading two underscores on meta-identifiers such as underscore-underscore-PACKAGE-underscore-underscore, and the PerlDoc parser probably does too. I cannot find the way to escape ?'s in POD link tags so that pod2html won't mangle them.

AUTHOR ^

Scott Walters - scott@illogics.org

TITLE ^

PerlDesignPatterns

AUTHOR ^

Scott Walters - scott@illogics.org

PerlDesignPatterns ^

"PerlDesignPatterns" in PerlDesignPatterns is a free on-line book and forum. For information about this project and links to download the entire book, see "HomePage" in HomePage, or just click http://wiki.slowass.net/?TinyCGI:assemble.cgi?PerlDesignPatterns - Downloading is highly recommended unless you're contributing to the project. Wget users - fetch "TinyWiki" in TinyWiki:download.cgi instead, and see "HomePage" in HomePage for more info. Novices, intermediate programmers: "Object Nuts and Bolts" is for you. Scroll down.

Introduction

Object Adapter Design Patterns

Experts and advanced programmers: start here.

Object State Patterns

Object Creational Patterns

Object Structure Patterns

Object, Lambda Hybrid Patterns

Relational Data Patterns

Non-Object Patterns

Application Features

Web:

General:

Anti-Patterns

Refactoring

Concepts

Object Nuts and Bolts

Object novices: start here.

Appendices

Other Concepts and Blurbs, Or As Of Yet Unclassified

Meta

All content on this server is copyright 2002, 2003 by "ScottWalters" in ScottWalters, unless otherwise noted. Content credited otherwise is copyright its original author and has been generously made available by them under the same terms as the rest of the project, the "GnuFreeDocumentationLicense" in GnuFreeDocumentationLicense. Member of "CategoryBook" in CategoryBook.

<img src="http://www.perldesignpatterns.com/counter/counter.cgi"> hits since Wed Oct 9 00:20:05 PDT 2002

$Id: "PerlDesignPatterns" in PerlDesignPatterns,v 1.225 2003/06/21 00:30:04 httpd Exp $

External Pages Linking to This Page - this is generated automatically - thanks to everyone linking here:

HomePage

Welcome to "TinyWiki" in TinyWiki, the "PerlDesignPatterns" in PerlDesignPatterns repository!

Here, CPAN's Object::PerlDesignPatterns (http://www.cpan.org/modules/by-module/Object/ PerlDesignPatterns) is crafted by you and me. "PerlDesignPatterns" in PerlDesignPatterns is a free online book, forum, and documentation project at http://savannah.nongnu.org/projects/perlpatbook/.

Quick start: Browse http://wiki.slowass.net/?TinyCGI:perldesignpatterns.html or download http://wiki.slowass.net/?TinyCGI:perldesignpatterns.html.gz .

News

Download PerlDesignPatterns

TinyWiki PerlDesignPatterns Development and Forum

Browsing the Wiki confuses mere mortals. Grab http://wiki.slowass.net/?TinyCGI:perldesignpatterns.html instead for casual reading.

Also Also Wik

"We are now the Knights who say ... Wiki wiki wiki wiki, bih-kang, zoop-boing, zowenzum" - I've been dying to say that ;) - Kurt quoting http://wiki.slowass.net/?MontyPython

What in the Heck?

There is no master site map: this site is itself a web. Some recommeneded starting points are: "PerlDesignPatterns" in PerlDesignPatterns, http://wiki.slowass.net/?SkipTheIntroduction, http://wiki.slowass.net/?CategoryAntiPattern, "CategoryBook" in CategoryBook, http://wiki.slowass.net/?CategoryConcept, http://wiki.slowass.net/?CategoryRefactoring, http://wiki.slowass.net/?CategoryWiki, "PerlPatternsResources" in PerlPatternsResources.

Why are all the words of all of the links run together? Because thats how you make links! Words written this way get turned into links. Linking to an unknown page creates a new page. See "TinyWiki" in TinyWiki for a jumpstart.

http://wiki.slowass.net/?WardCunningham started this madness with his original http://wiki.slowass.net/?WikiWiki at http://c2.com.

Feel free to edit pages to make corrections, improvements, editorial comments, ask questions, and so on. Someone will see your changes in http://wiki.slowass.net/?TinyCGI:recent.cgi and answer your questions or touch up your work.

Wikis exist to discuss all topics: see http://wiki.slowass.net/?CategoryWiki for a few others. This site is a tool for collaboration on the "PerlDesignPatterns" in PerlDesignPatterns project. Discussion of Wiki technology, Perl, OO programming in general, language theory, are on topic. You're encouraged to make a page named after yourself (for example, "ScottWalters" in ScottWalters is mine) and link to it off the "GuestLog" in GuestLog - your http://wiki.slowass.net/?PersonalPage need not be on topic. Off topic text not on your http://wiki.slowass.net/?PersonalPage is likely to be moved there or pruned, not because we don't think it's funny, merely because focus is important. See http://wiki.slowass.net/?HowToWrite.

  Have Fun!
  - ScottWalters 

$Id: "HomePage" in HomePage,v 1.156 2003/06/21 07:57:24 httpd Exp $

Pages Linking to This Page:

TinyWiki

What?

A http://wiki.slowass.net/?WikiWiki:WikiWiki style user-editable online area: a knock off of http://wiki.slowass.net/?WikiWiki:WardCunningham's http://wiki.slowass.net/?WikiWiki:WikiWikiWeb at http://c2.com/cgi/wiki, written in under a hundred lines of Perl. The code is available: See below.

In a nutshell, click the graphic on the top of the screen to get back to the "HomePage" in HomePage from anywhere. Feel free to edit pages. Play around in the http://wiki.slowass.net/?SandBox if you want to experiment, then make a "GuestLog" in GuestLog entry. To create a new page:

See http://wiki.slowass.net/?WhyWikiWorks and http://wiki.slowass.net/?WikiFun for more information on Wiki and other Wiki codebases, or keep reading for more about "TinyWiki" in TinyWiki. http://wiki.slowass.net/?TinyWikiFour has links to historic versions and versions unburdoned by all of my local parser rules.

How?

How did I write a Wiki in under 100 lines? Not exactly on par with http://wiki.slowass.net/?DamianConway, but I wrote compact code, did the http://wiki.slowass.net/?SimplestThingPossible, and most of all, didn't make any arrangements for modularity, resigning myself to refactor constantly. You could say "TinyWiki" in TinyWiki is a study in constant refactoring. http://wiki.slowass.net/?WriteWhatYouMean.

This version saves documents to CVS, but does tolerate not having it.

See Features below to learn what is available in the way of formatting text, then play with editing in http://wiki.slowass.net/?SandBox. http://ipaterson.ca/wiki/wiki.cgi?FormattingInTinyWiki has a very nice quick reference for "TinyWiki" in TinyWiki formatting.

Why?

Why another Wiki? Because the free Wiki clone I had been using was 4,000 lines long, which is about 3,900 too many. It took ages to load. It was tied to the goofy .dbm format so I couldn't easily write scripts to import/export. Wanted something easy to hack on. See http://wiki.slowass.net/?TinyWikiMotivation.

Who?

"ScottWalters" in ScottWalters. Just another perl hacker. See http://www.slowass.net/phaedrus/ for more.

Where?

Each script is capable of spitting out its own source code. Think of it as human-assisted-propagation. Want to practise software husbandry?

Be advised - in the spirit of tininess, important things are missing. There is currently no HTML filtering, so users could create obnoxious http://wiki.slowass.net/?JavaScript etc. http://wiki.slowass.net/?WikiWiki has different text processing rules - I didn't find http://wiki.slowass.net/?WikiWiki:WikiWiki 's intuitive. Sorry. Pages can not be completely deleted - that would interfere with fetching previous versions, and a philosophy exists that web pages should never just vanish, but should instead be replaced with a page linking to where the content moved.

In the spirit of http://wiki.slowass.net/?DesignPatterns, I firmly hold true the notion that it is more important to be able to hack features on than have every conceiveable feature simply because every feature isn't conceivable and attempts to conceive of them litter the code with thousands of attempts almost all of which miss the mark. To the degree that it's possible, new features are implemented as separate scripts. I want to push the limit of what is possible. With the advent of http://wiki.slowass.net/?ActiveWikiPages, most features are being implemented as code buried in pages. Some auxillary scripts may be converted to http://wiki.slowass.net/?ActiveWikiPages.

Features:

Todo:

Install Notes

See http://wiki.slowass.net/?TinyWikiInstall for some notes on installing this software.

Thanks To

"TinyWiki" in TinyWiki uses code from http://wiki.slowass.net/?RandalSchwartz in fogindex.cgi, code from http://wiki.slowass.net/?DougMiles and from Moogle Stuffy Software in diff.cgi, with bug fixes and contributions in wiki.cgi from http://wiki.slowass.net/?AlexSchroeder and other people whose names I hope I remember soon... oops.

<table cellspacing="0" cellpadding="0"><tr><td><img src="http://www.perldesignpatterns.com/back1.png"><img src="http://www.perldesignpatterns.com/back1.png"><img src="http://www.perldesignpatterns.com/back1.png"><img src="http://www.perldesignpatterns.com/back1.png"><img src="http://www.perldesignpatterns.com/back1.png"></td></tr><tr><td><img src="http://www.perldesignpatterns.com/back1.png"><img src="http://www.perldesignpatterns.com/back1.png"><img src="http://www.perldesignpatterns.com/back1.png"><img src="http://www.perldesignpatterns.com/back1.png"><img src="http://www.perldesignpatterns.com/back1.png"></td></tr></table>

The little graphic is meant to be tiled and is care of http://wiki.slowass.net/?ForrestCahoon at http://www.abstractfactory.org/ See http://www.abstractfactory.org/forrest/gallery/backgrounds.html for more and more about them. Hint: Its a plot of an x, y function.

See Also: http://wiki.slowass.net/?TinyWikiPresentation, http://wiki.slowass.net/?TinyWikiInstall, http://wiki.slowass.net/?TinyWikiBugs, "GuestLog" in GuestLog, http://wiki.slowass.net/?SandBox, http://wiki.slowass.net/?WikiFun, http://wiki.slowass.net/?TinyWikiMotivation, http://wiki.slowass.net/?VisualizationCompilerGraphs

http://wiki.slowass.net/?CategoryWiki

$Id: "TinyWiki" in TinyWiki,v 1.266 2003/06/22 05:58:43 httpd Exp $

Pages Linking to This Page:

SoftwareQualityLevels

Software, like all things, has quality. Which scenarios describe the projects you've worked on? Which of these are familiar? Which have you over come through experience?

1. Works when no one is watching

When the requirements are completely out of control, many programmers celebrate even having reached this point.

2. Works if you do it just right

Too many applications, most not written in Perl, make it to this point and stop cold. Forget reusable, this isn't even usable.

3. Trying most things once, it doesn't break

You may be tempted to give a software demo in front of a crowded auditorium at this point. Don't.

4. Other people tried it, and it seems to work

Software released to the community often starts at this point. Before this point, there isn't enough benefit for it to be worthwhile for them to fix your bugs.

5. Been in production for a while, and you're running out of bugs to fix

Most perl programs quickly shot to level 5, and stop. Level 5 is a good level. Since its really about the users, not the developers, Perl has traditionally been great for end users.

6. Other programmers are adding to it, so you made the code understandable

Other programs can incorporate this program into theirs, or vice versa, and benefit from your work.

7. A lot of people are working on it, so you made it modular and well laid out logically

Resistant to damage caused by new features, different requirements, and new programmers. In a lot of ways, like a Spider Plant: fractal, prolific, and cute.

8. It has turned into a generic framework for doing things of this kind, and has been separated from early assumption

Different products that do the same thing but better are different, but are based on this class, can easily be created.

9. Hoards of the nit-pickiest people on the net have picked every last nit out of it

College classes are dedicated to exploring your code. Aspiring programmers marvel at the sheer beauty of it.

Most programmers are smart and hard working. Things go wrong mysteriously. Changing requirements stress the design of a program. A program at level 5 can quickly turn into a level 2 program, if people start working on it who don't understand the entire design, or the original design doesn't take into account the direction it takes into the future and no one adapts the design. This is the primary reason to shoot for a level 7 program. Having net-god status thrust upon you and having to live up to it, or attempting to attain net-god status is the primary reason to shoot for level 9. Of course, if the program is a few lines long, none of this amounts to a hill of beans.

Software does not wear out in the traditional sense of machinery with moving parts. However, software is constantly being used in ways its authors never expected (often uncovering errors), and end users are constantly demanding extensions to their software. - http://wiki.slowass.net/?FredBrooks, http://wiki.slowass.net/?TheMythicalManMonth

AboutPerl

Because we don't know how programs will reinvent themselves, we don't know how to design an "Interface" [1] , what composite types are involved, and what containment and inheritance hierarchies will look like. In the beginning, we seldom know that a program will grow into this at all!

Perl's easy going attitude and powerful features shine here. After a program has devised a solution to a logic problem, and after it has proved its continued usefulness, we have a route for improvement.

AboutObjects

That's about all there is to it. Now you need just to go off and buy a book about object-oriented design methodology, and bang your forehead with it for the next six months or so. - "PerlDoc" in PerlDoc:perlobj

Objects allow arbitrary arrangements of useful logic. This enables software to scale, exhibit flexibility within its development cycle, and within the life on a single invocation. Implementations of different facilities can be swapped out not only during development, but while the program is running.

Objects don't help you finish a boring program quicker. They don't help much with diddling with a bit of code until it works. They don't magically make your programs maintainable and extensible.

Many Perl programmers happily blast OO. I believe every idea has its time and place. Clearly, small scripts aren't the place for OO, and before the code is even working isn't the time. Knowing when and how to use OO means knowing how to benefit from it without it getting in the way.

Conventional wisdom says that you can't graft objects onto an existing design. Perhaps you're already a Perl fan because it lets you break rules. This is one that needs breaking. In Perl, you can indeed bless [2] an existing datastructure into object-hood.

Graphical User Interfaces [3] proved the value of Object Oriented programming: see http://wiki.slowass.net/?PerlPaint. Everything that gets drawn on your screen shares are few similarity: it has an appearance that only it knows how to draw. It can inside of another object, such as menus can be in title bars and buttons can be in windows. It can send messages when activated to other objects which control the behavior of the application. Versions of components customized for appearance or behavior could easily be created, extending existing code. Taking advantage of these similarities allowed graphical elements to be mixed and matched, and allowed the application to treat similar elements in the same way. It also allowed complex structures to be arranged at run time and continiously revised as the user moved things around on the screen and opened windows. The possibilities are built in rather than the limits. The gospel spilled out. Large applications and operating systems adopted the tenets. Web programming adopted it after a rash of horrible overgrown "scripts" mostly written by Perl programmers.

Software Engineering has traditionally meant applying the right algorithm for the job. Most University educations focus on understanding algorithms. This is important, and http://wiki.slowass.net/?AdvancedAlgorithmsInPerl, O'Reilly Press, is good reading on the topc. Attention to the overall structure of the program, how the algorithms fit together, and building software with (at least the appearance of) a grand design is the trendy new wave.

With this in mind, lets think of Objects as tools, just like any other Perl shortcut or magic. Remember - There is More Than One Way To Do It.

||See Also

AboutPatterns

http://wiki.slowass.net/?ObjectOriented programming books tell you what an http://wiki.slowass.net/?ObjectOriented program looks like, and all of the benefits of writing code in this style. Too often, they don't tell you how to arrive at this ideal. The result has been large amounts of code that use OO features, but miss the boat on benefitting from them. Since we're using them strictly for fun and profit, we're going to concentrate on the exact utility of each idea, and when it is useful to apply it.

"DesignPatterns" are parables of good software design. Good parables have a cast of evil creatures, good creatures, and a moral. The follies of evil become evident. Lessons are learned. Sometimes, the evil creatures aren't killed, but change their ways. http://wiki.slowass.net/?DesignPatterns represent an ideal, explain the ideal, and give a path, all in neat little case studies. Think of it as your software bible.

"DesignPatternsElementsOfReusableSoftwareComponents" brought http://wiki.slowass.net/?DesignPatterns to computer science. When it did, it talked about OO constructs exclusively. Since Perl programs combine many other ideas, we're going to extend the concept. Objects are data attached to code; "LambdaClosures" are code attached to data. "Exporting" lets one module alter the world of another. Usually this means adding keywords, but there are few limits. Perl's introspective, http://wiki.slowass.net/?DynamicLanguage capabilities open up a new area of investigation. Perl is multi-paradigmatic, and we should be too.

XXX I apologize for the length of this letter, for I lacked the time to make it shorter. -- Blaise Pascal

Are Design Patterns worth it? Programmers freshly exposed to Design Patterns start building Winchester mansions [4] The creations themselves could likewise said be garish curiosities, Victorian in their own right. The same disease has been noted in programmers first exposed to the http://wiki.slowass.net/?LambdaProgramming style of Scheme and programmers first exposed to http://wiki.slowass.net/?ObjectOriented programming. Creatively applying http://wiki.slowass.net/?ObjectOrientation to a problem quickly degenerates into creatively making everything an object. Soon every variable, operator, condition, state, state transition, record, and connection is an object. Don't laugh. I've read serious texts that have turned state transitions into objects [5]. There is a difference between building an abstraction and abstract building. I'd have to answer the question "no" for most programmers: http://wiki.slowass.net/?DesignPatterns aren't worth it. On the other hand, most programmers don't program Perl. Perl programmers already have well-ventilated feet. To me, reading http://wiki.slowass.net/?ObjectOriented code is often like reading Atari BASIC (or any other non-procedural BASIC, for that matter). Finding out where values come from is a riddle. Names of objects and constructor prototypes give hints about how things are arranged, which let you wager a guess about where it probably should come from, which is sometimes where they do come from. The code is a web, and values tend to travel pretty darn far across it. On the other hand, in BASIC, important constants are near the top. I think BASIC wins this one. BASIC programs were proud of their constants: the fact that they were made into variables instead of repeated hardcodes, and placed at the top of the file, let them proudly display them as the easy to change options they are. In OO programming style, something is either adjusted with a GUI preferences screen, or its a shameful bit of post-war relic. The bad news is in order to be cleansed of all sin in this nihilist religion, you need an infinite number of config screens to keep up with the growing number of options of the growing program: there is no upper bound and no end to this race. [6]. At some point, things break down, and some foundation must be hardcoded, somewhere. The gentle art of bootstrapping, non GUI editable config files, and compile-time preferences have an enduring place in software. Likewise, the breaks need to be put on OO. Perl programs haven't reached this level of garishness yet. Perl is a humble language, as http://wiki.slowass.net/?LarryWall says, so with some ties to our roots, perspective, and frequent trips to the confessional, it may never become garish. Lets hope. Systems of Object relationships should never create more complexity than they clear up. This is an important and powerful motive to stop OO-ifying a program at a certain point. OO-ifying a program should make the program shorter, more readable, easier to prototype, cleaner, more robust - everything that OO zealots promised. If it doesn't it isn't a fault of OO or the OO zealots, its your fault - you've gone too far.

An important tip of the hat to http://wiki.slowass.net/?MarkJasonDominus goes here. His "'Design Patterns' Aren't" lightning talk voiced some latent objections I couldn't quite formulate. http://wiki.slowass.net/?ChristopherAlexander, author of http://wiki.slowass.net/?PatternLanguage, conceived of design patterns for architecture. His book doesn't tell you what to build, nor how to build it. To quote M. J. Dominus, //The problem Alexander is trying to solve is: How can you distribute responsibility for design through all levels of a large hierarchy, while still maintaining consistency and harmony of overall design?// Convention and communication are key, especially since convention in Perl is purely voluntary. Alexander's book is concerned with the level immediately above and immediately below yours, in addition to what you're doing. To think of space being distributed not only according to boundaries but according to delegation and impact is novel. When designing a city, planning for neighborhoods, public t ransportation, and intertwined natural areas are smaller scale architectural elements. When designing a school, park, or housing community, they are larger, encompassing architectural elements. Designing a nice whatever is important, but fitting it into the surrounding picture, at the same time thinking of the people who will pick up your work where you leave it, is paramount. This cuts deeper to the heart of encapsulation and delegation than any single programming technique.

Architecture is often seen as a luxury or a frill, or the indulgent pursuit of lily-gilding compulsives who have no concern for the bottom line. -- Pattern Language of Program Design IV

Architects know how to design skylights, and they delegate the actual construction of architectural objects to qualified builders. The primary job of the architect is a creative one: designing something functional that uses standard elements to create custom solutions for unanticipated specifications. This is remarkably similar to the plight of the programmer, baring one difference: programmers have to do the construction themselves. Being bogged down in this labor-intensive discipline can suck time away from contemplating the bigger picture. The mention of a skylight makes an architect giddy as he visualizes the light playing across the open spaces. The mention of a skylight makes a builder sigh as he ponders reinforcing the roof, hanging drywall in the roof, and more trim work. Not only can being bogged down in this level of detail keep programmers from appreciating architectural elements of software, it can keep them from learning about them at all. If that isn't enough, only recently was any effort made to catalog them. To top it all off, clients don't ask for architecturally sound software: they ask for huge amounts of square footage decorated with endless amounts of cheap facade. Design is cast away as an inconvenient nuisance that limits how much software can be churned out how fast. Architects are judged by the quality of their work. Programmers are judged by the quantity of their work.

Architects design stable structures, but they also creatively ply their craft to devise ways to make their customers value the structure more. The structures that pass the test of time are not only the most solid ones, but the most innovative, imaginative, inspirational, and useful.

That being said, its important to decide what to build, and how to build it, on your own. It is also important to know what is available to build, and the techniques available to do so. Being the designer-constructor, you have to pay your own price for your design errors.

Programmers are, in their hearts, architects, and the first thing they want to do when they get to a site is to bulldoze the place flat and build something grand. - http://wiki.slowass.net/?JoelOnSoftware

External Pages Linking to This Page:

AboutFlack

Eventually you wind up with libraries that are more trouble to reuse than rewrite from scratch - http://wiki.slowass.net/?ObjectOrientedDesignHeuristics.

OO isn't real, in the sense that it's an idea. There are seldom litmus tests for presence of ideas. It isn't a feature of a language that makes your program better. Instead, it is a collection of ideas, and facilities in the language, to apply these ideas. I won't ever discuss wither or not a language is an http://wiki.slowass.net/?ObjectOriented language. Early C++ compilers compiled C++ down to C and fed it to a C compiler. This doesn't make C++ any less OO. In fact, no matter what the language and its basic premises, they all run on the same computers and compile down to the same languages that computer processors can understand.

As with anything that is built up too much, results fall short of expectations. While many people are avid believers in OO, others are quick to point out cases where it does more harm than good. Before we do anything else, lets look at exactly what OO is, and what it isn't. A good, hard, honest evaluation will set reasonable expectations. Reasonable expectations will keep everyone happy.

Making the program do its own checking frees you from much of the debugging work. See http://wiki.slowass.net/?TestUnit, "TypeSafety" in TypeSafety, "DesignContract" in DesignContract.

It needn't be. Perl is an idiomatic language and shouldn't change to suit OO's style. See http://wiki.slowass.net/?IdiomaticProgramming.

PlanningIsNpComplete

We begin with the part of the language which defines a town or a community. These patterns can never be "designed" or "built" in one fell swoop - but patient piecemeal growth, designed in such a way that every individual act is always helping to create or generate these larger global patterns, will, slowly and surely, over the years, make a community that has these global patterns in it. - A http://wiki.slowass.net/?PatternLanguage

NP Complete problems take an exponential, relative to the amount of input, to complete. Calling something "NP Complete" describes it at a problem not worth trying to solve, or only trying to solve in a very approximate fashion. See http://wiki.slowass.net/?MasteringAlgorithmsWithPerl.

Contrived interfaces result from arrogantly believing that every aspect of the design of the program can be anticipated. This is akin to playing out a game of chess without touching a piece. All of the decision making in the world do a bit of good if it doesn't take reality into account, and reality requires constant probing to understand.

Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work. -- Unknown XXX

OO has been marketed as making planning easy. Planning without feedback is easy but useless. Planning with hypothetical feedback is both difficult and useless. I propose that planning to make design changes is far more important than any other planning you will do. Knowing when and how to restructure code applies equally to procedural and OO code. OO discipline only helps make the process easier.

No feedback means no quality in what you do. A project without a prototype is like a candle without a wick. - http://wiki.slowass.net/?PeterMerel

No feedback means no opportunity for improvement. Old timers blame the disappearance of punch cards for the deterioration in software quality [7]. Using punch cards forces you to stop and think things through. Interactive programming lets you guess your way through, often never really understanding the situation. A language that makes you be explicit about your intentions in great detail is a throw back to punch cards, in a way. Guessing has its place in sounding out theories (and passing exams). Having a compiler that can give you critical feedback may be a good trade off. Not having a product means no feedback - no feedback from the compiler, or from sounding out the design.

The only constant is change.

An assault on large problems employs a succession of programs, most of which spring into existence en route. These programs are rife with issues that appear to be particular to the problem at hand. -- Alan J. Perlis, Foreword, http://wiki.slowass.net/?StructureAndInterpretationOfComputerPrograms.

When asked what the most important tools of an architect are, http://wiki.slowass.net/?FrankLloydWright replied, The eraser in the drafting room and the sledgehammer at the construction site.

Good design comes from bad design eventually, if you learn from your mistakes. This may be the only software engineering manual that desn't beg and plead with you to "do it right the first time". You have to pick your battles though: for any program, some problems are design flaws, some are design trade-offs. How do you "fix" a trade off?

See Also: http://wiki.slowass.net/?AccidentalHero, http://wiki.slowass.net/?UseDiagrams, http://wiki.slowass.net/?DeComposition, http://wiki.slowass.net/?TopDownDesign, http://wiki.slowass.net/?BottomUpDesign, http://wiki.slowass.net/?DesignDocuments, http://wiki.slowass.net/?FlowCharts, http://wiki.slowass.net/?DesignPatterns, "AboutTheAuthor" in AboutTheAuthor, "AboutObjects" in AboutObjects

<!--

  <|dave|> you're right. OO can be useful. but the thing is, it gets forced down our throats
  > feel free to edit any page. there is a little "edit" link in the bottom left corner.
  <|dave|> everyone tries to bend a project to make it OO
  <|dave|> when some of them just arent suited to OO
  <uri> |dave|: i didn't force it down my throat. i designed with it and not against it. my project needed polymorphism and OO perl does that.
  <|dave|> few projects suit OO
  <uri> |dave|: wrong. some do. 
  <uri> many do.
  <|dave|> < 50%
  <uri> but many projects are poorly architected in any paradigm
  > |dave| - there was an idea that modeling the project using OO/usecase etc would make the program scalable. that never happened. that failed
    horribly. not only cna't you plan for something that complex, but throwing objets in the mix doesn't help at all.
  <uri> architecture is key. that is shitty all over
  > objects are much better used to clean up existing code incrementally than try to avoid the np-complete problem of predicting the future
  <uri> scrottie: same there. architecture is key. always will be.
  <uri> architecture is not OO. it is making a coherent whole out of parts
  > learning as you go is key. constant injetions of architecture rather than a poorly planned attempt up front is key.
  <uri> no one does architecture at all.
  <|dave|> when i look at OO code, i just cant stand it. so much overhead.... people work in micoarchitecture making so many tiny little
    improvements to squeeze as much performance as possible out of a computer, then people f*ck it away using OO
  <uri> |dave|: you haven't seen good OO code then.
  > |dave| - trying to delegate everything "just in case" makes a special kind of speghetti code. you can't figure out the flow of the program, how
    things are constructed, where values come from - because it is so indirect
  <uri> rare but good OO code makes sense. 
  > good OO code has abstraction removed as often as it has it added, and people hiding behind OO to make their code good aren't willing to do that
  <|dave|> what do you mean scrottie
  <|dave|> about "just in case" i mean
  > if a constructor gets called, and you have to dig through 30 different constructor calls, methods, delegations, etc to figure out where the hell
    the values came from, something is wrong
  > oh
  > people add a lot of delegation and abstraction "just in case it might be useful" to keeping things modular
  > which, as it turns out, is a complete rat race. it is impossible to add enough abstraction up front to have the exact right abstraction you end
    up needing
  <|dave|> yeah
  > then, they refuse to remove any of it
  > leaving a tangled web as bad as any speghetti code
  > spaghetti? i can't spell
  <|dave|> right on
  > PerlDesignPatterns tries to expose aspiring OO programmers to as many *other* ideas as possible
  <|dave|> one reason im anti-OO is that someone has to be

-->

External Pages Linking to This Page:

InnerClasses

Synopsis: Related packages can be created where they are defined.

When: Adding another Interface to an object, passing out callbacks, creating helper objects. Moving inheritance, or interfaces, out of your object but not far from it.

  package WebsafeColors;




  sub new { ... };




  sub getIterator {
    my $parentThis = shift;
    return eval {
      package WebsafeColors::Iterator;
      # this mini sub-package only knows how to iterate over our data structure
      @ISA=(Iterator);
      sub new {
        my $type = shift;
        my $this = { currentIndex=>0 };
        bless $this, $type;
      }
      sub hasNext {
        my $this = shift;
        return @{$parentThis->{'colors'}} > $this->{'currentIndex'};
      }
      sub getNext {
        my $this = shift;
        die unless $this->hasNext();
        return $parentThis->{'colors'}->[$this->{'currentIndex'}++];
      }
      __PACKAGE__;
    }->new();
  }




  # there should be two underscores on either side of PACKAGE. TinyWiki is having a bug. sorry.

WebsafeColors::Iterator (http://www.cpan.org/modules/by-module/WebsafeColors/ Iterator) implements all of the functions required to be an instance of Iterator. If something takes an argument, and insists it implement Iterator, it will accept the result of calling getIterator() on a http://wiki.slowass.net/?WebsafeColors object. However, http://wiki.slowass.net/?WebsafeColors itself does not implement these methods, or inherit the base abstract class for Iterators. The package that does is contained entirely inside http://wiki.slowass.net/?WebsafeColors's getIterator() method. This technique lets you localize the impact of having to provide an interface, and keep code related to supporting that interface together and away from the rest of the code. This supports the basic idea of putting code where it belongs.

When we return a WebsafeColors::Iterator (http://www.cpan.org/modules/by-module/WebsafeColors/ Iterator) object, that object uses a variable defined lexically inside http://wiki.slowass.net/?WebsafeColors. Since defined lexically (contained inside the block, in this case, the method) to the variable $parentThis, we hold a reference to it. If it changes, we see the changes. If the parent is destroyed before the WebsafeColors::Iterator (http://www.cpan.org/modules/by-module/WebsafeColors/ Iterator) object we return is, this variable will live on until all references are destroyed. This way, we can share data efficiently with our parent. In some situations, it may be better to copy the data before giving it to the inner class, or to use Immutable Objects, explained in Chapter XXX.

Our Perl implementation could cause problems if two threads contend for the same datastructure, even by way of different objects. Thus, if used in a threading environment, the http://wiki.slowass.net/?WebsafeColors and all of its returned inner classes would need to synchronize on the same object for access to the array of colors. Failure to do so would lead to iterators that miss colors, end prematurely, or overrun the array.

"BiDirectionalRelationshipToUnidirectional" in BiDirectionalRelationshipToUnidirectional talks about how "InnerClasses" in InnerClasses may be employed to cleanly build structures of mutually referring objects.

"AdapterPattern" in AdapterPattern is similar to "InnerClasses" in InnerClasses, but the adapter has no access to lexical data, and sits in a seperate file. Adapters can be (and usually are) added after the fact, and have the advantage of not requiring tampering with a class to implement. "CurryingConcept" in CurryingConcept talks about creating method-level wrappers to serve as adapters.

An "IteratorInterface" in IteratorInterface is a good use of "InnerClasses" in InnerClasses. Interfaces clutter up a namespace with lots of methods designed to present the data and logic in an object is various ways. The "IteratorInterface" in IteratorInterface encapsulates the requirements, keeping things as neat as possible.

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryNovice, http://wiki.slowass.net/?CategoryIntermediate

See Also

External Pages Linking to This Page:

AggregatePattern

Members of a common subclass are each known to have certain methods - that is, they all implement a given interface. These methods return information about the state of that perticular object, or make changes to its state. It does happen that an application is concerned with an aggregation, or an amalgamation, of data from several object of the same type. This leads to code being repeated around the program:

  my $subtotal;
  foreach my $item (@cart) {
    $subtotal += $item->query_price();
  }




  my $weight;
  foreach my $item (@cart) {
    $weight += $item->query_weight();
  }




  # and so on

Representing individual objects, when the application is concerned about the general state of several objects, is an http://wiki.slowass.net/?ImpedenceMismatch. This is a common mismatch: programmers feel obligated to model the world in minute detail then are pressed with the problem of giving it all a high level interface. "LayeringPattern" in LayeringPattern tells us to employ increasing levels of abstraction.

Create an object as a wrapper, using the same API as the objects being aggregated. Speak of objects in terms of the required interface - see "AbstractClass" in AbstractClass. This means using a common type as an entry, but allow the container to hold other that subclass it or imlpement it as an interface. Define its accessors to return aggregate information on the objects it contains.

  package Cart::Basket;




  use base 'Cart::Item';




  sub query_price {
    my $self = shift;
    my $contents = $self->{contents};
    foreach my $item (@$contents) {
    }
  }




  sub add_item {
    my $self = shift;
    my $contents = $self->{contents};
    my $item = shift; $item->isa('Cart::Item') or die;
    push @$contents, $item;
    return 1;
  }




  # query_ routines:




  sub query_price {
    my $self = shift;
    my $contents = $self->{contents};
    my $subtotal;
    foreach my $item (@$contents) {
      $subtotal += $item->query_price();
    }
    return $subtotal;
  }




  sub query_price {
    my $self = shift;
    my $contents = $self->{contents};
    my $weight;
    foreach my $item (@$contents) {
      $weight += $item->query_weight();
    }
    return $weight;
  }

The aggregation logic, in this case, totalling, need only exist in this container, rather than being strewn around the entire program. Less code, less http://wiki.slowass.net/?CodeMomentum, fewer depencies, more flexibility.

We have an object of base type Cart::Item that itself holds other Cart::Item objects. That makes us recursive and nestable - one basket could hold several items along with another basket, into which other items and baskets could be placed. You may or may not want to do this intentionally, but to someone casually calling -query_price()> on your Cart::Basket object won't have to concern himself with this - things will just work.

This will break. Unless the advice of "AbstractRootClasses" in AbstractRootClasses is followed and different implementations of the same thing share the same interface, the basket can't confidently aggregate things. Unless the advice of "StateVsClass" in StateVsClass is heeded, "AbstractRootClasses" in AbstractRootClasses will never be acheived: the temptation to draw distinctions between classes that lack certian functions will be too strong. These distinctions run counter to "AbstractRootClasses" in AbstractRootClasses, causing segmentation and proliferation of interfaces for no good reason. This proliferation of types prevents aggregation in baskets and containers. Avoid this vicious cycle. Parrots that don't squak are still parrots.

"IteratorInterface" in IteratorInterface blurb - aggregation is kind of like iteration in that they both present information gleaned from a number of objects through a tidy interface in one object. While "IteratorInterface" in IteratorInterface deals with each contained or known object in turn, "AggregatePattern" in AggregatePattern summerizes them in one fell swoop.

"ContainerPattern" in ContainerPattern continues (duplicates) this, with more depth, more gotchas, and more references.

Categories

See Also

ContainerPattern

Problem: The goals of "TypeSafety" in TypeSafety and reusable code clash when attempting to reuse containers of other objects.

Solution: Rethink interfaces.

Objects created to hold other objects. Queues, FIFOs/stacks, buffers, shopping carts, and caches all fit this description.

http://wiki.slowass.net/?BreadthFirstRecursion has an example of recursing through a network of objects to find them all, where a queue is used to hold unexplored paths.

"IteratorInterface" in IteratorInterface is an important part of all objects that act as containers in one way or another. It provides a consistent way to loop through that containers contents: any container should be functionally interchangable with any other for the purposes of inspecting their contents. This employes the ideas of "AbstractRootClasses" in AbstractRootClasses and "AbstractClass" in AbstractClass.

http://wiki.slowass.net/?TemplateClass talks about generators for containers. "TypeSafety" in TypeSafety breaks down when presented with generic, reusable containers that can hold any type of data. If a container only holds one specific type of data, we know any items retreived from it are of the correct type, and no type errors can occur, but then we can't reuse that container. http://wiki.slowass.net/?TemplateClass follows C++'s ideas of templates, and provides a generic implementation that can create instances tailored to specific data types to enforce safety. http://wiki.slowass.net/?ObjectOriented purists will find this of interest.

http://wiki.slowass.net/?AggregationPattern and "StateVsClass" in StateVsClass talk about other, more present, type issues that crop up when creating containers full of subclasses of a certain type. What if one subclass doesn't do something the superclass does? Model it as state. Null-methods are okey. Don't fork the inheritance to remove a feature. Similar to "IntroduceNullObject" in IntroduceNullObject, but for methods. Hmm. http://wiki.slowass.net/?IntroduceNullMethod?

http://wiki.slowass.net/?ObjectOrientedDesignHeuristics, section 5.19, has an example of a basket that cores fruit. How could this possibly made general? Anything other than a fruit would need a -core()> method that does nothing, requiring a base class implementing a stub core() to be inherited by all.

Extract a generic interface:

Containers should maintain relationships between objects they contain when the relationships are too numerous or abstract. An object that is part of a series might have links to the next and previous objects in that sequence:

  package LinkedList::Link;




  sub new { bless { prev => undef, next => undef }, $_[0]; }




  sub next { $_[0]->{next} }




  sub set_next { $_[0]->{next} = $_[1] }




  sub prev { $_[0]->{prev} }




  sub set_prev { $_[0]->{prev} = $_[1] }

See http://wiki.slowass.net/?AccessorsPattern for an explanation of this style of code, if you must. The objects place in the sequence makes sense to be part of the object. Each object can point you at the next one, following the "LawOfDemeter" in LawOfDemeter. Should the object be part of two linked lists, or three linked lists, or an arbitrary number of linked lists, no fixed method can be called to deturmine the "next" object in the sequence, because no assumption can be made about which sequence you're talking about. An access would have to exist for previous and next for each sequence the object is part of. It makes more sense to seperate the linking from the object. Rather than adding the code to do whatever to LinkedList::Link, LinkedList::Link should delegate to it: see http://wiki.slowass.net/?DelegationConcept. The object would be bare of any linked list logic, though several LinkedList::Link objects may hold a reference to it, and it might be part of an arbitrary number of linked lists, or other data structures. See "ObjectsAndRelationalDatabaseSystems" in ObjectsAndRelationalDatabaseSystems for more on the problems of complex inter-object relationships.

http://wiki.slowass.net/?CategoryRefactoring

See Also

DecoratorPattern

Synopsis: Attach additional logic to an existing object.

When: Something about an object needs to change. Objects can have attributes that change something about them.

Decorators provide a flexible alternative to subclassing for extending functionality.

http://wiki.slowass.net/?TheJoyOfPatterns used stacking burger toppings as an example. It's a good example. Lets use taco toppings instead, so we aren't copying them too blatantly. Lets imagine that there is a taco concession in a mall. We won't call it a Mexican restaurant. That would be a stretch. Most of their tacos sit under a heat lamp, pre-made, waiting for someone to order the standard taco. A rash of bowel disrupting bacteria outbreaks brought suspicion on the heat lamps, so people began ordering tacos with and without all kinds of weird toppings in attempt to foil the pre-making efforts and get a fresh taco. The concessions stand management found that the cashiers were making a lot of errors adding up the costs of the toppings, so they complained to the corporate office. Corporate office searches the web for "a programmer that doesn't interview like they are reading from a script and who doesn't design patterns using taco toppings like the last guy", and hires the first person that comes up: a Perl programmer! [8].

This programmer could write something like:

  # in a file named Taco.pm:
  
  package Taco;
  use ImplicitThis; ImplicitThis::imply();
  
  sub new { bless { price=>5.95}, $_[0]; }
  sub query_price { return $price; }
  
  # in a file named TacoWithLettuce.pm:
  
  package TacoWithLettuce;
  use ImplicitThis; ImplicitThis::imply();
  @ISA = qw(Taco);
  sub query_price { return $this->Taco::query_price() + 0.05; }
  
  # in a file named TacoWithTomato.pm:
  
  package TacoWithTomato;
  use ImplicitThis; ImplicitThis::imply();
  @ISA = qw(Taco);
  sub query_price { return $this->Taco::query_price() + 0.10; }
  
  # in a file named TacoWithTomatoAndLettuce.pm:
  
  package TacoWithTomatoAndLettuce;
  use ImplicitThis; ImplicitThis::imply();
  @ISA = qw(Taco);
  sub query_price { return $this->Taco::query_price() + 0.10; }

To do it this way, they would have to create a class for each and every topping, as well as each and every combination of toppings! With two toppings this isn't out of hand. With 8 toppings, you've got 256 possible combinations. With 12 toppings, you've 4096 combinations. Creating a permanent inheritance is the root of the problem, here. If we could do something similar, but on the fly, we wouldn't need to write out all of the possible combinations in advance. We could also make the inheritance chain deeper and deeper as we needed to.

  # in a file named Taco.pm:
  
  package Taco;
  use ImplicitThis; ImplicitThis::imply();
  
  sub new { 
    bless { price=>5.95, first_topping=>new Topping::BaseTaco }, $_[0]; 
  }
  sub query_price { return $first_topping->query_price(); }
  sub add_topping {
    my $topping = shift; $topping->isa('Topping') or die "add_topping requires a Topping";
    $topping->inherit($first_topping);
    $first_topping = $topping;
  }
  
  # in a file named Topping.pm:
  
  package Topping.pm;
  # this is just a marker class
  
  # in a file named Topping/BaseTaco.pm:
  
  package Topping::BaseTaco;
  @ISA = qw(Topping);
  
  sub query_price { return 5.95; }
  
  # in a file named Topping/Lettuce.pm:
  
  package Topping::Lettuce;
  @ISA = qw(Topping);
  use ImplicitThis; ImplicitThis::imply();
  sub query_price { return 0.05 + $this->SUPER::query_price(); }
  sub inherit { my $parent = shift; unshift @ISA, $parent; return 1; }
  
  # and so on for each topping...

The astute reader will notice that this isn't much more than a linked list. Since inheritance is now dynamic, we've gotten rid of needing to explicit create each combination of toppings. We use inheritance and a recursive query_price() method that calls its parent's version of the method. When we add a topping, we tell it to inherit it from the last topping (possibly the base taco). When someone calls query_price() on the taco, we pass off the request to our first topping. That topping passes it on down the line, adding them up as it goes.

There are two gotchas here, though. What if we want a taco with extra, extra tomatos? Topping::Tomato (http://www.cpan.org/modules/by-module/Topping/ Tomato) would be told to inherit itself. This would create an endless loop! All tomatos would have tomatos are their parent, not just the last one added. Base taco would be forgotten about. The real problem here is that we're modifying the whole class - not just the particular instance of the tomato we added last. This would keep us from using a multithreaded cash register shared by two people, and it would keep us from having two taco orders on the same tab, each with different toppings. Dynamic inheritance is a cool trick, but you must remember that its effects are global. Reserve it for creating objects of a new, unique name, of user specification, and perhaps a few similar applications. See http://wiki.slowass.net/?BeanPattern and "AbstractFactory" in AbstractFactory for more on custom-crafted objects. For some reason, this mess reminds me of "SelfJoiningData" in SelfJoiningData.

For our purposes, though, this won't fly. The linked list approach is the right approach. We need to instantiate individual toppings as objects, so that they each have private data. In this private data, we need to store the relationship: what the topping is topping is an attribute of each topping. See "InstanceVariables" in InstanceVariables for more on keeping data private to an instance of an object.

  # in a file named Taco.pm:
  
  package Taco;
  use ImplicitThis; ImplicitThis::imply();
  
  sub new { bless { price=>5.95, top_topping=>new Topping::BaseTaco }, $_[0]; }
  sub query_price { return $price; }
  sub add_topping {
    my $new_topping = shift;
    # put the new topping on top of existing toppings. this new topping is now our top topping.
    $new_topping->top($top_topping);
    $top_topping = $new_topping;
    return 1;
  }
  
  # in a file named Topping.pm:
  
  package Topping.pm;
  use ImplicitThis; ImplicitThis::imply();
  
  sub new {
    my $type = shift;
    bless { we_top=>undef }, $type;
  }
  
  sub top { 
    my $new_topping = shift; $new_topping->isa('Topping') or die "top must be passed a Topping";
    $we_top = $new_topping; 
    return 1; 
  }
  
  # in a file named Topping/BaseTaco.pm:
  
  package Topping::BaseTaco;
  @ISA = qw(Topping);
  sub query_price { return 5.95; }
  
  # in a file named Topping/Lettuce.pm:




  package Topping::Lettuce;
  use ImplicitThis; ImplicitThis::imply();
  @ISA = qw(Topping);
  sub query_price { return 0.05 + ($we_top ? $we_top->query_price() : 0); }

There! We finally have something that passes as workable! This solution is good for something where we want to change arbitrary features of the object without the containing object (in this case, taco) knowing before hand. We don't make use of this strength in this example. The query_price() method of the taco object just passes the request right along, we any math we want can be done. A two-for-taco-tappings-Tuesday, where all toppings were half price on Tuesdays, would show off the strengths of the "DecoratorPattern" in DecoratorPattern. With a press of a button, a new object could be pushed onto the front of the list that defined a price method that just returns half of whatever the price_method() in the next object returns. The important thing to note is that we can stack logic by inserting one object in front of another when "has-a" relationships.

For yet another approach, see the "AggregatePattern" in AggregatePattern.

For the sake of simplicity and clarity, each of these approaches has a different API. There is no reason they couldn't have been done consistently.

See Also

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryNovice, http://wiki.slowass.net/?CategoryIntermediate

External Pages Linking to This Page:

ProxyPattern

Problem: Objects talk to each other using an interface that has been overburdoned with the needs of security, access coherence, or historic versions of the interface.

Solution: Move access-centric features of the interface into a Proxy object. Put it in charge of security, or implement the translation between the historic interface there, or use it to inforce access coherency.

The Proxy Object is the grand daddy of all encapsulation patterns due to its sheer lack of scope. Any other delegation pattern is just a special case of this general case. The Problem/Solution lines list some possible uses, but they could just as well be phrased "one objects demands too much of another - have the second handle some of the work and delegate the rest".

A Proxy inherits the same base class or interface as the object it contains. It can be a generic proxy, that wraps arbitrary objects, or it could be custom crafted to stand in for a certain class.

  package GenericProxy;




  sub new {
    my $type = shift;
    my $this = { };
    my $obj = shift; ref $obj or die;
    $this->{'obj'} = $obj;
    $type .= '::' . ref $obj;
    # copy inheritance info.
    @{ref($this).'::ISA'} = @{ref($obj).'::ISA'};
    bless $this, $type;
  }




  # bug XXX - autoload is only used after @ISA is searched!




  sub AUTOLOAD {
    my $this = shift;
    (my $methodName) = $AUTOLOAD m/.*::(\w+)$/;
    return if $methodName eq 'DESTROY';
    $this->{'obj'}->$methodName(@_);
  }

This simple idea has many uses:

Other ideas, such as the "FacadePattern" in FacadePattern, are based on this. This Pattern supports the idea of encapsulation.

http://wiki.slowass.net/?AccessCoherency requirements touch on "AccumulateAndFire" in AccumulateAndFire.

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryNovice

See Also

AdapterPattern

Problem: Code will work with one kind of object, but there is another kind of object that should be able to be used in its place, that should work, but doesn't. Two Interfaces are incompatible implementations of the same idea. Using vender products interchangeably. Or, an object that requires one kind of object, when it should accept several different kinds.

Solution: Translate one interface to the other using a dedicated Adapter object.

The Adapter is a case of the "ProxyPattern" in ProxyPattern. It isn't even a special case. You could call it an example of a Proxy. Or vice versa.

One object requires a certain type of object. You have another object that provides an interface. You want to use them together. You could subclass one of the objects, but you'd lose polymorphism, unless all subclasses and compatible objects were subclassed individually as well - which wreaks of the http://wiki.slowass.net/?BridgePattern or parallel inheritance hierachies. Yuck. Instead, make a generic container that is accepted by any of the first class, and contains anything derived from the second class, which translates between the two disparate interfaces.

XXX despretely needs a diagram here

I can't think of an example that doesnt' insult the intelligence. I'll have to look for one in the wild.

XXX Discussion

XXX Code

http://www.pobox.com/~schwern/talks/Design_Patterns/full_slides/slide017.html - http://wiki.slowass.net/?MichaelSchwern's version from his Design Patterns talk

"InnerClasses" in InnerClasses are often used as Adapters. In Java, there is no way to pass a closure, a subroutine pointer, or any other first class object other than an actual object. Java 1.0 required you to create a class for each and every http://wiki.slowass.net/?CallBack you needed. [9] This was clearly unworkable. Java 1.1 eased the matter by allowing these objects to be defined with a short hand syntax, and allowed the definition to be placed in your code right where they are passed. See "InnerClasses" in InnerClasses for more information.

Categories

See Also

FacadePattern

Problem: A class is unwieldy to use. You don't want to be tied to that interface or implementation. Your code is becoming closely tied to a class that you don't like, or you spend a lot of time dealing with a difficult interface, or several programmers on your team have to learn a complex subject to accomplish a few simple tasks.

Solution: Write a new interface to it that translates between your simple requests and perhaps automates tedious things you do frequently.

Normally, you write for the interface of the class that you're using today, and if you have to use a different class tomorrow, you write a Proxy. With a poor or overly complex interface, you may wind up writing for a complex interface, then writing a Proxy to translate that back to a simple interface. A Facade is a neutral ground. It allows you to shuff all of the related undesired complexity should you switch classes. You can replace it with a new Facade that translates the simple interface of the first facade to the simple interface of the replacement class.

A "DecoratorPattern" in DecoratorPattern adds complexity to the class it stands in for; a "FacadePattern" in FacadePattern mitigates complexity. Both are cases of the "ProxyPattern" in ProxyPattern.

Conceivably, you could replace one package with a horrible interface with another package with a horrible interface. In this case, you would need to stick in an equally complex Facade, but the code using the interface could remain blissfully ignorant of the whole ordeal.

[10]

http://www.theperlreview.com/Articles/v0i4/facade.pdf - "The Facade Design Pattern" by brian d foy, The Perl Review, v0 i4, http://www.theperlreview.com

Credits: http://wiki.slowass.net/?GangOfFour

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryNovice

See Also

ResultObject

Problem: Polymorphic objects (interchangeable objects) pass sets of information to each other and return it back to each other. When passed in array form, it is difficult to add or remove arguments, and optional arguments require unsightly placeholders.

Solution: Rather than maintain the method calls and returns in all of the calling and callee objects, you put the results in a new, intermediate object type.

When you rename, insert, or delete a passed or returned parameter, you have to change dozens of objects.

Using an intermediate object to hold the results lets you add fields without breaking code anywhere. Deleting or changing a member of the result only affects places actually using that property, and opens the possibility of backwards compatibility catering to accesses to the old field. Contrast this to the horror of positional arguments in a method call:

  $foo->do($arg, $str, $bleah, $blurgh);

Should the arguments do() accepts be changed, every place it is called would need to be changed as well to be consistent. Failure to do so results in no warning and erratic bugs. "TypeSafety" in TypeSafety helps, but this is still no compile time check - missing an a call can lead a program killing bug.

Credits: http://wiki.slowass.net/?GangOfFour

See Also

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryNovice

VisitorPattern

Problem: The operations that can be performed on a type of object is poorly defined, and always changing. Objects contain large numbers of unrelated methods that perform some sort of logic.

Solution: Instead of continuously revising the objects themselves, put the logic is put into interchangeable (polymorphic) Visitor objects. Use a fixed interface between the objects containing data and the objects defining the behavior.

Data is contained in objects of a certain class or subclass. Many operations can be performed on objects of this class.

The actual operation to be done becomes pluggable. This fits with putting code where it belongs. Infocom, famous for its text adventure games with extremely intelligent natural language parsers, used a permutation of this idea. Any action you wished to perform was stated as a sentence. The parser picked out the verb, direct object, and indirect object. In three rounds, the verb was invoked, then the direct object, then the indirect object. The first round, each object was given a chance to veto the action: perhaps the verb object checked to see if the environment was tagged as being underwater, or the direct object may know for a fact that the material it is made out of is non-flammable, or the indirect object may be a torch that isn't currently lit, and vetos the action because it knows it isn't lit. If not vetoed, this is repeated for a round where changes actually take affect and objects update their state, then for a final round where each object reports on the consequence of the action. In this case, a container object holding information about the sentence, is acted upon by three pluggable objects: the verbs Visitor, the direct objects Visitor, and the indirect objects Visitor. Another example would be a porridge container acted upon by three different bear Visitors objects.

[11]

Use different objects logic to work on our data. As Perl gives us dynamic inheritance, adding and removing objects from our @ISA array could have the same effect. We simply inherit the object that accesses our data the way we want, when we want. When methods defined in the Visitor object are called, they are presented with all of our data, saving the bother of querying each item individually. This still requires a clean, well defined interface: which methods need to be defined, and how the data is represented. This approach rules out making changes to how we store the data and maintaining compatibility through the interface, as a disadvantage.

The Visitor name emphasises that the objects implementing behavior and the object containing data have no real relationship with each other: neither holds on to a reference to the other. They are merely interchangeable parts, to be here today and gone tomorrow.

Borrowing from the http://patternsinperl.com/designpatterns/visitor/ example by http://wiki.slowass.net/?NigelWetters, data items are coerced into a common superclass. This isn't object clean. It is always better to fix problems at the source rather than lurk in wait wielding band-aids [12]. The example does serve to illustrate that data items should be of a common base type to be accepting as a Visitor.

  foreach my $class ( qw(NAME SYNOPSIS CODE) ) {
          no strict 'refs';
          push @{ "POD::${class}::ISA" }, "POD::POD";
  }

Not having to use a different method call in each behavior object is key. That would prvent us from using them interchangably. It would introduce need for hardcoded dependencies. We would no longer be able to easily add new behavior objects. Assuming that each behavior object has exactly one method, each method should have the same name. Something generic like -go()> is okey, I suppose. Naming it after the data type it operators on makes more sense, though. If there is a common theme to the behavior objects, abstract it out into the name. -top_taco()> is a fine name.

  package Taco::Topper;




  sub top_taco { 
    my $self = shift;
    die "we're an abstract class, moron. use one of our subclasses" if ref $self eq __PACKAGE__;
    die "method strangely not implemented in subclass";
  }




  sub new {
    my $class = shift;
    bless [], $class;
  }




  package Taco::Topper::Beef;




  sub top_taco {
    my $self = shift;
    my $taco = shift;
    if($taco->query_flags()) {
       die "idiot! the beef goes on first! this taco is ruined!";
    }
    $taco->set_flags(0xdeadbeef);
    $taco->set_cost($taco->query_cost() + 0.95);
  }




  package Taco::Topper::Cheese;




  sub top_taco {
    my $self = shift;
    my $taco = shift;
    if(! $taco->query_flag(0xdeadbeef) and ! $taco->query_flag(0xdeadb14d)) {
      # user is a vegitarian. give them a sympathy discount because we feel
      # bad for them for some strange reason, even though they'll outlive us by 10 years
      $taco->set_cost($taco->query_cost() - 1.70);
    }
    $taco->set_flags(0xc43323);
    $taco->set_cost($taco->query_cost() + 0.95);
  }




  package Taco::Topper::Gravy;




  # and so on...

Gravy? On a taco? Yuck! In real life, places in the mall that serve "tacos" also tend to serve fries, burgers, hotdogs, and other dubiously non-quasi-Mexican food. It doesn't make sense to have one vat of cheese for the nachos, another for tacos, and yet another for cheesy-gravy-fries. The topper should be able to apply cheese to any of them. Keep in mind that these behavior classes work on a general class of objects, not merely one object. A burger could be a subclass of a taco. See "StateVsClass" in StateVsClass for some thoughts on what makes a good subclass.

The taco object could then do something vaguley along the lines of...

  $topping_counter->get_cheese_gun()->top_taco($self);

... where $topping_counter holds our different topping guns, and get_cheese_gun() returns a cached instance of Taco::Topper::Cheese. This creates a sort of a cow-milking-itself problem. The taco shouldn't be cheesing itself, some other third party should make the connection. Assuming that the topping counter has been robotized and humans enslaved by the taco craving robots, perhaps the topping counter could cheese the taco. [13].

Taco::Topper's strange die() calls give a prime example of run time interface checking versus compile time interface checking. Perl does this run time, Java compile time. Since the Java compiler would catch either of those errors, no run time checks are needed - those die() calls could go away. Also, the program wouldn't need to be thoroughly tested to find out if those die() calls ever happen - once again, it would be cought at compile time.

The "VisitorPattern" in VisitorPattern is a special case of "FeatureEnvy" in FeatureEnvy: we're more concerned about another objects data than our own. This flies in the face of the first rule of http://wiki.slowass.net/?ObjectOriented programming: data and related code should be packaged together. "FeatureEnvy" in FeatureEnvy suggests that perhaps the code should just be moved into the object being tweaked. In this case, we've been there, didn't like it, and moved it, but abstracted it behind an interface. The alternatives would have been http://wiki.slowass.net/?MixIns or something far worse. The first rule of http://wiki.slowass.net/?ObjectOriented programming is that anything is okay if its hidden behind an interface.

The important thing to remember is that we can cheese things as long as they provide an interface that allows cheesing. In this example, query_flag(), set_flags(), query_price(), and add_price().

Credits: http://wiki.slowass.net/?GangOfFour

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryIntermediate

See Also

ClassAsTypeCode

Problem: Values from a definitive list of permissiable values are needed. In Perl, hashes of possible valid values are commonly used, and enums are used in C. These permissiable values must be packaged with their behavior [14], or we're trying to apply to this idea in an http://wiki.slowass.net/?ObjectOriented way. Or, each object is a special http://wiki.slowass.net/?MagicCookie: unique, impossible to recreate without being given it, and therefore later usable as proof of having been given the cookie.

Solution: Centralize creation, containment, and distribution of the objects.

The container of the objects also plays the roles of both the creator and distributor. The creator aspect makes one of each when it itself is created, like the "SingletonPattern" in SingletonPattern applied to multiple objects. The distributor aspect descides to whom and on what basis the objects are distributed.

The idea of "TypeSafety" in TypeSafety allows us to validate that these objects probably came from our pool without having to have an explicit list of all of the members of the pool:

  # using TypeSafety:




  sub set_day {
    die unless $_[0]->isa('Day');
    $day = shift;
    return 1;
  }




  # using a plain old hash:




  sub set_day {
    die unless exists $daysref->$_[0];
    $day = shift;
    return 1;
  }

Everything from this set passes the "isa" test, so we can use "TypeSafety" in TypeSafety to check our arguments. In any other language, it would be impossible to add to the set after being created this way, but we could do revisit the package (see "RevisitingNamespaces" in RevisitingNamespaces) or redefine the constructor in Perl, so this shouldn't be considered secure.

  package Day;
  
  use ImplicitThis; ImplicitThis::imply();




  $mon = new Day 'mon';
  $tues = new Day 'tues';




  my @days;




  sub new { 
    die unless caller eq __PACKAGE__;
   my $me = { id=>$_[1] }
    bless $me, $_[0];
    push @days, $me;
    return $me;
  }




  sub get_id { return $id };




  sub get_days { return @days; }




  # in Apopintment.pm:
  
  package Appointment;
  
  my $day;




  sub set_day {
    die unless $_[0]->isa('Day');
    $day = shift;
    return 1;
  }

XXX examples of use, what you can and cannot do, etc.

Java's API, AWT especially, has numerous examples of this. AWT.Color contains AWT.Color.RED, AWT.Color.BLUE, and so forth. This provides a symbolic name for objects, where each object is unique. There will never be two different BLUE objects floating around. This allows us to compare them for equality using their pointers:

  $mon eq $mon;        # true
  $mon eq $tues;       # false

This behavior, too, is shared with the "SingletonPattern" in SingletonPattern. The same effect could be acheived using "OverloadOperators" in OverloadOperators. This approach is simplier and more clear.

If we give someone AWT.Color.BLUE, and then they later give it back to us, we can use the eq test to decide with certainty whether or not we gave them BLUE as there is no other way they could possibly obtain it [15].

Credits: Unknown! Dates back a long time, though... XXX

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryIntermediate

See Also

StatePattern

Problem: Checks litter the code. Nearly every method checks one specific instance variable to decide how to behave. The possible values of this variable are finite in number and well understood: on and off, or north, south, west, east, for example.

Solution: Make each possible state of the object into a subclass. Leave the general case and the general logic in the parent object. Consider the state variable to be a constant in each subclass and optimize it away in your code.

What happens when a light switch is thrown depends on its current state: on or off. Its new state is the opposite. A light switch has to be capable of dealing with all of the complexities of being either on or off, which isn't a lot of complexity, really. However, some machines have dozens or hundreds of states. This one machine has to know how to be in each state. In reality, few machines serve a large number of purposes. Attempts have been made to combine cell phones and PDAs, cell phones and MP3 players, PDAs and MP3 players, MP3 players and portable storage devices, PDAs and portable storage devices, audio recorders and MP3 players, audio recorders and PDAs, audio recorders and cell phones... in thousands of combinations... there is not currently an example of all three of those things in one device. It is complex to have a pocket full of devices, but it also complex to <s>license all of the patents needed to implement</s> design a device that serves every purpose. Design simplicity wins, for now. Likewise, when implementing a complex virtual object, sometimes it is best to represent it as a collection of simple objects, each of which knows exactly what its purpose is and cares nothing for the purposes of the other objects, not even able to agree on a common flash media format. When you wish to switch from one mode of the object to another, you simply replace it with the other object. No complex internal state change occurs, just one broad over all state change. States are each clearly defined and seperate.

  package Pocket::Computer;




  sub record_audio {
    # implemented in some subclasses but not others
  }




  sub take_a_memo {
    # that we can do
  }




  sub make_a_call {
    die "don't know how, and the FCC would have a cow";
  }




  package Pocket::Phone;




  sub record_audio {
    # some do, some don't. most don't.
  }




  sub take_a_memo {
    die "i'm not a PDA";
  }




  sub make_a_call {
    # this we can do
  }

Some devices can do some things, others can do other things. Each device does not have to check to see if it is the kind of device that can - it just knows, because thats what it is, and identity is a large part of http://wiki.slowass.net/?ObjectOrientation.

At a certain level of complexity the concept of a http://wiki.slowass.net/?StateChange is introduced. Cars suffer from this complexity. You may go from parked to idling, or you may go from idling to accelerating, but not from parked to accelerating. Going from accelerating to parked is also known as an insurance claim. Each state knows the states that are directly, immediately attainable. http://wiki.slowass.net/?BreadthFirstRecurssion or http://wiki.slowass.net/?DepthFirstRecurssion is needed to plan out anything more complex.

XXX - "TinyWiki" in TinyWiki parser as an example

http://wiki.slowass.net/?ConstructorPattern and "ImmutableObject" in ImmutableObject coupled with "AbstractFactory" in AbstractFactory describe an alternative arrangement: when a state change is needed, the existing object is passed as an argument to the factory along with the any information needed to decide what the next object will be. The "AbstractFactory" in AbstractFactory returns an "ImmutableObject" in ImmutableObject, initialized with the existing objects data, to replace the existing object. One object is swapped for another not through delegation and a facade, but through an "AbstractFactory" in AbstractFactory that spits out instances of "ImmutableObject" in ImmutableObject.

http://wiki.slowass.net/?WritingPerlModulesForCPAN, page 258, has a very good example of creating a simple web BBS using CGI::Application (http://www.cpan.org/modules/by-module/CGI/ Application) . CGI::Application (http://www.cpan.org/modules/by-module/CGI/ Application) models a users web experience as a http://wiki.slowass.net/?StateMachine. Each screen is a state that takes you to other states. The state transitions are buttons and so forth on the screens.

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryIntermediate

See Also

External Pages Linking to This Page:

MomentoPattern

Problem: Objects are left in an inconsistant state in a failure scenario.

Solution: Checkpoint the object and restore it in the event of failure.

Synopsis: You need an "Undo" behavior. Delegate an object to be the keeper of another.

When: You are starting something you may not be able to finish. An operation might abort, leaving data in an inconsistent state.

Symptoms: Querying values from an object and conditionally restoring them.

XXX Generic example with a deep-copy algorithm.

Easily implemented by wrapping one object inside of another and using Clone.

  package Memento;
  
  sub new {
    my $type = shift;
    my %opts = @_;
    die __PACKAGE__ . " requires an object passed on its constructor: new Memento object=>\$obj"
      unless $opts{'object'};
    my $this = { object=>$opts{'object'}, checkPoint=>undef };
    bless $this, $type;
  }
  
  sub mementoCheckPoint {
    my $this = shift;
    $this->{'checkPoint'} = $this->deepCopy($this->{'object'});  
  }
  
  sub mementoRestore {
    my $this = shift;
    $this->{'object'} = $this->{'checkPoint'};
  }
  
  sub AUTOLOAD {
    my $this = shift;
    (my $method) = $AUTOLOAD =~ m/.*::(\w+)$/;
    return if $method eq 'DESTROY';
    return $this->{'object'}->$method(@_);
  }
  
  sub deepCopy {
    my $this = shift;
    my $ob = shift;
    die unless caller eq __PACKAGE__; # private
    return $ob if(!ref $ob);
    if(ref $ob eq 'SCALAR') {
      my $value = $$ob; return \$value;
    }
    if(ref $ob eq 'HASH') {
      my %value = %$ob; return \%value;
    }
    if(ref $ob eq 'ARRAY') {
      my @value = @$ob; return \@value;
    }
    # FILEHANDLE, GLOB, other cases omitted
    # assume its an object based on a hash
    # XXX man perlfunc say that $ob->isa('HASH') works...?
    my $type = ref $ob;
    my $newself = { };
    foreach my $i (keys %$ob) {
      $newself->{$i} = $this->deepCopy($ob->{$i});
    }
    return $newself;
  }

While this is a generic Memento package, it cannot possibly know how to correctly deal with objects contained inside the object given it. A version of this (possibly subclassed) tailored to a specific package would handle this situation correctly. Here, we replicate objects mercilessly. This code also violates the encapsulation rules of OO. Use it as a starting point for something that doesn't.

Credits: http://wiki.slowass.net/?GangOfFour

See Also

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryIntermediate

SingletonPattern

Problem: You're using a constructor to create an object, but the design considers it an error to create more than one instance of that class. Or, you have a single instance of an object now, but this is an implementation detail, subject to change. "PassingState" in PassingState says to create the resources as early as needed and pass it to constructors, but you would be passing it almost everywhere.

Solution: Have your constructor, new(), return the same single object every time it is called. Allow objects to call the constructor directly. The Singleton will create the single instance of itself, and will be the repository for that single instance.

Synopsis: You've found a very good reason to have exactly one of a certain class. You rig the constructor to return the single existing instance instead of making a new one.

When: http://c2.com/cgi/wiki?SingletonPattern) lists example valid uses as logging, network interaction, and database connections.

Symptoms: Resource objects are created when the program starts and passed to the constructor of each object initially spawned. Each of those objects in turn pass this resource object to each of their children.

Given a http://wiki.slowass.net/?MountRushmore object, you want to be sure that its the true, one and only, http://wiki.slowass.net/?MountRushmore, and not someone's cheap knock-off.

  package MountRushmore; 
  
  my $oneTrueSelf;
  
  sub new {
    if($oneTrueSelf) {
      return $oneTrueSelf;
    } else {
      my $type = shift;
      my $this = {presidents => 
        ['George Washington', 'Thomas Jefferson', 'Theodore Roosevelt', 'Abraham Lincoln']
      };
      $oneTrueSelf = bless $this, $type;
      return $this->new();
    }
  }
  
  sub someMethod { ... }

Singletons are a special case of http://wiki.slowass.net/?StaticObjects.

Don't Use Singletons When...

This is over used. Don't make too many assumptions about when two of something could be handy. For example, the X-Windows windowing system early on assumed that more than one display could be attached to a system. This pattern should be used to distribute globally available resources. It should not be used to contain context or state information - this would make it impossible to create distinct instances of objects which use the singleton. Singletons should not be http://wiki.slowass.net/?ValueObjects.

Since many programs have a proliferation of Singletons, it may be handy to place all of them in a global Static Object, which itself is a Singleton.

Singletons managing a set of 1 or more objects for which there is contention or sharing is a http://wiki.slowass.net/?ResourcePool.

When a http://wiki.slowass.net/?ValueObject is wanted to hold configuration information, instead use "PassingPattern" in PassingPattern: this allows different instances of objects to be given different runtime parameters. Failure to do so would violate the identity requirement of http://wiki.slowass.net/?ObjectOriented programming, and we wouldn't want that, would we?

http://www.theperlreview.com/Issues/The_Perl_Review_0_1.pdf - brian d foy's article on Singleton in The Perl Review

http://www.perlmonks.com/index.pl?node_id=234123 for a good description of the delimma - very good.

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryNovice

Credits: http://wiki.slowass.net/?GangOfFour

Resources:

See Also

CurryingConcept

[16]

This is based on "TypeSafety" in TypeSafety, which is itself based on http://wiki.slowass.net/?AbstractClasses, or the concept of types it puts forward, rather. We confound the subject with "AnonymousSubroutineObjects" in AnonymousSubroutineObjects. We use "TypeSafety" in TypeSafety, "ClassAsTypeCode" in ClassAsTypeCode and "NewObjectFromExisting" in NewObjectFromExisting. "RunAndReturnSuccessor" in RunAndReturnSuccessor is a fundamental idea to the http://wiki.slowass.net/?LambdaClosure idea of currying, and we demonstrate it in the second example.

Currying is a universe of single argument functions. This sounds absurd and useless, and would be except for the tenets of http://wiki.slowass.net/?LambdaClosures. This pattern develops when state is accumulated incrementatally: see "AccumulateAndFire" in AccumulateAndFire. "AccumulateAndFire" in AccumulateAndFire comes about when there are "TooManyArguments" in TooManyArguments to pass all at once. Attempting to pass them all at once loose us the flexibility of being able to set things up, run, change a few things, run, and so on.

For example, lets say we're playing roulette. We can pick a color and perhaps a few numbers.

  package Roulette::Table;




  sub new {




    my $class = shift;
    my $this;




    # if new() is called on an existing object, we're providing additional
    # constructors, not creating a new object




    if(ref $class) {
      $this = $class;
    } else {
      $this = { };
      bless $this, $class;
    }




    # read any number of and supported type of arguments




    foreach my $arg (@_) {
      if($arg->isa('Roulette::Color')) {
        $this->{'color'} = $arg;
      } elsif($arg->isa('Roulette::Number')) {
        push @{$this->{numbers}}, $arg;
      } elsif($arg->isa('Money')) {
        if($this->{money}) {
          $this->{money}->combine($arg);
        } else {
          $this->{money} = $arg;
        }
      }
    }




    return $this;




  }




  sub set_color { new(@_); }
  sub add_number { new(@_); }
  sub add_wager { new(@_); }

The constructor, new(), accepts any number or sort of object of the kinds that it knows about, and skuttles them off to the correct slot in the object. Our set routines are merely aliases for new(). new() may be called multiple times, directly or indirectly, to spread our wager over more numbers, change which color we're betting on, or plunk down more cash. I don't play roulette - I've probably butched the example. Feel free to correct it. Use the little edit link. People won't be doing everything for you your entire life, atleast I hope.

We still have the problem of having an object exist in an indeterminate state. If we apply "AnonymousSubroutineObjects" in AnonymousSubroutineObjects, we get something much closer to the original idea of currying. Rather than storing state in an object as it is built up, store it in a http://wiki.slowass.net/?LambdaClosure that is object aware:

  package Roulette::Table;




  use MessageMethod;




  sub new {
    my $class = shift;
    my $this;
    my $curry;




    bless $this, $class;
    
    $curry = MessageMethod sub {




        my $msg = shift;




        if($msg eq 'spin_wheel') {
          die "Inconsistent state: not all arguments have been specified";
        }




        if($msg eq 'set_color') {
          $this->{'color'} = shift;
        } 




        if($msg eq 'add_number') {
          $this->{'numbers'} ||= [];
          my $numbers = $this->{'numbers'};
          push @$numbers, $arg;
        }




        if($msg eq 'add_add_money') {
          if($this->{'money'}) {
            $this->{'money'}->combine($arg);
          } else {
            $this->{'money'} = $arg;
          }
        }




        if($msg eq 'is_ready') {
          return 0;
        }




        if($this->{'money'} and $this->{'color'} and $this->{'numbers'}) {
          return $this;
        } else {
          return $curry;
        }
        
    };




    return $curry;




  }




  sub spin_wheel {
    # logic here...
  }




  sub is_ready {
    return 1;
  }

This second example doesn't support repeated invocations of new() to further define an unfinished object. It could, but it would detract from the example. Add it for backwards compatability if for any reason. More radically, we don't accept any constructors. We return an entirely new object that has the sole purpose of accepting data before letting us at the actual object.

Representing two different states of an object with two different objects is the subject of an ongoing debate as well as "StateVsClass" in StateVsClass.

Rather than using "TypeSafety" in TypeSafety to check the class membership of objects passed in, we could just as easily accept "NamedArguments" in NamedArguments. The choose is a matter of what feels right, and what is adequate without being overkill.

In brief, returning a custom object, partially configured by some argument, ready to either do work or accept more configuration, is the act of currying. More correctly, constructing a function to accept single arguments and return another function, or converting an existing function to such, is currying.

  sub create_roulette_table {
    my $color;
    my $money;
    my $numbers;
    return sub {
      $color = shift;
      return sub {
        $money = shift;
        return sub {
          push @$numbers, shift;
          return sub {
            # play logic here
          };
        };
      };
    };
  }




  # to use, we might do something like:




  my $table = create_roulette_table()->('red')->('500')->(8);
  $table->(); # play
  $table->(); # play again




  # or we might do something like:




  my $table_no_money = create_roulette_table()->('red')->('500');
  my $table;
  $table = $table_no_money->(100);
  $table->(); # play
  $table->(); # play again -- oops, lost everything
  $table = $table_no_money->(50);
  $table->(); # play some more

This is stereotypical of currying as you'd see it in a language like Lisp. The arguments are essentially untyped, so we take them one at a time, in a specific order. Also like Lisp, the code quickly migrates across the screen then ends aburptly with a large number of block closes (the curley brace in Perl, paranthesis in Lisp). The Lisp version makes heavy use of "RunAndReturnSuccessor" in RunAndReturnSuccessor. If we wanted to adapt this logic to spew out http://wiki.slowass.net/?GeneratedMethods, where each method generated wasn't tied to other generated methods, we would need to explicitly copy the accumulated lexical variables rather than simply binding to them. For example, my $color = $color; my $money = shift; would prevent each anonymous routine returned from sharing the same $color variable, although without further logic, they would all have the same value. This amounts to the distinction between instance and class data.

Understanding the Lisp-ish example isn't critical to using this idea. It merely serves to give us some context to the idea, and a counter-example to the http://wiki.slowass.net/?ObjectOriented approach. It also clearly demonstrates the advantages of having partially constructed objects laying around: we don't need to construct a whole new table just to put some more money down, but we have the power of creating objects to represent state at the same time.

"PerlMonks" in PerlMonks:62737 - taking reference to methods (closure) - closely related to "CurryingConcept" in CurryingConcept - http://wiki.slowass.net/?CategoryToDo - import this

http://wiki.slowass.net/?CategoryConcept, http://wiki.slowass.net/?CategoryIntermediate, http://wiki.slowass.net/?CategoryExpert, http://wiki.slowass.net/?CategoryRefactoring

See Also

External Pages Linking to This Page:

CloningPattern

Problem: A copy of an object is needed so it can be diddled while preserving the original, or an existing object should serve as a template for a new object.

Solution: Instead of probing into its innards from outside, implement it, or re-implement it, to have a clone() method. clone() makes an exact duplicate of it from the inside.

When: You want to keep an unmodified copy of an object around, or you want to play with a copy of an object without hurting the original.

Symptoms: You're querying all of the fields out of one object, and passing them to the accessor methods of another object of the same type. Or, you access the underlining data structure directly, looping over the fields in one object, assigning the values to another. You spend a lot of effort to set up objects with are similar to each other.

Cloning must be designed into an object, or added in subclass. Usually. Subclasses of a class with a clone() interface that add features to the class need to override the ancestors clone() method and augment it to handle the new features. Since only a designer of a class will know for sure how to correctly clone it, it must be implemented with each package that features it.

Cloning lets you distribute or play with copies of objects. It also lets you more easily make a series of similar objects, using one object as a template for others.

For objects based on hashes, an extremely simple implementation of this might look like:

  package Mumble;




  sub new { ... }; # standard constructor




  sub clone {
    my $self = shift;
    my $copy = { %$self };
    bless $copy, ref $self;
  };

Note that this is a http://wiki.slowass.net/?ShallowCopy, not a http://wiki.slowass.net/?DeepCopy: clone() will return an object that holds additional references to things that the object being copied holds onto. If it were a http://wiki.slowass.net/?DeepCopy, the new copy would have it's own private copies of things. This is only an issue when the object being copied refers to other objects, perhaps delegating to them. A http://wiki.slowass.net/?DeepCopy is a recursive copy. It requires that each and every object in this network implement -clone()>, though we could always fall back on reference sharing and fake it.

    my $copy = { %$self };

%$self expands the hash reference, $self, into a hash. This is done in a list context, so all of the key-value pairs are expanded returned out - this is done by value, creating a new list. This happens in side of the { } construct, which creates a new anonymous hash. This is assigned to $copy. $copy will then be a reference to all of the same data as $this, The end result is a duplicate of everything in side of $self. This is the same thing as:

  sub clone {
    my $self = shift;
    my $copy;
    foreach my $key (keys %$self) {
      $copy->{$key} = $self->{$key};
    }
    bless $copy, ref $self;
  }

If we wanted to do a http://wiki.slowass.net/?DeepCopy, we could modify this slightly:

  sub clone {
    my $self = shift;
    my $copy;
    foreach my $key (keys %$self) {
      if(ref $self->{$key}) {
        $copy->{$key} = $self->{$key}->clone(); 
      } else {
        $copy->{$key} = $self->{$key};
      }
    }
    bless $copy, ref $self;
  }

This assumes that $self contains no hashrefs, arrayrefs, and so on - only scalar values and other objects. This is hardly a reasonable assumption, but this example illustrates the need for and implementation of recursion when cloning nested object structures.

"MomentoPattern" in MomentoPattern has an example of copying an objects data against its permission - something that shouldn't be made a habit.

Clone Factories keep a pool of archetypical objects, and return slightly modified copies on request. XXX - example.

Permutations exist where other objects serve as general purpose object cloners or copiers. Due to Perl's introspective nature, a great deal of detail can be replicated. However, this will not always be safe, as some packages have special arrangements with their contents, some objects cannot handle multiple references existing to them, and so forth. This violates the encapsulation principle.

Class::Classless (http://www.cpan.org/modules/by-module/Class/ Classless) is an interesting twist on the idea of using one class as a template - not only is object instance data replicated, but objects themselves are configured to have the logic and methods you want, and then are cloned for their behavior. http://wiki.slowass.net/?JavaScript works this way. Objects could be looked at as buckets of data and methods, whether either type of thing can be thrown into the bucket. Copying (by reference) the methods from one object into a fresh one is the work of a constructor, and is how new objects of that "class" are made. Copying the methods and the data would be a clone, according to our description of object cloning. XXX more on Class::Classless (http://www.cpan.org/modules/by-module/Class/ Classless).

Categories

See Also

See also Clone on CPAN

http://wiki.slowass.net/?CategoryIntermediate, http://wiki.slowass.net/?CategoryPattern http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryNovice

FlyweightPattern

Problem: A class of very light weight objects are being used in large numbers. Reusing objects by sharing references would save a lot of memory.

Solution: Instead of creating thousands of identical copies of objects, keep a cache, and hand out references to existing copies.

When: You're passing a lot of simple objects around. You're using objects as a sort of enumeration. You've just gone OO overboard and made everything an object.

Symptoms: Object Oriented programming is at odds with memory usage.

A Flyweight is a permutation of an "AbstractFactory" in AbstractFactory. A million tiny objects can weigh a ton. By keeping only one copy of each, memory usage can be dramatically reduced.

  package FooFlyweight;




  my $objectCache;




  sub new {
    my $type = shift;
    my $value = shift;  # just a scalar
    if(exists $objectCache->{$type}->{$value}) {
      return $objectCache->{$type}->{$value};
    } else {
      my $this = { value => $value, moonPhase=>'full' };
      bless $this, $type;
      $objectCache->{$type}->{$value} = $this;
      return $this;
    }
  }

This example returns an object if we have one for that type and value. If not, it creates one, and caches it. An observant reader will note that if we cache objects, give it to two people, and one person changes it, the other will be affected. There are two solutions: pass out read-only objects, or preferably, use http://wiki.slowass.net/?ImmutableObjects.

As an alternative, Perl lets you bless scalars, which weigh about the same as an object reference. Blessed scalars aren't subject to the requirement that they be shared copies. Blessing a scalar into a package gives you an OO interface to a single value. If needed, you can later upgrade the implementation to a full blown hash, and keep the same interface.

  package TinyNumberOb;




  sub new {
    my $type = shift;
    my $value = shift; # scalar value
    my $this = \$value; # scalar reference
    bless $this, $type;
  }




  sub getValue {
    my $self = shift;
    return $$self;
  }




  sub setValue {
    my $self = shift;
    $$self = shift;
    return 1;
  }

This is kind of like Perl's Autovivication of variables and hash and array entries: things spring into existance at the moment a user asks for them.

See Also: "ImmutableObject" in ImmutableObject, "AbstractFactory" in AbstractFactory, http://wiki.slowass.net/?CopyOnWrite

See Also: http://hatena.dyndns.org/~jkondo/DesignPattern/FlyWeight/

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryIntermediate

External Pages Linking to This Page:

ImmutableObject

Synopsis: Small objects that can or should be shared, but change state.

When: You have a lot of little objects that sometimes keep one value, but sometimes change value. When someone changes the value of one, you don't want that change to show up in all of the other objects that have a pointer to that object, but you don't want to have to make a clone of that object for each object that has it, either.

Symptoms: Frequently copying objects and passing them out.

Lots and lots of tiny objects can eat up memory. If you've gone so far as to represent even little things as objects, you may find that your memory isn't going as far as it used to when everything was just a scalar. You would pass out the same object to everyone, but you really want everyone to have a private copy of it.

With a small change in how your module is used, you can declare that a given instance of it never changes values. If your object computes a new value, it returns a new instance of itself with that new value.

Instead of writing:

  $number->add(10);

You'll write instead:

  $number = $number->add(10);

Other modules using the old $number can continue doing so in confidence, while every time you change yours, you get a brand new one all your own. If your class is a blessed scalar, your add() method might look like:

  sub add {
    my $me = shift;
    my $newval = $$me + shift;
    return bless \$newval, ref $me;
  }

Returning new objects rather than changing ones that someone else might have a reference to avoids the problems of "ActionAtADistance" in ActionAtADistance with pointers - so long as you're using variables which the correct scope to store the pointers. [17]

Returning new objects containing the new state is strictly required for overloading Perl operators. Java's String class (different than http://wiki.slowass.net/?StringBuffer) are an example of this: you can never make changes to a String, but you can ask an existing String to compute a new String for you.

"StatePattern" in StatePattern talks about a mechanism for implementating state that consists of one "ImmutableObject" in ImmutableObject taking another in itis constructor, and digesting it to initialize itself. Coupled with an "AbstractFactory" in AbstractFactory to arbitrare which subtype will be used for the next object, this is a powerful construct.

Used as the output of a Flyweight from "FlyweightPattern" in FlyweightPattern.

Important concept to "OverloadOperators" in OverloadOperators.

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryIntermediate

See Also

AbstractFactory

Problem: Code that decides which of several subclasses to instantiate is being cut and paste around the program.

Solution: Centralize that logic in an object. Return a subtype of some abstract type.

When: Any time polymorphism is needed: the option of subclassing should be kept open. See "AbstractRootClasses" in AbstractRootClasses. Based on circumstance, an object may be created from one of a number of subclasses. The decision of which type of object to create doesn't seem to belongs where the object is created, but rather somewhere neutral.

Symptoms: Split a class into two, or introduce a new or different implementation of a class under a different name. Suddenly you find yourself going through all of the code looking for references to the old package. You know that if you make a similar change in the future, you'll have to go through all of the code again.

An Abstract Factory makes the decision of which class or subclass to create when. This decision making logic is tucked away in one place, rather than being spread around - think of it as http://wiki.slowass.net/?CrossSectionalRefactoring. Centralizing the logic gives us:

The return value for the method is the base class type or abstract class type (essentially the same in Perl).

Example

  package Car::Factory;
  
  sub create_car {
    my $self = shift;
    my $passengers = shift;
    my $topspeed = shift;
    
    return new Car::Ford if $topspeed < 100 and $passengers >= 4;
    return new Car::Honda if $topspeed < 120 and $passengers <= 2;
    return new Car::Porsche if $topspeed > 160 and $passengers <= 2;
    # etc
  }




  # in main.pl:




  package main;




  use Car::Factory;




  my $car = Car::Factory->create_car(2, 175); $car->isa('Car') or die;

To be http://wiki.slowass.net/?ObjectOriented "pure", each kind of car should do push @ISA, 'Car', so that they pass the $ob-isa('Car')> test required by "TypeSafety" in TypeSafety. This lets programs know that it is a car (reguardless of kind) and can thus be used interchangably. See http://wiki.slowass.net/?ObjectOriented, http://wiki.slowass.net/?PolymorphismConcept, "TypeSafety" in TypeSafety.

Refactoring

"RefactoringPattern" in RefactoringPattern may lead you to turn an object from a regular object into an "AbstractFactory" in AbstractFactory. Break code down into subclasses of ourself, and create those objects. This page is now http://wiki.slowass.net/?CategoryRefactoring. Before breaking up the code, create the subclasses.

Class AutoVivification

Before creating the subclasses, play with letting Perl do it for you. "ClassAsTypeCode" in ClassAsTypeCode says that a classes primary type can be used to distinguish it as a special case of a generic type, even if no implementation changes. This will give us a chance t prototype working with subclasses and make sure we aren't falling prey to "EmptySubclassFailure" in EmptySubclassFailure.

  package Car::Factory;




  sub create_car {
    # this way we can do Car::Factory->create_car(...) or $carfactoryref->create_car(...)
    # see NewObjectFromExisting 
    my $package = shift; $package = ref $package if ref $package;
  
    my $car = new Car::GenericAmericanCar;
  
    my $kind = ucfirst(shift());
  
    push @{$kind.'::ISA'}, 'Car', 'Car::GenericAmericanCar';
  
    return bless $car, 'Car::' . $kind;




  }

There! No matter what kind of car the user requests, we create it - even if it didn't exist before we created it. We set the @ISA array to inherit from Car and Car::GenericAmericanCar. Even if the package was completely empty, it now contains the minimal amount of definition to make it useful: an inheritance. You probably don't want to do exactly this, unless you really want the same product rebadged with a bizarre variety of different names.

$kind could be computed rather than taken verbatum from input. In most cases, you will want to compute it, as in our first example. Once computed, the package can be set up automatically.

And

Resist temptation to re bless or convert things except into subclasses: see "NoSexUntilMarriage" in NoSexUntilMarriage.

"StatePattern" in StatePattern is similar: different objects fields requests. The state object, like the "AbstractFactory" in AbstractFactory, has the criteria built in to decide which object to use. Rather than returning the selected object like the "AbstractFactory" in AbstractFactory, it merely delegates requests to that object, holding onto references to a single instance of each type of object.

Comments

A good example of an abstract factory would be building a system that worked with mod_perl 1 and mod_perl 2. Eventually, I'll get round to giving you an example of this - http://wiki.slowass.net/?NigelWetters

Yes, a useful example would be nice indeed! - "ScottWalters" in ScottWalters

This doesn't make it clear where the Car::Ford (http://www.cpan.org/modules/by-module/Car/ Ford) (etc) modules should get loaded, though. Wouldn't be better to say:

  if ($topsped < 100 and $passengers >= 4) {
    require Car::Ford;
    return new Card::Ford ;
  }

- http://wiki.slowass.net/?WilCooley

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryIntermediate

See Also

External Pages Linking to This Page:

A real example? Code from real examples is far too long to keep readers attention. I'll describe a real application, and if you want the code, you can email me. A client has a cart. Items in the cart are represented as objects. Initially, everything was an 'Item'. Donations were introduced - the client is a not-for-profit corporation. Tax and shipping is computed differently on donations added to the cart. Re-using the cart for wholesale orders is on the horizon. Once again, tax and shipping are computed differently: no tax, and shipping is actual-cost. Rather than burdon 'Item' with the special logic of examining its part number and deciding which of three personalities each method should have, one object is given the duty of creating an object of the right type from a selection of three. Each of the three different subclasses of 'Item' implement the relavent methods completely differently, while inheriting some common implementation. Still want to see the code? - "ScottWalters" in ScottWalters

FactoryObject

Problem: The exact implemenation of an object varies.

Solution: Create a factory that centralizes the decision making logic surrounding which implementation to use. Channel all requests for objects of for that role through the factory.

A "FactoryObject" in FactoryObject has a http://wiki.slowass.net/?FactoryMethod.

The basic factory always creates objects of the same concrete type. Factories, as objects, are pluggable: Which factory is used, and therefore which concrete type is created by it, can be changed.

  my $factory = new FordFactory; 




  my $wifes_car = $factory->create_car(); 
  $wifes_car->isa('Car') or die; 




  # later:




  $factory = new ChevyFactory;




  my $husbands_car = $factory->create_car();
  $husbands_car->isa('Car') or die; 

Code need not be concerned with where the cars come from, only that a Car materialize upon demand. Having a second source available for things is important. If there were only one auto manufacturer, a lot fewer people would be happy with their ride. Ralph Nader never would have won a law suit against them. The same goes for programs. Hacking up an entire program to change which implementation you use is undesireable. Sometimes you have an implementation you really want to get rid of.

Usually the decision is made at some point in configuration which factory is to be used, though it may be used to implement the "StatePattern" in StatePattern.

A Factory will always create objects of the same concrete type. Contrast this with the "AbstractFactory" in AbstractFactory:

Per "AbstractRootClasses" in AbstractRootClasses, all objects of a new type should be an http://wiki.slowass.net/?AbstractType and a concrete implementation of it. This lets you talk about objects in terms of type where "TypeSafety" in TypeSafety is concerned and not have to change those type delcarations when a new implementation is introduced.

An "AbstractFactory" in AbstractFactory will create objects of a fixed http://wiki.slowass.net/?AbstractType and a conrete type of it's chosing.

A plain old factory is useful when we're able to deturmine at some point what type all future manufactured objects should have for a concrete type. An "AbstractFactory" in AbstractFactory is suitable when this decision can never be finalized: the current state of the running program always sways the decision.

Supports Polymorphism and "LooseCoupling" in LooseCoupling.

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryIntermediate

See Also

RunAndReturnSuccessor

Synopsis: Flow control is spread all over the place. Understanding and modifying flow requires knowledge of many modules, which is error prone. Instead, you centralize transitions in flow, and represent state transitions as objects. Each state object knows how to create an object representing any state immediately accessible from itself.

When: Applications, or modules that perform many functions at different times.

Symptoms: Programs that people are scared of editing for fear of inserting terminal bugs. Programs that stop unexpectedly.

"The Halting Problem" is a subject of much research. No technique exists for predicting when an arbitrary program will suddenly stop running and bail out. Programmers of critical systems are deeply concerned with whether or not their programs contain unexpected conditions that would cause sudden, catastrophic termination. Modeling the program flow isn't a complete answer, but it addresses two important problems:

Each state would have a method that, given user input or the result of a computation, would return another state object, to be executed. Queues and Stacks can extend the possibilities: the basic idea is only to model the transitions.

  # Non ObjectOriented:




  my $parser = do {
    my $html;             # HTML to parse
    my $tag;              # name of the current HTML tag
    my $name;             # name of current name=value pair we're working on
    my $namevalues;       # hashref of name-value pairs inside of the current tag
    my $starttag = sub {
      if($html =~ m{\G()}sgc) {
        return $starttag;
      }
      if($html =~ m{\G<([a-z0-9]+)}isgc) {
        $tag = $1;
        $namevalues = {};
        return $middletag;
      }
      if($html =~ m{\G[^<]+}sgc}) {
        return $starttag;
      }
      return undef;
    };
    my $middletag = sub {
      if($html =~ m{\G\s+}sgc) {
        return $middletag;
      }
      if($html =~ m{\G<(/[a-z0-9]*)>}isgc) {
        $name = $1;
        return $middlevalue;
      }
      if($html =~ m{\G>}sgc) {
        $namevalues->{$name} = 1 if $name;
        return $starttag;
      }
      return undef;
    };
    my $middlevalue = sub {
      if($html =~ m{\G=\s*(['"])(.*?)\1}isgc) {
        $namevalues->{$name} = $1 if $name;
        return $middletag;
      }
      if($html =~ m{\G\s+}sgc) {
        return $middlevalue;
      }
      return $middletag;
    };
    return sub {
      $html = shift;
      return $starttag;
    };
  };




  open my $f, 'page.html' or die $!; read my $f, my $page, -s $f; close $f;




  $parser = $parser->($page);
  $parser = $parser->() while($parser);

Of course, rather than iterating through $parser and using it as a generator, we could blow the stack and make it do the recursive calls itself. In general, return $foo; would be replaced with return $foo-();>.

XXX I wonder if parser could do $_[0] = next object so that merely saying $parser->(foo) would work in place of $parser = $parser->(foo).. that would be nifty!

The observant reader will notice that each anonymous subroutine we define represents a state in our grammar. At any given moment, there are only a few things which are valid, so there is no point in looking for everything. Doing so would lead to confusion and bugs. We could rewrite this to be cleaner and use fewer variables, but I choose this presentation because of its extremely regular structure.

XXX http://wiki.slowass.net/?ObjectOriented example.

See Also: "StatePattern" in StatePattern, "ImmutableObject" in ImmutableObject, http://wiki.slowass.net/?StrategyObject, "MomentoPattern" in MomentoPattern, http://wiki.slowass.net/?TransactionObject

See Also: http://c2.com/cgi/wiki?RunAndReturnSuccessor) - implement the state transitions in your program as objects

Related concepts: http://wiki.slowass.net/?LazyEvaluation, "IteratorInterface" in IteratorInterface, http://wiki.slowass.net/?LexicalsMakeSense, http://wiki.slowass.net/?LambdaClosures

Credits: http://wiki.slowass.net/?DesignPatternsElementsOfReusableSoftware / http://wiki.slowass.net/?GangOfFour

http://wiki.slowass.net/?CategoryIntermediate, http://wiki.slowass.net/?CategoryExpert, http://wiki.slowass.net/?CategoryPattern

InlineObjects

new() might thought to be the creator of objects, but we know bless() is how objects are really made. Objects creation is really little more than:

  my $ob = bless { color => 'yellow', size => 'large' }, 'GetAndSet';

Of course, we need to back this up with some implementation:

  package GetAndSet;




  sub AUTOLOAD {
      my $this = shift;
      (my $method) = $AUTOLOAD =~ m/::(.*)$/;
      return if $method eq 'DESTROY';
      (my $request, my $attribute) = $method =~ m/^([a-z]+)_(.*)/;
      if($request eq 'set') {
        $this->{$attribute} = shift;
        return 1;
      }
      if(request eq 'get') {
        return $this->{$attribute};
      }
      die "unknown operation '$method'";
  }

Of course, this is considered http://wiki.slowass.net/?BadStyle. You should always use http://wiki.slowass.net/?ConstructorPattern. Okey, usually.

See Also

AboutInheritance

  /*
   * If you are going to copy this file, in the purpose of changing
   * it a little to your own need, beware:
   *
   * First try one of the following:
   *
   * 1. Do clone_object(), and then configure it. This object is specially
   *    prepared for configuration.
   *
   * 2. If you still is not pleased with that, create a new empty
   *    object, and make an inheritance of this objet on the first line.
   *    This will automatically copy all variables and functions from the
   *    original object. Then, add the functions you want to change. The
   *    original function can still be accessed with '::' prepended on the name.
   *
   * The maintainer of this LPmud might become sad with you if you fail
   * to do any of the above. Ask other wizards if you are doubtful.
   *
   * The reason of this, is that the above saves a lot of memory.
   */

- Comment as seen on core library objects in LPMud 2.4.5

Mirroring Real-Life

If you're thinking of using inheritance - @ISA in Perl - then you should be reading http://wiki.slowass.net/?AbstractClasses - there is a correct way to do it, then there is what everyone else does.

If you aren't thinking of using inheritance, then I wonder why you're reading this, and you probably are too.

LPMud is a dynamic adventure system. Players play while wizards code. New puzzles spring into being from within the game, while its running. The game is, of course, http://wiki.slowass.net/?ObjectOriented, in the name of mirrioring real-life object relationships. LPMud comes from the days when 24 megs was a lot of memory on a Unix server [18]

Given this object oriented system and these 24 megs of RAM, wizards cleverly started copying core library objects around - the player object, the monster object, the weapon object - and making changes to them for their own use - a clear case of "CutAndPasteProgramming" in CutAndPasteProgramming. As you can see, that didn't go over very well.

Modern motivations against "CutAndPasteProgramming" in CutAndPasteProgramming are different than lack of RAM. See "CutAndPasteProgramming" in CutAndPasteProgramming, and then when you're sold, "AbstractClass" in AbstractClass.

Perl's equivilent to LPMud's clone_object(ob) is ob-new()>, though a http://wiki.slowass.net/?FactoryMethod may return a cloned or configured object. See "CloningPattern" in CloningPattern. Creating object structures by holding onto references to other objects that you created with new() is known as delegation. See http://wiki.slowass.net/?DelegationConcept. This is the basis of most object patterns in this book.

Perl's equivilent to inherit is use base. See http://wiki.slowass.net/?UseBase. Creating object structures using inheritance is called http://wiki.slowass.net/?MixIns. http://wiki.slowass.net/?MixIns are best avoided. Inheritance should be used to build specialized versions of generic objects - not to generalize further, and not to combine general objects to make something.

Inheritance shouldn't be confused with exporting. Exporting adds features to a package, very much like http://wiki.slowass.net/?MixIns, but these features are used by that object only. Exporting isn't used by sane people to build new types of objects. If you want Carp, for instance, you'll use Carp yourself, and not attempt to call croak in another object that happens to use Carp. See "ExportingPattern" in ExportingPattern.

See Also

Contrast

Category

http://wiki.slowass.net/?CategoryNovice, http://wiki.slowass.net/?ConceptsCrossReference

StateVsClass

Problem: The base class is simple and subclasses are frequently implementing the same features on top of the base class, but in different combinations.

Solution: Allow objects to handle methods differently depending on their state, rather than demand that every possibily behavior be exhibited by a seperate object. Move shared behavior upwards even if not every subclass ultimately uses it. Make the base class the general, and allow subclasses to remove features - permenently or conditionally - to create special purpose version.

Given a special case of something that isn't really one at all, refactor. Gimpy versions of objects are still merely versions of those objects. Lack of feature doesn't automatically make something a candidate for superclasshood. In general, there is no harm adding functionality to the base class: this is often the cleanist solution, and the quickist way to make it available to all of the subclasses. "DecoratorPattern" in DecoratorPattern talks about a degenerate situation where http://wiki.slowass.net/?MixIns attempt to create endless combinations of features and ultimately fail.

Simple Rules:

A parrot that is as dead as a door nail is still just a special case of parrots, and parrots in general have facilities to perch(), squak(), eat() and bite(). Whether or not these facilities are working, or what the exact behavior of them can be left to the subclass. Perhaps the parrot is pining for the fjords and doesn't feel like squak()ing. Perhaps its deceased, but a parrot nonetheless.

Inheritance is "specialized case of", not "made out of". A bird is not a specialized case of a beak and legs. For composing something out of mix and match parts, use composition: see "CompositePattern" in CompositePattern.

  package Parrot;




  sub new {
    my $type = shift;
    my $me = { @_ };
    bless $me, $type;
  }




  sub perch {
    my $this = shift;
    $this->{perch} = shift;
    $this->{perch}->add_weight(38);
    return 1;
  }




  sub squak {
    print "Eeeeeeeeeeek!\n";
  }




  package Parrot::African;
  use base 'Parrot';
  
  sub squak {
    print "EEEEEEEEEEEEEEEEEEEEEEEEK!\n";
  }




  package Parrot::Pining;
  use base 'Parrot';




  sub perch {
    my $this = shift;
    return SUPER::perch(@_) if $this->{at_fjords};
    return undef;
  }
  
  sub squak {
    my $this = shift;
    return SUPER::squak(@_) if $this->{at_fjords};
    return undef;
  }

A call to squak() in a parrot is a notification that it should squak, or a request that it sqauk, never a garantee that a squak will be emitted.

"AbstractClass" in AbstractClass and "FunctionalityIsToBeShared" in FunctionalityIsToBeShared [19] tell us to move functionality as high up the inheritance chain as is useful.

"StatePattern" in StatePattern suggests delegating requests to a different object depending upon state, where each object you delegate to represents a state. This satisifies our requirement that objects not be swapped out runtime, and that polymorphism should be maintained [20], even when the bird goes into a "dead" state. We still maintain the same presentation - unlike "RunAndReturnSuccessor" in RunAndReturnSuccessor, a completely different object isn't swapped in in our place. Only behind the scenes, through a cleverly placed layer of delegating, is statehood implemented in terms of objects. This satisifies the "LawOfDemeter" in LawOfDemeter.

See Also

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryConcept, http://wiki.slowass.net/?CategoryIntermediate, http://wiki.slowass.net/?CategoryExpert

CommandObject

Synopsis: Use a Value Object to communicate the details of the action that is desired.

When: There is a proliferation of similar methods, and the interface to implement that kind of object is becoming unwieldy.

Symptoms: Too many public methods for other objects to call. An interface that is unworkable and always changing. You feel that a method name must include prose describing the exact action, and this is preventing layering your code.

A "CommandObject" in CommandObject is a case of using a http://wiki.slowass.net/?ValueObject to communicate which action is to be performed, along with any argument data. This is sent to a single method in the class that handles commands of the given type. That object is free to implement command processing with a switch, a variable method dispatch, or a call to a variable subclass. This lets you make changes to which commands are defined only in the definition of the command objects itself and the classes that actually use that command, rather than every class that wants to implement the command processing interface. It also frees up the command implementing the command processing interface to use any number of ideas for dispatching the command, once it has it:

  # example of a switch style arrangement:




  sub doCommand {
    my $me = shift;
    my $cmd = shift; $cmd->isa('BleahCommand') or die;
    my $instr = $cmd->getInstructionCode();
    if($instr eq 'PUT') {
      # PUT logic here
    } elsif($instr eq 'GET') {
      # GET logic here
    }
    # etc
  }




  # example of a variable method call arrangement:
  
  sub doCommand {
    my $me = shift;
    my $cmd = shift; $cmd->isa('BleahCommand') or die;
    my $instr = $cmd->getInstructionCode();
    my $func = "process_" . $instr;
    return undef unless defined &$func;
    return $func->($cmd, @_);
  }




  # example of a variable subclass arrangement.
  # this assumes that %commandHandlers is set up with a list of object references.




  sub doCommand {
    my $me = shift;
    my $cmd = shift; $cmd->isa('BleahCommand') or die;
    my $insr = $cmd->getInstructionCode();
    my $objectRef = $commandHandlers{$instr};
    return $objectRef ? $objectRef->handleCommand($cmd, @_) : undef;
  }

Since Perl offers AUTOLOAD, this idea could be emulated. If a package wanted to process an arbitrary and growing collection of commands to the best of its ability, it could catch all undefined method calls using AUTOLOAD, and then attempt to dispatch them (this assumes %commandHandlers is set up with a list of object references keyed by method name):

  sub AUTOLOAD {
    my $me = shift;
    (my $methodName) = $AUTOLOAD m/.*::(\w+)$/;
    return if $methodName eq 'DESTROY';
    my $objectRef = $commandHandlers{$methodName};
    return $objectRef ? $objectRef->handleCommand($methodName, @_) : undef;
  }

This converts calls to different methods in the current object to calls to a handleCommand() method is different objects. This is an example of using Perl to shoehorn a Command Object pattern onto a non Command Object interface.

XXX virtual machine as an interpreter operating on a series of command objects

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryIntermediate

See Also

IteratorInterface

Synopsis: Create a unified interface for iterating through data items.

hen: You have objects that contain sets of things, or you have objects that are arranged into structures.

Symptoms: Each package has a slightly different way to look through data items it contains.

This is a specific example of a general idea: if there is a kind of thing that needs done, create an abstract class (a package that has only empty methods) that outlines a general interface for doing it. In this case, we're concerned about looping through a collection of values:

  package Iterator;




  sub hasNext { die; }
  sub getNext { die; }

Other packages can come along and add Iterator to their @ISA list. They will need to redefine these methods. Now we have a uniform way of doing something. If a method in an object is expecting an Iterator as its argument, it has a way of checking to see if its argument really is an Iterator. It can be an Iterator and anything, else, too. This supports Type Safety.

This is a simple case. If an object doesn't directly contain the values, but instead references a network of items, we can recurse over them. This can be wrapped in an Iterator interface.

  package SampleTree;




  sub new {
    my $type = shift;
    my $this = { @_ };




    bless $this, $type;
  }




  sub getIterator {
    my $this = shift;
    return new Foo::Iterator node=>$this;
  }




  sub get_left {
    my $this = shift;
    return $this->{'leftNode'};
  }




  sub get_right {
    my $this = shift;
    return $this->{'rightNode'};
  }




  package SampleTree::Iterator;




  sub new {
    my $type = shift;
    my %opts = @_;
    my $this = {state=>0, iterator=>undef, node=>$opts{'node'};
    bless $this, $type;
  }




  sub getNext {
    my $this = shift;
    my $result;
    if($this->{'iterator'}) {
      $result = $this->{'iterator'}->getNext();
    }
    if($result) { 
     return $result;
    } elsif($this->{'state'} == 0) {
      # try the left node
      $this->{'iterator'} = $this->{'node'}->get_left();
      $this->{'state'} = 1; 
      return $this->getNext();
    } elsif($this->{'state'} == 1) {
      # try the right node
      $this->{'state'} = 2;
      $this->{'iterator'} = $this->{'node'}->get_right();
      return $this->getNext();
    } else {
       # state == 2
       return undef;
    }
  }

This [21] code allows a network of objects having the getIterator method to cooperatively and transparently work together. Each object in the network may have a different way of iterating. This example represents a tree datastructure. The tree may contain other tree nodes, array objects, queues, and so forth. As long the network consists of objects with a getIterator() method that returns an object that implements the Iterator iterface, we can crawl through the whole thing. Thats composition you can take to the bank and smoke!

Iteratoring through data sets which your object contains or which other objects contain is all fine and dandy, but this same interface gives us everything we need to iterator over data sets that don't exist at all, except perhaps in our imagination. The things we iterate over could be things that we know to exist from theory, like prime numbers. Computing things from a large set as they are needed, rather than beforehand, is called http://wiki.slowass.net/?LazyEvaluation. http://wiki.slowass.net/?LazyEvaluation lets you set up pipelines where different parts of the program do operations on data as it is generated or read. Contrast this with the typical Perl approach of slurping everything into memory, then working on it:

  # slurp everything into memory, then work on it:




  open my $file, 'dataset.cvs' or die $!;
  read $file, my $data, -s $file or die $!;
  close $file;




  foreach my $i (split /\n/, $data) {
    # process
  }




  # process as we read:




  my $process = sub {
    # process
  };




  open my $file, 'dataset.cvs' or die $!;
  while(my $record = <$file>) {
    $process->($record);
  }
  close $file;

Returning all of the data from a get_ method fosters slurping everything into memory. This fosters programers which are limited by memory in how large of datasets they can work on. You can chuckle and say that virtual memory will take up the slack, but if I can tell you that there are a heck of a lot of multi terrabyte data warehouses kicking around the world. Dealing with data in place, where your storage is essentially at capacity at all times, or having multiple clients process a very large dataset in parallel demands efficiency. There are still a few applications for good programmers and a few applications for good programmers to write.

The second example above, rewritten as a provider:

  package RecordReader;




  use ImplicitThis;
  @ISA = qw(Interface);




  sub new {
    my $type = shift;
    my $file = shift;
    open my $filehandle, $file or die $!;
    my $me = { handle => $filehandle, next => undef };
    bless $me, $type;
  }




  sub getNext {
    return $next if defined $next;
    return <$handle>;
  }




  sub hasNext {
    return 1 if defined $next;
    $next = <$me>;
    if($next) {
      return 1; 
    } else {
      close $fh;
      return 0;
    }
  }

Compare this to Java's IO Filters, which will superimpose read of datastructures, international characters, and so forth on top of IO strems: you'll find it to be remarkably similar. It lets users mix and match IO processing facilities.

Iterating and Overloading

Perl overloads the "++" operator to iterator strings through a useful realm of values:

  $a = "aaa"; $a++; print $a, "\n"; # prints "aab"

See "OverloadOperators" in OverloadOperators for how to create constructs like this yourself in Perl according to this formula:

XXX - an exmample of exactly this would be really nice

Sieve of Eratosthenes

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/117119 - implemented in Python. A Perl version would be a nice example for Perl iterators.

See Also

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryConcept, http://wiki.slowass.net/?CategoryIntermediate

PassingPattern

Generally, create things where it makes sense to, and pass them down constructors, leaving contained objects to do with them what they wish, rather than making assumptions about their structure. Given an A that creates a B, and B needs a C that only A can create, create C, pass it to A's constructor, and let A pass it to B itself, rather than trying to take charge and set up all of the relationships yourself. Extension of idea of encapsulation.

[22]

[23]

[24]

http://wiki.slowass.net/?CategoryToDo, http://wiki.slowass.net/?CategoryConcept, http://wiki.slowass.net/?CategoryPattern

See Also

WrapperModule

Synopsis: Want to use several modules across a collection of scripts, but don't want dozens of "use" lines at the top of each.

There is incentive not to split up bloated modules due to the need to go through and edit all of the scripts to use each new module-spawn. This also has all of the markings of a problem that resurfaces: should you refactor again, you'll be changing all of your modules. Leaving everything in one module is tempting.

In days of lore, Perl programmers would require a single config.pl that set up variables and requireed other modules for them. use doesn't automatically preclude this - merely leave off the package statement, and you'll continue operating in the namespace of the program that used your module.

For example, in config.pm:

  # note: no package statement




  use DBI;
  use CGI;
  use Mail::Sendmail;

Back in the main program:

  use config;




  my $userid = CGI::param('userid');
  # etc...

my variables are file-global when declared outside of any code blocks, which means that we can't easily declare lexical variables in config.pm and have them show up in the main program. We can co-opt the import() method of config.pm to create local variables in the main program, though:

  # back in config.pm:




  my %config = (
    maxusers => 100,
    retriespersecond => 2,
    loglevel => 5
  );




  sub import {
    my $caller = caller;
    foreach my $i (keys %config) {
      local ${$caller.'::'.$i};
      *{$caller.'::'.$i} = $config{$i};
    }
  }

This will atleast squelsh any warnings Perl would otherwise emit and let us return to importing configuration dependent values from a configuration file.

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryIntermediate

See Also

External Pages Linking to This Page:

AnonymousSubroutineObjects

Problem

Perl's http://wiki.slowass.net/?ObjectOriented programming interface sucks. "InstanceVariables" in InstanceVariables are slow to access, and require a special syntax that is unsightly and prevents easily converting procedural code to OO code. Subclass data can clobber superclass instance data unless manually prefixed with the class name. Or, you just want to integrate the http://wiki.slowass.net/?LambdaProgramming style with the http://wiki.slowass.net/?ObjectOriented programming style to harness their respective strengths.

Solution

Mix http://wiki.slowass.net/?ObjectOriented and http://wiki.slowass.net/?LambdaProgramming styles to deal with the ugliness of Perl's "InstanceVariables" in InstanceVariables syntax, write more concise program, and use scopes for implicit data flow rather than manually passing to and reading from constructors.

http://wiki.slowass.net/?LambdaProgramming's concept of automatically binding code to a perticular variable created at a perticular time is the perfect replacement for using hashes or arrays to contain instance data. Instead, routines magically hang on to the normal scalars, hashes, arrays, and so forth that were defined with my when the object was created. All that is needed is a block to set up the lexical context (define the my variables in) and a little glue.

Blessed Coderef

One of the strengths of the http://wiki.slowass.net/?LambdaProgramming style is ease of doing things like "InnerClasses" in InnerClasses. Logic and data can be bundled without having to type out the name of each variable to pass it to a constructor, and without having to read it and assign it names in the constructor. Instead, the new object is automagically coupled to the object that it was created in, and variables that are in scope when the new object was created remain in scope and available for use in future calls.

You should be familiar with this by now:

  package Preferences;




  sub new {
    my $class = shift;
    my %args = @_;
    bless {color=>$args{'color'}, petname=>$args{'petname'}, street=>{'street'} }, $class;
  }




  sub query_color { return $_[0]->{'color'}; }
  sub set_color { return $_[0]->{'color'} = $_[1]; }
  # other accessors here




  1;




  package main;




  $| = 1;




  print "Whats your favorite color? "; my $color = <STDIN>;
  print "Whats your pets name? "; my $petname = <STDIN>;
  print "What street did you grow up on? "; my $street = <STDIN>;




  my $foo = new Preferences (color=>$color, petname=>$petname, street=>$street);

The string "color" appears ten times. Ten! In Perl, no less. If I wrote out the constructors for the other arguments, this would be repeated for each variable. Shame. If we trust the user to pass in the right things to the constructor, we can get rid of two. Still, even typing each thing eight times is begging for a typo to come rain on your parade.

If you're a LISP or Scheme programmer, you wouldn't even consider writing an autocracy like this. You'd probably write something like:

  package main;




  $| = 1;




  sub get_preferences {
    print "Whats your favorite color? "; my $color = <STDIN>;
    print "Whats your pets name? "; my $petname = <STDIN>;
    print "What street did you grow up on? "; my $street = <STDIN>;
    return MessageMethod sub {
      my $arg = shift;
      ({
        query_color => sub { return $color; }
        set_color => sub { $color = shift; return 1; }
        # etc
      }->{$arg} || sub { die "Unknown request: $arg" })->(@_);
    };
  }




  my $ob = get_preferences();
  print $ob->get_street(), "\n";

First, the { query_name = sub { } }->{$arg}->(@_)> is a sort of switch/case statement. It creates an anonymous hash of names to functions, then looks up one of the functions by name, using the first argument passed in. Once we have that code reference, we execute it and pass it our unused arguments. Then we've added a default case to it, so we don't try to execute undef as code. This could have been coded using if/elsif/else just as easily.

Don't confuse the this case idiom {name=sub{}}->{$arg}->(@_)> with =8-()<>, the rubber chicken idiom.

The get_preferences() routine sets some variables, then returns a code reference. my variables get created when they're declared, and they don't get destroyed until no one can see them any more. Since the code reference we're returning when we say return MessageMethod sub { } can see these variables, and we can see this code reference, Perl doesn't get rid of them. They continue to live on, and keep their same values, as if the subroutine they were created in had never returned. What this means to us is that we don't have to copy the value from one variable into a hash when we create an object! This saves us having type the variable name as we pass it, specify what the variable should be named in the hash that gets passed, then goes on to save us from having to do the same steps in reverse once the object gets the hash passed to it. With the same security, we've cut the use of the word "color" in half, down to 5 uses.

If you think of Perl's sub { } feature as preserving the exact state of "my" variables in a routine, you'll think of countless applications for returning anonymous subroutines. Object Oriented object creation is much more explicit, so you may find yourself getting lost in code like this. If you figure out where an anonymous subroutine was defined, then start reading the code leading up to it, you'll find where the variables are declared, and where their values are set. The cost of the reduced typing is reduced redundancy, which can make the program both harder and easier to read at the same time.

Normal Objects:

Lexically Defined Object:

There is one little mystery left, though. Code references are dereferenced using the $ref-(@args)> syntax. $ref-function(@args)> syntax is reserved for objects. We shouldn't be able to call $ob->get_street() in our example on a code reference -- unless that code reference has been blessed into a package. It just so happens that that is exactly what MessageMethod does.

  package MessageMethod;




  sub new {
    my $type = shift;
    return $type->new(@_) if ref $type eq __PACKAGE__;
    my $ref = shift; ref $ref eq 'CODE' or die;
    bless $ref, $type;
  }




  sub AUTOLOAD {
    my $me = shift;
    (my $method) = $AUTOLOAD =~ m/::(.*)$/;
    return undef if $method eq 'DESTROY';
    return wantarray ? ($me->($method, @_)) : scalar $me->($method, @_);
  }




  1;

Given a code reference, MessageMethod blesses it into its own package. There are no methods aside from new() and AUTOLOAD(). AUTOLOAD() handles undefined methods for Perl, and since there are no methods, it handles all of them. (There is an exception to that, where new() has to pass off requests). AUTOLOAD() merely takes the name of the function it is standing in for and sends that as the first argument to a call to the code reference, along with the rest of the arguments. We're translating $ob-foo('bar')> into $ob-('foo', 'bar')>. This does nothing but let us decorate our code reference with a nice OO style syntax.

This is similar to Python's method lookup logic XXX, in that it returns the method as an object.

Blessed Hash full of Coderefs

The previous example was simplicity itself. This one is usefullness itself. Doing if and elsif in a chain to inspect an argument to see which clause to run to simulate methods is the http://wiki.slowass.net/?LambdaProgramming paradigm, but http://wiki.slowass.net/?ObjectOriented's concept of automatically dispatching to methods is superior. Obviously, a single code reference isn't enough to let OO do its dispatch magic. We need something larger - something like a hashref that contains a bunch of coderefs, one coderef per method. The normal thing to do in Perl is to put all of the code directly in the package, using the symbol table (or stash, or namespace, or what have you) to hold all of the code references, and define the code references using a simple named sub statement. This doesn't allow each instance of the object to have different code references lexically bound to different "InstanceVariables" in InstanceVariables. We need private storage for the code references and the anonymous version of the sub statement. We need hashclosure.pm.

  # place this code in hashclosure.pm




  # tell Perl how to find methods in this object - run the lambda closures the object contains




  sub AUTOLOAD {
      (my $method) = $AUTOLOAD =~ m/::(.*)$/;
      return if $method eq 'DESTROY';
      our $this = shift;
      if(! exists $this->{$method}) {
        my $super = "SUPER::$method";
        return $this->$super(@_);
      }
      $this->{$method}->(@_);
  } 
  
  1;

This code translates method calls into invocations of anonymous subroutines by the same name inside of a blessed hash: when a method is called, we look for a hash element of that name, and if we find it, we execute it as a code reference.

The flow of control goes something like:

/\/\/\/\

graph: {

  title: "Dispatch Order"
  color: lightcyan
  manhattan_edges: yes
  edge.color: lilac
  scale: 90




  node: { title:"A" label: "$foo = new Foo(); \n$foo->bar();" }
  node: { title:"A1" label: "Foo::new()" }
  node: { title:"B" label: "Foo::AUTOLOAD()" }
  node: { title:"C" label: "$foo->{'bar'}->() runs" }
  edge: { sourcename:"A" targetname:"A1" anchor: 1}
  edge: { sourcename:"A" targetname:"B" anchor: 2}
  edge: { sourcename:"B" targetname:"C" }

}

/\/\/\/\

Dropping the above code verbatum into a .pm file it doesn't change package (there is no package statement), so it defines an AUTOLOAD() method for the current package. This is a "WrapperModule" in WrapperModule of sorts. http://wiki.slowass.net/?LambdaClosures and our AUTOLOAD() method work together to provide http://wiki.slowass.net/?ImplicitThis-like easy access to $this and "InstanceVariables" in InstanceVariables. We can use object instance specific field variables directly without having to dereference a hash.

  package Foo;




  sub new {




    my $class = shift;  
    my %args = @_;
    our $this;




    my $foo;
    my $bar;




    bless {




      get_foo => sub { return $foo },
      set_foo => sub { $foo = shift },
      get_bar => sub { return $bar },
      set_bar => sub { $bar = shift },




      get_foo_bar_qux => sub {
        return $this->get_foo(), $this->get_bar(), get_qux();
      },




      dump_args => sub {
        foreach my $i (keys %args) {
          print $i, '=', $args{$i}, "\n";
        }
      },




    }, $class;




  }




  sub get_qux { return 300; }

This blesses an anonymous hash reference into our package, Foo. This hash reference contains method names as keys and anonymous subroutines as values. AUTOLOAD() knows how to look into our hash and find methods by name, and run them, rather than looking for methods in the normal place.

our is a strange beast. It gives us a my style lexical alias to a local style variable. We could use a local variable here, but our has a nicer syntax, and it keeps us in the lexical mode of thought.

$foo, $bar, $this, $class and %args are all lexical variables, and the subroutines we create with sub { } are http://wiki.slowass.net/?LambdaClosures because they reference these variables. By referencing them, they bind to the one specific copy that was created when new() is entered. That means that each object has its own private $foo, for instance, and can access it directly. get_qux() is defined as a normal method in the preceding example. In any OO Perl code, failing to do something like $this-method()> to call other functions in your code prevents inheritance from overriding those methods. Using this syntax keeps open the possibility of creating "TemplateMethod" in TemplateMethod. Where we explicit don't want subclass redefinitions of methods to be used, way can use the $this-Foo::method()> syntax, where Foo is the name of the class to search for method() in, usually our own package or our direct parent.

Methods may also be defined normally and placed next to new(). This is useful for utility methods, or static methods in C++ or Java. Methods must be defined this way to be called without using the $this-method()> syntax. $this-method()> is required to get the AUTOLOAD() logic to kick in as otherwise Perl has no knowledge of how to locate the code responsible for handing your method.

This is my own personal favorite idiom for creating objects in Perl: it requires the least code to acheive, and the least work on my part, and the least chance of error.

CPAN class Package

In other news, "PerlMonks" in PerlMonks:116725 defines a class package usable as such:

    my $class = new class sub{
        my $field = shift;
        $this->field = $field;
        $this->arrayref = [1,2,3];
        $this->hashref = {a => b, c => d};
        $this->method = sub{ return $this->field };
    };

...allowing the anonymous, inline construction of classes.

Abagail's Inside-Out Objects

Qouting Abagail:

  package BaseballPlayer::Pitcher; {
      use vars '@ISA';
      @ISA = 'BaseballPlayer';
 
      my (%ERA, %Strikeouts);
   
      sub ERA        : lvalue {$ERA        {+shift}}
      sub Strikeouts : lvalue {$Strikeouts {+shift}}
      sub DESTROY {
          my $self = shift;
          delete $ERA {$self}, $Strikeouts {$self}
      }
  }

Taking this apart, lexical data is used instead of nametable variables, which doesn't seem to make any difference. Rather than indexing the blessed reference by a constant field name to come up with a per-object, per-field storage slot, one of these lexicals is indexed by the stringified object reference.

See "PerlMonks" in PerlMonks:178518 for more

Categories

See Also

ConstraintSystem

Problem: Difficult to solve problem. All of the related logic is huge and no control structure or organisation seems to be adequate.

Solution: Model the problem using connectors and logic items. Let scenarios play themselves out recursively across the network.

This rather large example was adapted from code in http://wiki.slowass.net/?StructureAndInterpretationOfComputerPrograms, an excellent book. The program was originally written in Scheme, the languaged featured in Structure and Interpretation. Even if you write nothing but Perl, C or Java all of your life, I highly recommend this book. Decomposing problems into functions is the first cautious step in learning to program; decomposing programs into objects could be seen as a second and factoring out the recursive nature of complex problems a third. Complexity is the program killer, and its management is paramount in scaling programs as well as solving problems.

In addition to adopting the example to Perl, I've adopted it to use objects rather than lambda closures. This made the code longer and less elegant, but verbose borish implementation is considered a virtue in this day and age.

Constrain::new() is a wee little factory that spits out subtypes on demand. We're not actually using this right now in our code because by the time we got to the bottom of the file we forgot that we had done that. Using a factory as such is a good policy: it adds a layer of abstraction in the creation of objects, and each layer of abstraction is insurance against change, giving us a single place where we can translate the old interface to whatever is new.

Constrain::Adder is our first and only serious logic componenet. It should be refactored into a base class with a http://wiki.slowass.net/?TemplateFunction and a sample implementation. Perhaps I'll get around to this later XXX, as it would make this code more directly useful to random purposes. When told what its value should be, it lashes back, sending a message out on one of its connectors informing the objects on that connector what value they must have to satisfy the condition. The Adder object does whatever it must to satisfy the constraint. The three inputs are identical in that they are all connections that may be connected to any other logic devices. They differ in that the last will be the sum of the first two. If any single inputs value is unspecified, a value will be sent out on that connector. If all values are specified after a new value comes in, the last output is the one we force to fit the constraint. Should it not wish to do so, it may in turn push out a new value by calling setvalue() on the connector. Eventually, a solution that all nodes are happy with will be arrived at, or else every possibility will be exhausted. XXX, return failure should we be unable to arrive at a solution. This component has exactly three connections.

Constrain::Probe describes an object that merely repeats to the screen any value it is told to have. This componenet has exactly one connection.

Constrain::Constant asserts a value on the wire and refuses to accept any other value. Should it be told to be another value, it fights back, pushing its own value back out again. This componenet has exactly one connection.

Finally, Constrain::Connector isn't a logical component at all - just a wire or messenger between them. It has no behavior of its own other than to relay messages from one connection out on the other connections. The above components each have a fixed number of inputs - not so with a connector. A connector may be connected to any number of components.

  package Constrain;
  
  # component - anonymous functions that exert force on each other.
  #             these are generated by various functions, much as an
  #             object in OO Perl would be created.
  
  sub new {
  
    my $type = shift;
    my $subtype = shift;




    return new Constrain::Adder(@_)     if $subtype eq 'adder';
    return new Constrain::Constant(@_)  if $subtype eq 'constant';
    return new Constrain::Probe(@_)     if $subtype eq 'prober';
    return new Constrain::Connector(@_) if $subtype eq 'connector';




    warn "Unknown Constrain subtype: $subtype";




  }
  
  package Constrain::Adder;
  
  sub new {
    my $type = shift;
  
    my $a1 = shift;         # the name of our first connector
    my $a2 = shift;         # the name of 2nd connector we are tied to
    my $sum = shift;        # the name of 3rd connector we are tied to
  
    my $obj = { a1=>$a1, a2=>$a2, sum=>$sum };
    bless $obj, $type;
  
    $a1->xconnect($obj);
    $a2->xconnect($obj);
    $sum->xconnect($obj);
  
    return $obj;
  
  }
  
  sub forgetvalue {
     my $this = shift;
  
     $a1->forgetvalue($obj);
     $a2->forgetvalue($obj); 
     $sum->forgetvalue($obj); 
     $this->set_value(undef);
  }
  
  sub setvalue {
    my $this = shift;
    local *a1 = \$this->{a1};
    local *a2 = \$this->{a2};
    local *sum = \$this->{sum};
  
    if($a1->hasvalue() and $a2->hasvalue()) {
      $sum->setvalue($a1->getvalue() + $a2->getvalue(), $this);
  
    } elsif($a1->hasvalue() and $sum->hasvalue()) {
      $a2->setvalue($sum->getvalue($sum) - $a1->getvalue($a1), $this);
  
    } elsif($a2->hasvalue() and $sum->hasvalue()) {
      $a1->setvalue($sum->getvalue() - $a2->getvalue(), $this);
    }
  }
  
  sub dump {
     my $this = shift;
     local *a1 = \$this->{a1};
     local *a2 = \$this->{a2};
     local *sum = \$this->{sum};
  
     print("a1 has a value: ", $a1->getvalue(), "\n") if $a1->hasvalue();
     print("a2 has a value: ", $a2->getvalue(), "\n") if $a2->hasvalue();
     print("sum has a value: ", $sum->getvalue(), "\n") if $sum->hasvalue();
  }
  
  package Constrain::Constant;
  
  sub new {
  
    my $type = shift;
  
    my $value = shift;     # our value. we feed this to anyone who asks.
    my $connector = shift; # who we connect to.
  
    my $obj = { value => $value, connector => $connector };
  
    bless $obj, $type;
  
    $connector->xconnect($obj);
    $connector->setvalue($value, $obj);
  
    return $obj;
  
  }
  
  sub setvalue {
    my $this = shift;
    my $value = shift; 
    $this->{connector}->setvalue($value, $this);
  }
  
  sub getvalue {
    my $this = shift;
    return $this->{value};
  }
  
  package Constrain::Probe;
  
  sub new {
  
    my $type = shift;
    my $connector = shift;
    my $name = shift;
  
    my $obj = { connector => $connector, name => $name };
    bless $obj, $type;
  
    $connector->xconnect($obj);
  
    return $obj;
  
  }
  
  sub setvalue {
    my $this = shift;
    my $name = $this->{name};
    print "Probe $name: new value: ", $this->{connector}->getvalue(), "\n";
  }
  
  sub forgetvalue {
    my $this = shift;
    my $name = $this->{name};
    print "Probe $name: forgot value\n";
  }
  
  package Constrain::Connector;
  
  sub new {
  
    my $type = shift;
    my $obj = { informant=>undef, value=>undef, dontreenter=>0, constraints=>[] };
    bless $obj, $type;
  
  }
  
  sub hasvalue {
    my $this = shift;
    return $this->{informant}; 
  }
  
  sub getvalue {
    my $this = shift;
    return $this->{value};
  }
  
  sub setvalue {
    my $this = shift;
    local *constraints = \$this->{constraints};
    my $newval = shift;
    my $setter = shift or die;
  
    return if $this->{dontreenter}; $this->{dontreenter} = 1;
  
    $this->{informant} = $setter;
    $this->{value} = $newval;
  
    foreach my $i (@$constraints) {
      $i->setvalue($newval, $this) unless $i eq $setter;
    } 
  
    $this->{dontreenter} = 0; 
  }
  
  sub forgetvalue {
    my $this = shift;
    local *constraints = \$this->{constraints};
    my $retractor = shift;
  
    if($this->{informant} eq $retractor) {
      $this->{informant} = undef;
      foreach my $i (@$constraints) {
        $i->forgetvalue($this) unless $i eq $retractor;
      }
    }
  }
  
  sub xconnect {
    my $this = shift;
    local *constraints = \$this->{constraints};
    local *value = \$this->{value};
    my $newconstraint = shift or die;
  
    push @$constraints, $newconstraint;
    $newconstraint->setvalue($value, $obj) if $value;
  
  }
  
  package main;
  
  my $a =         Constrain::Connector->new();
  my $a_probe =   Constrain::Probe->new($a, 'a_probe');
  
  my $b =         Constrain::Connector->new();
  my $b_probe =   Constrain::Probe->new($b, 'b_probe');
  
  my $c =         Constrain::Connector->new();
  my $c_probe =   Constrain::Probe->new($c, 'c_probe');
  
  my $a_b_adder = Constrain::Adder->new($a, $b, $c);
  
  my $a_const =   Constrain::Constant->new(128, $a);
  
  my $b_const =   Constrain::Constant->new(256, $b);

XXX - constraint system example - IK system using X11::Protocol (http://www.cpan.org/modules/by-module/X11/ Protocol)?

XXX- constraint system example - traffic lights

XXX- constraint system with tied variables... $tempcelcius = 100; print $tempfarenheight;

http://wiki.slowass.net/?CategoryExport, http://wiki.slowass.net/?CategoryPattern

See Also

RevisitingNamespaces

Problem: Functionality doesn't exist in a class, but should. Subclassing to add the functionality isn't appropriate - the features are needed in all existing objects retroactively.

Solution: Change over to its package temproarily to define some new methods in the existing package.

Scenarios

Examples

  *{'ExistingPackage::new_function} = sub {
     # new accessor
  };




  sub ExistingPackage::new_function {
    # new accessor
  };

Any object created from ExistingPackage will instantly have a method, new_function(), after this code is run. Both examples do essentially the same thing. The first is uglier, but allows closures to be taken. Perl still considers the new function to be the package it was defined in [25]. This means that we can't use lexical data that was in scope when ExistingPackage was originally created, nor can we use UseVars and OurVariables that exist in ExistingPackage.

Examples exist of using lexically scoped my variables for the purpose of keeping people away from your data. While not completely fool proof, it does make it inconvinient. UseVars and OurVariables are easier.

  sub ExistingPackage::new_function {
    my $self = shift;
    local *existing_var = \${ref($self) . '::existing_var'};
    # code here that uses $existing_var freely, as if it were in
    # out package scope.
    $existing_var++;
  }

The local *glob = selfref idioms is, well, ugly. We compute the name of the variable - find the package that $self was blessed into, concatonated with "::existing_var", and then used as a soft reference. A reference is then taken to that soft reference using the backslash operator. See http://wiki.slowass.net/?ComputedReferences.

  local $ExistingPackage::new_variable;

$new_variable will be static - individual objects won't have their own copy. See "StaticVariables" in StaticVariables. This is usually not the desired result.

To add, and initialize, a new variable to each instance of objects from this package, redefine the constructor, new(), before any objects are made from it:

  do {
    my $oldnew = \&ExistingPackage::new;
    *ExistingPackage::new = sub {
      my $self = $oldnew->(@_);
      $self->{new_variable} = compute_value();
      $self;
    };
  };

This defines a new() routine in ExistingPackage that invokes the old new() routine using the reference we saved in $oldnew. This reference is passed all of the arguments given to the replacement new() routine. This assumes that the datastructure underlieing objects defined by ExistingPackage is a hash reference: $self-{new_variable}> would need to be changed to something similar to $self-[num]> if it were an array. compute_value() is a place holder for whatever logic you really want to do. We insert this value forcefully, disreguarding http://wiki.slowass.net/?AccessorsPattern. Finally, we return the modified $self. The return operator breaks the tieing on perl 5.6.1 and perhaps later, so we just let the last value of the block fall through.

Use the x = sub { } version of sub: it waits until run time to return the code reference, allowing a closure to be taken. See http://wiki.slowass.net/?LambdaClosure. We're taking a closure on $oldnew, in this example: we have to wait to bind to this variable until the specific instance of that variable we want has been created. This is being done in side of a do { } block so as not to pollute our lexical context with variables that don't need to be in our scope. See http://wiki.slowass.net/?LexicalsMakeSense.

Example: B::Generate - [26]

See Also

ObjectsAndRelationalDatabaseSystems

Its useful to borrow the idea of relationships from Relational Database Management Systems (relational databases). In fact, many large enterprise applications are actually collections of specialized applications all built around one large data warehouse. Records in the database are represented in software by objects. These objects can be queried for things they related to: other objects representing records.

In the same way, object oriented systems that aren't backed by a database still have one to one, one to many, and many to many relationships between the objects. It can be useful exercise to sit down with pencil and paper and map out which kinds of relationships which kinds of objects are going to have. This often exposes design limits in a system where the things can happen in reality that the software isn't prepared for.

[27]

  package DBI::Record;
  
  my $foreign_keys = {};
  
  sub import {
    # read foreign key information
    # translates a foreign column name to a table to its table
    # $foreign_keys{'FooID'} = 'Foo';
    while(my $i = shift) {
      $foreign_keys{$i} = shift;
    }
  }
  
  sub new {
    my $type = shift; $type = ref $type if ref $type;
    my $me = { };
    my $usage = 'usage: new DBI::Record $dbh, $sql | ($sth, $sth->fetchrow_hashref())';
    my $dbh = shift; ref $dbh or die $usage;
    my $rs = shift; my $sth; my $sql;
    die $usage unless @_;
    if(ref $_[0]) { 
      $sth = shift;
      $rs = shift or $rs = $sth->fetchrow_hashref();
    } else {
      $sql = shift;
      $sth = $dbh->prepare($sql); $sth->execute(); $rs = $sth->fetchrow_hashref();
    }
    $me->{'database_handle'} = $dbh;
    $me->{'record_set'} = $rs;
    $me->{'statement_handle'} = $sth;
    # generate accessors
    foreach my $i (keys %$rs) {
      *{$i} = sub {
        my $me = shift;
        my $sth = $dbh->prepare("select * from $foreign_keys{$i} where $i = $rs->{$i}");
        $sth->execute();
        my $newrs = $sth->fetchrow_hashref;
        return $me->new($dbh, $newrs, $sth); 
      }
    }
    bless $me, $type;
  }
  
  sub next {
    my $me = shift;
    my $sth = $me->{'statement_handle'} or return undef;
    my $newrs = $sth->fetchrow_hashref() or return undef;
    return $me->new($me->{'database_handle'}, $sth, $newrs);
  }
  
  package main;
  
  use DBI::Record
    CustomerID => Customers,
    BillID => Bills;
  
  use DBI;
  
  my $dbh = DBI->connect("DBI:Pg:dbname=geekpac", 'ingres', '') or die $dbh->errstr;
  
  my $customer = new DBI::Record $dbh, "select * from Users limit 1";
  
  my $bill = $customer->BillID();
  while($bill) {
    print $bill->{'BillID'}, " ", $bill->{'Amount'}, "\n";
    $bill = $bill->next();
  }

This makes it easy to navigate relationships in a relational database system, but doesn't do a lot for us in the way of reporting.

[28]

http://www.objectarchitects.de/ObjectArchitects/orpatterns/index.htm - design patterns of objects and relational database systems

http://www.martinfowler.com/eaaCatalog/singleTableInheritance.html is actually somewhat interesting, and begins to touch on the idea of data cubes - flattening and restoring hyperdimentional data structures into two dimentions.

http://www.martinfowler.com/eaaCatalog/concreteTableInheritance.html is prevelent, though not insightful, and should be illustrated here in depth. It ties in with http://wiki.slowass.net/?BeanPattern, too. If a relational database and an object system each match up part to part - table for class - the object system will work through normal delegation and composition. The database will also "just work", though newbies will need to learn how to write large-ish queries that do lots of outter joins. Detecting NULL for key fields replaces ->can(), or is used when constructing queries that build systems of objects, and ->can()/->isa() information is needed. This gets into datacube stuff, too.

http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryIntermediate, http://wiki.slowass.net/?CategoryExpert

See Also

SelfJoiningData

Problem: Reporting on a database with only one table, or a self-referential data structure.

Solution: Use relational database capabilities to join anyway, joining the data to itself, write queries to normalize the data [29], or refactor the database. For datastructures, use loops, temporarily hashes, and http://wiki.slowass.net/?BreadthFirstRecurssion to make sense of the data [30].

"SelfJoiningData" in SelfJoiningData refers to data not spread across objects or tables or different types. Instead, everything is of the same type or in the same table, and this data forms a web of internal references. Sometimes powerful, usually applied incorrectly. In the database world, refered to as "non-normalized" or "flat".

When relational data isn't normalized, you get something like:

  select self1 as foo, self2 as bar
  from   self as self1,
         self as self2
  where  self1.name = self2.param

Note how the table self is being joined against the table self. This is where the name comes from.

Or something like:

  foreach my $i (keys %hash) {
    if(exists $hash{$i} and exists $hash{$hash{$i}}) {
      push @results, [$i, $hash{i}, $hash{$hash{$i}}];
    }
  }

Ugly, slow, crude, effective. People have been known to write code generators and SQL generators when faced with degenerate cases like these that automate ugliness production. I guess you could categories this as an http://wiki.slowass.net/?AntiPattern in the form of a http://wiki.slowass.net/?CodeSmell.

The more fields you want back from the database, the more times you have to self-join the data.

Pretend you have a database that stores form submits. forms has one record per post, but since an HTML form has any number of name-attribute pairs, several entries in parameters reference the entry in forms for any given post. Given a formid, we want to extract a few named parameters: "email", "name", and "gender":

  select  p1.value as email,
          p2.value as name,
          p3.value as gender
  from    form
          parameters as p1,
          parameters as p2,
          parameters as p3
  where   form.formid = ?
  and     p1.formid = form.formid
  and     p1.name = 'email'
  and     p2.formid = form.formid
  and     p2.name = 'name'
  and     p3.formid = form.formid
  and     p3.name = 'gender'

Each additional field requires 4 additional lines in our query. If we were joining the additional tables in, it would take 2:

  select emails.email   as email, 
         names.name     as name, 
         genders.gender as gender
  from   forms, emails, names, genders
  where  forms.formid  = ?
  and    forms.nameid   = names.nameid
  and    forms.emailid  = emails.emailid
  and    forms.genderid = gender.genderid

Obviously, lumping everything in one table would simplify further in this case, and in this case would be perfectly acceptable. When not all of the columns describe the primary key and only the primary key, the database design degenerates. "SelfJoiningData" in SelfJoiningData usually comes about as a means to cope with trying to report on such degenerate databases.

Simply, different kinds of things should be placed in different tables. The structure (which table references which IDs in which other tables) shows the relationship between the different kinds of things. One to many relationships require two tables; many to many relationships require three.

Datastructures

Solutions:

XXX this place is a placeholder. You can fix it up yourself, or you can wait for me to do it. If you are here expecting a finished version of something, stick to http://wiki.slowass.net/?TinyCGI:assemble.cgiPerlDesignPatterns and don't wander off the path.

  <TakeFive> Juerd: I think I'm going to go with multiple tables after all.
             It will save me headaches in the future.
             And I can pull them (assuming the 'header' record is in %h) with:
             "select value from $h{datatype} where id = $h{id} order by sequence"




  <Juerd> subqueries, subqueries, subqueries, joins.
          subqueries, subqueries, subqueries, joins.




  <TakeFive> :)




  <Juerd> Ideally, you don't use a query just to have enough information to do the next




  <scrottie> Except for meta applications like database admins, usually you 
             don't want a variable table name.
 
  <Juerd> That is correct.
          Same goes for column names.




  <Juerd> TakeFive: Think symbolic references




  <scrottie> If all of your things are of essentially of the same type, put atleast
             the parts of them that describe the primary key in one table. You
             can always OutterJoin a lot of other tables, so you get kind of an 
             ObjectOriented like thing going on - everything "is a" foo, but 
             you have some MixIns going on as well.
             Not that MixIns are encouraged in OO, but it is kind of the same idea.




  <TakeFive> scrottie: the problem is (going back to the oceanographic implementation)
             right now, with the dataset I have, all the actual data is floating 
             point numbers --




  <scrottie> i've always said we should dump our problems in the ocean =)




  <TakeFive> salinities, depths, current speeds and such.
             but now i've been told i need to support character fields, 
             latitude/longitude pairs, and timestamps, and ultimately, I'll
             need to be able to generate pictures of buoys as they float, or 
             purely text output.
             If I use a single table, I'll always have to check what kind of data 
             I have.




  <scrottie> n:1 relationships break out into another table,
             so if you have a bunch of buoys for one given primary record (what 
             is the primary record anyway?), then throw them all in another
             table.
             If you have an arbitrary number of other things of types that you can't 
             anticipate, you could promote everything to the same object type
             and allow recurisve references between objects ;)
             Perlers tend to write databases like that... like perlmonks's 
             codebase... but it is best not to talk of such things




  <Juerd> It's called Everything




  <scrottie> If you have a lot of different things, you can set up an attribute-value 
             pairs table. Think of HTML forms. Someone posts a form. That
             gets a record in a Posts table, lets say. it has a bunch of name value 
             pairs. Each of those gets a record in the Attribute table,
             where each record references the Posts table entry.




  <TakeFive> scrottie: ah, add a column like: "datatype" for each record.




  <scrottie> Yeah. You lose the ability to cleanly join at that point - 
             everything is nested subqueries with another self-join for each 
             (lag) record you want from Attributes. ugly. So, that way is 
             sometimes - seldom but sometimes - better. 




  <Juerd> subqueries, subqueries, subqueries, joins.




  <scrottie> Well, the value in the attribute-value pair will always be the 
             largest thing - if you're holding binary data, it will be a blob. 
             Few databases index blobs.




  <scrottie> You probably don't want SelfJoiningData, and you don't want to 
             promote all records to the same type. That leaves creating a lot
             of tables, one for each type of thing, and doing a lot of joins
             and OutterJoins. It kind of sucks, but it is powerful, and a lot
             less ugly in the end than any alternative. 
             The relation between tables is based purely on references
             between fields. Never list table names in a database as a means
             of creating references.

Juerd is right. Use lots of joins and subqueries to pull your data together from multiple different tables. As Juerd says, ideally, you should get your result in one query. Use only IDs in auxiliary tables or http://wiki.slowass.net/?HingeTables. You can easily create more auxiliary tables and reference the primary table from them. Only queries that want this information will know about and know to ask for it to be joined in. [31]

See Also

http://wiki.slowass.net/?CategoryToDo, http://wiki.slowass.net/?CategoryIntermediate, http://wiki.slowass.net/?CategoryPattern

ManyToManyRelationship

Individual people each having a distinct set of traits can be expressed cleanly with three tables. Any fewer would lead to "SelfJoiningData" in SelfJoiningData and an ever increasing number of columns holding primary, secondard, trinary, and so on indefinately, positional attributes, which can only be used in queries with great pain and modifications each time a new slot is added. Any more tables leads to the same problem, but with constant introduction of tables rather than attributes.

The People table exists exactly as expected: a list of people, with columns for things that each and every person has.

All attribute tables need to be generalized into one. Any further attribute specific data may http://wiki.slowass.net/?OutterJoin to the Attribute table but should not be included in the attribute table itself: only the columns which describe the each and every attribute and nothing else.

Normalize the People table so that AttributeIDs don't exist in it. The rules of normalization state that any time we're attempting to hold an array of data in one record, we really want a 3rd independent table. This is exactly what we need to do. People contains PeopleID. Attribute contains AttributeID. PeopleToAttribute contains one PeopleID and one AttributeID per record. Each PeopleID may occur any number of times, and each AttributeID may occur any numbers of times, and these may occur in any combination. PeopleToAttribute is a hinge table. Hinge tables can and should contain data specific to the combination of the two IDs.

Badly designed databases often require repeated application of this concept. A database may list wholesale and retail prices, primary product category, and secondary category, should be turned into a table listing Products, one listing Category, one listing PriceOption, and two hinge tables. ProductToCategory shows the membership of each produce in each category by virtue of having a record making the connection:

  select count(*) as isDongle
  from   Product, Category, ProductToCategory
  where  Product.ProductID = ProductToCategory.ProductID
  and    ProductToCategory.CategoryID = Category.CategoryID
  and    Category.Name = 'Dongle'

This query returns the number of dongles in the database. Replacing count(*) with a specific field list would return details of each dongle.

PriceOption contains records "Wholesale" and "Retail", but cannot contain the actual prices. Attempting to do so would be no better than putting the prices directly into Product. ProductToPriceOption not only connects Product to the pricing options available for it as listed in PriceOption, but for each pricing option, contains the actual price. Normalization dictates that each and every column in a table depend (that is, be specific to) the key, the entire key, and nothing but the key. The price depends on more than ProductID in Product because it also depends on PriceOptionID in PriceOption. Likewise, it does not depend on just PriceOptionID, but also ProductID. ProductToPriceOption is keyed by both PriceOptionID and ProductID, so each record it contains is specific to both values. 3.95 may be the Retail price for "The Moon is a Harsh Mistress".

Understanding object relationships is impossible without understanding the rules of data normalization. Failing to do so so will result in obnoxiously complex object structures with no apparent solution for making sense of them. It is critical to deciding when to create objects, and where to place data in them.

See Also: http://wiki.slowass.net/?PracticalSQLHandbook, http://wiki.slowass.net/?OutterJoin, "SelfJoiningData" in SelfJoiningData

http://wiki.slowass.net/?CategoryConcept, http://wiki.slowass.net/?CategoryPattern, http://wiki.slowass.net/?CategoryIntermediate

OneToOneRelationshipsTurnIntoOneToManyRelationships

One to many relationships become many to many as original design assumptions are relaxed. This lets us model more complex situations. Objects that contained one instance of a kind of another object may find themselves holding an array. Methods that operated on this object explicitly now need to be told which one to operate on. Defining an iterator and moving the interface to the iterator lets us keep our concept of one and only object, but adds the concept of moving to the next object in the list. Places where the single object was implicitly manipulated need only be wrapped in a loop.

[32]

See also: "IteratorInterface" in IteratorInterface, "CompositePattern" in CompositePattern, "ObjectsAndRelationalDatabaseSystems" in ObjectsAndRelationalDatabaseSystems, "BiDirectionalRelationshipToUnidirectional" in BiDirectionalRelationshipToUnidirectional

See also: UML, SQL, http://wiki.slowass.net/?PracticalSQLBook

http://wiki.slowass.net/?CategoryRefactoring

BiDirectionalRelationshipToUnidirectional

Problem: Relationship between objects is confusing. Responsbility is ambigious, or calls bounce back and forth.

Solution: Apply http://wiki.slowass.net/?WholeObject, "InnerClasses" in InnerClasses, http://wiki.slowass.net/?ModelViewController, or a "ConstraintSystem" in ConstraintSystem as necessary.

In it's most basic form:

  my $output = new Output;
  my $backend = new Backend($output);
  $output->set_backend($backend);

Or:

  my $output = new Output($this);

Refactor as a:

Should $output know about $backend or $this? Does it make sense for $backend to place a call into $output that requires a call back into $output?

WholeObject

Contention for data of exactly this sort is a strong hint at a http://wiki.slowass.net/?ValueObject refactoring: move the data that is of common interest into a http://wiki.slowass.net/?ValueObject and passed whole, negating the need for a callback. See http://wiki.slowass.net/?WholeObject, "PassingState" in PassingState, http://wiki.slowass.net/?ValueObject

InnerClasses

Using "InnerClasses" in InnerClasses, the parent class need not have its API burdoned with the special needs and interfaces of its child, and the scope of the circular reference can be greatly reduced. An object created inside of the parent object, attached to its lexical data, can be sent off in place of $this.

  my $output = new Output;
  my $backend = new Backend($output->get_backend_adapter());
  $output->set_backend($backend->get_output_adapter());

Or...

  my $output = new Output($this->get_output_adapter());

ModelViewController

As http://wiki.slowass.net/?ObjectOrientedDesignHeuristics says, books don't shelve themselves, nor do shelves put the books on them, but there must exist a librarian. Considering mapping the problem as a http://wiki.slowass.net/?ModelViewController. This is more of interest to dealing with too much complexity in the logic rather than too much complexity in the code.

ConstraintSystem

An odd web of objects that participate in group-think is unavoidable or desireable. Bite the bullet and do it right. See "ConstraintSystem" in ConstraintSystem.

Resources

See Also

http://wiki.slowass.net/?CategoryRefactoring, http://wiki.slowass.net/?CategoryIntermediate

NamedArguments

When adding and removing arguments, it can be difficult to remember the order you wanted them in. Using a hash, you can do away with arbitrary ordering.

  sub foo {
    my %args = @_;
    my $color = $args{color};
    my $number = $args{number};
    # ...
  }
  
  foo(color=>'red', number=>13);
  
  The || operator lets you easily provide defaults and error checking:
  
  sub foo {
    my %args = @_;
    my $color = $args{color} || 'red';
    my $number = $args{number} || die 'number=> paramer required';
    # ...
  }
  
  Or, you may explicitly list the argument names and defaults, providing a self-documenting framework:
  
  sub foo {
    my %args = (
      Arg1        => 'DefaultValue',
      Blah        => 42,
      Bleh        => 60*60*24,
      Hostname    => undef,
      @_
    );
    # Handle error-checking here
    defined $args{Hostname} or die 'Required parameter "Hostname" not defined';
  }

See Also

PassingState

Synopsis: The arguments to the first function are augmented and repased to the next function, possibly recursively.

When: Context is built up during evaluation, and this context utliamtely turns into the result. Recursive code that details with a variable set of variables. In place of code that uses $$var to directly access the symbol table.

  my $context = { 
        increment    => sub { my $context = shift; $context->{sum}++; return ''; },
        currentvalue => sub { my $context = shift; return $context->{sum}; }
  };




  sub expand_macros {
    my $context = shift;
    my $text = shift;
    my $macro = qr{([A-Z][A-Z0-9]{2,})};
    $text =~ s/$macro/$context->{lc($1)}->($context)/ge;
    return $text;
  }




  expand_macros($context, "INCREMENT INCREMENT The current value is: CURRENTVALUE");

This is fairly strightfoward: We can pass $context and some text containing the macros "INCREMENT" and "CURRENTVALUE" to expand_macros(), and the macros will increment the current value of $context->{sum} and return the value. This is a simple template parser that finds escapes in text and replaces them with the result of a peice of code passed in through a hash. However, since we're maintaing our context in a hash reference, we can do this recursively:

  $context->{doubleincrement} = sub { 
    my $context = shift; 
    expand_macros($context, "INCREMENT INCREMENT CURRENTVALUE"); 
  }




  expand_macros($context, "The current value is: DOUBLEINCREMENT");

Maintaining state in a hashref rather than the symbol table only requires us to be vigilent in passing the hash ref around. We have access to the updated state in the hashref after evaluation has finished. We can take this same context and pass it off again. In our example, we could template something else, reusing our same state and collection of macro definitions.

See Also

FunctionTemplating

Synopsis: Creating a custom compliment of code cleans crufty object access syntax.

When: Your code is bloating due to another cumbersome interface.

http://wiki.slowass.net/?ForthLanguage teaches that no language will be well suited to every problem, so the best language is one that is well suited to creating languages for expression solutions in general. Huh? Instead of attaching the problem head on, step back and formulate a plan involving intermediate steps to the goal. We've designed data structures using objects. We've engineered programs using objects as building blocks. Whens the last time we've designed a language to solve a problem? Any language lets you create functions, but Forth lets you create functions that create other functions (Forth calls functions "words"). We don't need to cook up a VM and a syntax to do this, though we could. Perl's VM and syntax will work.

This is kind of a like an Abstract Factory. Objects certainly give you a way to generalize a solution, but they don't give you a mechanism to express a solution. If the solution involves making lots of method calls, the algorithm can get swamped down in OO syntax to the point where it is hidden. Removing the excess syntax is one way of refactoring code. Everyone benefits from the clarity, especially when you're trying to formulate a language as an intermediate step to solving a tough problem.

Lets take template processing as an example. Lets say you've got various sorts of templates: templates representing HTML fragments, templates representing email templates, database queries, and so on. You could create objects to represent each type of thing, and give each a stringify() method that requires a hash argument of values to template in. You would then write a huge amount of code, mostly method calls, loops, and string concatenations.

Or... XXX untested code.:

  # defining our mini language:




  # format of our macro escapes. returns the name of the macro.
  $macro = qr{([A-Z][A-Z0-9]{2,})};
  sub fetchvalue() {
    my $symbol = lc(shift());
    my $ob = shift;
    return $ob->{$symbol} if defined $ob->{$symbol};
    return $symbol->($ob) if defined &{$symbol};  # if its available as a function, recurse into it
    return $$symbol;                                              # assume its a scalar
  }




  sub createtemplate {
    my $name = shift; 
    my $text = shift;
    *{$name} = sub {
      my $ob = shift;
      my $text = $text; # private copy, so we don't ruin the original
      $text =~ s{$macro}{ fetchvalue($1, $ob); }oges;
      return $text;
    };
  }




  sub createquery {
    my $name = shift;    # name of function to create 
    my $sql = shift;        # query this function will execute
    my $inner = shift;     # name of function to call with each result, optional
    my @queryargs; $sql =~ s{('?)$macro\1}{push @queryargs, lc($2);'?'}oges;
    my $sth = $dbh->prepare($sql, @queryargs);
    *{$name} = sub {
      my $ob = shift;
      my $row;
      my $ret;
      $sth->execute(map { fetchvalue($1, $ob); } @args);
      my @names = @{$sth->{'NAME'}};
      while($row = $sth->fetchrow_arrayref()) {
        # store each item by its column name
        for(my $i=0;$i < @names; $i++) {
          $ob->{$names[$i]} = $row->[$i];
        }
        # if we're supposed to send each record to someone, do so.
        $ret .= $inner->($ob) if($inner);
      }
      $sth->finish();
      return $ret;
    };
  }




  # writing code in our mini language:




  createquery('readnames', qq{
    select Name as name from Users where Name is not null
  });




  createquery('readnumberbyageinstate', qq{
      select count(*) as number, Age as agearoup
      from Users where State = STATE
      group by Age 
  }, 'drawbargraph');




 createtemplate('drawbargraph', qq{
    <div align="left"><img src="reddot.png" height="20" width="NUMBER"></div>
  });




  print readnames();
  print readnumberbyageinstate({state=>'MD'});

Lets take a look at what we've factored out in this example:

createtemplate() is a simple example. createquery() is more advanced. A simple example appeared in Chapter XXX 3 where we created accessors for ourself.

For any task that is suited our mini language, we've completely factored out several tedious syntactical things. We're now free to work in a very concise, expressive, short-hand language. Yet, we still have all of the power of Perl available - we haven't given up anything.

The key elements are:

The returned code reference is lexically bound to the data you passed to it. The data passed to it could be any datatype, including objects, scalars, and most importantly, code references. Logic is factored out of the main program into the inner part of the "create" routine, inside of the anonymous subroutine block.

Creating a symbol table entry (assigning the anonymous subroutine to a glob of the given name) is optional. This skip can be stepped and done manually if you find yourself mostly creating functions to pass to other functions:

  print createquery($readnumberbystatesql, {drawpiechart => createpiechart() }, 'drawpiechart');

It is traditional in languages like Lisp and Scheme to skip naming things unless actually necessary.

Next time you're getting bogged down in syntax, ask yourself if a function generator could be written that would take care of the tedious busy work.

See Also

AssertPattern

Die early, die often, get closer to the root of the problem. Don't let an error in one part of the program trigger problems much later in a distant, unrelated part of the program. Check arguments types, provide accessors to enforce policies and handle state changes in objects there so that they are responsible for keeping themselves consistent.

Have or die $!; appear after each statement whose failure you aren't prepaired to handle otherwise. These clauses should absolutely litter your program.

Should something fail unexpectedly, execution will stop at the exact point of failure, and the diagnostic will be fresh and useful.

Things that a program assumes it can count on are called invarients. These are the basic assumptions that the program was written under. or die documents these in your code for all to see.

People resort to the printed documentation only when they can't figure out the interface for themselves. This applies equally to video games, digital time pieces, and software APIs. Diagnostics are more helpful than the manual, in this sense.

This is part of "encapsulation" or "data hiding". Making part of your interface public is committing to support that design indefinately. Don't do this lightly.

Sometimes you want your application to die with a useful diagnostic should an invarient by deviated from, for example when you're first installing and configuring an application, or when you're debugging it. Othertimes, you want it to do its absolute best to keep on trucking, for instance when that program is running as a mission critical service. Making no attempt to trap errors acheives the first case. For the second case, wrap eval {} around calls. Where you apply this technique depends on how much recovery logic you're willing to write. The more recovery logic, the closer the protective eval {} can be to the possible failure points. Less recovery logic means that fewer eval {} statements are needed.

  eval {
    run_query();
  };
  if($@) {
     $dbh = DBI->connect("DBI:Pg:dbname=blog;host=localhost;port=5432", 'scott', 'foo');
     run_query();
  }

See Also

CodeAsData

Code and data, time and space, lo...

What follows is a rant on the nature of programs. While not suitable for consuption in any format, it is a thought I need to develop further, as it affects every explaination in this text.

Some declarations run as they are encountered; some affect future behavior. Run time programing modification - self modifying code - is an example of affecting future behavior; so are lambda closures and object instantiations. Some languages are purely sequential: C. Some are purely declarative: Ocaml.

For many people, datastructures are seen as influencing future behavior only, likewise code is always seen as executing immediately.

Tied data, for example in our http://wiki.slowass.net/?AccessorsPattern, and using object accessors to fetch and stow data give datastructures the property of executing immediately. Changing the implementation on the fly by using the polymorphic nature of objects makes linear comprehension of code impossible. So do lambda closures.

See Also: http://wiki.slowass.net/?AccessorsPattern, "FunctionalProgramming" in FunctionalProgramming, "AbstractFactory" in AbstractFactory, http://wiki.slowass.net/?AbstractObject, http://wiki.slowass.net/?LexicalsMakeSense, http://wiki.slowass.net/?LambdaClosure, http://wiki.slowass.net/?LexicallyScopedVariables

External Pages Linking to This Page:

NonReenterable

Problem: Work is handled through recursion or delegation. Sometimes it is delegated back, or recursion never terminates due to a problem out of our control.

Solution: Use a re-entrance lock to detect and gracefully handle the situation. Set the lock on entrance and clear it on exit.

  my $lock;




  sub notify_all {
    if($lock) {
      warn "Don't respond to an event with an event!";
      $lock++;
    }
    foreach my $listener (@listeners) {
      $listener->send_event(@_);
    }
    $lock = 0;
  }

In most cases, it is never an error to be called back by the object that you just called. Some times re-entry isn't an error at all, and you can silently refuse it. "ConstraintSystem" in ConstraintSystem uses this idea to propogate values across a network where some nodes are willing to budge and others aren't. Usually this manifests as a list of notification recipients that receive a notification, and one needs to send yet another notice to all of them except the sender of the original message, but doesn't happen to know which originated. This situation crops up with the Gnutella protocol, where nodes replay messages to every peer except the originating one, but the mesh of connections can cause the message to be accidentally routed to the originator anyway. Simpily tracking which messages you originated yours and ignoring requests to forward them again pervents a condition where a host transmits the same message out onto the net over and over.

In yet another case, the one illustrated above, we're flatly denying recursion. If one node responds to events of type "A" with events of type "B", and another node responds to events of type "B" with events of type "A", and we did no reentry checking, Perl would explode. It would use up all of the memory the OS would allow it, grind away for a while, blow up like a big grinding balloon, and just pop. Nobody wants that. Putting rules in place for which events may be replied to with another event will prevent this situation as well. If you do opt for policy, you may elect to put some limits in place for testing purposes. These kind of arbitrary limits can never be set correctly: what you consider an impossibly large value becomes unworkably small in a few years. For debugging, detecting what looks like a run away condition can be a life saver:

  sub notify_all {




    if($testing) {
      # never do this in production code!
      my $calldepth = 0;
      $callerdepth++ while(caller($calldepth));
      die "arbitrary limit exceeded: stack depth too deep, possible runaway recursion detected"
        if $callerdepth > 100;
    }
    
    foreach my $listener (@listeners) {
      $listener->send_event(@_);
    }




  }

Recursion and Locking on User Data

Recursing through user data. Sends chills up your spine, doesn't it? User data is notorious for kiniving, minipulation, and being just plain old abusive, contreived rubbish. Why do users write HTML files that include a second HTML file that includes the first? To piss you off, thats why.

  # expand includes in HTML templates
  # eg, 




  my $numfound;




  FOUNDSOME: 




  $numfound = 0;




  $tmpl =~ s{}{
    die "invalid include path: '$2'" if $2 =~ m{\/\.\./\/};
    open my $f, "$inputdir/$2" or die "include not found: $inputdir/$2 $!";
    read $f, my $repl, -s $f;
    close $f;
    $numfound++;
    return $repl;
  }gie;




  goto FOUNDSOME if($numfound);

This would run indefinately (if permitted by the universe) if a user tried the A includes B, B includes A attack. Preventing reentry into some method wouldn't work. If we created a method, we would need to be able to reenter it to include more than one file deep. Of course, we could make it non-recursive, but it wouldn't do http://wiki.slowass.net/?TheRightThing. Things that seem like they should work, don't.

Limiting the stack depth is another option, but it is a violation of the "BusySpin" in BusySpin antipattern: no correct value can possibly be chosen that is large enough for extreme, but valid cases, but too small to shut out denial of service attacks. Someone fetching a malicious construct over and over will easily take the server out.

Refusing to include the same page twice would also fail to do http://wiki.slowass.net/?TheRightThing, and would throw cold water on most template arrangements that sites actually, really use. It is, however, simple to implement. Limiting the number of times that a single file may be included helps but the breaks on things, but violates the above pragmas concerning "BusySpin" in BusySpin as well:

  my $numfound;
  my %done;  # added this




  FOUNDSOME: 




  $numfound = 0;




  $tmpl =~ s{}{
    die "invalid include path: '$2'" if $2 =~ m{\/\.\./\/};
    die "file '$2' included entirely too many times" if $done{$2}++ > 30;   # added this
    open my $f, "$inputdir/$2" or die "include not found: $inputdir/$2 $!";
    read $f, my $repl, -s $f;
    close $f;
    $numfound++;
    return $repl;
  }gie;




  goto FOUNDSOME if($numfound);

Another solution is to maintain a stack, perhaps a http://wiki.slowass.net/?SimpleStack, and continiously examine it for repeated sequences. Such attempts are prone to occurance of a "RaceCondition" in RaceCondition, and there is usually an upper limit on how large of the stack segment it will compare to the rest of the stack. For example, if the code only checks for repeated patterns of two through 300 stack frame entries, someone need only create a circulation inclusion attack that 301 pages. D'oh!

Correctly solving this problem could be done by computing routes between pages and make a map of which pages include which others. See http://wiki.slowass.net/?DepthFirstRecursion. This is far too complex for most people to stomache. If you happen to write a solution to this, please, by all means, post it here. It should be a stright forward adaptation of http://wiki.slowass.net/?DepthFirstRecursion to use the example above. Yes, I'm just too lazy to do it myself right now.

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/118845 is an example of recursion in perl that checks for recursive traps.

See Also

SelectPollPattern

Stand in for threads. Much more efficient in Unix. Named for the use of select().

A single inner loop waits for either a timeout, signal, or a filehandle to become available for read or write. Coordination of reading and writing and responding to other events is handled in a single, centralized, and often massive central loop. Contrast with threads where each thread has its own loop and blocks waiting for exactly one thing at any given time: an object lock, input, another thread to wake it, and so on.

Many http://wiki.slowass.net/?ObjectOriented systems are built on top of a select() and its style. AWT, the Java http://wiki.slowass.net/?AbstractWindowsToolkit builds an http://wiki.slowass.net/?ObjectOriented facde on top of the event oriented X11 platform on Unix-like hosts.

The "SelectPollPattern" in SelectPollPattern is counter-intuitive for most people to use. It requires manual management of the CPU, and each task has to completely return to the inner loop and then be called fresh. [33]

    my $shbit = 1 << fileno($sh);
    my $sibit = 1 << fileno($si);




    my $inbitmask = $shbit | $sibit;




    # select(readtest, writetest, exceptiontest, max wait)
    select($inbitmask, undef, undef, 0);




    if($inbitmask & $shbit) {
      # $sh is ready for read
    }




    if($inbitmask & $sibit) {
      # $si is ready for read
    }

Done in a loop, several sources of input - perhaps the network, a GUI interface, pipes connected to other processes - could all be managed. The last argument to select() is typically 0 or undef, though it is sometimes other numbers. If it is undef, select() will wait indefinately for input. If it is 0, select() will return immediately, input ready or not. Any other number is a number of seconds to wait, floating point numbers accepted. As soon as a any monitored input or output handle becomes ready, select() will return. select() doesn't return a value in the normal sense: it motifies the bit mask, turning off any bits that correspond to fileno() bit positions that aren't ready. Each bit that we set must be tested to see if it is still on. If it is, that filehandle is ready for read or write. Filehandles that we want to monitor for read are passed as a bitmask in the first argument position of select(). The second argument of select() is the filehandles to monitor for write, and the third, for exceptions.

    if($inbitmask & $sibit) {
      $si->process_input();
    }

Filehandles may be blessed into classes [34], and then methods called to handle the event where input becomes available for read. This is easy to implement, simple, and sane - to implement. Using it is another story.

  package IO::Network::GnutellaConnection;




  use base 'IO::Handle';




  sub process_input() {
    my $self = shift;
    $self->read(...);
  }

Each access must promptly return for other handles to be served. This is a big requirement. Unheaded, a user interface could repeatedly cause network traffic to time out, or one unresponsive process reading on a pipe to lock up the process writing on the pipe - see http://wiki.slowass.net/?PerlGotchas for more. These cases are more numerous and insideous than thread CPU starvation issues.

To effectively cope with not having a return stack of its own, each line of processing associated with an IO handle must take pains to keep track of where it was in its code, what is doing, and what it expects to do next. See "StatePattern" in StatePattern for an implementation of this and more discussion.

Non-Blocking I/O

Sometimes select() will tell you that an I/O channel is ready to read from, but attempting to read still blocks. Non-blocking I/O can be used as a safety net.

When accepting connections on a TCP/IP socket, non-blocking I/O is a must:

  use Socket;
  use Fcntl qw(F_GETFL F_SETFL O_NONBLOCK);
  use POSIX qw(:errno_h :fcntl_h); 




  my $proto = getprotobyname('tcp');
  socket($server, PF_INET, SOCK_STREAM, $proto)                 or die "socket: $!";
  setsockopt($server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))   or die "setsockopt: $!";
  bind($server, sockaddr_in($port, INADDR_ANY))                 or die "bind: $!";
  listen($server, SOMAXCONN)                                    or die "listen: $!";




  # non blocking listens:




  fcntl($client, F_SETFL, fcntl($server, F_GETFL, 0) | O_NONBLOCK) or die "fcntl: $!"; 




  while(1) {




    my $paddr = accept($client, $server);
    (my $remoteaddress, my $remoteport) = sockaddr_in($paddr);
    my $remotehostname = gethostbyaddr($iaddr,AF_INET);




  }

XXX - very dubious, could be written cleaner, probably doesn't work.

accept() will try to accept a new connection, but it won't wait to do so. It returns immediately, and when //$paddr/ is marked ready for read according to select(), then we know a new connection has actually arrived. This integrates listening for new connections into the select-poll service loop.

This code is based on code in "PerlDoc" in PerlDoc:perlipc

See Also

JournalingPattern

Problem: Slow updates and corrupt files.

Solution: Don't change when you can append updated information, and never leave data in an indeterminate state.

  package Xfor;
  
  sub new {




    my $pack = shift;
  
    my $filecache;       # holds all of the name->value pairs for each item in each file
    my $buffered;        # same format: data to write to file yet
  
    bless {




      # open a flatfile database. create it if needed.




      open => sub {
        my $fn = $_[0]; 
        unless(-f $fn) {
          open F, '>>'.$fn or return 0;
          close F;
        }
        $self->openorfail($fn);
      },
  
      # open a flatfile database. fail if we are unable to open an existing file.




      openorfail => sub {
        my $file = shift;       # which file the data is in
        open my $f, $file or die $!;
        my $k; my $v;
        while(<$f>) {
            chomp;
            %thingy = split /\||\=/, 'key='.$_;
            while(($k, $v) = each %thingy) {
              $filecache->{$file}->{$thingy{'key'}}->{$k} = $v;
            }
        }
        close $f;
        return 1;
      },
  
      # fetch a value for a given key
  
      get => sub {
        my $file = shift;   #    which file the data is in
        my $thingy = shift; #    which record in the file - row's primary key
        my $xyzzy = shift;  #    which column in that record
        $logic->openflatfile($file) unless(exists $filecache->{$file});
        return $filecache->{$file}->{$thingy}->{$xyzzy};
      },
  
      keys => sub {
        my $rec = $filecache;
        while(@_) {
          $rec = $rec->{$_[0]}; shift;
        }
        if(wantarray) {
          keys %{$rec};
        } else {
          $rec;
        }
      },
  
      set => sub {
        my $file = shift;   #    which file the data is in
        my $thingy = shift; #    which record in the file - row's primary key
        my $x = shift;      #    which column in that record
        my $val = shift;    #    new value to store there
        $filecache->{$file}->{$thingy}->{$x} = $val;
        $buffered->{$file}->{$thingy}->{$x} = $val;
        1;
      },
  
      close => sub {
        my $file = shift;       # which file the data is in
        my $thingy;             # which record in the file - row's primary key
        my $x;                  # which column in that record
        my $val;                # new value to store there
        my $line;               # one line of output to the file
        open my $f, '>>'.$file or die "$! file: $file";
        foreach $thingy (keys %{$buffered->{$file}}) {
          $line = $thingy;
          foreach $x (keys %{$buffered->{$file}->{$thingy}}) {
            $line .= '|' . $x . '=' . $buffered->{$file}->{$thingy}->{$x};
          }
          print F $line, "\n";
        }
        $buffered->{$file} = ();
        close $f;
      },
   
      recreate => sub {
        my $file = shift;       # which file the data is in
        my $thingy;             # which record in the file - row's primary key
        my $x;                  # which column in that record
        my $val;                # new value to store there
        my $line;               # one line of output to the file
        open my $f, ">$file.$$" or die "$! file: $file.$$";
          foreach $thingy (keys %{$filecache->{$file}}) {
          $line = $thingy;
          foreach $x (keys %{$filecache->{$file}->{$thingy}}) {
            $line .= '|' . $x . '=' . $filecache->{$file}->{$thingy}->{$x};
          }
          print $f $line, "\n";
        }
        close F;
        rename "$file.$$", $file or die "$! on rename $file.$$ to $file";
      },




    } , $pack;




  }

To use, do something like:

  use Xfor;
  my $hash = new Xfor;
  $hash->open('carparts.nvp');




  # read:




  $hash->get('carparts.nvp', 'xj-11', 'muffler');   # which muffler does the xj-11 use?




  # write:




  $hash->set('cartparts.nvp', 'xj-11', 'muffler', 'c3p0');




  # then later:




  $hash->close('carparts.nvp');




  # or... 




  $hash->recreate('carparts.nvp');

Xfor.pm reads files from beginning to end, and goes with the last value discovered. This lets us write by kind-of journeling: we can just tack updated information on to the end. we can also regenerate the file with only the latest data, upon request. Since we read in all data, we're none too speedy. Reading is as slow as Storable or the like, but writing is much faster.

Data is written to the end of the file when the -close()> method is called. There are no fixed record lengths. We never go into the middle of a file and try to insert data. We don't move and regenerate the file unless explicitly asked to, and we only do that to keep it from getting too large.

A tied-hash interface could be provided for persistant journaled storage without the clumbsy method interface. If a single value is needed, the entire file need not be read into memory - this case could be optimized. We use the vertical bar as a field deliminator - this is bound to cause problems unless either we escape them in strings, in which case the escape character must also be escaped when it occurs normally. Taking a http://wiki.slowass.net/?BinaryClean approach is usually better than trying to escape things: include an explicit length and then use read() to read exactly that much data.

"ExportingPattern" in ExportingPattern talks about creating a single default instance that can be used without explicitly naming an object, only using the correct methods.

This example should also take a few arguments to the constructor and pass them to each method so that a default file or default file and default record can be specified. It isn't useful as a module as it stands, but illustrates the trade off between read time and write time that simple journaling approaches offer.

[35]

See Also: perldoc perltie

See Also: http://wiki.slowass.net/?FakingAccessorsUsingTiedData, "AnonymousSubroutineObjects" in AnonymousSubroutineObjects, http://wiki.slowass.net/?StorablePattern Pages Linking to This Page:

ApplicationGenerator

I'd rather write programs that write programs than write programs - http://wiki.slowass.net/?WikiWiki:BradAppleton

http://wiki.slowass.net/?OnceAndOnlyOnce says we should do something in only one place. In accordance with keeping our secret bits hidden away and not duplicated all over the place like a scandle on the tabloid front page. A Perl module would be expected to export these values and routines when used. A Java package would be expected to give an object reference that can be prodded with method calls and examination of package global and object instance fields. A Bourne shell script would spit out another shell script which it would then execute. Thats what I want to talk about.

A lot of language rheteric tells us not to worry about the size of our applications. We should load all of the modules we need rather than ever thinking of copying and modifying a program or module. Like anything that says "always" or "never", this is dangerous. Rather than writing a clean implementation that loads a bazzilion modules, we'll sheepishly dumb down the specs, and set our expectations of the application low. We'll hardcode things for fear of creating modules. In short, we'll deal with module explosion problem by developing a neurotic adversion to creating more of the bastards. Each new candidate for modulehood seems to pale in comparison to the last 3 or 4 hundred modules that were endorced.

The heart of the problem is with diverging applications, the kind we make when we copy one program to reuse it for another client, for instance. Conventional wisdom says to copy the entire thing and add on to it, in essence, without removing anything. There are two cross sections we're trying to cut the same application into at once: the cross section of the functionality we need for this client, and the cross section of how functionality within the application.

Organizing the entire application by which logic may or may not be needed for future projects assumes knowledge we don't have, and it completely neglects organizing the objects by their relationship with each other - our primary reason for using objects.

Organizing the entire application by its fuctional structure includes an undue amount of dependence between building blocks in an environment where the very purpose of the application can go two or more ways at once, as different clients have an application customized. In face, it is very rare it is even attempted that diverging versions will track each other. http://wiki.slowass.net/?NetBSD, http://wiki.slowass.net/?OpenBSD and http://wiki.slowass.net/?FreeBSD selectively adapting code from each others projects is a good example. Even so, each BSD has gone in a different direction, introducing a very real element of manual labor in adapting code, dispite the histric common origins.

Another exmaple is GNU autoconf. If you've ever installed Unix software from source code (Perl, perhaps?), you've probably downloaded a tarball, typed tar -xzvf foo.tar.gz, chdired into the directory, typed ./configure, then make. configure is a shell script, generated by GNU autoconf. Every time anyone needs to test for a new feature which may or may not be present in POSIX like operating environments as part of the build process the test for that feature is added to GNU autoconf. Okey, not every time, but this is the secret to GNU autoconf's success. Every application running the same configure script, which tested for everything would be unworkable. It would take hours to run on a fast computer, and do all sorts of work not needed. Configuring certain tests off would help, but you're still forcing poor bash* to read a several thousand line (several hundred thousand line long?) script when only a few may be needed. With open source software, programmers may be tolerant of lots of unrelated code or hooks kicking around. In the real world, clients don't want to know about each other. They're just happier that way*.

Writing an application to write applications lets you put everything where it belongs, score high on http://wiki.slowass.net/?OnceAndOnlyOnce tests, structure your code according a natural, logical criteria, and not bog clients down with a beast that is the sum of the size of all of its copies. This idea is nothing new. The concept of returning code is cornerstone of the Lambda programming style, and is also known as http://wiki.slowass.net/?FirstClassData. We could, and otherwise should, use http://wiki.slowass.net/?LambdaClosures, but we're trying to exclude the code from ever going out the door. The idea is similar to generating a string of code and using <i>eval</i> on it, but once again, we're trying to keep the code from ever disgracing their harddrive.

Using http://wiki.slowass.net/?AutoSplit and http://wiki.slowass.net/?AutoLoader breaks your module up into individual functions (methods) that are loaded on demand from strings stored past the __END__ of your program.

XXX quick hack to look at an http://wiki.slowass.net/?AutoSplit module and spit out select sections either as http://wiki.slowass.net/?AutoLoader ready or as regular Perl code.

http://wiki.slowass.net/?CategoryToDo - round this out, proof it, give it a few examples.

See Also

WebAuthentication

Do you want to send them an email with a generated password in it to validate their email address?

Suited for a small number of users, each of which has the same permissions. User creation and maintenance involves modifying the file directly.

There are lots of formulas, but the winning one is: issue a cookie with an authorization token. Store the token in the database along with an expiration time seperate of the cookie. The token should be random generated and completely seperate from the password but handed out when the password is validated. This is the best case; if your porn addicted friend comes over and uses your computer, and steals your cookies.txt file when you aren't looking, cookies generated this way can't be used to discover the username or password used. The password change form could be used as a loophole though: if the token is still valid and the password change script doesn't explicitly double check the old password before letting you change it, a new password could be put in place for the account without your friend knowing the old one. It is best to always check that the user knows the old password before allowing them to change it to avoid this problem.

Our example here doesn't do any of this. It merely hands out cookies that contain the literal username and password. Our passwords aren't stored encrypted in the database. See http://wiki.slowass.net/?OneWayHash for an example of that. The examples are at the bottom of this section.

Sometimes users don't have cookies turned on. In this case, you've got two options: tracking them by IP and including the session ID in all forms and links. Tracking users by IP is error prone, since entire companies traffic is often filtered through a firewall that uses network address translation to present all of the internal computers traffic as coming from one IP address. Inexpensive home "modem sharing" devices do exactly the same thing. Munging links requires that the session ID be constantly passed back to the scripts at every link or form:

  # go out of our way to include sid=$sid:




  print qq{<a href="otherprog.cgi?foo=bar&color=red&sid=$sid">Go To Otherprog</a>};




  print qq{
    <form action="anotherprog.cgi" method="post"> 
      <input type="hidden" name="sid" value="$sid">
      Enter answer: <input type="text" name="answer"><br>
      <input type="submit">
    </form>
  };

Forgetting to do this in even one link or form causes the site to forget any and all information about a user as soon as they click it. Additionally, since the sessionid is part of the HTML, it lives on in the browser cache. For this reason, session id tokens should be expired after a period of time by the server. This means having the server simply record the date that it issued a session id number and refusing to honor it after a period of time has elapsed, forcing the user to re-login.

One dirty little trick that a programmer friend of mine (okey, it was me) used once (okey, several times) on mod_perl sites was having the handler parse .html files with embedded perl, and munge all of the links - from both the .html and the perl output:

  $oOo =~ s/<(a|frame)([^>]*) (src|href)=(['"]?)(?!javascript)([^'"]+\.htm)(l)?(\?)?([^'">]*)?\4(?=\w|>>)/<$1$2 $3="$5$6\?$8\&sid=$sid"/ig;
    
  # $1: 'a' or 'frame'
  # $2: any random name=value pairs (exa 'name="mainFrame"')
  # $3: 'src' or 'href'
  # $4: any begin qouting character, be it ' or "
  # $5: whatever.htm
  # $6: optional 'l'
  # $7: optional '?' (discarded)
  # $8: optional cgi get string
  # $9: 0-width lookahead assertion: > or space isn't matched but is looked for

You prolly want to plan from the beginning to have a bunch of small .cgi scripts instead of one huge monolithic one... so you'll want to make a sort of "validateuser.pm" file and "use validateuser.pm;" at the top of each .cgi.

  # Sample validateuser.pm:
  
  use CGI;
  use CGI::Carp qw/fatalsToBrowser/;
  use DBI;
  
  use lib "/home/scott/cgi-bin/DBD";
  
  BEGIN {
  
    $dbh = DBI->connect("DBI:Pg:dbname=sexcantwait;host=localhost;port=5432", 'scott', 'pass')
      or die $DBI::errstr;
  }
  
  use TransientBaby::Postgres;
  use TransientBaby;
  
  createquery('validateuser', qq{
    select   UserID as userid
    from     Users
    where    Name = [:username:]
    and      Pass = [:userpass:]
  });
  
  sub validated {
    $userid = -1;
    my $sid=CGI::cookie(-name=>"sid");
    return 0 unless $sid;
    ($username, $userpass) = split /,/, $sid;
    validateuser();
    return $userid == -1 ? 0 : 1;
  }
  
  sub is_guest {
    return $username =~ /^guest/;
  }




  sub offer_login {
    print qq{
      Sorry, you aren't logged in. Please enter your name and password:<br><br>
      <form action="login.cgi" method="post">
        <input type="hidden" name="action" value="login">
        User name: <input type="text" name="username"><br>
        Password: <input type="password" name="password"><br>
        Are you a new user? <input type="checkbox" name="newuser"><br>
        <input type="submit" value="Log in"><br>
      </form>
    };
    exit 0;
  }
  
  1;

Instead of declaring a package and using Exporter, we're merely continuing to operate in the namespace of the module that invoked us. The methods we define - validated(), validateuser(), offer_login() and is_guest() show up in their package, ready for use. As a side effect, we're using CGI.pm and DBI.pm on behalf of our caller, letting us list all of the modules we want in only one place, rather than in every .cgi script. This module could be used with:

  print qq{Content-type: text/html\n\n}; 
  use validateuser;
  validated() or offer_login();




  # rest of the script here, for users only

offer_login() never returns once we call it. It handles exiting the script for us.

  #!/usr/bin/perl




  # example login/create user script that uses validateuser.pm.
  # this should be named login.cgi to match the form in validateuser.pm, unless of course
  # that form's action is changed.
  
  use validateuser;
  
  createquery('userexists', qq{
    select count(*) as num
    from   Users
    where  Users.Name = [:name:]
  });
  
  createquery('createuser', qq{
    insert into Users
    (Name, Pass, CreationIP)
    values
    ([:name:], [:pass:], [:creationip:])
  });
  
  my $action = CGI::param('action');
  my $newuser = CGI::param('newuser');
  
  if(!$action) {




    offer_login();




  } elsif($action eq 'login' and !$newuser) {
  
    $username = CGI::param("username");
    $userpass = CGI::param("userpass");
  
    validateuser();
  
    if($userid != -1) {
  
      my $cookie=CGI::cookie(
        -name=>'sid', -expires=>'+18h', -value=>qq{$username,$userpass},
        -path=>'/', -domain=>'.sexcantwait.com'
      );
      print CGI::header(-type=>'text/html', -cookie=>$cookie);
  
      print qq{Login successful.\n};




    } else {
      
      sleep 1; # frustrate brute-force password guessing attacks
      
      print qq{Content-type: text/html\n\n};
      
      print qq{Login failed! Please try again.<br>\n};




      offer_login();
      
    } 
    
  } elsif($newuser and $action eq 'login') {
    
    local $name = CGI::param("username");
    local $pass = CGI::param("userpass");
    
    userexists(); if($num) {
      print qq{User already exists. Please try again.<br>\n};
      offer_login();
    }
  
    local $creationip = $ENV{REMOTE_ADDR};
  
    createuser();
    validateuser(); # sets $userid
  
    print qq{Creation successful! Click on "account" above to review your account.<br>\n};
  
  }

These examples make heavy use of my http://wiki.slowass.net/?TransientBaby.pm module. That module creates recursive routines that communicate using global variables - ick. I need to change that, and then this example. Then I'll put that code up. XXX.

Back to "PerlDesignPatterns" in PerlDesignPatterns.

$Id: "WebAuthentication" in WebAuthentication,v 1.9 2003/02/23 19:07:42 httpd Exp $ Pages Linking to This Page:

FileUpload

Common application feature, for CGI applications.

Users select files, using a form element in their web browser, and when they submit, that file is uploaded to the server with the rest of the form data.

  <gogamoga> well, i`ll ask: how do i fetch attached file from the query?
  <scrottie> ask to ask?
  <Perl-fu> Don't ask to ask. Don't ask if anybody can help you with x. 
            Just ask!  Omit any irrelevant details. If nobody answers then we 
            don't know or are busy for a few minutes. Wait and don't bug us. 
            If you must ask again wait until new people have joined the channel.
  <scrottie> my $fh = CGI::upload($fn); my $buffer; while (read($fh,$buffer,length($buffer)) { }; 
  <scrottie> where $fn is the name of the CGI param. make sure the from has the right enctype. 
  <scrottie> i don't remember the enctype, but "perldoc CGI" will tell you
  <scrottie> unless the form uses that special enctype, file uploads won't be uploaded, rather mysteriously
  <gogamoga> THANK YOU SOOOOOOOOO MUCH
  <gogamoga> i got lost in cgi.pm reference :(
  <scrottie> heh, you're welcome. let me know if you get stuck.
  <scrottie> yeah, someone really needs to slim that down.
  <gogamoga> i use only jpg enctype so i wont even check it
  <gogamoga> just fetch the file and save it
  <scrottie> you don't understand.
  <scrottie> hang on. let me find it.
  <gogamoga> ok
  <scrottie> if your form doesn't say 
             <form method="post" enctype="multipart/form-data">, then 
             <input type="file"> tags wont work. they won't upload the file.
  <scrottie> reguardless of the type of the file, the file won't be uploaded.
  <scrottie> Netscape 2 introduced the ability to upload files, and in order to 
             support this feature, they had to introduce a
             new format for sending data to the server - the old 
             application/x-www-form-urlencoded one couldn't handle large
             blocks of arbitrary data
  <gogamoga> ah
  <gogamoga> damn, it wont upload it but it still takes ages as it uploads it :)
  <gogamoga> ah, sorry i am dumb
  <scrottie> no, we all have to work through the standard mistakes ;)
  <gogamoga> dreamweaver adds multipart/form-data by default
  <gogamoga> :)
  <scrottie> good. no one uses Netscape 1 anymore ;)

http://wiki.slowass.net/?NetPBM has an example of serving binary objects as images from a CGI script. This can easily be coupled with database BLOBs to store images in the database, and serve them as normal images from a CGI script.

See Also

WebScraping

"WebScraping" is extracting information from the Web. Picking out information from web pages and using it in an appliction is said to be scraping the data. Usually refers to harvesting live data feeds or minipulating specific applications via the Web. Also known as http://wiki.slowass.net/?WebMining or http://wiki.slowass.net/?WebHarvesting, especially when one type of information is sought across the entire web.

Use LWP to fetch web pages using URLs. See example HTML parser in "RunAndReturnSuccessor" in RunAndReturnSuccessor.

  use TransientBaby::Forms;
  use TransientBaby;




  my $accessor;
  my %opts;




  my @table;
  my $tablerow;
  my $tablecol = -1;




  parse_html($document, sub {




    $accessor = shift;
    %opts = @_;




    if($opts{tag} eq 'tr') {




      # create a new, blank array entry on the end of @table
      $tablerow++; $table[$tablerow] = [];
      $tablecol = 0;




    } elsif($opts{tag} eq 'td') {




      # store the text following the <td> tag in $table[][]
      $table[$tablerow][$tablecol] = $accessor->('trailing');
      $tablecol++;




    }




  });

I've gone out of my way to avoid the nasty push @{$table[-1]} construct as I don't feel like looking at it right now. $tablerow and $tablecol could be avoided otherwise. This code watches for HTML table tags and uses those to build a 2 dimentional array.

Data taken from a database and presented in HTML tables was normalized in the database, but is denormalized for display. When it is denormalized, data from several relational tables is presented as one table. In this case, there may be different views of the data, each driven by a differenet query or different query parameters. See "ObjectsAndRelationalDatabaseSystems" in ObjectsAndRelationalDatabaseSystems for more on normalization.

If we're putting the harvested data back into a database to report on, it suits us to reconstruct some structure to it.

  select table1.a, table2.b, table3.c
  from table1, table2, table3
  where table1.id = table2.id
  and   table2.param = table3.id
  order by table1.a, table2.b, table3.c

We can't recover the id or param fields from the output of this query, but we can generate our own.

Joining between three tables flattens the extracted data down to one. This sort of joining has a tell-tale pattern in its output, in that the columns appear to count. The first n columns are from tablea, second so many from tableb, and so on.

  aaa
  aab
  aac
  aad
  aba
  aca
  ada
  baa
  bab
  (And so on...)

Add this clause to the if statement in the sub passed to parse_html() above, remembering to declare the introduced variables in the correct scope:

    } elsif($opts{tag} eq '/tr') {




      if(!$tablerow or $table[$tablerow][0] ne $table[$tablerow-1][0]) {
        $dbh->execute("insert into tablea (a) values (?)", $table[$tablerow][0]);
        $table_a_id = $dbh->insert_id();
        # else $table_a_id will retain its value from the last pass
      }




      if(!$tablerow or $table[$tablerow][1] ne $table[$tablerow-1][1]) {
        $dbh->execute("insert into tableb (b, id) values (?, ?)", $table[$tablerow][1], $table_a_id);
        $table_b_id = $dbh->insert_id();
        # else $table_b_id will retain its value from the last pass
      }




      if(!$tablerow or $table[$tablerow][2] ne $table[$tablerow-1][2]) {
        $dbh->execute("insert into tablec (c) values (?, ?)", $table[$tablerow][1], $table_b_id);
        $table_c_id = $dbh->insert_id();
        # else $table_c_id will retain its value from the last pass
      }




    }

This code depends on $dbh being a properly initialized database connection. I'm using -insert_id()>, a http://wiki.slowass.net/?MySQL extention, for clarity. Unlike the previous code, this code is data-source specific. Only a human looking at the data can deturmine how best to break the single table up into normalized, relational tables. We're assuming three tables, each having one column, aside from the id field. Assuming this counting pattern, we insert records into tablec most often, linking them to the most recently inserted tableb record. tableb is inserted into less frequently, and when it is, the record refers to the most recently inserted record in tablea. When a record is inserted into tablea, it isn't linked to any other records.

XXX Todo:

See Also

ReadingAFile

Problem: Perl gives so many ways to read a file, so many of them bad.

Solution: Know the bad ones.

An Old Idiom in Poor Style

  {
    local $/ = undef;
    open FH, "<$file";
    $data = <FH>;
    close FH;
  }

Pros: Everyone seems to know this one. Reads in entire file in one gulp without an array intermediary. Cons: $data cannot be declared with my because we have to create a block to localize the record seperator in. Ugly.

A Short and Sweet Idiom

  @ARGV = ($file);
  my $data = join '', <>;

Pros: Short. Sweet. Cons: Clobbers @ARGV, poor error handling, inefficient for large files.

Shell-Holdout Idiom

  my $data = `cat $file`;

Pros: Very short. Makes sense to sh programmers. Cons: Secure problem - shell commands may be buried in filenames. Creates an additional process - poor performance for files small and large. No error handling. Is not portable.

Read/Sysread Idiom

  open my $fh, '<', $file or die $!;
  read $fh, my $data, -s $fh or die $!;
  close $fh;

Pros: Good error handling. Reasonably short. Efficient. Doesn't misuse Perl-isms to save space. Uses lexical scoping for everything. Cons: None.

Mmap Idiom

  use Sys::Mmap;
  new Mmap my $data, -s $file, $file or die $!;

Pros: Very fast random access for large files as sectors of the file aren't read into memory until actually referenced. Changes to the variable propogate back to the file making read/write, well, cool. Cons: Requires use of an external module such as Sys::Mmap (http://www.cpan.org/modules/by-module/Sys/ Mmap), file cannot easily be grown. Difficult for non-Unix-C programmers to understand.

http://wiki.slowass.net/?CategoryNovice

ConfigFile

Problem: Reading configuration data from a file that users can edit and have written back to disc. Using require to read config files is handy, but many people feel they've outgrown using it, so they write elaborate modules to handle configuration.

Solution: Hot-rod require with advanced features to the degree it makes sense before resorting to complex or do-it-yourself replacements.

  require 'config.pl';

We've all seen it a million times. It's as old as Perl itself. You make a little Perl program that does nothing but assign values to variables. Users can "easily" edit the file to change the values without having to wade through your source code. It is extremely easy to read configuration files of this format, and you can store complex datastructures in there, along with comments.

Configuration is one of those sore spots that the limits of are continuously pushed by users. Most Perl programmers give up their old config.pl when requirements specify a spiffy Web or Tk interface for users to change settings. No more!

  # config.pl:




  $config = {
    widgets=>'max',
    gronkulator=>'on',
    magic=>'more'
  };




  # configTest.pl:




  use Data::Dumper;
  require 'config.pl';




  $config->{gronkulator} = 'no, thanks';




  open my $conf, '>config.pl' or die $!;
  print $conf Data::Dumper->Dump($config);
  close $conf;

Data::Dumper (http://www.cpan.org/modules/by-module/Data/ Dumper).pm comes with Perl, and can even store entire objects. In fact, it can store networks of object.

Security may be a concern. If you don't want Perl in configuration files to gain the priviledge of your program, use the Safe module or http://wiki.slowass.net/?UseOps. If the program is running as a http://wiki.slowass.net/?DaemonProcess as the superuser, http://wiki.slowass.net/?UseOps or the Safe module. If the program is setuid and the people running it don't have access to edit it, use the Safe module or http://wiki.slowass.net/?UseOps.

Finding the Config or Data Directory

Something that is reasonably portable between Unix and Win is to look for an environment variable telling you where to go for the data. msconfig.exe lets you set startup environment variables and a lot of unix programs (cvs, postgres, etc) use environment variables to find their data if it isn't passed on the command line. Polluting the environment in Unix is considered bad form by many, and dropping something in /etc isn't portable, so go fish.

Active Config

Closures are useful for doing config options that have behavior:

  $dumping = "xterm -display $display";

You could (if you wanted) make that a closure. That would let you use the multiple arg version of system(), which is good security practice, and the closure would bind to my variables, so if the config changes at run time, they change there too.

  $dumping = sub { system 'xterm', $arg, $arg; };

XXX http://wiki.slowass.net/?CategoryToDo - dumping active config using B::Deparse

http://wiki.slowass.net/?CategoryNovice

See Also

ErrorHandling

Die Early, Die Often

Catch errors before you get far away, or unrelated code will appear to malfunction, as a horrible form of "ActionAtADistance" in ActionAtADistance. In the process of debugging, you're going to need to insert lots of tests anyway, so why not do it neatly from the beginning and integrate it into your program? When the program is in production is when error reporting is most needed, if users or logs are going to communicate the nature of the problem to you to be fixed. See "RunAndReturnSuccessor" in RunAndReturnSuccessor and http://wiki.slowass.net/?MementoPattern for a description of checkpointing an application to recover from otherwise fatal errors. eval { } is used for trapping errors - see "AssertPattern" in AssertPattern.

  open my $f, 'file.txt' or die $!;

or die should litterally dot your code. Thats how you communicate to Perl and your readership that it is not okey for the statement to silently fail. Most languages make such error geeration default; in Perl, you must request it. This is no excuse for allowing all errors to sneak by silently.

Should you not have the constitution to speckle your code with or die clauses, or you're a minimalist, striving for elegance, there is a solution:

  # from the Fatal.pm perldoc:




  use Fatal qw(open close);




  sub juggle { . . . }
  import Fatal 'juggle';

Fatal.pm will place wrappers around your methods or Perl built in methods, changing their default behavior to throw an error. A module which does heavy file IO on a group of files need not check the return value of each and every open(), read(), write(), and close(). Only at key points - on method entry, entry into worker functions, etc - do you need to handle error conditions. This is a more reasonable request, one easily acheived. Should an error occur and be cought, the text of the error message will be in $@.

  use Fatal qw(open close read write flock seek print);




  sub update_data_file {
    my $this = shift;
    my $data = shift;
    my $record;
    local *filename = \$this->{filename};
    local *record = \$this->{record};




    eval {




      open my $f, '>+', $filename;
      flock $f, 4;
      seek $f, $record, 0; 
      print $f, $data;
      close $f;




    };




    return 0 if $@;   # update failed
    return 1;         # success




  }

Alternatively, rather than using eval { } ourselves, following "AssertPattern" in AssertPattern, we could trust that someone at some point installed a __DIE__ handler. The most recently installed local handler gets to try to detangle the web.

  sub generate_report {
    local $SIG{__DIE__} = { 
      print "Whoops, report generation failed. Tell your boss it was my fault. Reason: ", @_;
    }
    foreach my $i ($this->get_all_data()) {
      $data->update_data_file($i);
    }
  }




  sub checkpoint_app {
    local $SIG{__DIE__} = { 
      print "Whoops, checkpoint failed. Correct problem and save your data. Reason: ", @_;
    }
    $data->update_data_file($this->get_data());
  }

Using local scoped handlers this way allows you to provide context-sensitive recoverory, or atleast diagnostics, when errors are thrown. This is easy to do and all that is required to take full advantage of Fatal.pm.

Fatal.pm was written by Lionel.Cons@cern.ch with prototype updates by Ilya Zakharevich ilya@math.ohio-state.edu.

Time-Outs

Use alarm() with eval():

  RETRY:




  eval {
    alarm 30; # send a $SIG{ALRM} after 30 seconds - default is death
    # do something that might time-out
    alarm 0;  # disable alarm
  };




  if($@) {
    # there was an error - error text is in $@ - do what you will - perhaps retry:
    goto RETRY;
  }

select() provides an alternative for timeouts on I/O, and is especially safe when coupled with non-blocking I/O. See "SelectPollPattern" in SelectPollPattern.

Throwing Objects

Here's another problem: Exceptions and die are the same thing in Perl, which sometimes surprises people. Someone wrote into perl5-porters recently about a library function that was going to run a subprocess. The fork() succeeded but the exec() failed, so the child process called die. That was usually the right thing to do. In this case, however, the library function had been called inside of an eval block, which trapped the child's die. The original process was still waiting for the child to complete, but the child was going ahead, thinking it was the parent!

Groundwork for rationalization has been laid here; recent versions of Perl let you throw any sort of object with die, not just a string. Using these objects you could propagate complex kinds of exceptions in your programs. But as far as I know these features are little-used. There are several modules that provide try-catch-cleanup syntax, but as far as I know they're also little-used. And there are no widely accepted guidelines for the behavior of modules.

- http://wiki.slowass.net/?MarkJasonDominus at http://www.perl.com/pub/a/1999/11/sins.html

[36]

http://wiki.slowass.net/?CategoryNovice, http://wiki.slowass.net/?CategoryIntermediate

See Also

ErrorReporting

Use die()

Avoid temptation to write a new death-handler and call it by name in place of die():

  # don't do this




  sub barf {
    print "something went wrong!\n", @_;
    exit 1;
  }




  # ...




  barf("number too large") if($number > $too_large);

die() has a useful default behavior that depends on no external modules, but can easily be overriden with a handler to do more complex cleanup, reporting, and so on. If you don't use die(), you can't easily localize which handler is used in a given scope.

Every Error, Great And Small

warn() provides a reasonable default for reporting potential errors. Programs run at the command line get warn() messages sent to stderr. CGI programs get warn() messages sent to the error log, under Apache and thttpd [37]. Using CGI::Carp (http://www.cpan.org/modules/by-module/CGI/ Carp), warnings are queued up for display in the event of a die(), thus making important debugging information available.

Even reasonable defaults aren't always what you want. Without changing your code [38], the behavior of warn() and die() can be changed:

  # send diagnostic output to the end of a log




  open my $debug, '>>bouncemail.debug';
  $SIG{__WARN__} = sub { print $debug $_, join(" - ", @_); };
  $SIG{__DIE__} = sub { print $debug $_, join(" - ", @_); exit 0; };

Some logic will want to handle its own errors - some times a fatal condition in one part of code doesn't really matter a hill of beans on the grand scale of the application. A command line print utility may want to die if the printer is off line [39]

 - a word processor probably does __not__ want to exit with unsaved

changes merely because the document couldn't be printed. So, do this:

  local $SIG{__DIE__} = sub {
    # yeah, whatever
  };




  # or...




  local $SIG{__DIE__} = 'IGNORE';

...or, do the error processing of your choice. Perhaps set a lexically bound variable flag - see http://wiki.slowass.net/?LexicalsMakeSense.

Report Everything

In the event of a fatal error, display as much information as possible about the current execution context.

  # intercept death long enough to scream bloody murder




  $version = '$Id: ErrorReporting,v 1.20 2003/05/15 09:58:41 phaedrus Exp $'; # CVS will populate this if you use CVS




  $SIG{qq{__DIE__}} = sub {




    local $SIG{qq{__DIE__}}; # next die() will be fatal




    my $err = '';




    $err .= "$0 version $version\n\n";




    # stack backtrace




    $err .= join "\n", @_, join '', map { 
      (caller($_))[0] ? sprintf("%s at line %d\n", (caller($_))[1,2]) : ''; 
    } (1..30);




    $err.="\n";




    # report on the state of global variables. this includes 'local' variables 
    # and 'our' variables in scope. see PadWalker for an example of inspecting
    # lexical 'my' variables as well.




    foreach my $name (sort keys %{__PACKAGE__.'::'}) {
      my $value =  ${__PACKAGE__.'::'.$name};
      if($value and $name ne 'pass' and $name =~ m/^[a-z][a-z0-9_]+$/) {
        $err .= $name . ' ' . $value . "\n" 
      }
    }




    $err .= "\n";




    foreach my $name (sort keys %ENV) {
      $err .= $name . ' ' . $ENV{$name} . "\n";
    }




    $err .= "\n";




    # open the module/program that triggered the error, find the line
    # that caused the error, and report that.




    if(open my $f, (caller(1))[1]) {




      my $deathlinenum = (caller(1))[2];
      my $deathline;




      # keep eof() from complaining:
      <$f>; 




      $deathline = <$f> while($. != $deathlinenum and !eof);




      $err .= "line $deathline reads: $deathline\n";




      close <$f>;




    }




    # send an email off explaining the problem
    # in text mode, errors go to the screen rather than by email




    require Mail::Sendmail;
    sendmail(To=>$errorsto, From=>$mailfrom, Subject=>"error", Message=>$err) unless($test);




    print "<pre>\n", CGI::escapeHTML($err), "</pre>\n" if($test);




    # reguardless, give the user a way out. in this case, we display what was in their
    # shopping cart and give them a manual order form that just sends an email, and we
    # call them back for payment info.




    $|=1;
    # print "Those responsible for sacking the people that have just been sacked, have just been sacked.<br><br>\n";
    print "A software error has occured. Your order cannot be processed automatically. ";
    print "At the time of the error, your cart contained:<br><br>\n";
      
    open my $cart, $cartdir.$sid;
    print "$_<br>\n" while(<$cart>);
    print qq{
      <script language="javascript">
        window.open($errororderpage);
      </script>
    };
    close $cart;




    # finally, give up 




    exit 0;




  };

A software error has occured. Give the user an out. I wish I could remember what book this was from - the St. Thomas University library in St. Paul, Minnesota had it, but the author quoted a conversation that went something like...

I noticed a contingency in the code, so I went to the client and asked him how I should handle it. He said, "Oh, that won't happen, it doesn't matter". Dumbfounded, knowing full well that it might happen, I said, "Oh, so if the program reaches this point, it is okey to drop the database, delete all of the data, lock up, and stop responding without printing any diagnostic message? The client reeled back aghast and exclaimed, "No! You can't do that!". I said, unless we put some code in here to handle this situation, thats exactly what might happen! Now, when the code reaches this point, how should we handle it?

Poping up a form that asked for contact information rather than a credit card, and transmits it insecurely along with the contents of the cart is our http://wiki.slowass.net/?FailOver solution. Perhaps their order wasn't complete - thats okey. Atleast the system knew it failed and did something reasonable.

See Also

http://wiki.slowass.net/?CategoryNovice, http://wiki.slowass.net/?CategoryIntermediate

ExtensibilityPattern

Problem: Supporting features, such as protocols, that don't yet exist. Solving general problems without concern for the specifics of details.

Solution:

Synopsis: Provide a framework certain kind of task.

Frameworks

A "framework" uses other modules. Normal modules have a fixed set of dependencies and are only extended through subclassing, as per "AboutInheritance" in AboutInheritance. A framework may consist of several parts that must be inherited to be used much like several cases of "AbstractClass" in AbstractClass. It may also be passed references to other objects, as would a class thats sets up a http://wiki.slowass.net/?ModelViewController. It may read names of classes from a "ConfigFile" in ConfigFile or from the user, as in http://wiki.slowass.net/?BeanPattern. Instead of code being used by other code, it will use other code on the fly. It is on top of the food chain instead of the bottom.

XXX examples of these cases as "extensibility".

Configuration Files as Extentions

A "ConfigFile" in ConfigFile may be enough to customize the module for reasonable needs. It may also specify modules by name to be created and employed in a framework.

  # the config.pl file defines @listeners to contain a list of class names
  # that should receive notices from an EventListener broadcaster, 
  # referenced by $broadcaster. 




  require 'config.pl';




  foreach my $listener (@listeners) {
    require $listener;
    my $list_inst = $listener->new();
    $broadcaster->add_listener($list_inst);
  }

See http://wiki.slowass.net/?EventListener for the broadcaster/listener idiom. This avoids building the names of listener modules into the application. An independent author could write a plug-in to this application: she would need only have the user modify config.pl to include mention of the plug-in. Of course, modification of config.pl could be automated. The install program for the plug-in would need to ask the user where the config.pl is, and use the "ConfigFile" in ConfigFile idiom to update it.

Extending Through Scripting

A major complaint against GUIs is that they make it difficult to script repetitive tasks. Command line interfaces are difficult for most humans to work with. Neither give rich access to the API of a program. A well designed program is a few lines of Perl in the main program that use a number of modules - see http://wiki.slowass.net/?CreatingCPANModules. This makes it easier to reuse the program logic in other programs. Complex programs that build upon existing parts benefit from this, without question. How about the other case - a small script meant to automate some task? This requires that the script have knowledge about the structure of the application - it must know how to assemble the modules, initialize them, and so on. It is forced to work with aspects of the API that it almost certainly isn't concerned with. It must itself be the framework. This is a kind of "AbstractionInversion" in AbstractionInversion - where something abstract is graphed onto something concrete, or something simple is grafted onto the top of something complex. It would make more sense in this case for the application to implement a sort of "VisitorPattern" in VisitorPattern, and allow itself to be passed whole, already assembled, to another spat of code that knows how to perform specific operations on it. This lends itself to the sequential nature of the script: the user defined extention could be a series of simple calls:

  pacakge UserExtention1;




  # we are expected to have a "run_macro" method




  sub run_macro {
    my $this = shift;
    my $app = shift;




    $app->place_cursor(0, 0);
    $app->set_color('white');
    $app->draw_circle(radius=>1);
    $app->set_color('red');
    $app->draw_circle(radius=>2);
    # and so on... make a little bull's eye




    return 1;
  }

The main application could prompt the user for a module to load, or load all of the modules in a plug-ins directory, then make them available as menu items in an "extentions" menu. When one of the extentions are select from the menu, a reference to the application - or a "FacadePattern" in FacadePattern providing an interface to it - is passed to the run_macro() method of an instance of that package.

Many applications will have users that want to do simple automation without being bothered to learn even a little Perl (horrible but true!). Some applications (like Mathematica, for instance) will provide functionality that doesn't cleanly map to Perl. In this case, you'd want to be able to parse expressions and minipulate them. In these cases, a http://wiki.slowass.net/?LittleLanguage may be just the thing.

XXX - move this to http://wiki.slowass.net/?LittleLanguage.

A http://wiki.slowass.net/?LittleLanguage is a small programming language created specifically for the task at hand. It can be similar to other languages. Having something clean and simple specifically targetted at the problem can be better solution than throwing an overpowered language at it. Just by neglecting unneeded features, user confusion is reduced.

    place_cursor(0, 0)
    set_color(white)
    draw_circle(radius=1)
    set_color(red)
    draw_circle(radius=2)

A few options exist: we can compile this directly to Perl bytecode using B::Generate (suitable for integrating legacy languages without performance loss), or we can munge this into Perl and eval it. Lets turn it into Perl.

  # read in the users program
  my $input = join '', <STDIN>;




  # 0 if we're expecting a function name, 1 if we're expecting an argument,
  # 2 if we're expecting a comma to seperate arguments
  my $state = 0;




  # perl code we're creating
  my $perl = '
    package UserExtention1;




    sub run_macros {
      my $this = shift;
      my $app = shift; 
  ';




  while(1) {
    # function call name
    if($state == 0 && $input =~ m{\G\s*(\w+)\s*\(}cgs) {
      $perl .= '  $app->' . $1 . '(';
      $state = 1;
 
    # a=b style parameter
    } elsif($state == 1 && $input =~ m{\G\s*(\w+)\s*=\s*([\w0-9]+)}cgs) {
      $perl .= qq{$1=>'$2'};
      $state = 2;




    # simple parameter
    } elsif($state == 1 && $input =~ m{\G\s*([\w0-9]+)}cgs) {
      $perl .= qq{'$1'};
      $state = 2;




    # comma to seperate parameters
    } elsif($state == 2 && $input =~ m{\G\s*,}cgs) {
      $perl .= ', ';
      $state = 1;




    # end of parameter list
    } elsif(($state == 1 || $state == 2) && $input =~ m{\G\s*\)}cgs) {
      $perl .= ");\n";
      $state = 0;




    # syntax error or end of input
    } else {
      return 1 unless $input =~ m{\G.}cgs;
      print "operation name expected\n" if $state == 0;
      print "parameter expected\n" if $state == 1;
      print "comma or end of parameter list expected\n" if $state == 2;
      return 0;
    }
  
  }




  $perl .= qq<
      return 1;
    }
  >;




  eval $perl; if($@) {
     # display diagnostic information to user
  }

We're using the \G regex metacharacter that matches where the last global regex on that string left off. That lets us take off several small bites from the string rather than having to do it all in one big bite. The flags on the end of the regex are:

Out of context, the string "xyzzy" could be either a parameter or the name of a method to call. The solution is simply to keep track of context: that is where $state comes in. Every time we find something, we update $state to indicate what class of thing would be valid if it came next. After we find a function name and an opening paranthesis, either a hash style parameter or a single, lone parameter, or else a close paranthesis would be valid. We aren't even looking for the start of another function [40]. After a parameter, we're looking for either the close paranthesis or another parameter.

Every time we match something, we append a Perl-ized version of exactly the same thing onto $perl. All of this is wrapped in a package and method declaration. Finally, $perl is evaluated. The result of evaluating should be to make this new package available to our code, ready to be called.

XXX a B::Generate exmaple! ... but in http://wiki.slowass.net/?LittleLanguages

Beans as Extentions

XXX a B::Generate exmaple!

Hacks as Extentions

When a base application, or shared code base, is customized in different directions for different clients. Making heavy use of http://wiki.slowass.net/?TemplateMethods and http://wiki.slowass.net/?AbstractFactories, localizing client specific code into a module or tree of modules under a client-specific namespace rather than "where it belongs". See http://wiki.slowass.net/?HacksModule.

See Also

CutAndPasteProgramming

When programming, you take a generic algorithm and customize it for a task. Sometimes you have a copy of an implementation of that algorithm laying around that you can copy. OO tells us not to do that. Someone once said, "If you're going to make a mistake, make it in a big way". Keeping one sacred copy that must be correct could certainly accomplish that. However, it makes it possible to fix a problem once instead of having it spread around the code. Having code replicated is a huge commitment. You're banking that nothing is wrong with it, that your program will never change how the data it works on is represented, and that people like looking at endless permutations of a single piece of code.

Object Orientation proposes to eliminate this. The act of separating your program into objects creates a ripe new area for endless duplication: sequences of setting, querying, and passing data. A common situation:

  $money = $player->query_money();
  if($player->query_max_money() < $x + $payout) {
    $player->set_money($money + $payout); 
    $nickels_on_floor = 0;
  } else {
      $nickels_on_floor =  $money + $payout - $player->query_max_money();
      $player->set_money($player->query_max_money());
  }

No matter which way we make the set_money() function work, we're doomed. If it enforces a ceiling, then we have to query again to see if the ceiling was enforced. If it doesn't enforce a ceiling, then we have to check each and every time we access the value and enforce it ourselves. The result is one or two of these sequences of logic will get strewn around the program. The problem is that the client needs something slightly more complex than the server is prepared to provide. We could, and perhaps should, make the object we're calling, $player, return an array, including success or failure, how much money actually changed hands, how much more they could carry. This would go with the approach of providing way more information than could ever be needed. This leads to bloated code and logic that we aren't sure whether or not is actually being used, leading to untested code going into production and becoming a time-bomb for the future, should anyone actually start using it. Less dramatically, we could modify the target object to return just one more piece of information when we realize we need it. This leads to a sort of feature envy, where the server is going out of its way to provide things in terms of a certain clients expectations, causing an API that is geared towards a particular client and incomprehensible to all else. Also, there is temptation to write something like:

  package Util;

Beware of Utility, Helper, Misc, etc packages. They collect orphan code. The pressure to move things out of them is very low, as they all seem to fit by virtue of not fitting anywhere else. They grow indefinitely in size because the class of things that don't seem to belong anywhere is very large. The effect snowballs as the growth of other objects is stymied while the "Utility" package booms.

Snafus like this cause the number of accessors to grow to accommodate all of the permutations of accessing the data. You'll often see a set_ function, a query_ or get_ function, and add_ function, for each value we encapsulate.

  package Casino;
  use ImplicitThis; ImplicitThis::imply();




  sub pay_out {




    # this method would go in $player, since it is mostly concerned with $player's variables,
    # but we don't want to clutter up $player's package, and we don't know if anyone else
    # will ever want to use this code.




    my $player = shift;
    my $payout = shift;




    my $money = $player->query_money();
    if($player->query_max_money() < $money + $cost) {
      $player->set_money($money + $payout); 
      $nickels_on_floor = 0;
  } else {
      $nickels_on_floor =  $money + $payout - $player->query_max_money();
      $player->set_money($player->query_max_money());
    }
  }

Associating methods with our client object that reasonably belong in the server object ($player, in our case), isn't always the worst solution. In fact, if you put them there and leave them until it is clear that they are needed elsewhere, you'll find that either they are globally applicable, they only apply to this client, they apply to a special case of the client, or they apply to a special case of the server.

1. Applies only to this particular client: Leave the server's accessors in the client, in this case, Casino. 2. Nearly every client can benefit from this code: Put the logic in the server, in this case, $player. 3. Applies to a special case of clients: Consider a Facade for $player. Not worth it? Toss up between #1 and #2. 3. Applies to a special case of servers: Subclass $player's package.

Its okay to do it "wrong". Each new thing that gets built will give you more and more insight into how things really need to be able to work. The important thing is to continue to incorporate these lessons into the code, to keep the code in line with reality, and never be afraid of breaking your code. If you're afraid of breaking your code in name of making it better, it has you hostage. If you're afraid of breaking it because you think it'll take too long to fix it to work again after the change, it has already grown rigid, and only frequently breaking and fixing it will allow it to regain its flexibility. Take Jackie Chan, for instance. Having broken countless bones, he's only gotten stronger, braver, and apparently, more skilled at walking on a broken leg, more knowledgeable about his limits, and adept at healing. Alternatively, if you're afraid of subtle bugs creeping in undetected, you've got murky depths syndrome. Perhaps a lot of it is poorly understood Lava Flow code, that was laid down once, built on top of, and has become permanent for it. Reading through the dark murky code is a good start. A pass now and then keeps the possibilities and implications fresh in your mind. However, this is time consuming and will ultimately miss implications. There is no substitute for knowledge of the code, and neither is there substitute for testing. There is a special class of code where every bit of logic is exersized every execution. Mathematical modeling applications that work on large, well understood, datasets fit this description. Any subtle bug would give dramatic bias in the output as soon as the buggy program were run. Normal programs doesn't have this luxury - their bugs lurk for ages, possibly until maintaince headaches dictate it be abandoned and rewritten. We can't understand everything in a large program, but we can contrived data sets and test applications that work out every feature of our module. Writing artificial applications that use our modules, and coming up with bizarrely improbably-in-every-way datasets simulates the "luxury" case where all of the dark murky depths are used every run, in our case, every test run. The only time dark murky depths and Lava Flow code, the code most in need of a refactor, can be attacked is when we have a definitive method for discovering whether or not we've broken it. "Measure twice, cut once". If you're anything like me, you like flying by the seat of your pants. If you go to the movies or watch the tele, you know that every fighter pilot struggles with this issue: do I trust my intuition and wing it, or do I go by the book? Luke Skywalker destroyed the Death Star without that damn useless targeting computer. The architects that built skyscrapers certainly have to think outside of the box, so to speak, to come up with techniques and ideas for building beyond the bounds of what is believed possible, but no one would trust an architect that couldn't back up his gut instincts with cold, hard math. Solution? Code with your heart, but be the first to know when you make the inevitable wrong guess. Write seat of your pants code, but write first class scientific tests.

Cut and paste code is a sign of larger problems.

Categories

See Also

PrematureOptimization

Premature optimization is the root of all evil.

Don't optimize for bugs. Don't optimize for poor implementations of language interpreters. Don't optimize a naive implementation. All of these things will change right out from under you. Code that is dependent on something broken for speed will run slower when the real problem is fixed.

Optimization isn't evil - only premature optimization is. In each of these examples, if the more general case is optimized rather than specific cases, everything is right in the world. People failing to see the bigger picture are the ones causing grief. If you think you see the bigger picture, you almost certainly don't. Like a good security consultant, a good programmer is pessimistic about everything.

I won't bore you with statistics about computers becoming faster faster than you can change your code. The fact is, there is a niche for squeezing the last drop of performance out of an application. This nitche shows people what they could do if only their computer were a little faster, and it drives hardware performance.

The conclusion to draw from this is: If there is a quicker way to do something in Perl that is less readable or requires jumping through hoops, it is a bug in the more readable implementation, and the more readable implementation will soon be fixed. Write good Perl. Let Perl do its thing.

There are some optimizations which are considered good style, and therefore aren't premature:

See Also

BlindFaith

"If I use the Object Oriented features, I must be benefiting from them".

OO is no silver bullet. It isn't a cure all and it isn't impossible to do more harm than good with it. Its a good indication that something has gone wrong if it is adding complexity rather than removing. Remember, keep it simple. When it becomes clear how to refactor your code, do so then, not before.

Can interchangeable objects be used interchangeably? Can one object be replaced with several objects? If not, consider adding a common Interface Type to the like objects, and creating an Abstract Factory to create and return the correct one for any given situation. Think of saving files using one of many filters.

Are objects being used purely to hold values? You've probably got one or two big fat Big Ball Of Mud objects. Start moving logic out into the package that contains the data it most closely identifies with. Insert shims to keep everything running, that delegate the method call to the object, if you must, and experiment with removing the shim over time. If nothing else, this sets a precedent, and new code can be written immediately in the new, correct way, rather than more and more code accumulating using the old, ugly approach.

Can you remove an object from the system without breaking every other object? If not, they are too interdependent, with very few exceptions. Even with the Microsoft Windows operating system, something like "the registry" sounded like a great idea at first. Any program, as well as the operating system, could store configuration and run time data in a central database. In practice, it creates a single point of failure, frequently sustaining damage that causes the entire system to need to be restored from backup or reinstalled. The file grows too large and the operating system fails on any attempt to grow it beyond the limit of the max file size. Windows was designed with the register being a core, unchanging idea, but in retrospect, it may have been better if it weren't an absolute. Examining your code for objects which absolutely cannot be removed provides great insight into over dependence. If every object is dependent on every other object, object orientation is doing nothing for you. If any object can be removed with minimal damage to the overall structure, you have something healthy and organic.

http://wiki.slowass.net/?CategoryToDo

See Also

BigBallOfMud

Procedural code converted to OO lends itself to one main object with lot of little objects hung off of it. The interdependency in the code doesn't change, and objects don't become noticeably autonomous. Like things may be grouped together, but for the wrong reasons: historically they have been used in sequence, or they form an implementation and interface wrapped together.

A BIG BALL OF MUD is haphazardly structured, sprawling, sloppy, duct-tape and bailing wire, spaghetti code jungle. We've all seen them. These systems show unmistakable signs of unregulated growth, and repeated, expedient repair. Information is shared promiscuously among distant elements of the system, often to the point where nearly all the important information becomes global or duplicated. The overall structure of the system may never have been well defined. If it was, it may have eroded beyond recognition. Programmers with a shred of architectural sensibility shun these quagmires. Only those who are unconcerned about architecture, and, perhaps, are comfortable with the inertia of the day-to-day chore of patching the holes in these failing dikes, are content to work on such systems. - http://wiki.slowass.net/?PatternLanguageOfProgramDesign 4

Also known as a Stove Pipe System, as apparently stove pipes were prone to corrosion and needed frequent repair with whatever was at hand, creating a discombobulated kludge.

An ill-assorted collection of poorly matching parts, forming a distressing whole. - Jackson Granholme

The problem with retrofits is they are typically hastily done and never improved before the next story is built. They come under an immediate pressing concern that overwhelms any reasonable ability to think of the future. Indeed, the future won't exist at all for the project if the retrofit isn't done. Not even in Las Vegas are floors built so aggressively.

Reguardless of whether you're in the Windows camp or the Unix camp, you're using an operating system built for a 16 bit system that has been retrofit, but never actually completely rewritten. Other operating systems have equally as interesting stories - http://wiki.slowass.net/?MacOS 1 through 9 were written for a 32 bit native address space, but memory protection was retrofit, while Unix was written for a 16 bit address space and retrofit for 32, but incorporated memory management from the beginning. http://wiki.slowass.net/?AmigaOS was originally Tripos, but was written in BCPL, a language that had one data unit - the machine word - making the conversion to a 32 bit processor and system bizarrely easy, while making mundane programming tasks bizarrely painful. C later grew out of BCPL, where it cleaned up the syntax, introduced subscripts on arrays of different sized units of memory, then later structs, unions, and countless other modern marvels - but all of this is neither here nor there.

Refactor mercilessly - http://wiki.slowass.net/?RefactoringImprovingTheDesignOfExistingCode

Systems can effectively be adapted, and in order to build very far at all, you almost have to adapt an existing system. Adaption cannot be sustained without time spent making fundamental changes - see http://wiki.slowass.net/?InvariantsArentAlwaysConstants - but fundamentally it is a better investment to maintain and adapt existing systems rather than rewrite them. Most spectacular software industry failures arose from failure to maintain systems followed by an attempt to rewrite them from scratch. Most successes can trace their code back to the 1970s: The SAS system, Unix, DB2, and Signaling System 7, for example. http://wiki.slowass.net/?JoelOnSoftware states that it takes 10 years to write a program. I'd place that as a minimum. Most software starting life in the 1970s is now rock solid, mature, and portable. Most programs that started life in the mid-1980s are still having growing pains, stability problems, and their owners can't bare the expense of porting them.

Perl allows you to quickly create applications. Perl itself could be considered a "BigBallOfMud" in BigBallOfMud, with complexity oozing out every pore. Perl has been around longer than 10 years. A large part of the code reuse of a script is from the interpreter itself. Writing an interpereter is one way of writing an API for code reuse. This gives significant lead time on small scripts, but growing and changing applications hit a ceiling even quicker because of this accelerated start. Perl scripts quickly reach the point where they need to be detangled.

"GodObject" in GodObject has specific steps for migrating code and data out of a monolithic object.

<s>This exhibits "LayeringPattern" in LayeringPattern, Polymorphism, "LooseCoupling" in LooseCoupling, "CommandObject" in CommandObject, Routing, and http://wiki.slowass.net/?EventListener patterns. </s>

http://wiki.slowass.net/?CategoryToDo

See "PerlDesignPatterns" in PerlDesignPatterns for the table of contents

SpaghettiCode

No one understands it, so no one refactors it. Just as it is almost impossible to untangle a plate of spaghetti, code with no visible structure and no logical structure is daunting. Structured Programming contributed to the world the idea that the code should visibly reflect its logical structure: this was the birth of indenting. Previously, a goto would bounce back a few lines in flow, and another one somewhere in the middle would bounce you past the last goto.

  10 let a=a+1
  20 if a > 10 then goto 50
  30 print a:print "\n"
  40 goto 10
  50 stop




  foreach my $a (1..10) {
    print "$a\n";
  }

Despite the systematic banishment of these languages *, people still find ways to write code that has this problem on a large scale:

1. Side effects: Each method called seems to do countless unrelated unexpected things, making the normal flow difficult to understand or discover. When writing new code, it is often impossible to reuse existing code because of the unfathomed grouping of unrelated tasks.

2. Dark heart: The heart of each routine, object, module, etc is buried somewhere deep in its bowels, poorly or not marked, and reached through an obscure path, kind of like an Egyptian pyramid.

3. Ransom transaction: Data is communicated through global variables, or stashing data in some remote object. This is akin to conducting a ransom transaction by demanding that money be thrown into a dumpster in an abandoned industrial complex to be picked up by someone who will presumably flee and kill the kidnapee should either cops be there or money not be. This is an entirely unwholesome way to conduct a transaction.

4. Three cups and nut: Large amounts of unrelated things are grouped together without regard for when, how, by whom, in what order, under what conditions, or why they are used. Since they all look alike and any be used at any time, getting lost is easy. Which one actually gets used may well be a slight of hand anyway.

5. Wheel factory: Reinventing the wheel, or stack, or program control constructs, or parameter passing mechanisms, or anything else which should be both standardized throughout a language and completely factored out of the language. This clutters the program with difficult to understand, repeated idiom.

If Spaghetti code is needed, it can take on a life of its own. Most large projects have some legacy code that forms the heart of their project that is no longer represented by a human who wrote it.

See also: "LavaFlow" in LavaFlow, "GodObject" in GodObject, "ObjectOrgy" in ObjectOrgy

External Pages Linking to This Page:

Thats a thought. Some common goto idioms, documented in the interest of untangling them. Linux kernel uses a goto-on-failure idiom where error return codes are set just in case, but that is the actual result code only when an error causes a goto to exit the function. Other program simulate stacks using temporary variables that they stash things in - the http://wiki.slowass.net/?WebWanker sure suffered from that.

LavaFlow

When code just kind of spews forth and becomes permanent, it becomes an architectural feature of the archeological variety. Things are built atop the structure without question and without hope of changing what is beneath them. The existing code is seen has historical curiosity.

XXX There is a tale of a computer manufacturer, back in the days when each vendor had their own CPU. There was a bug in their new processor, and production schedules didn't give them time to work it out. The department responsible for coding the system software (operating system) for the thing was instructed to work around it. The system software dutifully avoided tickling the bug, and documented the presence of it for anyone writing applications for the machine. Software was ported to the machine, and unsure what to make of the bug warned end users that certain feature of the applications didn't function correctly on this machine.

BoatAnchor

Hardware or software that serves no useful purpose that is kept around for political reasons. Often, everyone is secretly waiting for it to be used again, so it is no longer a derelict eyesore.

http://wiki.slowass.net/?CategoryToDo

Not sure this antipattern really fits with the motif of this text.

BusySpin

Problem: Using 100% of the CPU to wait for an event.

  while(1) {
    if(@queue) {
      dosomething();
    }
  }

This example applies to threaded code, but non threaded code can fall prey as well:

  while(! -f $file) { }
  # do something with $file here

Both of these examples attempt to use 100% CPU resources. In the best case, you make everything sluggish. Worst case, you never release the CPU so the thing you're waiting for happens. On different systems, the results are different, too. Some threads preempt. Others never take the CPU back until you call yield() or an IO function! Writing code like this is the quickest way to make things work on some systems but not others.

Using sleep() and yield() from the threads package is an improvement. Sometimes polling is unavoidable. When you wrote the code you're waiting on, using thread::shared::semaphore lets you easily and efficiently communicate readiness between threads. Unix programs have no way of being notified when a file shows up, so polling may be the only solution: just sleep() so others can get work done.

Non-Blocking I/O

IO::Socket::INET (http://www.cpan.org/modules/by-module/IO/ Socket::INET) has a -blocking(0)> method to disable blocking. Blocking halts the program until data is available to read. In a program running as a daemon or server - see http://wiki.slowass.net/?DaemonProcess - that needs to service I/O on multiple channels, this is unacceptable - blocking must be disabled.

Code like this will be written:

  # this program attempts to use 100% of CPU time




  use IO::Socket::INET;
  my $sock = IO::Socket::INET->new(PeerAddr => 'www.perl.org',
                                   PeerPort => 'http(80)',
                                   Proto    => 'tcp');
  $sock->blocking(0);




  while(1) {
    read $sock, my $buffer, 8192;
    do_something_with_data($buffer);
  }

The program should sleep, waiting for data to arrive, rather than looping constantly and trying over and over again. See "SelectPollPattern" in SelectPollPattern for a solution using the select() call.

Signals to Wake By

sleep() and I/O operations are aborted by incoming signals, as sent from the shell with the "kill" command or from another process using the kill() function on your PID.

When I/O is aborted, it returns a zero-length string, not undef. Read-loops using while() work correctly:

  while(<$fh>) {
    print;
  }

This may print zero length strings sometimes, but no one will ever know. while(<$fh)> continues looping.

Sometimes you want to sleep for a fixed period, no matter what.

  my $waketime = scalar(time()) + 60*60*8.5; # longer on the weekends
  while(scalar time() < $waketime) {
    sleep $waketime - scalar time(); # sleep the rest of the duration - probably
  }

"DebuggingPattern" in DebuggingPattern has a tiny example of dumping stack when a signal comes in.

When fork()ing, children send CHLD signals to their parent when the child dies. The parent should have a signal handler set up to reap these: see http://wiki.slowass.net/?DaemonProcess.

Categories

See Also

RaceCondition

Problem: Multitasking operating systems change tasks at unexpected times, such as between two lines of the program, or half way through a statement. This creates subtle bugs that pop up "now and then".

Solution: Use flock() and semaphores to guard access to things accessed by more than one process or thread.

Nature of the Race Condition

  Malak tells you: wee! :-)  ok here is the question.  if i have two
  copies of a script downloading the same set of files (to make it go faster) i
  want to make sure that one script doesnt try to download the same file as the
  other.  right now i'm using a -e check to see if the file exists but im not
  sure if this will ever cause a problem if both scripts happen to hit the same
  file at the same time




  Yes, there is a race condition between the time that you test for the file
  with //-e// and when you create the file with //open()//. It could well
  happen that you test to see if the file, is there, it isn't, you go to
  open it for write and over write another process.




    if(! -e $file) {
      open my $f, '>', $file;
      download($f);
    }




  You tell Malak: yes. use sysopen(). open for write but not create. if
  it returns error status, the race condition bit ya, move on to the next file




  Malak tells you: not sure if i can do that. im calling an external
  program to actually do the download...




  You tell Malak: why don't you use threads, then? then you can create
  a hash that is shared between all threads and use it to do locking




    use threads::shared;
    my :shared %locks;




  Malak tells you: i wonder if the race condition matters though...  
  which ever process finishes downloadig last should write the file and replace
  whatever the other file wrote, right?




  You tell Malak: yeah




  Malak tells you: i dont care if that happens, all i care about is
  that no files get corrupted, seemingly downloaded good when they arent




  You tell Malak: actually, on unix, what would happen is the same
  would be downloaded, twice, at the same time, but only one of the inodes would
  actually exist on the filesystem, so when the other processed closes its
  filehandle, the filesystem will deallocate the blocks

File-Access Race Conditions

Files require coordinated access when there is any chance that multiple processes could attempt to access the same file at the same time. It could be two instances of the same application running (Mozilla, mutt, gtk-gnutella), or it could be two fork()ed processes of the same application, or threads.

http://www.perldesignpatterns.com/?PerlDesignPatterns displays a web counter that I cooked up as a quick amusement some time ago. It is a 1 bit animaged GIF that displays 30 iterations of Conways Game of Life [41] applied to the current hit number. At the time of this writing, it is at 3866. It uses flock() to guard access to the "counter" file, which contains the current hit number. Initially, I didn't bother, and every 1000 hits or so, it would reset to 0. Ooops. Just as one process had opened the file for write and truncated it, the other process went to read the value, and got zero. The second process would finish after the first, and it would increment zero to get one, and write that back.

All of these dire warnings apply to access to datastructures in memory, such as those using Sys::Mmap (http://www.cpan.org/modules/by-module/Sys/ Mmap), and to .dbm files accessed with dbmopen() or a similar routine. This code depends on the fine http://wiki.slowass.net/?NetPBM package, available from http://www.acme.com .

The lock should be established before reading, in cases where a value is read, modified, then written back - cases including counters like web counters.

  #!/usr/bin/perl
  
  print "Content-type: image/gif\n";
  print "Pragma: no-cache\n";
  print "\n";




  my $pid = $$; # our pid, not the pid of some shell or something
  
  umask 000;
  local $ENV{PATH} = '/usr/local/bin';




  open my $f, '+<', 'counter';
  flock $f, 2;
  $counter=<$f>;
  $counter++;
  seek $f, 0, 0;
  printf $f $counter + "\n";
  close $f;
  
  system "pbmtext $counter | pnmcrop 2>/dev/null | pnmenlarge 3 > counter10.$pid.pbm 2>/dev/null ";
  for(my $i=10;$i<30;$i++) {
    my $j = $i + 1;
    system "pbmlife counter$i.$pid.pbm > counter$j.$pid.pbm 2>/dev/null";
  }
  open my $gif, "ppmtogif -delay 40 -loop 100 counter??.$pid.pbm 2>/dev/null|";
  while(read $gif, my $buf, 1024, 0) {
    print $buf;
  }
  close $gif;
  
  # this isn't working :(
  
  for(my $i=10;$i<31;$i++) { 
    unlink "counter$i.$pid.pbm"; 
  }

Didn't anyone ever tell you web-page hit counters were useless? They don't count number of hits, they're a waste of time, and they serve only to stroke the writer's vanity. It's better to pick a random number; they're more realistic.

Here's a much better web-page hit counter:

           $hits = int( (time() - 850_000_000) / rand(1_000) );

If the count doesn't impress your friends, then the code might. :-)

- "PerlDoc" in PerlDoc:perlfaq5 by http://wiki.slowass.net/?TomChristiansen and http://wiki.slowass.net/?NathanTorkington

When several processes are reading the current value (as it stands at any given moment), and one process is independently generating and storing new values, file I/O still has a race condition where the file may be null, between the time the file is truncated and the new data written. This also requires locking. http://wiki.slowass.net/?NetPBM has an example of a multi-player Life game, where locking is not needed. Single bits are modified in the Sys::Mmap (http://www.cpan.org/modules/by-module/Sys/ Mmap) 'd image during any hit, and the current board is displayed. Since random memory access is being used rather than file I/O, truncated files aren't a concern. SQL engines want something like this, but the problem is far more complex. They must use generational locks, where each "update" or "insert" represents a generation. Only records marked at or earlier than the current generation at the time a query is started are returned in a query. "update" must add a new record with a newly incremented generation number before removing the old one in order to let currently executing queries run without garbled results. This arrangement lets one "insert"/"update" or other write operation happen at the same time as an arbitrary number of queries. Generational systems like this are also used in garbage collection, to avoid race conditions between the thread that is collecting unreferenced memory and the running program.

Thread Datastructure Race Conditions

XXX

http://wiki.slowass.net/?CategoryToDo, http://wiki.slowass.net/?CategoryAntiPattern

See Also

GodObject

Synopsis

Anti-patterns stereotype pathological, degenerate code. The God Object anti-pattern afflicts Perl programs at a shocking rate. It is a hold over from top down design in procedural languages. It's the first trap aspiring Object Oriented programs fall into, so it's a suitable first Anti-Pattern. I assume that you know the basic syntax for creating objects in Perl. If not, go read Tom Christiansen's tutorial at http://search.cpan.org/author/GSAR/perl-5.6.1/pod/perltoot.pod [42] then come right back - this is the next thing you need to know.

Anti Patterns

[43]

Programming is fun. "Hacking on a program" is an expression of the glee that comes from rapidly adding neat features to a program.

Programmers are optimisits. We assume that each feature in the specification for a project can be added in a constant amount of time even as the code grows, and we add each new feature just like we added the last. In other words, that programs are completed in linear time. The last half, recursively, takes twice as long.

Unchecked code growth destroys a program from the inside out. Sure signs of unchecked growth are:

Code degeneration causes lack of programmer interest, which leads to forked Open Source projects, over budget or failed commercial ventures, and most horrifically, loss of interest in working on a program that used to be fun.

Reading difficult to comprehend code is work. If the quality of the code is good, this work is rewarded. If the quality is poor, the reader suffers the code with no joy or benefit. There are volumes full of difficult to understand code that people willingly pour over. Programming Pearls is one such book [44]. The readers patience in studying the algorithms is rewarded with deep insight. Quality is difficult to put your thumb on and impossible to quantify. Just because code is difficult to read doesn't mean it isn't worth your time. It is our job to make it worth reading, worth keeping, worth reusing, and worth hacking on.

God Object Anti-Pattern

Named for the conspicous centralization of control. It is a hold over from procedural languages and top-down design. Top-down design states that the way to design a program is to start with a main routine, and repeatedly break it i