The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Text::TypingEffort - Calculate the effort required to type a given text

SYNOPSIS

  use Text::TypingEffort qw/effort/;
  
  my $effort = effort("The quick brown fox jumps over the lazy dog");

$effort will be a hashref something like this

  $effort = {
      characters => 43,     # the number of characters in the text
      presses    => 44,     # key presses need to type the text
      distance   => 950,    # millimeters the fingers moved while typing
      energy     => 2.2..., # the energy (Joules) used while typing
  };

DESCRIPTION

Text::TypingEffort is used to calculate how much physical effort was required to type a given text. Several metrics of effort are used. These metrics are described in detail in the "METRICS" section.

This module is useful for determining which keyboard layout is more efficient, for making API/language design decisions, or to show your boss how hard you're working.

Function Quick Reference

The following quick reference provides brief information about the arguments that the functions can take. More detailed information is given below.

 # effort() with a single argument
 my $effort = effort(
 
    $text | \$text                        # the text to analyze
 );
 
 # effort() with named arguments
 my $effort = effort(
 
    text     => $text | \$text,           # the text to analyze
    file     => $filename | $filehandle,  # analyze a file
    layout   => 'qwerty'                  # keyboard layout
              | 'dvorak'
              | 'aset',
              | 'xpert',
              | 'colemak',
    unknowns => 0 | 1,                    # tally unknown chars?
    initial  => \%metrics,                # set initial values
    caps     => 0 | 2 | 3 | ...           # Caps Lock technique
 );
 
 # layout()
 my $l = layout;                          # get QWERTY layout
 my $l = layout($layout_name);            # get named layout
 
 # register_layout()
 register_layout($name, \@layout);        # register custom layout

FUNCTIONS

effort [$TEXT | \$TEXT]

The argument should be a scalar or a reference to a scalar which contains the text to be analyzed. If no parameter is provided, $_ is used as the value of $TEXT. Leading whitespace on each line of $TEXT is ignored since a decent text editor handles that for the typist. Only characters found on a standard US-104 keyboard are tallied in the metrics. That means that accented characters, unicode, etc. are not included. If a character is unrecognized, it may be counted under the 'unknowns' metric (see that documentation).

effort %ARGUMENTS

effort() may also be called with a list of named arguments. This allows more flexibility in how the metrics are calculated. Below is a list of acceptable arguments. In summary, calling effort like this

 effort($text);

is identical to explicitly specifying all the defaults like this

 effort(
    text     => $text,
    layout   => 'qwerty',
    unknowns => 0,
    initial  => {},
    caps     => 4,
 );

text

Specifies the text to be analyzed. The value should be either a scalar or a reference to a scalar which contains the text. If neither this argument nor file is specified, $_ is used as the text to analyze.

file

Specifies a file which contains the text to be analyzed. If the value is a filehandle which is open for reading, the text will be read from that file handle. The filehandle will remain open after effort is finished with it.

If the value is a filename, the file will be opened and the text for analysis read from the file. If neither this argument nor text is specified, $_ is used as the text to analyze.

layout

Default: qwerty

Specifies the keyboard layout to use when calculating metrics. Acceptable, case-insensitive values for layout are: qwerty, dvorak, aset, xpert, colemak. If some other value is provided, the default value of 'qwerty' is used.

unknowns

Default: 0

Should a histogram of unrecognized characters be returned with the other metrics? A true value indicates yes and a false value no. Tallying this histogram takes a little bit more work in the inner loop and therefore makes processing ever so slightly slower. It can be useful for seeing how much of the text was not counted in the other metrics.

See unknowns in the "METRICS" section for information on how this option affects effort's return value.

initial

Default: {}

Sets the initial values for each of the metrics. This option is the way to have effort accumulate the results of multiple calls. By doing something like

 $effort = effort($text_1);
 $effort = effort(text=>$text_2, initial=>$effort);

you get the same results as if you had done

 $effort = effort($text_1 . $text_2);

except the former scales more gracefully. The value of initial should be a hashref with keys and values similar to the result of a previous call to effort. If the hashref does not contain a key-value pair for a given metric, the initial value of that metric will be its normal default value (generally 0).

If the value of initial is not a hashref, effort proceeds as if the initial argument were not present at all. This behavior may change in the future, so don't rely upon it.

caps

Default: 4

Determines how strings of consecutive capital letters should be handled. The default value of 4 means that four or more capital letters in a row should be treated as though the user pressed "Caps Lock" at the beginning, then typed the characters and then pressed "Caps Lock" again. This behavior more accurately models what typical users do when typing strings of capital letters. You may change the number of capital letters that must be in a row in order to trigger this behavior by specifying an integer greater than 1 as the value of the caps argument. If you specify, the value 1, the value 2 will be used instead.

If the value of caps is 0, capital letters are treated as though the user pressed Shift for each one. If undef is given, the default value of caps is used.

When caps handling is enabled, "capital letter" means any character that can be typed without the Shift key when Caps Lock is on. That includes characters such as '.' and '/' and '-' etc. However, the string of consecutive caps must start and end with a real capital letter. That way, a string such as '-----T-----' won't be calculated using Caps Lock.

layout [$NAME]

Returns an arrayref representing the requested layout or undef if the given name is unknown. If no layout name is provided, the QWERTY layout is returned.

See register_layout below or the Text::TypingEffort source code for examples of the contents of the arrayref.

register_layout $NAME, \@LAYOUT

Register a new layout, using the given name. The name is stored without regard to case, so 'NAME' and 'name' are considered the same. The layout itself should be an arrayref containing each key's character and its shifted version. Running the code below displays a pseudo-code snippet showing how the QWERTY keyboard layout is defined. Start in the upper-left corner of a QWERTY keyboard and follow along through the pseudo-code. You should get the idea. You can also find documented examples in the source code.

 use Text::TypingEffort qw/layout/;
 $l = layout;
 print "register_layout('qwerty', [qw{\n";
 while( ($lower, $upper) = splice(@$l, 0, 2) ) {
        print "\t$lower $upper\n";
 }
 print "}]);\n";

Typically, register_layout is called just prior to effort. For example:

 my @layout = qw{
    ...
 };
 register_layout('my custom layout', \@layout);
 my $e = effort(
    text   => $text,
    layout => 'my custom layout',
 );

METRICS

characters

The number of recognized characters in the text. This is similar in spirit to the Unix command wc -c. Only those characters which are encoded in the internal keyboard layout will be counted. That excludes accented characters, Unicode characters and control characters but includes newlines.

presses

The number of keys pressed when typing the text. The value of this metric is the value of the characters metric plus the number of times the Shift key was pressed.

distance

The distance, in millimeters, that the fingers travelled while typing the text. This distance includes movement required for the Shift and Enter keys, but does not include the vertical movement the finger makes as the key descends during a press. Perhaps a better name for this metric would be horizontal_distance, but that's too long ;-)

The model for determining this metric is very simplistic. It assumes that a finger moves from its home position to the destination key and then returns to the home position before moving on to the next key. Of course, this is not how people actually type, but the model should result in an upper-bound for the amount of finger movement.

energy

The number of Joules of energy required to type the text. This metric is the most inclusive in that it tries to accomodate the values of both the presses and the distance metrics into a single metric. However, this metric is also the least accurate at modeling the real world. The calculations are roughly based upon the The Compendium of Physical Activities (or rather hearsay about it's contents since I don't have a copy).

The physical charactersistics of the keyboard are assumed to be roughly in line with ISO 9241-4:1998, which specifies standards for such things.

unknowns

This metric is only included in the output if the unknowns argument to effort was true.

The value is a histogram of the unrecognized characters encountered during processing. This includes any control characters, accented characters or unicode characters. Generally, anything other than the letters, numbers and punctuation found on a standard U.S. keyboard will be counted here.

If all characters were recognized, the value will be an empty hashref. If any characters were unknown, the value will be a hashref something like this:

 unknowns => {
    presses => {
        'Å' => 2,
        'Ö' => 3,
    },
    distance => {
        'Å' => 2,
        'Ö' => 3,
    },
 }

The key indicates the metric for which information was missing. The value is a hash indicating the character and the number of times that character occurred. There will be no entries in the hash for the characters or energy metrics as these are incidental to the other two.

This metric is only added to the result if the unknowns option was specified and true.

SEE ALSO

Tactus Keyboard article on the mechanics and standards of keyboard design - http://www.tactuskeyboard.com/keymech.htm

CONTRIBUTING

The source for Text::TypingEffort is maintained in a Git repository located at git://git.ndrix.com/Text-TypingEffort. To submit patches, you can do something like this:

 $ git clone git://git.ndrix.com/Text-TypingEffort
 $ cd Text-TypingEffort
 # hack, commit, hack, commit
 $ git format-patch -s origin
 $ git send-email --to michael@ndrix.org *.patch

See http://www.kernel.org/pub/software/scm/git/docs/everyday.html

AUTHOR

Michael Hendricks <michael@ndrix.org>

Thanks to Ricardo Signes for a patch for the layout and register_layout subroutines.

BUGS/TODO

Please submit suggestions and report bugs to the CPAN Bug Tracker at http://rt.cpan.org/NoAuth/Bugs.html?Dist=Text-TypingEffort

COPYRIGHT AND LICENSE

Copyright (C) 2005-2009 by Michael Hendricks

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 445:

Non-ASCII character seen before =encoding in ''Å''. Assuming CP1252