The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

App::JobLog::TimeGrammar - parse natural (English) language time expressions

VERSION

version 1.032

SYNOPSIS

  #!/usr/bin/perl
  
  use Modern::Perl;
  use DateTime;
  use App::JobLog::Time qw(tz);
  use App::JobLog::TimeGrammar qw(parse);
  
  # for demonstration purposes we modify "today"
  $App::JobLog::Time::today =
    DateTime->new( year => 2011, month => 2, day => 17, time_zone => tz );

  for my $phrase ( 'Monday until the end of the week', 'Tuesday at 9:00 p.m.' ) {
      my ( $start, $end, $endpoints ) = parse($phrase);
      say $phrase;
      say "$start - $end; both endpoints specified? "
        . ( $endpoints ? 'yes' : 'no' );
  }

produces

  Monday until the end of the week
  2011-02-14T00:00:00 - 2011-02-20T23:59:59; both endpoints specified? yes
  Tuesday at 9:00 p.m.
  2011-02-08T21:00:00 - 2011-02-15T23:59:59; both endpoints specified? no

DESCRIPTION

App::JobLog::TimeGrammar converts natural language time expressions into pairs of DateTime objects representing intervals. This requires disambiguating ambiguous terms such as 'yesterday', whose interpretation varies from day to day, and 'Friday', whose interpretation must be fixed by some frame of reference. The heuristic used by this code is to look first for a fixed date, either a fully specified date such as 2011/2/17 or one fixed relative to the current moment such as 'now'. If such a date is present in the time expression it determines the context for the other date, if it is present. Otherwise it is assumed that the closest appropriate pair of dates immediately before the current moment are intended.

Given a pair consisting of fixed and an ambiguous date, we assume the ambiguous date has the sense such that it is ordered correctly relative to the fixed date and the interval between them is minimized.

If the time expression provides no time of day, such as 8:00, it is assumed that the first moment intended is the first second of the first day and the last moment is the last second of the second day. If no second date is provided the endpoint of the interval will be the last moment of the single date specified. If a larger time period such as week, month, or year is specified, e.g., 'last week', the first moment is the first second in the period and the last moment is the last second.

If you wish to parse a single date, not an interval, you can ignore the second date, though you should check the third value returned by parse, whether an interval was parsed.

parse will croak if it cannot parse the expression given.

Time Grammar

The following is a semi-formal BNF grammar of time understood by App::JobLog::TimeGrammar. In this formalization s represents whitespace, d represents a digit, and \\n represents a back reference to the nth item in parenthesis in the given rule. After the first three rules the rules are alphabetized to facilitate finding them.

              <expression> = s* ( <ever> | <span> ) s*
                    <ever> = "all" | "always" | "ever" | [ [ "the" s ] ( "entire" | "whole" ) s ] "log"
                    <span> = <date> [ <span_divider> <date> ]
 
                      <at> = "at" | "@"
                 <at_time> = [ ( s | s* <at> s* ) <time> ]
              <at_time_on> = [ <at> s ] <time> s "on" s
               <beginning> = "beg" [ "in" [ "ning" ] ]
                    <date> = <numeric> | <verbal>
               <day_first> = d{1,2} s <month>
                 <divider> = "-" | "/" | "."
                 <dm_full> = d{1,2} s <month> [ "," ] s d{4}
                     <dom> = d{1,2}
                    <full> = <at_time_on> <full_no_time> | <full_no_time> <at_time>
              <full_month> = "january" | "february" | "march" | "april" | "may" | "june" | "july" | "august" | "september" | "october" | "november" | "december" 
            <full_no_time> = <dm_full> | <md_full>
            <full_weekday> = "sunday" | "monday" | "tuesday" | "wednesday" | "thursday" | "friday" | "saturday"
                     <iso> = d{4} ( <divider> ) d{1,2} \\1 d{1,2}
                      <md> = d{1,2} <divider> d{1,2}
                 <md_full> = <month> s d{1,2} "," s d{4}
          <modifiable_day> = <at_time_on> <modifiable_day_no_time> | <modifiable_day_no_time> <at_time>
  <modifiable_day_no_time> = [ <modifier> s ] <weekday>
        <modifiable_month> = [ <month_modifier> s ] <month>
       <modifiable_period> = [ <period_modifier> s ] <period>
                <modifier> = "last" | "this" | "next"
                   <month> = <full_month> | <short_month> 
               <month_day> = <at_time_on> <month_day_no_time> | <month_day_no_time> <at_time>
       <month_day_no_time> = <month_first> | <day_first>
             <month_first> = <month> s d{1,2}
          <month_modifier> = <modifier> | <termini> [ s "of" ]
                      <my> = <month> [","] s <year>
            <named_period> = <modifiable_day> | <modifiable_month> | <modifiable_period> 
                     <now> = "now"
                 <numeric> = <year> | <ym> |<at_time_on> <numeric_no_time> | <numeric_no_time> <at_time>
         <numeric_no_time> = <us> | <iso> | <md> | <dom>
                     <pay> = "pay" | "pp" | "pay" s* "period"
                  <period> = "week" | "month" | "year" | <pay>
         <period_modifier> = <modifier> | <termini> [ s "of" [ s "the" ] ] 
         <relative_period> = [ <at> s* ] <time> s <relative_period_no_time> | <relative_period_no_time> <at_time> | <now>
 <relative_period_no_time> = "yesterday" | "today" | "tomorrow"
             <short_month> = "jan" | "feb" | "mar" | "apr" | "may" | "jun" | "jul" | "aug" | "sep" | "oct" | "nov" | "dec"
           <short_weekday> = "sun" | "mon" | "tue" | "wed" | "thu" | "fri" | "sat" 
            <span_divider> = s* ( "-"+ | ( "through" | "thru" | "to" | "til" [ "l" ] | "until" ) ) s*
                 <termini> = [ "the" s ] ( <beginning> | "end" )
                    <time> = d{1,2} [ ":" d{2} [ ":" d{2} ] ] [ s* <time_suffix> ]
             <time_suffix> = ( "a" | "p" ) ( "m" | ".m." )
                      <us> = d{1,2} ( <divider> ) d{1,2} \\1 d{4}
                  <verbal> = <my> | <named_period> | <relative_period> | <month_day> | <full>  
                 <weekday> = <full_weekday> | <short_weekday>
                    <year> = d{4}
                      <ym> = <year> <divider> d{1,2}

In general App::JobLog::TimeGrammar will understand most time expressions you are likely to want to use.

METHODS

daytime

Parses a time expression such as "11:00" or "8:15:40 pm". Returns a map from hour, minute, second, and suffix to the appropriate value, where 'x' represents an ambiguous suffix.

parse

This function (it isn't actually a method) is the essential function of this module. It takes a time expression and returns a pair of DateTime objects representing the endpoints of the corresponding interval and whether it was given a pair of dates.

If you are parsing an expression defining a point rather than an interval you should be safe ignoring the second endpoing, but you should check the count to make sure the expression didn't provide a second endpoint.

This code croaks when it cannot parse the expression, so exception handling is recommended.

SEE ALSO

App::JobLog::Command::parse

AUTHOR

David F. Houghton <dfhoughton@gmail.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2011 by David F. Houghton.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.