WWW::Patent::Page - get patent documents from WWW source (e.g. ( not available: JP->Eng translations in HTML from JPO,) complete US applications and grants from (USPTO), and place into a WWW::Patent::Page::Response object) (note: ESPACE_EP not provided due to captcha use..)
This document describes WWW::Patent::Page version 0.100.0 of February, 2007.
Please see the test suite for working examples in t/ . The following is not guaranteed to be working or up-to-date.
THE ONLY OFFICE CURRENTLY WORKING IS THE USPTO.
$ perl -I. -MWWW::Patent::Page -e 'print $WWW::Patent::Page::VERSION,"\n"' 0.02 $ perl get_patent.pl US6123456 > US6123456.pdf & $ perl -wT get_JPO_patent_translation_to_english.pl "JPH09-123456A" > JPH09-123456A.zip & ( see examples/JPH09-123456A.zip for an html formatted, machine translated, Japanese patent document. ) (command line interfaces are included in examples/ ) http://www.yourdomain.com/www_get_patent_pdf.pl http://www.yourdomain.com/www_get_JPO_patent_translation_to_english.pl (web fetchers are included in examples/ )
Typical usage in perl code:
use WWW::Patent::Page; print $WWW::Patent::Page::VERSION,"\n"; my $patent_browser = WWW::Patent::Page->new(); # new object my $document1 = $patent_document->get_page('6,123,456'); # defaults: # country => 'US', # format => 'pdf', # page => undef , # and usual defaults of LWP::UserAgent (subclassed) my $document2 = $patent_document->get_page('US6123456', format => 'pdf', page => 2 , #get only the second page ); my $pages_known = $document2->get_parameter('pages'); #how many total pages known?
Intent: Use public sources to retrieve patent documents such as TIFF images of patent pages, html of patents, pdf, etc. Expandable for your office of interest by writing new submodules.. Alpha release by newbie to find if there is any interest
See also SYNOPSIS above Standard process for building & installing modules: perl Build.PL ./Build ./Build test verbose=1 ./Build install or perl Makefile.PL make make test TEST_VERBOSE=1 make install or on ActiveState or otherwise using nmake perl Makefile.PL nmake nmake test TEST_VERBOSE=1 nmake install
Examples of use:
$patent_browser = WWW::Patent::Page->new( doc_id => 'US6,654,321', format => 'pdf', page => undef , # returns all pages in one pdf agent => 'Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4b) Gecko/20030516 Mozilla Firebird/0.6', ); $patent_response = $patent_browser->get_patent('US6,654,321(B2)issued_2_Okada');
Object oriented, and modelled on LWP.
NEW instance of the Page class, subclassing LWP::UserAgent
login to a server to use its services; obtain a token or session id or the like
country_known maps the known two letter acronyms to patenting entities, usually countries; country_known returns undef if the two letter acronym is not recognized.
Takes a human readable patent/publication identifier and parses it into country/entity, kind, number, doc_type, ...
CC[TY]##,###,###(K#)_Comments US_6,123,456_A1_-comments CC : Two letter country/entity code; e.g. US, EP, WO TY : Type of document; one or two letters only of these choices: e.g. in US, Kind = Utility is default and no "Kind" is used, e.g. US6123456 D : Design, e.g. USD339,456 PP: Plant, e.g. USPP8,901 RE: Reissue, e.g. USRE35,312 T : Defensive Publication, e.g. UST109,201 SIR: Statutory Invention Registration, e.g. USH1,523 ##,###,### Document number (e.g. patent number or application number- only digits and optionally separators, no letters) K# : the kind or version number, e.g. A1, B2, etc.; placed in parenthesis- at least one letter and at most one number. Not always used in document fetching. Comments: retained but not used- single string of word characters \w = A-z0-9_ (no spaces, "-", commas, etc.) Separators (comma, space, dash, underscore) may occur between entries, and at least one MUST occur before a comment (due to difficulty of parsing the kind code which might be one letter). Separators (the comma is handy) may occur within the number
As of version 0.1, the parsed result used at the office of choice is placed in $self->patent->doc_id_standardized
A convenience value of $self->patent->doc_id_commified is provided.
In recognizing the values such as CC country, the priority is:
$self->patent->doc_id as supplied; if absent: $self->patent->country; if absent: $WWW::Patent::Page::default_country
method to use the modules specific to Offices like USPTO, with methods for each document/page format, etc., and LWP::Agent to grab the appropriate URLs and if necessary build the response content or produce error values
Method to override the LWP::UserAgent::request that gets a URL. This calls LWP::UserAgent::request itself, but around it adds things like a retry (and possibly debugging, like throwing pages to a browser for display).
method to provide a summary or pointers to the terms and conditions of use of the publicly available databases
internal private method to access helper modules in WWW::Patent::Page
private method to assign default agent
private method to load a big hash and allow it to be folded during code development.
The accepted tactic is to set $self->{'is_success'} or $self->{'patent'}->{'is_success'} to false and add a message to $self->{'message'} or $self->{'patent'}->{'message'}
WWW::Patent::Page requires no configuration files or environment variables.
WWW::Patent::Page makes use of LWP environmental variables such as HTTP_PROXY.
LWP::UserAgent HTTP::Response
None reported.
Code contributions, suggestions, and critiques are welcome.
Error handling is undeveloped.
By definition, a non-trivial program contains bugs.
For United States Patents (US) via the USPTO (USPTO), the 'kind' is ignored in method provide_doc
Wanda B. Anon Wanda.B.Anon@gmail.com
Copyright (c) 2008, Wanda B. Anon wanda.b.anon@GMAIL.com . All rights reserved.
This program is free software; you can redistribute it and/or modify it under the Artistic License version 2.0 or above ( http://www.perlfoundation.org/artistic_license_2_0 ) .
Hermann Schier, Lokkju, Andy Lester, the authors of Finance::Quote, Erik Oliver for patentmailer, Howard P. Katseff of AT&T Laboratories for wsp.pl, version 2, a proxy that speaks LWP and understands proxies, and of course Larry and Randal and the gang.
BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
To install WWW::Patent::Page, copy and paste the appropriate command in to your terminal.
cpanm
cpanm WWW::Patent::Page
CPAN shell
perl -MCPAN -e shell install WWW::Patent::Page
For more information on module installation, please visit the detailed CPAN module installation guide.