Simon Cozens > Plucene-SearchEngine-1.1 > Plucene::SearchEngine::Index::URL

Download:
Plucene-SearchEngine-1.1.tar.gz

Dependencies

Annotate this POD

CPAN RT

Open  1
View/Report Bugs
Source  

NAME ^

Plucene::SearchEngine::Index::URL - File reader for web URLs

DESCRIPTION ^

This frontend module takes a URL, downloads its content, extracts its metadata and passes the file onto a backend. The frontend registers the following Plucene fields:

mimetype

The MIME type of the data.

filename

The basename of the URL's filename.

id

The URL given.

modified

A Plucene date field representing the last modified date of the file

language

The ISO language identifier of the content

encoding

The original character set. (before conversion to UTF-8)

METHODS

    Plucene::SearchEngine::Index::URL->examine($url);

This downloads and examines a file on the filesystem for the above metadata, before handling it to a backend.

syntax highlighting: