The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

URI::Collection - Input and output link collections in different formats

SYNOPSIS

  use URI::Collection;

  $c = URI::Collection->new;
  $c = URI::Collection->new(links => $bookmark_file);
  $c = URI::Collection->new(links => $favorite_directory);
  $c = URI::Collection->new(
      links => [ $bookmark_file, $favorite_directory ],
  );

  $links = $c->fetch_items(
      category => $regexp_1,
      title    => $regexp_2,
      url      => $regexp_3,
  );

  if ($items = $c->is_item($regexp)) {
      print Data::Dumper($items);
  }

  $bookmarks = $c->as_bookmark_file;
  $c->as_bookmark_file(save_as => $filename);

  $favorites = $c->as_favorite_directory;
  $c->as_favorite_directory(save_as => $directory);

DESCRIPTION

An object of class URI::Collection represents the links and categories in parsed Netscape style bookmark files and Windows "Favorites" directories, with output in either style.

METHODS

new

  $c = URI::Collection->new(links => $bookmark_file);
  $c = URI::Collection->new(links => $favorite_directory);
  $c = URI::Collection->new(
      links => [ $bookmark_file, $favorite_directory ],
  );

Where links may be a Netscape bookmark file name, a Windows favorite directory path or an arrayref of any amount of both.

Return a new URI::Collection object after processing the specified Netscape bookmark files and specified Windows .url files.

Note about bookmarks on Windows: On Windows OSes, bookmarks are saved as files (one file per bookmark) with extension .url. A .url file is a plain text file with the same structure as Windows .ini files, and may be processed with the Config::IniFiles module.

Note about bookmarks on Netscape: On the Netscape browser (and Mozilla too), bookmarks are saved as links in an html page that is maintained by the browser.

If no arguments are passed, this method returns an empty URI::Collection object.

This method mashes link store formats together, simultaneously. It creates an internal data structure that is the same without worrying what kind of argument(s) is(are) specified.

as_bookmark_file

  $bookmarks = $c->as_bookmark_file;
  $c->as_bookmark_file(save_as => $filename);

Without an argument this method returns a Netscape style bookmark file as a string with the current bookmark as file contents.

With an argument, save the bookmarks to disk as a Netscape style bookmark file called save_as.

Note that this method lets you convert Windows style .url bookmarks to a Netscape style bookmark file.

as_favorite_directory

  $favorites = $c->as_favorite_directory;
  $c->as_favorite_directory(save_as => $directory);

Without argument returns a M$ Windows "Favorites" tree as a hash reference, where the keys are the subdirectories (categories) and the values are array references with the contents of M$ Windows *.url files as string elements.

Note: In Netscape bookmark files you may group related bookmarks inside what is called a 'category'. Netscape categories may be nested inside other categories. With Windows Favorites, categories are groups of related .url files inside directories. For the purposes of this documentaion and module, we refer to both collectivelly as categories.

This tree is one dimensional. That is, the nested categories are represented (as hashref keys) by "slash separated paths" (without worrying if it comes from a Netscape category or a Windows Favorite subdirectory). An example will elucidate:

  { 'foo'     => [ $link1, $link2 ],
    'foo/bar' => [ $link3, $link4 ],
    'baz'     => [ $link5 ],
    'baz/x/y' => [ $link6, $link7 ] }

Here $link1, $link2, ... and so are strings, each with the content of a Windows .url file, without worring if the link comes from a Windows .url file or a Netscape bookmark.

For documentation about the format of a Windows .url file see http://www.cyanwerks.com/file-format-url.html.

With argument, this method will create a Windows like directory hierarchy and fill it with Windows style, .url file bookmarks. It is assumed that the value of save_as is the root path of the directory tree to be created.

Note that this function lets you convert a Netscape style bookmark file to a Windows style .url file directory tree.

fetch_items

  $items = $c->fetch_items(
      category => $regexp_1,
      title    => $regexp_2,
      url      => $regexp_3,
  );

Returns links that have titles, urls or categories that match the given regular expressions.

Returned value is a hashref with this format:

  {
    'name/of/category' => { title_of_item => url_of_item }, ...
  }

Note that if a category argument is supplied, only links under matching categories will be found. If no category argument is provided, any link with a matching title or url will be returned.

If no arguments are provided, all links are returned.

is_item

  $items = $c->is_item($regexp);

Return the items whose titles or urls match the given regular expression.

Note that this method is just fetch_items with the no category argument and identical title and url pattern.

fetch

  $data_struct = $c->fetch(
            title         => $regexp_1,
            url           => $regexp_2,
            category      => $regexp_3,
            add_date      => $regexp_4,
            last_modified => $regexp_5,
            last_visit    => $regexp_6,
            iconfile      => $regexp_7,
            iconindex     => $regexp_8,
            description   => $regexp_9,
            alias_id      => $regexp_10,
            baseurl       => $regexp_11,
            modified      => $regexp_12            
  );

Allows you to select items that match one or more regular expressions.

Note that if a category argument is supplied, only links under matching categories will be found. If no category argument is provided, any item matching at least one regexp will be returned.

All arguments are optional. If no arguments are specified, all link items are returned (i.e. no selection is done).

The data structure that is returned has this format:

  [
    {
      add_date      => 3,
      alias_id      => 9,
      baseurl       => 10,
      category      => 2,
      description   => 8,
      iconfile      => 6,
      iconindex     => 7,
      last_modified => 4,
      last_visit    => 5,
      modified      => 11,
      title         => 0,
      url           => 1,
    },
    {
      'name/of/category' =>
      [
        [ ... ],   # <-- (A)
        ...
      ],
      ...
    }
  ]

Where each A element is an arrayref of data sorted as specified by the first hashref. For example, if you have to get 'iconfile' of 1st A element of category 'category_1', you have to write:

  $data_struct->[1]->{'category_1'}->[0]->[ $data_struct->[0]->{iconfile} ]

 XXX Which is some heinous syntax. -gb

Another example ('category_1', 2nd A element, 'iconfile'):

  $data_struct->[1]->{'category_1'}->[1]->[ $data_struct->[0]->{iconfile} ]

Please, don't assume $data_struct-[0]->{iconfile}> is 6, because this may change in future releases.

Also, take into account that 'name/of/category' is '.' for root category.

set

  $c->set( $data_struct );

Sets the internal data structure of an URI::Collection object ($c).

DEPENDENCIES

Carp

Cwd

File::Spec

File::Find

File::Path

IO::String

Config::IniFiles

Netscape::Bookmarks::Link

TO DO

Ignore redundant links.

Optionally return the M$ Favorites directory structure (as a variable) instead of writing it to disk.

Allow input/output of file and directory handles.

Allow slicing of the category-links structure.

Allow link munging to happen under a given category or categories only.

Check if links are active.

Update link titles and URLs if changed or moved.

Mirror links?

Handle other bookmark formats (including some type of generic XML), and "raw" (CSV) lists of links, to justify such a generic package name. This includes different platform flavors of every browser.

Move the Favorites input/output functionality to a seperate module like URI::Collection::Favorites::IE::Windows and URI::Collection::Favorites::IE::Mac, or some such. Do the same with the above mentioned "platform flavors", such as Opera and Mosaic "Hotlists", and OmniWeb bookmarks, etc.

SEE ALSO

http://www.cyanwerks.com/file-format-url.html

There are an enormous number of web-based bookmark managers out there (see http://useful.webwizards.net/wbbm.htm), which I don't care about at all. What I do care about are multi-format link converters. Here are a few that I found:

Online manager: http://www.linkagogo.com/

CDML Universal Bookmark Manager (for M$ Windows only): http://www.cdml.com/products/ubm.asp

OperaHotlist2HTML: http://nelson.oit.unc.edu/~alanh/operahotlist2html/

bk2site: http://bk2site.sourceforge.net/

Windows favorites convertor: http://www.moodfarm.demon.co.uk/download/fav2html.pl

bookmarker: http://www.renaghan.com/pcr/bookmarker.html

Columbine Bookmark Merge: http://home.earthlink.net/~garycramblitt/

XBEL Bookmarks Editor: http://pyxml.sourceforge.net/topics/xbel/

And here are similar perl modules:

URI::Bookmarks

BabelObjects::Component::Directory::Bookmark

WWW::Bookmark::Crawler

Apache::XBEL

THANK YOU

Thank you to #perl for answering my random questions about this. :)

Thank you to Enrique for working this into something more useful.

AUTHORS

Gene Boggs <gene@cpan.org>

Enrique Castilla <ecastillacontreras@yahoo.es>

COPYRIGHT AND LICENSE

Copyright 2003-2005 by Gene Boggs

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.