The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Elastic::Model::Role::Doc - The role applied to your Doc classes

VERSION

version 0.28

SYNOPSIS

Creating a doc

    $doc = $domain->new_doc(
        user => {
            id      => 123,                 # auto-generated if not specified
            email   => 'clint@domain.com',
            name    => 'Clint'
        }
    );

    $doc->save;
    $uid = $doc->uid;

Retrieving a doc

    $doc = $domain->get( user => 123 );
    $doc = $model->get_doc( uid => $uid );

Updating a doc

    $doc->name('John');

    print $doc->has_changed();              # 1
    print $doc->has_changed('name');        # 1
    print $doc->has_changed('email');       # 0
    dump $doc->old_values;                  # { name => 'Clint' }

    $doc->save;
    print $doc->has_changed();              # 0

Deleting a doc

    $doc->delete;
    print $doc->has_been_deleted            # 1

DESCRIPTION

Elastic::Model::Role::Doc is applied to your "doc" classes (ie those classes that you want to be stored in Elasticsearch), when you include this line:

    use Elastic::Doc;

This document explains the changes that are made to your class by applying the Elastic::Model::Role::Doc role. Also see Elastic::Doc.

ATTRIBUTES

The following attributes are added to your class:

uid

The uid is the unique identifier for your doc in Elasticsearch. It contains an index, a type, an id and possibly a routing. This is what is required to identify your document uniquely in Elasticsearch.

The UID is created when you create your document, eg:

    $doc = $domain->new_doc(
        user    => {
            id      => 123,
            other   => 'foobar'
        }
    );
  • index : initially comes from the $domain->name - this is changed to the actual domain name when you save your doc.

  • type : comes from the first parameter passed to new_doc() (user in this case).

  • id : is optional - if you don't provide it, then it will be auto-generated when you save it to Elasticsearch.

Note: the namespace_name/type/ID of a document must be unique. Elasticsearch can enforce uniqueness for a single index, but when your namespace contains multiple indices, it is up to you to ensure uniqueness. Either leave the ID blank, in which case Elasticsearch will generate a unique ID, or ensure that the way you generate IDs will not cause a collision.

type / id

    $type = $doc->type;
    $id   = $doc->id;

type and id are provided as convenience, read-only accessors which call the equivalent accessor on "uid".

You can defined your own id() and type() methods, in which case they won't be imported, or you can import them under a different name, eg:

    package MyApp::User;
    use Elastic::Doc;

    with 'Elastic::Model::Role::Doc' => {
        -alias => {
            id   => 'doc_id',
            type => 'doc_type',
        }
    };

timestamp

    $timestamp = $doc->timestamp($timestamp);

This stores the last-modified time (in epoch seconds with milli-seconds), which is set automatically when your doc is saved. The timestamp is indexed and can be used in queries.

Private attributes

These private attributes are also added to your class, and are documented here so that you don't override them without knowing what you are doing:

_can_inflate

A boolean indicating whether the object has had its attributes values inflated already or not.

_source

The raw uninflated source value as loaded from Elasticsearch.

METHODS

save()

    $doc->save( %args );

Saves the $doc to Elasticsearch. If this is a new doc, and a doc with the same type and ID already exists in the same index, then Elasticsearch will throw an exception.

Also see Elastic::Model::Bulk for bulk indexing of multiple docs.

If the doc was previously loaded from Elasticsearch, then that doc will be updated. However, because Elasticsearch uses optimistic locking (ie the doc version number is incremented on every change), it is possible that another process has already updated the $doc while the current process has been working, in which case it will throw a conflict error.

For instance:

    ONE                         TWO
    --------------------------------------------------
                                get doc 1-v1
    get doc 1-v1
                                save doc 1-v2
    save doc1-v2
     -> # conflict error

on_conflict

If you don't care, and you just want to overwrite what is stored in Elasticsearch with the current values, then use "overwrite()" instead of "save()". If you DO care, then you can handle this situation gracefully, using the on_conflict parameter:

    $doc->save(
        on_conflict => sub {
            my ($original_doc,$new_doc) = @_;
            # resolve conflict

        }
    );

See "has_been_deleted()" for a fuller example of an "on_conflict" callback.

The doc will only be saved if it has changed. If you want to force saving on a doc that hasn't changed, then you can do:

    $doc->touch->save;

on_unique

If you have any unique attributes then you can catch unique-key conflicts with the on_unique handler.

    $doc->save(
        on_unique => sub {
            my ($doc,$conflicts) = @_;
            # do something
        }
    )

The $conflicts hashref will contain a hashref whose keys are the name of the unique_keys that have conflicts, and whose values are the values of those keys which already exist, and so cannot be overwritten.

See Elastic::Manual::Attributes::Unique for more.

overwrite()

    $doc->overwrite( %args );

"overwrite()" is exactly the same as "save()" except it will overwrite any previous doc, regardless of whether another process has created or updated a doc with the same UID in the meantime.

delete()

    $doc->delete;

This will delete the current doc. If the doc has already been updated to a new version by another process, it will throw a conflict error. You can override this and delete the document anyway with:

    $doc->delete( version => 0 );

The $doc will be reblessed into the Elastic::Model::Deleted class, and any attempt to access its attributes will throw an error.

has_been_deleted()

    $bool = $doc->has_been_deleted();

As a rule, you shouldn't delete docs that are currently in use elsewhere in your application, otherwise you have to wrap all of your code in evals to ensure that you're not accessing a stale doc.

However, if you do need to delete current docs, then "has_been_deleted()" checks if the doc exists in Elasticsearch. For instance, you might have an "on_conflict" handler which looks like this:

    $doc->save(
        on_conflict => sub {
            my ($original, $new) = @_;

            return $original->overwrite
                if $new->has_been_deleted;

            for my $attr ( keys %{ $old->old_values }) {
                $new->$attr( $old->$attr ):
            }

            $new->save
        }
    );

It is a much better approach to remove docs from the main flow of your application (eg, set a status attribute to "deleted") then physically delete the docs only after some time has passed.

touch()

    $doc = $doc->touch()

Updates the "timestamp" to the current time.

has_changed()

Has the value for any attribute changed?

    $bool = $doc->has_changed;

Has the value of attribute $attr_name changed?

    $bool = $doc->has_changed($attr_name);

Note: If you're going to check more than one attribute, rather get all the "old_values()" and check if the attribute name exists in the returned hash, rather than calling has_changed() multiple times.

old_values()

    \%old_vals  = $doc->old_values();

Returns a hashref containing the original values of any attributes that have been changed. If an attribute wasn't set originally, but is now, it will be included in the hash with the value undef.

terms_indexed_for_field()

    $terms = $doc->terms_indexed_for_field( $fieldname, $size );

This method is useful for debugging queries and analysis - it returns the actual terms (ie after analysis) that have been indexed for field $fieldname in the current doc. $size defaults to 20.

Private methods

These private methods are also added to your class, and are documented here so that you don't override them without knowing what you are doing:

_inflate_doc

Inflates the attribute values from the hashref stored in "_source".

_get_source / _set_source

The raw doc source from Elasticsearch.

AUTHOR

Clinton Gormley <drtech@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2014 by Clinton Gormley.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.