Webservice::InterMine::Cookbook::Recipe1 - The Bare Basics
# Get a list of the organisms in the database use Webservice::InterMine 'www.flymine.org/query'; my $query = Webservice::InterMine->new_query(class => "Organism"); # Print out all the genera and taxons of the organisms in the database $query->select('taxonId', 'genus'); $query->show(); # Also get the species associated with the organism, and order by genus $query->add_view('species'); $query->order_by('genus'); $query->show();
At its heart a query is just a description of what you want to get out of the database - and the first part of that is the view, which specifies the fields for each row of results you get back. In the InterMine webapp the output columns are often referred to as the "view". That terminology is supported here, as well as what may be more familiar SQL-ish syntax.
Every query should have a root class, or if you prefer, a table to select from. It is good practice to declare this root with the optional
class => "Gene" parameters. If these are not provided, the first root class found will be assumed to be the root class. The following are equivalent:
my $query = $service->new_query(class => "Gene"); $query->select("symbol", "primaryIdentifier", "length");
my $query = $service->new_query(); $query->select("Gene.symbol", "Gene.primaryIdentifier", "Gene.length");
Choose whatever seems best to you.
The view is a list of strings which represent 'paths' in Webservice::InterMine terminology.
Organism.name is such a path, which says you want to know the name of each organism. This kind of notation may be familiar to you as object/property notation in many computing languages, or as the table/column notation of SQL. In InterMine, the properties of an object/columns of a table, may themselves be references to other objects/other tables, and so have properties/columns of their own. Joins are implicit in this system, and you can freely add as many further columns as you wish:
Note that here the "chromosomes" property is a reference to not just one object/record, but potentially many chromosomes. If these are selected for output, there will be as many rows in the output as there are chromosomes to display.
This system of objects/tables, properties/columns, and joined references/nested/objects is encapsulated in InterMine terminology which refers to:
=item: Classes (tables)
=item: Fields (columns)
=item: Attributes: (fields of a class which hold data)
=item: References: (fields of a class which refer to objects)
=item: Collections: (fields which may refer to more than one object)
The output columns must eventually end in a data field to display (an attribute).
The view list can be specified in a couple of ways, either as a list of paths, as in:
or as a space or comma delimited string, as in:
or any mixture of the two.
add_view can also be called multiple times, with each call appending the view(s) onto the list.
By default the results returned to you will be sorted by the values in the first column (the first item in the view list), and in an ascending direction. You can override this default by specifying any of the view columns to sort by. You do not have to specify a direction -
asc will be the default if none is provided. The following are all(1) valid ways of defining the sort order, and would all have the same result:
$query->add_sort_order('Organism.taxonId' => 'asc');
$query->add_sort_order( path => 'Organism.taxonId', direction => 'asc', );
In addition there is a more SQL-like
order_by, which is slightly different, in that instead of adding elements to the sort order, it replaces what is currently in there. The following are equivalent:
# Sort first by name (asc), and then by symbol (desc) $query = $service->new_query(class => 'Gene'); $query->add_view('symbol', 'name'); $query->add_sort_order('name') $query->add_sort_order('symbol', 'desc'); # Note that it is possible to chain the SQL-ish method calls $query = $service->new_query(class => 'Gene'); $query->select('symbol', 'name')->order_by(name => 'asc', symbol => 'desc');
order_by supports multiple orders being given at once, but in which case the directions should be specified. It does not however support the names argument syntax.
The two valid directions are 'asc' or 'desc' (case is irrelevant), and you must specify the view before you select the sort order.
Once we have a defined view, we have a valid query, which can be run and will return results. Without constraints these queries will just return every item of the specified type from the database, so these queries will list all organisms that the www.flymine.org database has data on. Obviously it is more useful to be able to specify which items you want, and for this we need constraints, which we deal with in the next recipe.
A query is at heart just a view list with (optionally) some constraints on the items you want back. This recipe demonstrates the mimimum you need to set up a valid query.
1) Perl is famous for its philosophy of TIMTOWTDI (There is more than one way to do it) - and that is in evidence here as well: there are generally serveral ways to call the methods in the Perl API, and none is categorically right or wrong. In the above example with
sort_order, you can see a progression from terser to more declarative styles; it is perhaps good practice to use declarative styles if you want your code to be robust and readable, and thus more maintainable.
Alex Kalderimis <firstname.lastname@example.org>
Copyright 2004-2010 by Webservice::InterMine
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.