NAME
XPC - XML Procedure Call
SYNOPSIS
use XPC;
and then
my $xpc = XPC->new(<<END_XPC);
<?xml version='1.0' encoding='UTF-8'?>
<xpc>
<call procedure='localtime'/>
</xpc>
END_XPC
or
my $xpc = XPC->new();
$xpc->add_call('localtime');
or
my $xpc = XPC->new_call('localtime');
and then later
print XML_FILE $xpc->as_string();
DESCRIPTION
This class represents an XPC request or response. It uses XML::Parser to
parse XML passed to its constructor.
MOTIVATION
A Commentary on the XML-RPC Specification and Definition of XPC Version
0.2
Introduction
The following commentary is based upon the specification from the
UserLand web site. The version referenced for this commentary has a
notation on it that it was "Updated 10/16/99 DW" (see
http://www.xmlrpc.com/spec).
These comments are stylistic in nature, and it is well recognized by the
author that style in program and protocol design are very personal. This
commentary will, however, point out the rationale of the proposed
changes to the specification's design.
Procedure Call Structural Simplifications
The example in the "Request example" section looks like this:
<methodCall>
<methodName>examples.getStateName</methodName>
<params>
<param>
<value><i4>41</i4></value>
</param>
</params>
</methodCall>
We note by looking at the remainder of the specification that there are
only two top-level elements allowed in XML-RPC: "methodCall" and
"methodResponse". Since methods are *the* subject of RPC, and since all
top-level elements in the design are about methods, there is no need to
have the redundant qualifier "method" in the names of these elements.
Thus, the example would be modified to look like this:
<call>
<methodName>examples.getStateName</methodName>
<params>
<param>
<value><i4>41</i4></value>
</param>
</params>
</call>
Now, the content of the "methodName" element is constrained to be very
simple text (from the "Payload format" section, which says "...
identifier characters, upper and lower-case A-Z, the numeric characters,
0-9, underscore, dot, colon and slash"). It is also mandatory. This is
precisely the reason XML includes the ability to add attributes to
elements (it is technically redundant, but very convenient). So, we
really should turn this example into:
<call method='examples.getStateName'>
<params>
<param>
<value><i4>41</i4></value>
</param>
</params>
</call>
Once the "methodName" element has been removed from the design, the
"params" element becomes superfluous, since its only purpose was to
group the parameters and separate them from the method name. Now, the
"call" element *is* the element that groups the parameters, leaving us
with:
<call method='examples.getStateName'>
<param>
<value><i4>41</i4></value>
</param>
</call>
Header Nomenclature
One final comment on terminology: RPC stands for Remote *Procedure*
Call, so we should probably not use the term "method" when we mean
"procedure" or something else. Since the "procedures" can return values,
which corresponds in some languages to the term "function", we have a
rivalry for the term to use. "Procedure" matches the acronym nicely, but
for some folks "Function" would have a better connotation. Fans of
Eiffel might even prefer "Feature", or "Query" for calls returning a
value and "Routine" or "Command" for those not. Given the variety of
possibilities, here we stay with the simple policy of matching the
acronym:
<call procedure='examples.getStateName'>
<param>
<value><i4>41</i4></value>
</param>
</call>
Scalar Values
Typically, an interface definition determines the number, names and
types of parameters to a procedure call. It is incumbent upon the caller
to conform to that specification. Therefore, the declaration for any
procedure to be called as part of an interface *should* indicate the
expected types of the parameters, which means that the caller should not
have to indicate the type of value it is passing (and, the value
*itself* isn't passed in general, but rather a *textual representation*
of the value is passed). XML-RPC should not be blind to typing issues.
These issues should not appear in the calling standard, but rather in an
interface definition standard (about which more later). Removing the
type information from the example results in:
<call procedure='examples.getStateName'>
<param>
<value>41</value>
</param>
</call>
Since the <value> element really now just means "scalar" (see the
specification section "Scalar <value>s"), let's call it that:
<call procedure='examples.getStateName'>
<param>
<scalar>41</scalar>
</param>
</call>
If for some reason not contemplated here type information is necessary
for scalars, then having a simple "type" attribute of the "scalar"
element would suffice, especially since the set of allowable values is
fixed, small, and consists of only short string values ("i4", "int",
"boolean", "string", "double", "dateTime.iso8601", and "base64").
If we only ever expected simple, short scalar values, we could make one
more change, to:
<!-- NOTE: This is NOT a proposed change -->
<call procedure='examples.getStateName'>
<param>
<scalar value='41'/>
</param>
</call>
but, it is presumed that it would be possible to have a very long scalar
string value, for which the former representation would be better.
Named Parameters
Some procedures may be implemented in a language that makes it very easy
to implement named parameters. Supporting this would be easy:
<call procedure='examples.getStateName'>
<param name='stateNum'>
<scalar>41</scalar>
</param>
</call>
Scalar Types
Whether types apply to calls and interfaces or just to interfaces, they
are an important part of the specification.
The specification defines "i4" and "int" to be synonyms for a 'four-byte
signed integer'. Since the value will be represented in the call as
text, this description really isn't an appropriate specification, since
it is written in terms of a binary representation. We suggest here a
single term for this data type, "integer", and that it be defined in
terms of a range of acceptable values: -2,147,483,648 to +2,147,483,647
(just the range of vales that can be stored in a two's complement 32-bit
binary representation).
The "boolean" data type is distinct from the "integer" data type, yet
its domain {"0", "1"} is a subset of the "integer" domain instead of the
more consistent {"false", "true"}. If "boolean" is going to be treated
as its own type, it should have its own domain.
The specification defines "double" to be 'double-precision signed
floating point number'. Note that in the 1999-01-21
questions-and-answers section near the end of the document, it is
revealed that the full generality of the data type commonly meant by
such a description is not available. Niether infinities, nor "NaN" (the
Not-a-Number value) are permitted. Not even exponential notation is
allowed. Very simple strings matching the Perl regular expression:
/^([+-])(\d*)(\.)(\d*)$/
are the only ones permitted according to the answer given, although one
suspects that what was meant was something closer to this:
/^([+-])?(\d*)((\.)(\d*))?$/
because the first expression requires the sign to be present, and
permits ""+."" and ""-."" as valid strings (although to what values they
would map is a mystery).
Note: The second expression makes the leading sign and trailing decimal
point and digits optional, but still isn't perfect, since it allows the
empty string as a value.
This type should be called "rational" instead of "double" to get away
from the physical description. "decimal" is another potentially
reasonable name for this type.
Also, the FAQ answer says the range of allowable values is
implementation- dependant, but the specification refers to
"double-precision floating-point", which does have an expected set of
behaviors for most people.
The specification mentions "ASCII" in the type definition for string,
but XML permits all of Unicode. Shouldn't one expect to be able to pass
around string values with all the characters thus permitted? Shouldn't
servers and clients be written to handle this broader character set, and
convert as necessary internally? Otherwise, we are taking a big step
back from the promise of XML and the web.
The "dateTime.iso8601" data type name is awkward. They didn't refer to
the IEEE 754 floating point standard in the name of the "double" type
(which would have been "double.ieee754" if they had). Unless the
specification is going to allow multiple "dateTime" variants, the
qualifier is just an annoyance. In addition, most people call this type
"timestamp", even if their computer languages sometimes just call it
"DATE" (as in many SQL implementations). So, here we propose that this
type just be called "timestamp" and that the type description refer to
the ISO 8601 standard.
Finally, the "base64" type (added 1999-01-21) really should be "binary"
with the encoding standard (Base-64) referenced in the type description.
Structures
Structures continue the same idiom used elsewhere in the specification:
the avoidance of element attributes. Here is the example used in the
specification (modified to acommodate the recommendations already made
here):
<struct>
<member>
<name>lowerBound</name>
<scalar>18</scalar>
</member>
<member>
<name>upperBound</name>
<scalar>139</scalar>
</member>
</struct>
The "name" element here should be converted into an attribute of the
"member" element, leaving:
<struct>
<member name='lowerBound'>
<scalar>18</scalar>
</member>
<member name='upperBound'>
<scalar>139</scalar>
</member>
</struct>
Arrays
The "array" element is defined with a superfluous "data" child element.
This element serves no function, so it should be removed. Here is the
example from the specification (again, modified based on previous
recommendations):
<array>
<data>
<scalar>12</scalar>
<scalar>Egypt</scalar>
<scalar>false</scalar>
<scalar>-31</scalar>
</data>
</array>
Removing the unneeded "data" element leaves us with:
<array>
<scalar>12</scalar>
<scalar>Egypt</scalar>
<scalar>false</scalar>
<scalar>-31</scalar>
</array>
We have recommended getting rid of "value" and using "scalar", but the
specification allows a "value" to contain a scalar value *or* a "struct"
*or* an "array". We can still do without the "value" element, though:
<array>
<scalar>12</scalar>
<array>
<scalar>Egypt</scalar>
<scalar>false</scalar>
<scalar>-31</scalar>
</array>
</array>
Responses
The example in the document is:
<?xml version="1.0"?>
<methodResponse>
<fault>
<value>
<struct>
<member>
<name>faultCode</name>
<value><int>4</int></value>
</member>
<member>
<name>faultString</name>
<value><string>Too many parameters.</string></value>
</member>
</struct>
</value>
</fault>
</methodResponse>
This has much unnecessary nesting. It is *much* simpler to store the
fault code as an attribute of the "fault" element and to have the fault
description be the body of the "fault" element:
<?xml version="1.0"?>
<methodResponse>
<fault code='4'>
Too many parameters.
</fault>
</methodResponse>
Adding a Consistent Top-Level Element
It would be nice if one could always be sure that XML data involved in
the XML-RPC protocol had a particular root element.
Another benefit of doing this is that a given request *could* include
multiple calls, which for certain types of interactions could be of
great performance benefit. If you need to make many related calls, the
network latency would be a real drag on performance, but batching up the
calls into one big bundle amortizes the transport time, increasing
performance. A top- level element of "xpc" is used here to stand for
"XML Procedure Call".
<xpc>
<call> ... </call>
<call> ... </call>
<call> ... </call>
</xpc>
As soon as we decide to put multiple calls in a transmission, it begs
the issue of tieing responses to calls. We could use order for this, but
we could also provide an attribute to "call" and "response" called "id"
that is optionally provided by the caller, and if present, is copied
into the response element for that call.
HTTP POST REQUEST CONTENT:
<xpc>
<call ... id='1'> ... </call>
<call ... id='foo'> ... </call>
<call ... id='some_guid'> ... </call>
</xpc
HTTP RESPONSE CONTENT:
<xpc>
<response id='1'> ... </call>
<response id='foo'> ... </call>
<response id='some_guid'> ... </call>
</xpc
Another benefit of having a consistent top-level element is that we can
use it to specify the protocol version:
<xpc version='0.2'>
<call ...> ... </call>
</xpc
Finally, using a consistent top-level element permits the response to
contain a copy of the request if desired.
HTTP POST REQUEST CONTENT:
<xpc>
<call ... id='1'> ... </call>
<call ... id='foo'> ... </call>
<call ... id='some_guid'> ... </call>
</xpc
HTTP RESPONSE CONTENT:
<xpc>
<call ... id='1'> ... </call>
<call ... id='foo'> ... </call>
<call ... id='some_guid'> ... </call>
<response id='1'> ... </call>
<response id='foo'> ... </call>
<response id='some_guid'> ... </call>
</xpc
Extended Types
Given that XML-RPC is an XML application, it is disconcerting to see its
design be so blind to XML issues such as Unicode values (discussed
above) and tree-structured data. Suppose a procedure was to accept XML
as a parameter or to return XML as its result. How would this be
accomplished with XML-RPC? The answer seems to be "stuff it in a string
scalar". But, to be a proper string, all the markup would have to be
escaped:
<call procedure='foo'>
<param>
<scalar>
<bar>Here's some text in an element.</bar>
</scalar>
</param>
</call>
However, if we add to the "scalar", "array" and "struct" types a new
type "xml", then we can do the natural thing:
<call procedure='foo'>
<param>
<xml>
<bar>Here's some text in an element.</bar>
</xml>
</param>
</call>
We could even use XML Namespaces if needed to resolve element name
collisions if they arise (namespaces are commonly used for this reason
in XSLT transforms).
Technically speaking, allowing parameters and results to contain XML
makes the other XML-RPC types redundant, but providing shortcuts for
these common cases does make sense.
Interface Specifications
In order to provide true discoverability, there needs to be a way for a
client to ask the server what operations it supports, and to get back
interface information for the supported procedures.
Sending an empty "query" element should cause the server to return an
array of procedure names:
HTTP POST REQUEST CONTENT:
<xpc>
<query/>
</xpc>
HTTP RESPONSE CONTENT:
<xpc>
<result>
<array>
<scalar>foo</scalar>
<scalar>bar</scalar>
</array>
</result>
</xpc>
Sending a "query" element with a procedure name filled in should return
a response containing a prototype:
HTTP POST REQUEST CONTENT:
<xpc>
<query procedure='foo'/>
</xpc>
HTTP RESPONSE CONTENT:
<xpc>
<prototype procedure='foo'>
<comment>
The 'foo' procedure! Given an integer, returns an array with that
many elements, with each element containing the integer number of
its position within the array.
</comment>
<param-def name='splee' type='scalar' subtype='integer'/>
<result-def type='array'/>
</prototype>
</xpc>
Requesting information on an unknown procedure results in a "fault"
return:
HTTP POST REQUEST CONTENT:
<xpc>
<query procedure='quux'/>
</xpc>
HTTP RESPONSE CONTENT:
<xpc>
<fault code='42'>
Unknown procedure name 'quux'!
</fault>
</xpc>
Conclusion
The "Strategies/Goals" section of the specification lists these items
(paraphrased):
* Leverage the ability of CGI to pass many firewalls to build an RPC
mechanism that can cross many platforms and many network boundaries.
* Cleanliness.
* Extensibility.
* Easy implementation.
The first of these seems to be met without difficulty by leveraging the
HTTP protocol.
Cleanliness is of course a subjective measure, and this document has
pointed out many points on which we think cleanliness can be improved.
The original specification doesn't seem to address extensibility other
than to list it as a goal. This document's addition of the XML type
provides much extensibility.
Ease of implementation should not be radically decreased by the modified
version of XML-RPC proposed here, except in the handling of Unicode
text. This is likely the main reason ASCII was specified in the original
protocol definition.
ADDITIONAL INFORMATION
The following sections provide details behind the proposed XPC.
Document Type Definition for Proposed XPC
This appendix shows the complete simple DTD for XPC. It is no more
complicated than the XML-RPC DTD (see
http://www.ipso-facto.demon.co.uk/xml-rpc-inline.html or
http://www.ontosys.com/xml-rpc/xml-rpc.dtd).
<!-- We are going to use this parameter entity to refer to the value -->
<!-- element types. -->
<!ENTITY % value "(scalar|array|struct|xml)" >
<!ENTITY % request "(query|call)" >
<!ENTITY % response "(prototype|result|fault)" >
<!-- We can have any number of calls and responses inside the top-level -->
<!-- element (but at least one). -->
<!ELEMENT xpc ( %request; | %response; )+ >
<!ATTLIST xpc version CDATA #IMPLIED >
<!-- A query is always empty, and it has an optional procedure attribute. -->
<!-- It can also have an id attribute to distinguish it from other -->
<!-- requests in the same transaction. -->
<!ELEMENT query EMPTY >
<!ATTLIST query procedure CDATA #IMPLIED >
<!ATTLIST query id ID #IMPLIED > <!-- TODO: Can it be ID *and* #IMPLIED? -->
<!-- A call can have zero or more parameters. -->
<!ELEMENT call (param)* >
<!ATTLIST call procedure CDATA #REQUIRED >
<!ATTLIST call id ID #IMPLIED > <!-- TODO: Can it be ID *and* #IMPLIED? -->
<!-- A param *must* have one of the value elements as a child. -->
<!ELEMENT param %value; >
<!ATTLIST param name CDATA #IMPLIED >
<!-- Types for scalars are shown here as optional, but they may not need -->
<!-- to be part of the design. -->
<!ELEMENT scalar (#PCDATA) >
<!ATTLIST scalar type (boolean|integer|rational|string|timestamp|binary)
#IMPLIED >
<!-- An array has any number of elements, each of which is of one of the -->
<!-- value elements. -->
<!ELEMENT array (scalar|array|struct)* >
<!-- A structure has one or more members. -->
<!ELEMENT struct (member+) >
<!-- A member has a name and *must* contain one of the value elements as -->
<!-- a child. -->
<!ELEMENT member %value; >
<!ATTLIST member name CDATA #REQUIRED >
<!-- An xml value element can contain any XML data. -->
<!ELEMENT xml ANY >
<!-- A fault has a name and contains text. -->
<!ELEMENT fault (#PCDATA) >
<!ATTLIST fault code CDATA #REQUIRED >
<!ATTLIST fault id ID #IMPLIED > <!-- TODO: Can it be ID *and* #IMPLIED? -->
<!-- A result is like a param, and *must* have one of the value elements -->
<!-- as a child. -->
<!ELEMENT result %value; >
<!ATTLIST result name CDATA #IMPLIED >
<!ATTLIST result id ID #IMPLIED > <!-- TODO: Can it be ID *and* #IMPLIED? -->
<!-- A prototype gives the calling convention for a procedure. -->
<!ELEMENT prototype (comment?, (param-def|result-def)*) >
<!ATTLIST prototype procedure CDATA #REQUIRED >
<!ATTLIST prototype id ID #IMPLIED > <!-- TODO: Can it be ID *and* #IMPLIED? -->
<!-- A param-def defines an optional name, type and subtype for the -->
<!-- parameter. It may also contain a comment about the parameter. -->
<!ELEMENT param-def (comment?) >
<!ATTLIST param-def name CDATA #IMPLIED >
<!ATTLIST param-def type (scalar|array|struct|xml) #IMPLIED >
<!ATTLIST param-def subtype (boolean|integer|rational|string|timestamp|binary) #IMPLIED >
<!-- A result-def defines an optional name, type and subtype for the -->
<!-- result. It may also contain a comment about the result. -->
<!ELEMENT result-def (comment?) >
<!ATTLIST param-def name CDATA #IMPLIED >
<!ATTLIST param-def type (scalar|array|struct|xml) #IMPLIED >
<!ATTLIST param-def subtype (boolean|integer|rational|string|timestamp|binary) #IMPLIED >
<!ELEMENT comment (#PCDATA) >
XML Schema for Proposed XPC
<!-- TODO -->
An XML-RPC <---> XPC Gateway
The following XSLT transform will convert XML-RPC requests into XPC
requests:
<!-- TODO -->
The following XSLT transform will convert XPC responses into XML-RPC
responses (where it is possible):
<!-- TODO -->
The following XSLT transform will convert XPC requests into XML-RPC
requests (where it is possible):
<!-- TODO -->
The following XSLT transform will convert XML-RPC responses into XPC
responses:
<!-- TODO -->
AUTHOR
Gregor N. Purdy <gregor@focusresearch.com>
COPYRIGHT
Copyright (C) 2001 Gregor N. Purdy. All rights reserved.
This is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.