The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
README.zxid
###########
<<author: Sampo Kellomäki (sampo@iki.fi)>>
<<cvsid: $Id: README.zxid,v 1.38 2006/10/01 19:35:50 sampo Exp $>>
<<class: article!!!ZXID>>

<<abstract:

ZXID ist eine C-Bibliothek, die den vollständigen SAML
2.0-Stack implementiert und alle populären
Identitätsverwaltungs-Protokolle wie Liberty ID-FF 1.2,
WS-Federation, WS-Trust und ID-Webservices wie Liberty ID-WSF 1.1 und
2.0 implementieren will. Sie beruht auf Schema-basierter
Code-Erzeugung, woraus eine genaue Implementation resultiert. SWIG
wird verwendet, um Schnittstellen zu Skriptsprachen wie Perl, PHP und
Python sowie zu Java bereitzustellen. Sie kann als SP, IdP, WSC und
WSP fungieren.

>>

1 Who needs this?
=================

ZXID project has currently (Sept 2006) four main outputs

libzxid:: A C library for supporting SAML 2.0, including federated Single Sign-On
zxid:: A C program that implements a SAML Service Provider (SP) as a CGI script
Net::SAML:: A Perl module wrapping libzxid. Also zxid.pl, that implements SP
    in mod_perl environment, is supplied.
php_zxid:: A PHP extension that wraps libzxid. Also supplied: zxid.php
    that implements SP in mod_php environment.

*You need this if you are*

Web Master:: You want to enable SAML based Single Sign-On (SSO) to your
    web site. In this case you would use the zxid SP CGI script directly,
    only configuring it slightly. Or you can hint your PHP or perl
    developer that this functionality is available and your want it.

Perl Developer:: You can use the Net::SAML module to integrate SSO
    to your application and web site. Given the direct perl, support this is
    easier than fully understanding the C interface. Both mod_perl
    and perl as CGI are supported.

PHP Developer:: You can use ~dl("php_zxid.so")~ to load the module and
    access the high level functionality, such as SAML 2.0 SSO. We
    support functionality roughly equivalent to perl Net::SAML.
    The PHP module is fully ready to use for SSO, but we expect to
    add a lot more, such as WSC, in future. Both mod_php5 and php as
    CGI are supported. php4 should also work.

Web Developer:: You want to integrate SAML based SSO to your web site
    tool or product so that your customers can enjoy SSO enabled web
    sites. In this case you would study zxid.c for examples and use
    libzxid.a to implement the functionality in your own program.

Identity Management hacker:: You need some building blocks: you
    will study libzxid and add to it, contributing to the project.

ZXID Project has vastly more ambitious goals. See the ZXID Project chapter
later in this document.

2 Installing
============

If you want to try ZXID out immediately, we recommend compiling the
library and examples and installing one of the examples as a CGI
script in an existing web server. See later chapters for more details.

  tar xvzf zxid-0.7.tgz
  cd zxid-0.7
  # N.B.  There is no configure script. The Makefile works for all
  #       supported platforms as is.
  # N.B2: We distribute some generated files. If they are missing you need
  #       to regenrate them do make cleaner; make dep ENA_GEN=1
  make
  make samlmod           # optional
  make samlmod_install   # optional: install Net::SAML perl module
  make phpzxid           # optional
  make phpzxid_install   # optional: install php_zxid.so PHP extension
  cp zxid <webroot>/
  # configure your web server to recognize zxid a CGI, e.g.
  mini_httpd -p 8443 -c zxid -S -E zxid.pem

  # Edit your /etc/hosts to contain
  127.0.0.1       localhost sp1.zxidcommon.org sp1.zxidsp.org

  # Point your browser to
  https://sp1.zxidsp.org:8443/zxid?o=E
  https://sp1.zxidsp.org:8443/zxid.pl?o=E      # Perl version

  # Find an IdP to test with and configure it...

2.1 Prerequisites
-----------------

This software depends on following packages:

1. zlib from zlib.net. Generally whatever comes with your distro is sufficient.
2. openssl-0.9.8c or later. See www.openssl.org. Generally openssl libraries
   distributed with most Linux distros are sufficient.<<footnote: It is
   possible to compile without OpenSSL, e.g. for space constrained embedded
   system, but this has serious security implications.>>
3. libcurl from http://curl.haxx.se/. I used version 7.15.5, but probably
   whatever ships with your distribution is fine. libcurl is needed
   for SOAP bindings and for fetching metadata. It needs to be compiled
   to support HTTPS.<<footnote: Compilation without libcurl is possible
   with some loss of functionality.>>
4. HTTPS capable web server. For most trivial testing CGI support is needed. We
   recommend mini_httpd(8) available from
   http://www.acme.com/software/mini_httpd/

Following additional packages are needed by developers who wish
to build from scratch, including the code generation (the standard
distribution includes the output of the code generation, so most
people do not need these).

a. gperf from gnu.org (only for build process when generating code)
b. swig from swig.org (only for build process and only if you
   want scripting interfaces)
c. perl from cpan.org (only for build process and only if you
   want to generate code from .sg)
d. plaindoc from http://mercnet.pt/plaindoc/pd.html (only for
   build process, for code generation from .sg, and for documentation)

Although technically not needed to build zxid, you will need an IdP
to test against. We do not, at the time, supply one so you
will need to find a third party, perhaps a free download of one of the
commercial ones like http://symlabs.com/Products/SFIAM.html.

2.2 Canned Tutorial: Running ZXID as CGI under mini_httpd
---------------------------------------------------------

While zxid will run easily under Apache httpd (see <<link:apache.html:
receipe>>), for sake of simplicity we first illustrate running it with
mini_httpd(8), a very simple SSL capable web server by Jef Poskanzer.

2.2.1 Getting and installing mini_httpd
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can download the source for mini_httpd from
http://www.acme.com/software/mini_httpd/

You should already have installed OpenSSL, or quite probably OpenSSL
shipped with your distribution. If it is not located at
/usr/local/ssl, the you need to edit the mini_httpd ~Makefile~ to
indicate where it is. At any rate you need to uncomment all lines that
start by SSL_ in the ~Makefile~. Then say

  make

Now copy the mini_httpd binary somewhere in your path.

2.2.2 Running mini_httpd
~~~~~~~~~~~~~~~~~~~~~~~~

After building zxid, in zxid directory, run

  mini_httpd -p 8443 -c zxid -S -E zxid.pem

where

  -p 8443      specifies the port to listen to
  -c zxid      specifies that URL paths ending in "zxid" are CGI scripts
  -S           specifies that https is to be used
  -E zxid.pem  specifies the SSL certificate to use

See <<link:apache.html: Apache receipe>> for alternative that
avoids mini_httpd, but is more complicated otherwise.

> N.B. The zxid.pem certificate and private key combo is shipped with zxid
> for demonstration purposes. Obviously everybody who downloads zxid
> has that private key, so there is no real security what-so-ever.  For
> production use, you must generate, or acquire, your own private
> key-certificate pair (and keep the private key secret). See Certificates
> chapter for further info.

2.2.3 Accessing ZXID
~~~~~~~~~~~~~~~~~~~~

Edit your /etc/hosts file so that the definition of localhost also
includes sp1.zxidcommon.org and sp1.zxidsp.org domain names, e.g:

  127.0.0.1       localhost sp1.zxidcommon.org sp1.zxidsp.org

Point your browser to

> https://sp1.zxidsp.org:8443/zxid

or if you do not want the common domain cookie check

> https://sp1.zxidsp.org:8443/zxid?o=E

2.2.4 Setting up an IdP
~~~~~~~~~~~~~~~~~~~~~~~

Currently zxid does not ship with an IdP (though the necessary
protocol encoders and decoders are latently available in libzxid,
should anyone wish to make an attempt to hack an IdP together).
For you to test zxid, you will need to acquire an IdP from
somewhere - any vendor whose product is SAML 2.0 certified
will do. One possible source is http://symlabs.com/Products/SFIAM.html
who have a free download.

If you do not want to install an IdP yourself (even for testing), find
someone who already runs one and ask if they would be willing to load
the metadata of your zxid SP. If you do this, you will need to get
externally visible domain names. This canned tutorial uses /etc/hosts
(see previous step) which is only visible on your own machine.

Once you get your IdP up and running, you need to make sure it accepts
the zxid SP in its Circle of Trust (CoT). This is done by placing
the metadata of the SP in right place in the IdP product configuration.
If your IdP supports automatic CoT management, just turn it on
and chances are you are done.<<footnote: On production IdP you should
understand the trust implications (i.e. no trust) of flipping automatic
CoT management on.>>

If not, you can obtain the zxid SP metadata (which is slightly different
for each install so you can't just copy it from existing install) from

> https://sp1.zxidsp.org:8443/zxid?o=B

This URL is the +well known location method+ metadata URL. It is also
the SP +Entity ID+ or Provider ID, should the IdP product ask for
this in its configuration. If the IdP product needs you to
supply the metadata manually as an xml file, just point your
web browser to the above URL and save to file.

zxid SP, by default, has automatic fetching of IdP metadata enabled so
there is no manual configuration step needed, provided that the IdP
supports the well known location method. All SAML 2.0 certified IdP
implementations must support it (but you may still need to enable it
in configuration).

However, you will need the Entity ID (Provider ID) of the IdP. This is
the URL that the IdP uses for well known location method of metadata
sharing. You may need to dig the IdP documentation or GUI for a while
to find it. If you already have the IdP metadata as an xml file, open it
and look for EntityDescriptor/entityID. If you already have the
file, you can also import it manually by running following command

  ./zxid -import file:///path/to/idp-meta.xml

But the preferred method still is just let the automatic method
do its job.

2.2.5 Your first SSO
~~~~~~~~~~~~~~~~~~~~

1. Start at

   > https://sp1.zxidsp.org:8443/zxid

   or

   > https://sp1.zxidsp.org:8443/zxid?o=E

   If you had common domin cookie already in place, and you
   are already logged in the IdP, the SSO may happen
   automatically (go to step 3). The automatic experience
   will be typical when you use SSO regularily for more
   than one web site (i.e. SP).

   However, if you get a screen titled "ZXID SP SSO",
   you need to paste the IdP's Entity ID to the supplied field
   and click "Login". If zxid SP already obtained the metadata for the
   IdP, you may also see a button specific for your IdP (and in this
   case there is no need to know the Entity ID anymore or paste anything). 

2. Next step depends on the IdP product you are using. Usually
   a login screen will appear asking for user name and password.
   Supply these and login. You will need an account at the IdP.

3. For more slick IdPs, that's all you need to do and you will
   land right back at the zxid SP page titled "ZXID SP Management".

   > Congratulations, you have made your first SSO!

   However, some IdPs will pester you with additional questions
   and you will have to jump through their hoops. A typical
   question is whether you want to accept a federation. You do.
   Sometimes the federation question does not appear automatically
   and you need to figure out a way to create a federation
   in their user interface and how to get them to send you
   back to SP. Sometimes the word used is "account linking"
   instead of federation.<<footnote: Vendor products are constantly
   improving in this area. From protocol perspective
   all the additional gyrations are unnecessary. Be sure
   to provide feedback to the vendor so that simpler, easier
   to use, products will emerge in future.>>

3 Configuring and Running
=========================

ZXID ships with working demo configuration so you can run it right
away and once you are familiar with the concepts, you can return
to this chapter.

ZXID uses a configuration file in hardwired path<<footnote: As of
version 0.2 the configuration file has not been implemented yet. You
configure ZXID at compile time by editing zxidconf.h>>

  /var/zxid/zxid.conf

for figuring out its parameters. If this file is not present, built-in
default configuration is used. The built-in configuration will allow
you to test features of ZXID, but should not be used in production
because it uses default certificates and private keys. Obviously the
demo private key is of public knowledge since it is distributed with the
ZXID package, and as such it provides no privacy protection
what-so-ever. For production use you MUST generate your own
certificate and private key.

Usually configuration of a system involves following tasks

1. Configure web server (see your web server documentation)
   a. HTTPS operation and TLS certificate. In the minimum you need
      the main site, but you may want to configure the Common Domain
      Cookie virtual host as well.
   b. Arrange for ZXID to be invoked. This could mean configuring
      zxid.x or zxid.pl to be recognized as a CGI script, or it could
      mean setting up your ~mod_perl~ or ~mod_php~ system to call
      ZXID at the appropriate place.
2. Configure ZXID, including signing certificate and CoT with peer metadata
   a. generate or acquire certificate
   b. Obtain peer metadata (from their well known location) or
      enable +Instant CoT+ feature.
3. Configure CoT peers with your metadata. They can download your
   metadata from your well known location (which is the URL that
   is your entity ID). For this to happen you need to have web
   server and ZXID up and running.

3.1 Configuration Parameters
----------------------------

3.1.1 zxidroot
~~~~~~~~~~~~~~

The root directory of ZXID configuration files and directories. By default this
is /var/zxid and has following directories and files in it

  /var/zxid/
   |
   +-- zxid.conf  Main configuration file
   +-- pem/       Our certificates
   +-- cot/       Metadata of CoT partners (metadata cache)
   +-- ses/       Sessions
   `-- log/       Log files, pid files, and the like

3.1.2 pem
~~~~~~~~~

Directory that holds various certificates. The certificates
have hardwired names that are not configurable.

ca.pem:: Certification Authority certificates. These are used for
    validating any certificates received from peers (other sites
    on the CoT). The CA certificates may also be shipped to the
    peers to facilitate them validating our signatures. This is
    especially relevant if the certificate is issued by multilayer CA
    hierarchy where the peer may not have the intermediate CA certificates.
sign-nopw-cert.pem:: The signing certificate AND private key (concatenated
    in one file). The private key MUST NOT be encrypted (there will not
    be any opportunity to supply decryption password).
enc-nopw-cert.pem:: The encryption certificate AND private key (concatenated
    in one file). The private key MUST NOT be encrypted (there will not be
    any opportunity to supply decryption password). The signing certificate
    can be used as the encryption certificate. If encryption certificate
    is not specified it will default to signing certificate.

In addition to the above certificates and private keys, you will need
to configure your web server to use TLS or SSL certificates for the
main site and the Common Domain site. We suggest the following naming

ssl-nopw-cert.pem:: SSL or TLS certificate for main site. In order to
    avoid browser warnings, the CN field of this certificate should match
    the domain name of the site. The SSL certificate can be same as
    signing or encryption certificate.
cdc-nopw-cert.pem:: SSL or TLS certificate for Common Domain Cookie
    introduction site. In order to avoid browser warnings, the CN field
    of this certificate should match the domain name of the site. The SSL
    certificate can be same as signing or encryption certificate.

3.1.3 cot
~~~~~~~~~

Directory that holds metadata of the Circle of Trust (CoT)
partners. If +Instant CoT+ is enabled, this directory needs to be
writable at run time.

4 Compilation for Experts
=========================

  make cleaner
  make dep ENA_GEN=1
  make

4.1 Build Process
-----------------

The build process of ZXID relies heavily on code generation techniques
that are not for the faint of heart. Some of these techniques, like
xsd2sg.pl were innovated for this project, while others like SWIG and
gperf are existing software.  Here and there some additional perl(1)
and sed(1) scripts are run to fix a thing or two.

<<dot: zxid-build: ZXID Build Process

margin=0

sg [label=".sg"];
i  [label=".i"];
phpi  [label="phpzxid.i"];
hc [label=".h and .c"];
pm [label=".pm and glue"];
php [label="php glue"];
gperf [label=".gperf"];
netsaml [label="Net::SAML"];

sg -> hc [label="xsd2sg.pl"];
sg -> gperf [label="xsd2sg.pl"];
gperf -> hc [label="gperf"];
hc -> hc [label="gen-consts-from-gperf-output.pl"];
hc -> libzxid [label="gcc"];
hc -> pm [label="swig"];
i  -> pm [label="swig"];
hc -> php [label="swig"];
phpi -> php [label="swig"];

libzxid -> zxid [label="ld"];

libzxid -> netsaml [label="ld"];
pm -> netsaml [label="perl, gcc, ld"];

libzxid -> php_zxid [label="ld"];
php -> php_zxid [label="gcc, ld"];

>>

Carefully study the Makefile and this should all start to make sense.

4.2 Special or embedded compile (reduced functionality)
-------------------------------------------------------

libzxid contains thousands of functions and any given application is
unlikely to use them all. Thus the easiest, safest, no loss of
functionality, way to reduce the footprint is to simply enable
compiler and linker flags that support dead function
elimination.<<footnote:  Unfortunately the gnu ld does not support dead
function elimination. You should file this as a bug to them. If they
tell you to put evey one of the 7000 some functions in a separate .c file,
consider the scalability implications of this. Read the comments in
pulverize.pl for a full scoop and an approach.>>

If you need to squeeze zxid into as minimal space as possible,
some functionality tradeoffs are supported. I stress that you
should only attempt these tradeoffs once you are familiar with
zxid and know what you are doing. The canned install instructions
and tutorial walk throughs stop working if you omit
significant functionality.

4.2.1 Compilation without OpenSSL
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Comment out the -DUSE_OPENSSL flag from CFLAGS in Makefile and
recompile.

This will cripple zxid from security perspective because it
will no longer be able to verify or generate digital signatures.
Unless your environment does not need trust and security,
or you understand thoroughly how to provide trust and security
by other means, it is a very bad idea to compile without OpenSSL.

N.B. Compiling, or not, zxid with OpenSSL does not affect
whether your web server will use SSL or TLS. Unless you know
what you are doing, you should be using SSL at web server
layer. Given that SSL is used at web server layer, the savings
you would gain from compiling zxid without OpenSSL may be
neglible if you use dynamic linking.

4.2.2 Compilation without libcurl
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Comment out the -DUSE_CURL flag from CFLAGS in Makefile and
recompile.

Disabling libcurl does not have adverse security implications: you
only loose some functionality and depdending on your situation you may
well be able to live without it.

1. Without libcurl, zxid can not act as a SOAP client. This has a few
   consequences

   a. Artifact profile for SSO is not supported because it needs SOAP
      to resolve the artifact. In most cases a perfectly
      viable alternative is to use POST profile for SSO.

   b. SOAP profiles for Single Logout and NameID management (aka
      defederation) are not supported. You can use the redirect
      profiles and get mostly the same functionality.

2. Automatic CoT metadata fetching using well known
   location method is not supported without libcurl.
   You can fetch the metadata manually, e.g. using web browser,
   and place it in /var/zxid/cot directory.

   If you want to manually control your Circle of Trust
   relationships, you probably want to do this anyway so
   loss of automatic functionality is a nonissue.<<footnote:
   If you compile with libcurl, but still want to disable
   automatic metadata fetching, investigate the ZXID_MD_FETCH
   and related configuration options.>>

3. Web Services Client (WSC) functionality is not supported
   without libcurl. Effectively this is just another case of
   SOAP needed. If you have your own SOAP implementation,
   you may, at lesser automation, achieve much of the
   same functionality by calling the encoder and decoder
   functions manually.

4.2.3 Compiling without zlib (not supported)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

zlib is used mainly in redirect profiles. Since zlib foot print is
small, we have made no supported provision to compile without it. If
you hack something together, let us know.

4.3 Choosing Which Standards to Compile in (default: all)
---------------------------------------------------------

On space constrained systems you may shed additional weight by only
compiling in the IdM standards you actully use.  Of course, if you do
not use them the dead function eliminations should take care of them,
but sometimes you can gain additional savings in space and especially
compile time.

Another reason could be, in the land of the free, if some modules
are covered by a software patent, you may want to compile a binary
without the contested functionality.<<footnote: Please do not
ask me to add additional baggage to avoid patents. Software
patents are a plague and your efforts are best spent in getting
them overturned.>>

You can tweak the flags, shown in accompanying table, in the Makefile
or by supplying new values on commend line. For example

  make TARGET=sol8 ENA_SAML2=0

would disable SAML 2.0 (and trigger build for Sparc Solaris 8).

<<table: Conditional inclusion of standards
Makefile flag    Standard      Comments
================ ============= ======================================
ENA_SSO=1        All SSO       Must be enabled for any of SSO to work
ENA_SAML2=1      SAML 2.0
ENA_FF12=1       ID-FF 1.2     Requires ENA_SAML11=1
ENA_SAML11=1     SAML 1.1
ENA_WSF=1        All WSF       Must be enabled for any of WSF to work
ENA_WSF2=1       ID-WSF 2.0
ENA_WSF11=1      ID-WSF 1.1
>>

4.4 localconf.mk
----------------

You can use localconf.mk to remember your own make options,
such as TARGET and different ENA flags, wihtout editing
the distributed Makefile.

One useful option to put in localconf.mk is ENA_GEN which
will turn on the dependencies that will trigger generation
of the files in zxid/c directory. For example

  echo 'ENA_GEN=1' >>localconf.mk
  make

5 Net::SAML Perl Module
=======================

* perl CGI example: zxid.pl
* using with mod_perl

After building the main zxid tree, you can

  cd Net
  perl Makefile.PL
  make
  make test      # Tests are extremely sparse at the moment
  make install

This assumes you use the pregenerated Net/SAML_wrap.c and Net/SAML.pm files
that we distribute. If you wish to generate these files from origin,
you need to have SWIG installed and then say in main zxid directory

  make perlmod     # Makes all available perl modules (including heavy low level ones)
  make samlmod     # Only makes Net::SAML (much faster)
  make wsfmod      # Only makes Net::WSF (much faster)

>  WARNING: Low level interface is baroque, and consequently, it
>  will take a lot of disk space, RAM and CPU to build it: 100 MB
>  would not be exageration and over an hour (on 1GHz CPU). Build
>  time memory consumption of single cc1 process will be over
>  256 MB of RAM. You have been warned.

5.1 Current major modules are
-----------------------------

* Net::SAML - The high level interfaces for Single Sign-On (SSO)
* Net::SAML::Raw - Low level assertion and protocol manipulation interfaces
* Net::SAML::Metadata - Low level metadata manipulation interfaces

5.2 Planned modules
-------------------

* Net::WSF - The high level interfaces for Web Services Frameworks (WSF)
* Net::WSF::Raw - The low level interfaces for WSF variants
* Net::WSF::WSC - The high level interfaces for Web Services Clients
* Net::WSF::WSC:Raw

5.3 Perl API Adaptations
------------------------

The perl APIs were generated from the C .h files using SWIG. Generally any
C functions and constants that start by zxid_, ZXID_, SAML2_, or SAML_
have that prefix changed to <<tt: Net::SAML::>>. Note, however, that
the zx_ prefix is not stripped.

Since ZXID wants to keep strings in many places in length + data
representation, namely as ~struct zx_str_s~, SWIG typemaps were used
to make this happen automatically. Thus any C function that takes as
an argument <<tt: struct zx_str_s*>> can take a perl string
directly. Similarily any C function that returns such a pointer, will
return a perl string instead. As a final goodie, any C function, such
as

  struct zx_str_s* zx_ref_len_str(struct zx_ctx* c, int len, char* s);

that takes ~length~ and <<tt: s>> as explicit arguments, takes only single
argument that is a perl string (the one argument automatically satisfies
two C arguments, thanks to a type map). The above could be called like

  $a = Net::SAML::zx_ref_len_str(c, "foo");

First the "foo" satisfies both ~len~ and ~s~, and then the return value
is converted back to perl string.

5.4 Testing Net::SAML and zxid.pl as CGI script
-----------------------------------------------

To test the perl module, you must restart the mini_httpd(8) so
that it recognizes zxid.pl as CGI script:

  mini_httpd -p 8443 -c zxid.pl -S -E zxid.pem

Then start browsing from

  https://sp1.zxidsp.org:8443/zxid.pl

or if you want to avoid the common domain cookie check

  https://sp1.zxidsp.org:8443/zxid.pl?o=E

5.5 Testing Net::SAML and zxid.pl under mod_perl
------------------------------------------------

You can run zxid.pl under mod_perl using the Apache::Registry
module. See <<link:apache.html: Apache receipe>> for how
to compile Apache to support mod_perl. After configuration
it should work the same as the CGI approach.

5.6 Debugging Net::SAML with GDB
--------------------------------

As bizarre as it may sound, it is actually quite feasible to debug
libzxid and the SAML_wrap.c using GDB while in perl. For example

  cd zxid
  gdb /usr/local/bin/perl
  set env QUERY_STRING=o=E
  r ./zxid.pl

If the script crashes inside the C code, GDB will perfectly
reasonably take control allowing you to see stack back-trace (bt)
and examine variables. Of course it helps if openssl and perl
were compiled with debug symbols (libzxid is compiled
with debug symbols by default), but even if they weren't you
can ususally at least get some clue.

When preparing a perl module, generally Makefile.PL mechanism causes
the same compilation flags to be used as were used to compile the perl
itself. Generally this is good, but if libzxid was compiled with
different flags, mysterious errors can crop up. For example, I compile
my libzxid agains openssl that I have also compiled myself. However, I
once had a bug where the perl had been compiled such that the Linux
distribution's incompatible openssl would be picked by perl compile
flags, resulting in mystery crashes deep inside openssl ASN.1 decoder
routines (c2i_ASN1_INTEGER() while in d2i_X509() to be exact). When I
issued `info files' in GDB I finally realized that I was using the
wrong openssl library.

6 PHP extension php_zxid.so
===========================

The PHP integration is incomplete due to incomplete support in SWIG
for php5. However, enough interface exists to get most high level API
working and thus successfully run an SP.

After building main zxid distribution, say

  make phpzxid

You MUST have php-config(1) in path. If not, try

  make phpzxid PHP_CONFIG=/path/to/php-config

If the extension built successfully, you can use it by copying
it to a suitable place, e.g.

  make phpzxid_install

The install again uses the php-config(1) to figure out where
php(1) can find the module.

Then add in your script

  dl("php_zxid.so");   // Load the module

You may need to tweak the paths, or LD_LIBRARY_PATH, to get this to work.

7 Python Extension
==================

TBD using SWIG

8 Java Native API
=================

TBD using SWIG

9 Integration with Existing Web Sites
=====================================

Single Sign-On is used to protect some useful resources. ZXID does not
have any means of serving these resources, rather a normal web server
or application server should do it. ZXID should just concentrate on
verifying that a user has valid session, and if not, establishing the
session by way of SSO.

9.1 Brief Overview of Control Flow
----------------------------------

The SAML 2.0 specifications mandate a wire protocol, and in order
to speak the wire protocol, the SP application typically
has to follow certain standard sequence of control flow.

<<dia: sp-flow,,:bg: Typical control flow of ZXID SP>>

First a user<<footnote: The user is often referred to as "Principal" in
more technical jargon. Although the human user and web browser are
distinct entities, we do not stress that separation here. Whatever
user "does" really will, in protocol, appear as web browser sending
requests.>> tries to access a web site that acts in SP role. This
triggers following sequence of events

1.  User is redirected to URL in a common domain. This is so that
    we can read the Common Domain Cookie that indicates which
    IdP the user uses. Alternatively, if you started at
    https://sp1.zxidsp.org:8443/zxid?o=E, the CDC check is
    by-passed and flow 2b. happens.

2.  After the CDC check, a Authentication Request (AuthnReq) is
    generated. The IdP may have been chosen automatically
    using CDC (2a), or there may have been some user interface
    interaction (not show in the diagram) to choose the IdP.

3.  User is redirected to the IdP. The redirection carries
    as a query string a compressed and encoded form of
    the SAML 2.0 AuthnReq.

4.  Once the IdP has authenticated the user, or observed
    that there already is a valid IdP session (perhaps from
    a cookie), the IdP redirects the user back to the SP.

    The AuthnResponse may be carried in this redirection
    in a number of alternate ways

    a. The redirect contains a special token called
       +artifact+. The artifact is a reference to the
       AuthnResponse and the SP needs to get the
       actual AuthnResponse by using a SOAP call (the
       4bis step).

    b. The "redirect" is actually a HTML page with
       a form and little JavaScript that causes the
       form to be automatically posted to the SP.
       The AuthnResponse is carried as a form field.

5.  After verifying that AuthnResponse indicated a
    success, the SP establishes a local session for the
    user (perhaps setting a cookie to indicate this).
    
    Depending on how the SP to web site integration
    is done the user is taken to the web site in
    one of the two ways

    a. Redirect to the content. This time the session
       is there, therefore the flow passes directly
       from check session to the web content.

    b. It is also possible to show the content directly
       without any intervening redirection.

9.2 Redirect Approach to Integration
------------------------------------

9.3 Pass-thru Approach to Integration
-------------------------------------

9.3.1 mod_perl pass-thru
~~~~~~~~~~~~~~~~~~~~~~~~

9.3.2 PHP pass-thru
~~~~~~~~~~~~~~~~~~~

9.3.3 mod_zxid pass-thru
~~~~~~~~~~~~~~~~~~~~~~~~

9.4 Proxy Approach to Integration
---------------------------------

10 Native C API
===============

The generated aspects of the native C API are in c/*-data.h, for example

  c/saml2-data.h

Studying this file is very instructive.

10.1 C Data Structures
----------------------

From .sg a header (NN-data.h) is generated. This header contains structs that
represent the data of the elements. Each element and attribute
generates its own node. Even trivial nodes like strings have to be
kept this way because the nodes form basis of remembering the ordering
of data. This ordering is needed for exclusive XML canonicalization,
and thus for signature verification.<<footnote: It's unfortunate that
the XML standards do not make this any easier. Without order
maintenance requirement, it would be possible to represent trivial
child elements directly as struct fields. An approach that tried to do
just this is available from CVS tag GEN_LALR (ca. 29.5.2006).>>

Any missing data is represented by NULL pointer.

Any repeating data is kept as a linked list, in reverse order of being
seen in the data stream.<<footnote: Reverse order is just an
optimization - or an artefact of simply adding latest element to the
head of the list. If this bothers you, it's easy enough to reverse the
list afterwards. Linked list is simple and works well for data whose
order does not matter much (we use separate pointer for remembering
the canonicalization order) and where random access is not needed, or
cardinality is low enough so that simple pointer chasing is efficient
enough.>>

Simple elements and all attributes are represented by simple string node
(even if they are booleans or integers).

*Example*

Consider following XML

  <ds:Signature>
     <ds:SignedInfo>
       <ds:CanonicalizationMethod
           Algorithm="http://w3.org/xml-exc-c14n#"/>
       <ds:SignatureMethod
           Algorithm="http://w3.org/xmldsig#rsa-sha1"/>
       <ds:Reference
           URI="#RrcrNwFIw6n">
         <ds:Transforms>
           <ds:Transform
               Algorithm="http://w3.org/xml-exc-c14n#"/>
           <ds:Transform
               Algorithm="http://w3.org/xmldsig#env-sig"/></>
         <ds:DigestMethod
             Algorithm="http://w3.org/xmldsig#sha1"/>
         <ds:DigestValue>lNIzVMrp8CwTE=</></></>
     <ds:SignatureValue>
       GeMp7LS...vnjn8=</></>

Decoding would produce the data structure in Fig-<<see: fig:decode-data>>. You
should also look at c/saml2-data.h to see the structs involved in this
example.

<<dot: decode-data: Typical data structure produced by decode.

// This graph crashes dot 1.12, but works in dot 2.8

size="11.0,6.0"
margin=0
rankdir=LR

{ rank=same; siginfo; sigval; }
{ rank=same; canonmeth; sigmeth; ref; }
//{ rank=same; canonmeth; sigmeth; ref; digmeth; digval; }
//{ rank=same; xforms; xform_env; xform_c14n; }
//{ rank=same; xform_env; xform_c14n; digmeth; digval; }
{ rank=same; xforms; digmeth; digval; }
{ rank=same; xform_c14n; xform_env; }

sig [shape=record,label="zx_ds_Signature_s|{|{<f_kids>gg.kids|<f_siginfo>SignedInfo|<f_sigval>SignatureValue|KeyInfo (0)|Object (0)|Id (0)}}"];
siginfo [shape=record,label="zx_ds_SignedInfo_s|{|{<f_kids>gg.kids|<f_wo>gg.g.wo|<f_canonmeth>CanonicalizationMethod|<f_sigmeth>SignatureMethod|<f_ref>Reference|Id (0)}}"];

canonmeth [shape=record,label="zx_ds_CanonicalizationMethod_s|{|{<f_wo>gg.g.wo|Algorithm\n\"http://w3.org/xml-exc-c14n#\"}}"];

sigmeth [shape=record,label="zx_ds_SignatureMethod_s|{|{<f_wo>gg.g.wo|Algorithm\n\"http://w3.org/xmldsig#rsa-sha1\"}}"];

ref [shape=record,label="zx_ds_Reference_s|{|{<f_kids>gg.kids|gg.g.wo (0)|<f_xforms>Transforms|<f_digmeth>DigestMethod|<f_digval>DigestValue|Id (0)|Type (0)|URI\n\"#RrcrNwFIw6n\"}}"];

xforms [shape=record,label="zx_ds_Transforms_s|{|{<f_kids>gg.kids|<f_wo>gg.g.wo|gg.g.n (0)|<f_xform>Transform}}"];

xform_c14n [shape=record,label="zx_ds_Transform_s|{|{<f_wo>gg.g.wo|gg.g.n (0)|XPath (0)|<f_c14n_algo>Algorithm\n\"http://w3.org/xml-exc-c14n#\"}}"];

xform_env [shape=record,label="zx_ds_Transform_s|{|{gg.g.wo (0)|<f_n>gg.g.n|XPath (0)|Algorithm\n\"http://w3.org/xmldsig#env-sig\"}}"];

xforms:f_xform -> xform_env
xform_env:f_n -> xform_c14n

digmeth [shape=record,label="zx_ds_DigestMethod_s|{|{<f_wo>gg.g.wo|Algorithm\n\"http://w3.org/xmldsig#sha1\"}}"];
digval [shape=record,label="zx_elem_s|{|{gg.g.wo (0)|content\n\"lNIzVMrp8CwTE=\"}}"];

sigval [shape=record,label="zx_ds_SignatureValue_s|{|{gg.g.wo (0)|gg.content\n\"GeMp7LS...vnjn8=\"|Id (0)}}"];

sig:f_siginfo -> siginfo
sig:f_sigval  -> sigval

siginfo:f_canonmeth -> canonmeth
siginfo:f_sigmeth -> sigmeth
siginfo:f_ref -> ref

ref:f_xforms -> xforms
ref:f_digmeth -> digmeth
ref:f_digval -> digval

sig:f_kids ->siginfo [weight=0,style=dashed,color=red]

siginfo:f_wo ->sigval [weight=0,style=dashed,color=red]
siginfo:f_kids -> canonmeth [weight=0,style=dashed,color=red]
canonmeth:f_wo -> sigmeth [weight=0,style=dashed,color=red]
sigmeth:f_wo -> ref [weight=0,style=dashed,color=red]

ref:f_kids -> xforms [weight=0,style=dashed,color=red]
xforms:f_wo -> digmeth [weight=0,style=dashed,color=red]
digmeth:f_wo -> digval [weight=0,style=dashed,color=red]

xforms:f_kids -> xform_c14n [weight=0,style=dashed,color=red]
xform_c14n:f_wo -> xform_env [weight=0,style=dashed,color=red]

>>

There are two pointer systems at play here. The black solid arrows
depict the logical structure of the XML document. For each child
element there is a struct field that simply points to the child. If
there are multiple occurances of the child, as in
~sig->SignedInfo->Reference->Transforms->Transform~, the children are
kept in a linked list connected by gg.g.n (next) fields.<<footnote:
This linked list may be in inverted order depending on the phase of
the moon and position of the trams in Helsinki. Until implementation
matures, its better not to depend on the ordering.>>

The +wide order+ structure, depicted by red dashed arrows, is
maintained using gg.kids and gg.g.wo fields. For example
~sig->SignedInfo->Reference->Transforms~ keeps its kids, the
~zx_ds_Transform~ objects, in the original order hanging from the kids
and linked with the wo field. As can be seen the order kept with wo
fields can be different than the one kept using n (next) fields.
What's more, the kids list can contain dissimilar objects, witness
~sig->SignedInfo->Reference->gg.kids~. The wire order representation
is only captured when decoding the document and is mainly useful for
correctly cononicalizing the document for signature verification. If
you are building a data structure in your own program, you typically
will not set the gg.kids and gg.g.wo fields.

In the diagram, the objects of type ~zx_str_s~ were collapsed to
double quoted strings. Superfluous gg.kids, gg.g.wo, and gg.g.n fields
were omitted: they exist in all structures, but are not shown when
they are ~NULL~. The ~NULL~ is depicted as zero (0).<<footnote: All
this gg.g business is just C's way of referencing the fields of a
common base type of element objects.>>


<<notacountry: so wo>>

10.1.1 Handling Namespaces
~~~~~~~~~~~~~~~~~~~~~~~~~~

An annoying feature of XML documents is that they have variable
namespace prefixes. The namespace prefix for the unqualified elements
is taken to be the one specified in target() directive of the .sg
input. Name of an element in C code is formed by prefixing the element
by the namespace prefix and an underscore.

Attributes will only have namespace prefix if such was expressly
specified in .sg input.

When decoding, the actual namespace prefixes are recorded. The wire
order encoder knows to use these recorded prefixes so that accurate
canonicalization for XMLDSIG can be produced.

If the message on wire uses wrong namespaces, the wrong ones are
remembered so that canonicalization for signature validation will work
irrespective. The ability to accept wrong namespaces only works as
long as there is no ambiguity as to which tag was meant - there are
some tags that need namespace information to distinguish. If you hit
one of these then either you get lucky and the one that is arbitrarily
picked by the decoder happens to be the correct one, or you are stuck
with no easy way to make it right. Of course the XML document was
wrong to start with so theoretically this is not a concern. Generally
the more schemas that are simultaneously generated to one package, the
greater the risk of collisions between tags.

The schema order encoder always uses the prefixes defined
using target() directives in .sg files. The runtime notion of
namespaces is handled by ~ns_tab~ field of the decoding and encoding
context.  It is initialized to contain all namespaces known by virtue
of .sg declarations.  The runtime assigned prefixes are held in a
linked list hanging from <<tt: n>> (next) field of ~struct
zx_ns_s~. (*** more work needed here)

The code generation creates a file such as c/saml2-ns.c which contains
initialization for the table. The main program should point the ns_tab
field of context as follows:

  main {
    struct zx_ctx* ctx;
    ...
    ctx->ns_tab = zx_ns_tab;   /* Here zx_ is the chosen prefix */
  }

Consider following evil contortion

  <e:E xmlns:e="uri">
    <h:H xmlns:h="uri"/>
    <b:B xmlns:b="uri">
      <e:C xmlns:e="uri"/>
      <e:D xmlns:e="iru">
        <e:F xmlns:e="uri"/></></></>

Assuming the ~ns_tab~ assigns prefix <<tt: y>> to the namespace
URI, we would have following data structure as a result of a decode

<<dot: ns-data,,: Decode of XML and resulting namespace structures.
margin=0
//rankdir=LR

{ rank=same; ns_tab; e; h; b; }
{ rank=same; H; B; }
{ rank=same; C; D; }

ns_tab [shape=record,label="{ns_tab|{y|uri|<uri_n>}|{z|iru|<iru_n>}}"]

e [shape=record,label="e|uri|<n>"]
h [shape=record,label="h|uri|<n>"]
b [shape=record,label="b|uri|0"]
i [shape=record,label="e|iru|0"]

ns_tab:uri_n -> e
ns_tab:iru_n -> i
e:n -> h
h:n -> b

E -> H [style=bold]
E -> B [style=bold]
B -> C [style=bold]
B -> D [style=bold]
D -> F [style=bold]

E -> e [color=red]
H -> h [color=red]
B -> b [color=red]
C -> e [color=red]
D -> i [color=red]
F -> e [color=red]
>>

The red thin arrows indicate how the elements reference the
namespaces. Since none of the elements used the prefix originally
specified in the schema grammar target() directive, we ended up
allocating "alias" nodes for the uri. However, since E and C use the
same prefix, they share the alias node. Things get interesting with D:
it redefines the prefix e to mean different namespace URI, "iru", which
happens to be an alias of prefix z.

Later, when wire order canonical encode is done, the red thin arrows
are chased to determine the namespaces. However, we need to keep a
separate "seen" table to track whether parent has already declared the
prefix and URI. E would declare xmlns:e="uri", but C would not because
it had already been "seen". However, F would have to declare it again
because the xmlns:e="iru" in D masks the declaration. The ~zx_ctx~
structure is used to track the namespaces and "seen" status
through out decoders and encoders.

<<dot: seen-data,,: Seen data structure (blue dotted and green dashed arrows) in the end of decoding F. S=seen, SN=seen_n.
margin=0
//rankdir=LR

{ rank=same; ns_tab; ee; e; h; b; }
{ rank=same; H; B; }
{ rank=same; C; D; }

ns_tab [shape=record,label="{ns_tab|{P|URI|S|SN|N}|{y|uri|0|0|<uri_n>}|{z|iru|0|0|<iru_n>}}"]

e [shape=record,label="e|uri|0|0|<n>"]
ee [shape=record,label="e|uri|<s>|0|<n>"]
h [shape=record,label="h|uri|0|<sn>|<n>"]
b [shape=record,label="b|uri|0|<sn>|0"]
i [shape=record,label="e|iru|<s>|0|0"]

ctx [shape=record,label="{ctx|{|{<ns>ns_tab|<sn>seen_n}}}"]

ns_tab:uri_n -> ee
ns_tab:iru_n -> i
ee:n -> e
e:n -> h
h:n -> b

E -> H [style=bold]
E -> B [style=bold]
B -> C [style=bold]
B -> D [style=bold]
D -> F [style=bold]

E -> e [color=red]
H -> h [color=red]
B -> b [color=red]
C -> e [color=red]
D -> i [color=red]
F -> ee [color=red]

ns_tab -> ctx:ns [arrowhead=none,arrowtail=normal]
b -> ctx:sn [color=blue,style=dotted,arrowhead=none,arrowtail=normal]
b:sn -> h [color=blue,style=dotted]
h:sn -> ee [color=blue,style=dotted]
ee:s -> i [color=green,style=dashed]
i:s -> e [color=green,style=dashed]
>>

Here we can see how the ~seen_n~ list, represented by the blue dotted
arrows, was built: at the head of the list, ~ctx->seen_n~, is the last
seen prefix, namely b (beacuse, although the meaning of e at F was
different, e as a prefix had already been seen earlier at E), followed
by other prefixes in inverse order of first occurance.<<footnote: This
is a mere artifact of implementation: it's cheapest to add to the head
of the list. This may change in future.>> The green dashed arrows from
e:uri to e:iru and then on to second e:uri reflect the fact that e:uri
(second) was put to the list first (when we were at E), but later, at
D, a different meaning, iru, was given to prefix e. Finally at F we
give again a different meaning for e, thus pushing to the "seen stack"
another node. Although e at E and at F have namespace URI, "uri", we are
not able to use the same node because we need to keep the stack order.
Thus we are forced to allocate two identical nodes.

10.1.2 Handling any and anyAttribute
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Since our aim is to be lax in what we accept, every element can handle
unexpected additional attributes as well as unexpected elements. Thus
whether the schema specifies any or anyAttribute or not, we handle
everything as if they were there. However, when attributes and
elements are received out side of their expected context, they are
simply trated as strings whith string names. This is true even for
those attributes and elements that would be recognizable in their
proper context.

The any extension points, as well as some bookkeeping data
are hidden inside ~ZX_ELEM_EXT~ macro. If you tinker with
this macro, be sure you know what you are doing. If you want
to add your own specific fields to all structs, redefining
~ZX_ELEM_EXT~ may be appropriate, but if you want to add more
fields only to some specific structures, you can define
a macro of form

  TPF_EEE_EXT

and put in it whatever fields you want. These fields will be
initialized to zero when the structure is created, but are not touched
in any other way by the generated code. In particular, if some of your
fields are pointers, it will be your responsibility to free them. The
standard free functions will not understand to free them. See the data
structure walking functions, below for one way to accomplish this.

10.1.3 Root data structure
~~~~~~~~~~~~~~~~~~~~~~~~~~

The root data structure

  struct zx_root_s;

is a special structure that has a field for evey top level
recognizable element.

10.1.4 Per element data structures
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

*** TBW

10.1.5 Memory Allocation
~~~~~~~~~~~~~~~~~~~~~~~~

After decoding all string data points directly into the input buffer,
i.e. strings are NOT copied. Be sure to not free the input buffer
until you are done processing the data structure. If you need to take
a copy of the strings, you will need to walk the data structure as a
post processing step and do your copies. This can be done using

  void TPF_dup_strs_len_NS_EEE(struct zx_dec_ctx* c, struct TPF_NS_EEE_s* x);

The structures are allocated via ZX_ZALLOC() macro, which
by default calls zx_zalloc() function, which in turn
uses system malloc(3). However, you can redefine the
macro to use whatever other allocation scheme you desire.

The generated libraries never free(3) memory. In many programming
patterns, this is actually desireable: for example a CGI program can
count on dying - the process exit(2) will free all the memory.

If you need to free(3) the data structure, you will need to walk it
using

  void TPF_free_len_NS_EEE(struct zx_dec_ctx* c,
                           struct TPF_NS_EEE_s* x,
                           int free_strings);
  void zx_free_any(struct zx_dec_ctx* c,
                   struct zx_note_s* n,
                   int free_strs);

The zx_free_any() works by having a gigantic switch statement that calls
the appropriate specific free function.

You can deep clone the data structure with

  void TPF_deep_clone_NS_EEE(struct zx_dec_ctx* c,
                             struct TPF_NS_EEE_s* x,
                             int dup_strings);
  struct zx_note_s* zx_clone_any(struct zx_dec_ctx* c,
                                 struct zx_note_s* n,
                                 int dup_strs);

The zx_clone_any() works by having a gigantic switch statement that calls
the appropriate specific free function.

10.2 Decoder as Recursive Descent Parser
----------------------------------------

The entry point to the decoder is

  struct zx_root_s* zx_DEC_root(struct zx_dec_ctx* c,
                                struct zx_ns_s* dummy,
                                int n_decode);

The decoding context holds pointer to the raw data and must be
initialized prior to calling the decoder. The third argument specifies
how many recognized elements are decoded before returning. Usually you
would specify 1 to consume one top level element from the
stream.<<footnote: The second argument, the dummy namespace, is
meaningless for root node, but makes sense for element decoders. For
root you can simply supply 0 (NULL).>>

The returned data structure, ~struct zx_root_s~, contains
one pointer for each type of top level element that can
be recognized. The ~tok~ field of the returned value
identifies the last top level element recognized and can
be used to dispatch to correct request handler:

  zx_prepare_dec_ctx(c, TPF_ns_tab, start_ptr, end_ptr);
  struct TPF_root_s* x = TPF_DEC_root(c, 0, 1);
  switch (x->gg.g.tok) {
  case TPF_NS_EEE_ELEM: return process_EEE_req(x->NN_EEE);
  }

When processing responses, it is generally already known
which type of response you are expecting, so you can simply
check for NULLness of the respective pointer in the returned
data structure.

Internally zx_DEC_root() works much the same way: it scans
a beginning of an element from the stream, looks up the token
number corresponding to the element name, and switches on
that, calling element specific decoder functions (see next
section) to do the detailed processing.

In the above code fragment, you should note the call to
zx_prepare_dec_ctx() which initializes the decoder machinery.
It takes +ns_tab+ argument, which specifies which namespaces
will be recognized. This table MUST match the TPF_DEC_root()
function you call (i.e. both must have been generated as
part of the same xsd2sg.pl invocation). The other arguments
are the start of the buffer to decode and pointer one past
the end of the buffer to decode.

10.2.1 Element Decoders
~~~~~~~~~~~~~~~~~~~~~~~

For each recognizable element there is a function of form

  struct TPF_NS_EEE_s* zx_DEC_NS_EEE(struct zx_dec_ctx* c);

where TPF is the prefix, NS is the namespace prefix, and
EEE is the element name. For example:

  struct zx_se_Envelope_s* zx_DEC_se_Envelope(struct zx_ctx* c);

These functions work much the same way as the root decoder. You
should consult dec-templ.c for the skeleton of the decoder. Generally
you should not be calling element specific decoders: they
exist so that zx_DEC_root() can call them. They have somewhat
nonintuitive requirtements, for example the opening <, the
namespace prefix, and the element name must have already been
scanned from the input stream by the time you call element
specific decoder.

10.2.2 Decoder Extension Points
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The generated code is instrumented with following macros

ZX_ATTR_DEC_EXT(ss):: Extension point called just after decoding known attribute
ZX_XMLNS_DEC_EXT(ss):: Extension point called just after decoding xmlns attribute
ZX_UNKNOWN_ATTR_DEC_EXT(ss):: Extension point called just after decoding unknown attr
ZX_START_DEC_EXT(x):: Extension point called just after decoding element name
    and allocating struct, but before decoding any of the attributes.
ZX_END_DEC_EXT(x):: Extension point called just after decoding the entire element.
ZX_START_BODY_DEC_EXT(x):: Extension point called just after decoding element tag, including attributes, but before decoding the body of the element.
ZX_PI_DEC_EXT(pi):: Extension point called just after decoding processing instruction
ZX_COMMENT_DEC_EXT(comment):: Extension point called just after decoding comment
ZX_CONTENT_DEC(ss):: Extension point called just after decoding string content
ZX_UNKNOWN_ELEM_DEC_EXT(elem):: Extension point called just after decoding unknown element

Following macros are available to the extension points

TPF:: Type prefix (as specified by  -p during code generation)
EL_NAME:: Namespaceful element name (NS_EEE)
EL_STRUCT:: Name of the struct that describes the element
EL_NS:: Namespace prefix of the element (as seen in input schema)
EL_TAG:: Name of the element without any namespace qualification.

10.3 Exclusive Canonical Encoder
--------------------------------

The encoder receives a C data structure and generates a gigantic
string containing an XML document corresponding to the data structure
and the input schemata. The XML document conforms to the rules of
exclusinve XML canonicalization and hence is useful as input to XMLDSIG.

One encoder is generated for each root node specified at the code
generation. Often these encoders share code for interior nodes.

The encoders allow two pass rendering. You can first use the length
computation method to calculate the amount of storage needed and
then call one of the rendering functions to actually render. Or
if you simply have large enough buffer, you can just render directly.

The encoders take as argument next free position in buffer
and return a char pointer one past the last byte used. Thus
you can discover the length after rendering by subtracting the
pointers. This is guaranteed to result same length as returned
by the length computation method.<<footnote: This is a useful
sanity check. If the two ever disagree, please report a bug.>>
You can also call the next encoder with the return value
of the previous encoder to render back-to-back elements.

The XML namespace and XML attribute handling of the encoders
is novel in that the specified sort is done already at code
generation time, i.e. the renderers are already in the order
that the sort mandates.

For attributes we know the sort order directly from the schema
because [xml-c14n], sec 2.2, p.7, specifies that they
sort first by namespace URI and then by name, bot of which
we know from the schema.

For ~xmlns~ specifications the situation is similarily easy in the
schema order encoder case because we know the namespace prefixes
already at code generation time. However, for the wire order encoder
we actually need a runtime sort because we can not control which
namespace prefixes get used. However, for both cases we can make a
pretty good guess about which namespaces might need to be declared at
any given element: the element's own namespace and namespaces of each
of its attribuites. That's all, and it's all known at code generation
time. At runtime we only need to check if the namespace has already
been seen at outer layer.

10.3.1 Length computation
~~~~~~~~~~~~~~~~~~~~~~~~~

Compute length of an element (and its subelements). The XML attributes
and elements are processed in schema order.

  int TPF_LEN_SO_NS_EEE(struct zx_ctx* c,
                        struct TPF_NS_EEE_s* x);

For example:

  int zx_LEN_SO_se_Envelope(struct zx_ctx* c,
                            struct zx_se_Envelope_s* x);

Compute length of an element (and its subelements). The XML namespaces
and elements are processed in wire order.

  int TPF_LEN_WO_NS_EEE(struct zx_ctx* c,
                        struct TPF_NS_EEE_s* x);

For example:

  int zx_LEN_WO_se_Envelope(struct zx_ctx* c,
                            struct zx_se_Envelope_s* x);

10.3.2 Encoding in schema order
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Render an element into string. The XML elements are processed in
schema order. The xmlns declarations and XML attributes are always
sorted per [xml-exc-c14n] rules.<<footnote: The sort is actually done
already at code generation time by xsd2sg.pl.>> This is what you
generally want for rendering new data structure to a string. The wo
pointers are not used.

  char* TPF_ENC_SO_NS_EEE(struct zx_ctx* c,
                          struct TPF_NS_EEE_s* x,
                          char* p);

For example:

  char* zx_ENC_SO_se_Envelope(struct zx_ctx* c,
                              struct zx_se_Envelope_s* x,
                              char* p);

Since it is a very common requirement to allocate correct
sized buffer and then render an element, a helper function
is provided to do this in one step.

  struct zx_str_s* zx_EASY_ENC_SO_se_Envelope(struct zx_ctx* c,
                                    struct zx_se_Envelope_s* x);

The returned string is allocated from allocation arena described
by ~zx_ctx~.

10.3.3 Encoding in wire order
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Render element into string. The XML elements are
processed in wire order by chasing wo pointers. This is what you want
for validating signatures on other people's XML documents. If the wire
representation was schema invalid, e.g. elements were in wrong order,
the wire representation is still respected, except for xmlns
declarations and XML attributes, which are always sorted, per exc-c14n
rules. For each element a function is generated as follows

  char* TPF_ENC_WO_NS_EEE(struct zx_ctx* c,
                          struct TPF_NS_EEE_s* x,
                          char* p);

For example

  char* zx_ENC_WO_se_Envelope(struct zx_ctx* c,
                              struct zx_se_Envelope_s* x,
                              char* p);

A helper function is also available

  struct zx_str_s* zx_EASY_ENC_WO_se_Envelope(struct zx_ctx* c,
                                    struct zx_se_Envelope_s* x);

10.4 Signatures (XMLDSIG)
-------------------------

10.4.1 Signature Generation
~~~~~~~~~~~~~~~~~~~~~~~~~~~

*** TBW

10.4.2 Signature Validation
~~~~~~~~~~~~~~~~~~~~~~~~~~~

For signature validation you need to walk the decoded data structure
to locate the signature as well as the references and pass them to
zxsig_validate(). The validation involves wire order exclusive
canonical encoding of the referenced XML blobs, computation of SHA1 or
MD5 checksums over them, and finally computation of SHA1 check sum
over the <SignedInfo> element and validation of the actual
<SignatureValue> against that. The validation involves public key
decryption using the signer's certificate.

A nasty problem in exclusive canonicalization is that the namespaces
that are needed in the blob may actually appear in the containing XML
structures, thus in order to know the correct meaning of a namespace
prefix, we need to perform the +seen+ computation for all elements
outside and above the blob of interest.<<footnote: This is yet another
indication of how botched the XML namespace concept is. Or this could
have been fixed in the exclusive canonicalization spec by not using
namespace prefixes at all.>>

To verify signature, you have to do certain amount of preparatory work
to locate the signature and the data that was signed. Generally what
should be signed will be evident from protocol specifications or from
the security requirements of your application environment. Conversely,
if there is a signature, but it does not reference the appropriate
elements, its worthless and you might as well reject the document
without even verifying the signature.

*Example*

    struct zxsig_ref refs[1];
    cf = zxid_new_conf("/var/zxid/");
    ent = zxid_get_ent_from_file(cf, "YV7HPtu3bfqW3I4W_DZr-_DKMP4.");
    
    refs[0].ref = r->Envelope->Body->ArtifactResolve
                   ->Signature->SignedInfo->Reference;
    refs[0].blob = (struct zx_elem_s*)r->Envelope->Body->ArtifactResolve;
    res = zxsig_validate(cf->ctx, ent->sign_cert,
			 r->Envelope->Body->ArtifactResolve->Signature,
			 1, refs);
    if (res == ZXSIG_OK) {
      D("sig vfy ok %d", res);
    } else {
      ERR("sig vfy failed due to(%d)", res);
    }

This code illustrates

1. You have to determine who signed and provide the entity
   object that corresponds to the signer. Often you
   would determine the entity from <Issuer> element somewhere
   inside the message.

   The entity is used for retrieving the signing certificate.
   Another alternative is that the signature itself contains
   a <KeyInfo> element and you extract the certificate from
   there. You would still need to have a way to know if you
   trust the certificate.

2. You have to prepare the refs array. It contains pairs of
   <SignedInfo><Reference> specifications combined with the
   actual elements that are signed. Generally the URI
   XML attribute of the <Reference> element points to the
   data that was signed. However, it is application dependent
   what type of ID XML attribute the URI actually references
   or the URI could even reference something outside the
   document. It would be way too unreliable for the
   zxsig_validate() to attempt guessing how to locate the
   signed data: therefore we push the responsibility to
   you. Your code will have to walk the data to locate
   all referenced bits and pieces.

   In the above example, locating the one signed bit was
   very easy: the specification says where it is (and this
   location is fixed so there really is no need to check
   the URI either).

   You pass the length of the refs array and the array
   itself as two last arguments to zxsig_validate().

3. You need to locate the <Signature> element in the document
   and pass it as argument to zxsig_validate(). Usually
   a protocol specification will say where the <Signature>
   element is to be found, so locating it is not difficult.

4. The return value will indicate validation status. ZXSIG_OK,
   which has numerical value of 0, indicates success. Other
   nonzero values indicate various kinds of failure.

10.4.3 Certificate Validation and Trust Model
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Trust models for TLS and signature validation are separate.

In signature validation the primary trust mechanism is that entity's
metadata specifies the signing certificate and there is no
Certification Authority check at all.<<footnote: If you develop CA
check, please submit pathces to ZXID project.>>
This model works well if you control the admission
to your CoT. However, ZXID ships by default with the
automatic CoT feature turned on, thus anyone can get
added to the CoT and therefore signature with any
certificate they declare is "valid". This hardly
is acceptable for anything involving money.

10.5 Data Accessor Functions
----------------------------

Simple read access to data should, in C, be done by
simply referencing the fields of the struct, e.g.

  if (!r->EntitiesDescriptor->EntityDescriptor)
      goto bad_md;

*** TBW

10.6 Memory Allocation and Free
-------------------------------

*** TBW

10.7 Walking the data structure
-------------------------------

*** TBW

10.9 Thread Safety
------------------

All generated libraries are designed to be thread safe, provided
that the underlying libc APIs, such as malloc(3) are thread safe.

11 Creating New Interfaces Using ZXID Methodology
=================================================

The ZXID code generation methodology can be used to create
interfaces to any XML document or protocol that can be
described as a Schema Grammar (which includes any document
that can be expressed as XML Schema - XSD). The general
steps are

1. Convert .xsd file to .sg, or write the .sg directly. For conversion,
   you would typically use a command like

     ~/pd/xsd2sg.pl <foo.xsd >foo.sg

2. Tweak and rationalize the resulting .sg file. In ideal world
   any construct expressible as .xsd should be nicely representable,
   but in practise some work better than others, thus you can create
   a much nicer interface if you invest in some manual tweaking.

   Note that the tweaked .sg still is able to represent the
   same document as the original .xsd described, though
   often the tweaking causes some relaxation.

   Most common tweaks

   a. If the .xsd is written so that the targetted namespace is
      also the default namespace, you should introduce
      a namespace prefix because this is needed during
      code generation to keep different C identifiers
      from clashing with each other. Ideally you
      should coordinate the namespace prefixes globally
      so that even two different projects will not clash.

   b. Where the choice construct is used, indicated
      by pipey symbol (|) in the .sg file, you
      should refactor these into sequences of
      zero-or-one occurance (?) instances of the alternatives
      of the choice. This is needed because for the foreseeable
      future xsd2sg.pl has a limitation in code generation
      feature. If the choice has maxOccurs="unbounded"
      you should use (*) instead.

   c. xml:lang and other similar attributes may need to
      be factored open to be just of type %xs:string. This
      is a bug in xsd2sg.pl
      
3. "Connect" the schema to bigger framework. Usually this
   means adding your schema grammar to the ZX_SG variable
   in zxid/Makefile and supplying additional -r flags
   in ZX_ROOT variable. This allows your new schema to
   be visible at top level.

   If your schema is meant to extend leafs or interior nodes of
   the parse tree, such as SOAP Body, you would edit
   the SOAP schema to accept your
   new protocol elements in the Body. Or that the generic SOAP
   header can accept your specific header schemata, or that
   the SAML attribute definitions accept your kind of
   attributes - whatever makes sense in your context.

   Alternative to this is to create an entirely new
   monolothic encoder decoder, i.e. instead of extending
   the existing ZXID project to accommodate your new
   protocol, you just start a new project that uses the
   same methodology. You should see how the SAML protocol
   part is separated from the SAML metadata parsing and
   from the WSF parsing in the existing project.

12 ZXID Project
===============

Immediate goal: build a SAML 2.0 SP and ID-WSF 1.1 WSC

Goals of ZXID project include

* SOAP 1.1 support (done)
* SAML 2.0 compliance
  - SP role (done)
  - IdP role
* Liberty ID-FF 1.2 support
  - SP
  - IdP
  - SAML 1.1
* Liberty ID-WSF 1.1 support
  - Discovery bootstrap
  - Discovery WSC
  - ID-DAP WSC
  - ID-DAP WSP
* Liberty ID-WSF 2.0 support
  - Discovery bootstrap
  - Discovery WSC
  - ID-DAP WSC
  - ID-DAP WSP

12.1 Project Layout
-------------------

Following directory layout is used by the project. Many of the specified
directories are used by intermediate outputs that are not distributed
in tarball releases, but may or may no be present in CVS checkouts.

  zxid-0.xx
   |
   +-- Net    The Net::SAML perl module
   +-- xsd    XML schema descriptions of protocols (not distributed)
   +-- sg     Schema Grammar (.sg) descriptions of protocols
   +-- c      C code generated from the Schema Grammar descriptions
   +-- tex    Temporary files for document generation using PlainDoc (not distributed)
   +-- html   HTML documentation generated using PlainDoc
   +-- review Publicly released announcements and documents (not distributed)
   +-- t      Test scripts and expected test outputs
   `-- tmp    Temporary files, such as actual test outputs

The Manifest file, that follows, explains each file in more detail.

<<logoutput:
<<Manifest>>
>>

12.2 Protocol Encoders and Decoders
-----------------------------------

The protocol encoders and decoders are generated automatically from
the schema grammar (.sg) descriptions. This ensures accurate protocol
implementation. While the output is strictly schema driven and correct,
the decoders have some provisions to accept some deviations from
strict spec (e.g. out of order elements are tolerated). However,
one should note that XMLDSIG does not tolerate very much deviation,
thus even if decoder accepts a slightly illfomed message, it is likely
to fail in signature verification.

There are three outputs from generation

1. Data structures describing the data (xx.h)
2. Encoder that linearizes the data structure to wire protocol (xx-enc.c)
3. Decoder that converts wire protocol byte stream to a data structure (xx-dec.c)

12.3 Standards and Namespaces
-----------------------------

ZXID uses consistently the same namespace prefixes throughout the project. The
generated encoders and decoders support following schemas

<<table: Namespaces
Prefix URI                                         Description
====== =========================================== =================================
sa     urn:oasis:names:tc:SAML:2.0:assertion       SAML 2.0
sp     urn:oasis:names:tc:SAML:2.0:protocol
md     urn:oasis:names:tc:SAML:2.0:metadata
sa11   urn:oasis:names:tc:SAML:1.0:assertion       SAML 1.1
sp11   urn:oasis:names:tc:SAML:1.0:protocol
ff12   urn:liberty:iff:2003-08                     ID-FF 1.2
m20    urn:liberty:metadata:2004-12                v2.0 (almost asme as 1.2)
ac     urn:liberty:ac:2004-12                      v2.0 (almost asme as 1.2)
b12    urn:liberty:sb:2003-08                      ID-WSF 1.1 SOAP Binding
sec12  urn:liberty:sec:2003-08                     ID-WSF 1.1 Security Mechanisms
di12   urn:liberty:disco:2003-08                   ID-WSF 1.1 Discovery Service
is12   urn:liberty:is:2003-08                      ID-WSF 1.1 Interaction Service
lu     urn:liberty:util:2006-08                    ID-WSF 2.0 Utility Schema
sbf    urn:liberty:sb                              Framework header
b      urn:liberty:sb:2006-08                      ID-WSF 2.0 SOAP Binding
sec    urn:liberty:security:2006-08                ID-WSF 2.0 Security Mechanisms
di     urn:liberty:disco:2006-08                   ID-WSF 2.0 Discovery Service
is     urn:liberty:is:2006-08                      ID-WSF 2.0 Interaction Service
se     http://schemas.xmlsoap.org/soap/envelope/   SOAP 1.1, SSO variant
e      http://schemas.xmlsoap.org/soap/envelope/   SOAP 1.1, WSF variant
dise   http://schemas.xmlsoap.org/soap/envelope/   SOAP 1.1, DS MD update variant
ds     http://www.w3.org/2000/09/xmldsig#          XML Signatures
xenc   http://www.w3.org/2001/04/xmlenc#           XML Encryption
a      http://www.w3.org/2005/08/addressing        WSA 1.0

wsse
http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd
WS Security SecExt 1.0

wsu
http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd
WS Security Utility 1.0
xs     http://www.w3.org/2001/XMLSchema            Namespace only, no code
>>

13 Code Generation Tools
========================

Main work horse of code generation is xsd2sg.pl, which serves multiple
purposes

1. Build hashes of all declarations in .sg input. Each hash element consists
   of array of elements and attributes, as well as groups and attribute groups.
   The type of array element sis determined from prefix, per .sg rules.
2. Expand groups and attribute groups
3. Evaluate each element wrt its type and generate
   a. C data structures
   b. Decoder grammar
   c. Token descriptions for perfect hash and lexical analyzer
   d. Encoder C code

The code to build hashes is intervowen in the code that generates .xsd
from .sg. The rest of the generation happens in a function called
generate().

Typical command line (to generate SAML 2.0 protocol engine)

  ~/plaindoc/xsd2sg.pl -d -gen saml2 -p zx_ \
       -r saml:Assertion -r se:Envelope \
       -S \
       sg/saml-schema-assertion-2.0.sg \
       sg/saml-schema-protocol-2.0.sg \
       sg/xmldsig-core.sg \
       sg/xenc-schema.sg \
       sg/soap11.sg \
       >/dev/null

<<ignore: ~/plaindoc/xsd2sg.pl -d -gen saml2 -p zx_ -r saml:Assertion -r se:Envelope -S sg/saml-schema-assertion-2.0.sg sg/saml-schema-protocol-2.0.sg sg/xmldsig-core.sg sg/xenc-schema.sg sg/soap11.sg >/dev/null >>

To generate SAML 2.0 Metadata engine you would issue

  ~/plaindoc/xsd2sg.pl -d -gen saml2md -p zx_ \
       -r md:EntityDescriptor -r md:EntitiesDescriptor \
       -S \
       sg/saml-schema-assertion-2.0.sg \
       sg/saml-schema-metadata-2.0.sg \
       sg/xmldsig-core.sg \
       sg/xenc-schema.sg \
       >/dev/null

<<ignore: ~/plaindoc/xsd2sg.pl -d -gen saml2md -p zx_ -r md:EntityDescriptor -r md:EntitiesDescriptor -S sg/saml-schema-assertion-2.0.sg sg/saml-schema-metadata-2.0.sg sg/xmldsig-core.sg sg/xenc-schema.sg >/dev/null >>

13.1 Special Support for Specific Programming Languages
-------------------------------------------------------

While C code generation is the main output, and this can always be
converted to other languages using SWIG, sometimes a more natural
language interface can be built by directly generating it.

We plan to enhance the code generation to do something like this. At
least direct hash-of-hashes-of-arrays-of-hashes type datastructure
generation for benefit of some scripting languages is planned.

14 ZXID SP
==========

*** warning: not checked lately, may be wrong!

<<table: ZXID SP URLs
URL          Description
============ =======================================================
/zxid        Same as o=M. Main convenience entry point
/zxid?o=M    SSO with CDC; or management if already logged in
/zxid?o=C    Common Domain Cookie (CDC) reader, usually under common domain host name.
/zxid?o=E    SSO after CDC read; or management if already logged in.
/zxid?o=P    HTTP POST end point. Used for forms and last part of POST profile SSO.
/zxid?o=S    SOAP end point (HTTP POST)
/zxid?o=B    Get SP metadata (or combined SP and IdP metadata if proxying).
>>

*** add description of CGI fields

15 Certificates
===============

*** TBD - This chapter should be elaborated to be a certificate tutorial with
following contents:

* Intro to certs and private keys
* Generating self signed cert
* Generating certificate signing request and using it to obtain
  commercially issued cert
* Installing root certs so you can recognize other people's certs
* Client TLS considerations

For the time being, the short answer is that ZXID uses OpenSSL and
PEM format certificates. You can use same techniques as you would use for
Apache / mod_ssl for acquiring certificates.

You should NEVER password protect your private key. There will not
be any opportunity to supply the password. You should insted protect
your private key using Unix filesystem permissions. See OpenSSL.org
or modssl.org FAQs for further information, including how to remove
a password if you accidentally enabled it.

16 License
==========

Copyright (c) 2006 Sampo Kellomäki (sampo@iki.fi), All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

While the source distribution of ZXID does not contain
SSLeay or OpenSSL code, if you use this code you will use OpenSSL
library. Please give Eric Young and OpenSSL team credit (as required by
their licenses).

And remember, you, and nobody else but you, are responsible for
auditing ZXID and OpenSSL library for security problems,
backdoors, and general suitability for your application.

17 FAQ
======

*** real user FAQs are still lacking. Maybe this stuff is perfect?

17.4 Vendor products
--------------------

17.4.1 Symlabs Federated Identity Access Manager (FIAM)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Metadata import to IdP?

What I usually do is

  cd /opt/SYMfiam/3.0.x/conf/symdemo-idpa
  echo 'sp: zxid-sp1$https://sp1.zxidsp.org:8443/zxid?o=B$$' >>cot.ldif

Double check with text editor that the file is sensible.
Note that the single quotes are essential as the dollars
are to be interpretted literally, as separators.

  cd pem
  wget https://sp1.zxidsp.org:8443/zxid?o=B >zxid-sp1.xml

Here the intent is to fetch the metadata from the SP and
store it in a file whose name (without .xml extension)
matches the first component of the sp: line. I am not
100% on the wget syntax. You can also use browser
to fetch the metadata and simply Save as under the
correct name.

  cd /opt/SYMfiam/3.0.x/conf/symdemo-idpa/start.sh restart

This should restart the IdP server process and cause a
refresh of the metadata it may have cached. You may
want to

  tail -f /opt/SYMfiam/3.0.x/conf/symdemo-idpa/log/debug.log

to see if its getting indigestion.

17.5 Known Bugs
---------------

Following are known limitations. We document them here
because we do not plan to fix them in foreseeable future.

1. Unknown XML attributes are not sorted according to rules
   of exc-c14n. Instead they appear always +after+ known
   XML attributes and in the order they happen to be
   in the linked list.

   *Work around:* Add the attribute to schema (.sg) and
   regenerate and rebuild.

17.6 Mysterious Error Messages
------------------------------

"Random number generator not seeded!!!"

This warning indicates that randomize() was not able to read
/dev/random or /dev/urandom, possibly because your system does not
have them or they are differently named. You can still use SSL, but
the encryption will not be as strong. Investigate setting up
EGD (entropy gathering daemon) or PRNG (Pseudo Random Number
Generator). Both are available on the net.

"msg 123: 1 - error:140770F8:SSL routines:SSL23_GET_SERVER_HELLO:unknown proto"

SSLeay error string. First number (123) is PID, second number (1) indicates
the position of the error message in SSLeay error stack. You often see
a pile of these messages as errors cascade.

"msg 123: 1 - error:02001002::lib(2) :func(1) :reason(2)"

The same as above, but you didn't call load_error_strings() so SSLeay
couldn't verbosely explain the error. You can still find out what it
means with this command:

     /usr/local/ssl/bin/ssleay errstr 02001002

Password is being asked for private key

This is normal behaviour if your private key is encrypted. Either
you have to supply the password or you have to use unencrypted
private key. Scan OpenSSL.org for the FAQ that explains how to
do this.

17.7 Author's Pet Peeves
------------------------

1. What is Schema Grammar (.sg) and why are you using it?
   * Schema Grammar is a compact formal description of XML documents. It is
     mostly bidirectionally convertible to XML Schema (XSD) and captures
     the useful essence of most XML schemas.
   * Schema Grammars are intuitive and compact, often allowing the
     essence to be understood at glance, and even most complex cases
     being only about 50% of the volume of the corresponding XSD.
   * We use Schema Grammar descriptions because they are more human readable
     than XSD and still equally amenable to automated code generation.
   * Schema Grammar descriptions are usually converted using xsd2sg.pl, which is
     part of the PlainDoc distribution.
   * See http://mercnet.pt/plaindoc
   * N.B. You do not need xsd2sg.pl or PlainDoc if you just want to compile and use ZXID.

2. What is PlainDoc (.pd)?
   * PlainDoc is a document preparation system that uses intuitive plain text files
     with minimal markup to generate PDF and HTML outputs.
   * We use PlainDoc because it makes it easy to maintain documentation.
   * See http://mercnet.pt/plaindoc
   * N.B. You do not need PlainDoc if you just want to compile and use ZXID.

3. How come zxid is so heavy to compile?
   * SAML 2.0 and related specs have a lot of functionality and detail, even
     if you really only need 1% of it. We do not wish to arbitrate which
     functionality is best or most needed, so we simply provide it all.
   * A lot of the code is generated, thus the input for C compiler is well
     in excess of half a million lines of code (of which only about 6k
     were written by a human).
   * Some of the generated files are gigantic, e.g. Net/SAML/zxid_wrap.c
     is over 380k lines. Compiler has to process all of this as a single
     compilation unit.
   * gcc and gnu ld were, perhaps, not designed to process this large inputs
     efficiently. Often the implementation strategy of keeping
     everything in memory will cause a smaller machines to swap.
   * My 1GHz CPU, 256 MB RAM machine definitely swaps and thus
     takes about 45 minutes to compile all this stuff.
   * I recommend at least 1GB RAM and 3GHz CPU for development
     machine. On such machine, you should be able to build in about 10 min.

4. Why do you not use ./configure and GNU autoconf?
   * ~autoconf~ is not for everyone. World does not stop without
     ~autoconf~. Or indeed need ~autoconf~. It is Yet Another Dependency
     I Do Not Need (YADIDNN).
   * I find the GNU ~autoconf~ stuff much more difficult to understand than
     my own ~Makefile~. Why should I debug ~autoconf~ when I could
     spend the time debugging my ~Makefile~ or the actual code?
   * I find resolving problems much easier at source code and ~Makefile~ level
     than trying to debug a million line script generated by some system
     I do not understand (perhaps some hardcore ~autoconf~ advocate could
     try to convince me and educate me, but I doubt).
   * My policy is to only support systems I have first hand experience with,
     or I have trustworthy friends to rely on. It does not help me
     to have a system that tries to guess +gazillion irrellevant variables+
     to an unpredictable state. It's much easier to stick to standards like
     POSIX and make sure you have predictable results from predictable inputs.
   * If the deterministic and predictable results are wrong, they can
     at least be debugged and fixed with a finite amount of work.
   * Supporting all relevant systems manually is not that much of work. The
     inhabitants of the irrelevant systems can support themselves, probably
     learning a great deal on the side.

17.8 What does ZXID aim at - an answer
--------------------------------------

A recent conversation that touched on the aims of ZXID project:

> So just generally, what are your goals for it, are you interested in making
> it work well with what other people are producing (e.g. SAML -> WSF
> cross-over), etc? I'm certainly assuming the answer's yes to that.

I aim at full stack client side implementation. ID-FF, SAML 2.0,
WSF (both versions). The generation technique I use will yield
the encoders and decoders for both WSP and WSC, but the hand written
higher level logic will at first be only written for SP and WSC.

It is Apache licensed project, of course, so if someone contributes
the IdP and WSP capabilities, I'll merge them into the distribution.

I am interested to have it working with other people's code at 3 levels:

1. Over-the-wire iop
2. I have split the functionality of the SP from the WSC such that
   my SP could probably be used with oneone else's WSC and someone
   else's SP would reasonably easily be able to use my WSC.
3. Interfaces to non IdM parts of the complete system, typically
   used to implement the application layer, shall be
   plentiful: C/C++ API, Net::SAML/mod_perl, php - whatever you
   can SWIGify.

One thing I am NOT interested in is "layered" stack. I strongly
believe it's better each vertically integrated slice is implemented by
one mind. Thus, except for lowest HTTP, TLS, and TCP/IP layers,
my SP, or WSC, handles the whole depth of the stack - SOAP, signature,
and app interface layers (of course the actual app should be its
own layer and probably user written). That is by design.

I have found in practise that if you attempt a layered stack, you have
impedance mismatches between the modules at different layers because
they were designed and written by different minds. By having vertical
integration I avoid impedance mismatches. This is the reason why
monolithic TCP/IP implementations tend to be better than explicitly
layered, such as the streams approach.

Now, if someone else wanted to take my generated encoders and
decoders and use them as a "layer" in their layered stack, I guess
I would not have any issue. If you do that, please let me know
because I would have to commit to API stability at that layer.
I am willing to do that once there are real projects that depend
on it, but until then I still may redesign those APIs, after
all, I am at revision 0.4 :-)

In the end, it seems that ZXID is actually somewhat layered approach -
what I mean by "vertical integration" is that all the layers are
designed and controlled by the same mind.

> BTW, I gather that it's SAML 2.0 at the moment, which I can't offer any test
> capability for, but if you get to SAML 1.1, I'm happy to set up some kind of
> IdP test capability for that.

In SSO world SAML 1.1 and ID-FF 1.2 capabilities are definitely on
the road map. In ID-WSF world, I'll probably start with 2.0 DS-WSC
(don't we all) followed by ID-DAP WSC and then tackle 1.1 after that.

17.9 Annoyances and improvement ideas
-------------------------------------

There is a lot of commonality that is not leveraged, especially in the
way service end points are chose given the metadata.  The descriptors
are nearly identical so casting them to one should work.

Many of the SAML2 responses are nearly identical. Rather than
construct them fully formally, we could have just one "SAML any
response" function. Perhaps this could be supported by some schema
grammar level aliasing feature: if an element derives from base type
without adding anything at all of its own, we might as well only
generate code for the base type.

Namespace aliasing scheme would allow us to consider two versions of
schema the same. It seems to be fairly common that the schema
changes are so minor that there is no justification for two
different decoding engines.

98 Support
==========

98.1 Mailing list and forums
----------------------------

Mail the author until we get the list set up. Or volunteer a list :-)

98.2 Bugs
---------

Mail the author until we get bug tracking set up. Or volunteer.

98.3 Developer access
---------------------

We use CVS, but access needs to be manually configured and is not
anonymous. If you contribute significantly, I will bother. Others
can send patches (good way to show you are worthy of CVS access)
to me. I've heard some mixed experiences about open source
sites like sourceforge. If you run such site and want to
host ZXID Project, please contact me.

If you just always want the latest source: get the tar ball from
the downloads section. Trust me, this is still so much in flux
that only the tar ball snapshots are in any usable state. CVS
access just to get latest source would be pointless.

98.9 Commercial Support
-----------------------

Following companies provide consultancy and support contracts for
ZXID:

* symlabs.com

99 Appendix: Schema Grammars
============================

Large parts of ZXID code are generated from +schema grammars+ which
are a convenient notation for describing XML schmata. This appendix
contains the schema grammars that are currently implemented and
distributed in the ZXID package.

<<tex: \small>>

99.1 SAML 2.0
-------------

99.1.1 saml-schema-assertion-2.0 (sa)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/saml-schema-assertion-2.0.sg>>
>>

99.1.2 saml-schema-protocol-2.0 (sp)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/saml-schema-protocol-2.0.sg>>
>>

99.1.4 saml-schema-metadata-2.0 (md)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/saml-schema-metadata-2.0.sg>>
>>

99.2 SAML 1.1
-------------

99.2.1 oasis-sstc-saml-schema-assertion-1.1 (sa11)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/oasis-sstc-saml-schema-assertion-1.1.sg>>
>>

99.2.2 oasis-sstc-saml-schema-protocol-1.1 (sp11)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/oasis-sstc-saml-schema-protocol-1.1.sg>>
>>

99.3 Liberty ID-FF 1.2
----------------------

99.3.1 liberty-idff-protocols-schema-1.2 (ff12)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/liberty-idff-protocols-schema-1.2-errata-v2.0.sg>>
>>

99.3.2 liberty-metadata-v2.0 (m20)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/liberty-metadata-v2.0.sg>>
>>

99.3.3 liberty-authentication-context-v2.0 (ac)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/liberty-authentication-context-v2.0.sg>>
>>

99.4 Liberty ID-WSF 1.1
-----------------------

99.4.1 liberty-idwsf-soap-binding-v1.2 (b12)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/liberty-idwsf-soap-binding-v1.2.sg>>
>>

99.4.2 liberty-idwsf-security-mechanisms-v1.2 (sec12)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/liberty-idwsf-security-mechanisms-v1.2.sg>>
>>

99.4.3 liberty-idwsf-disco-svc-v1.2 (di12)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/liberty-idwsf-disco-svc-v1.2.sg>>
>>

99.4.5 liberty-idwsf-interaction-svc-v1.1 (is12)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/liberty-idwsf-interaction-svc-v1.1.sg>>
>>

99.5 Liberty ID-WSF 2.0
-----------------------

99.5.1 liberty-idwsf-utility-v2.0 (lu)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/liberty-idwsf-utility-v2.0.sg>>
>>

99.5.2 liberty-idwsf-soap-binding (no version, sbf)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/liberty-idwsf-soap-binding.sg>>
>>

99.5.3 liberty-idwsf-soap-binding-v2.0 (b)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/liberty-idwsf-soap-binding-v2.0.sg>>
>>

99.5.4 liberty-idwsf-security-mechanisms-v2.0 (sec)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/liberty-idwsf-security-mechanisms-v2.0.sg>>
>>

99.5.5 liberty-idwsf-disco-svc-v2.0 (di)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/liberty-idwsf-disco-svc-v2.0.sg>>
>>

99.5.6 liberty-idwsf-interaction-svc-v2.0 (is)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/liberty-idwsf-interaction-svc-v2.0.sg>>
>>

99.6 SOAP 1.1 Processors
------------------------

99.6.1 saml20-soap11 (se)
~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/saml20-soap11.sg>>
>>

99.6.2 wsf-soap11 (e)
~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/wsf-soap11.sg>>
>>

99.6.3 ds-soap11 (dise)
~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/ds-soap11.sg>>
>>

99.7 XML and Web Services Infrastructure
----------------------------------------

99.7.1 xmldsig-core (ds)
~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/xmldsig-core.sg>>
>>

99.7.2 xenc-schema (xenc)
~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/xenc-schema.sg>>
>>

99.7.3 ws-addr-1.0 (a)
~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/ws-addr-1.0.sg>>
>>

99.7.4 wss-secext-1.0 (wsse)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/wss-secext-1.0.sg>>
>>

99.7.5 wss-util-1.0 (wsu)
~~~~~~~~~~~~~~~~~~~~~~~~~

<<schema:
<<sg/wss-util-1.0.sg>>
>>

<<references:

[SAMLCore11] SAML 1.1 Core

[SAMLCore2] SAML 2.0 Core

[xml-c14n] XML Canonicalization (non-exclusive), http://www.w3.org/TR/2001/REC-xml-c14n-20010315

[xml-exc-c14n] Exclusive XML Canonicalization, http://www.w3.org/TR/xml-exc-c14n/

[Disco2] Liberty ID-WSF Discovery service 2.0

[Disco12] Liberty ID-WSF Discovery service 1.1 (liberty-idwsf-disco-svc-v1.2.pdf)

[SecMech2] Liberty ID-WSF 2.0 Security Mechanisms

[SOAPAuthn2] Liberty ID-WSF 2.0 Authentication Service

[SOAPBinding2] Liberty ID-WSF 2.0 framework document that pulls together all aspects

[DST21] Liberty Data Services Template 2.1

[DST20] Liberty DST v2.0

[DST11] Liberty DST v1.1

[IDDAP] Liberty Identity based Directory Access Protocol

[IDPP] Liberty Personal Profile specification.

[Interact11] Liberty ID-WSF Interaction Service protocol 1.1

[FF12] Liberty ID Federation Framework 1.2, Protocols and Schemas

[SUBS2] Liberty Subscriptions and Notifications specification

[Schema1-2] Henry S. Thompson et al. (eds): XML Schema Part 1: Structures, 2nd Ed., WSC Recommendation, 28. Oct. 2004, http://www.w3.org/2002/XMLSchema

[XML] http://www.w3.org/TR/REC-xml

>>

<<htmlpreamble: <title>README ZXID</title><body bgcolor="#330033" text="#ffaaff" link="#ffddff" vlink="#aa44aa" alink="#ffffff"><font face=sans><h1>README ZXID</h1> >>

<<EOF: >>