The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

IOD - IOD file format specification

VERSION

version 0.9.0

SPECIFICATION VERSION

0.9

STATUS

Specification is still rather in flux. Backwards compatibility is not guaranteed between 0.9.x releases.

ABSTRACT

IOD (short for INI On Drugs) is a configuration file format that is backwards-compatible with the popular INI format. It adds a few extensions to make configuration more powerful but still lets the configuration parseable by a regular INI parser (albeit the parse result might differ, sometimes significantly so, but it can be exactly the same for simple cases). An implementation can turn off some or all of these extensions to make the configuration closer to a regular INI file.

IOD is meant to be a general configuration format for applications.

NOTATION

Data structure is written in pseudo JSON format (JSON with ellipsis ... to denote missing pieces or say "so on", and comments.

RATIONALE

A general configuration file format needs to be simple for computers as well as users to read and write. Users mostly need to write configuration, while computers need to read but increasingly nowadays also write to configuration (for automation tasks). INI is chosen as the basis format for several reasons: popularity, simplicity, and ease of round trip parsing. These features satisfy the aforementioned requirement.

INI format is popular on Windows as well as Unix. This makes it easy for new users to get started with the configuration.

INI format is simple and very straightforward. It is essentially assignments of parameter names and values, with sections.

Round trip parsing means preserving everything in the file, including comments and formatting (indentations, whitespaces). Most serialization format do not have round trip parsers: after a read and write process, original formatting and comments are lost. Round trip parsing means that if one loads an INI file, modifies a parameter, and saves it again, everything that is not modified will still be the same (including whitespaces and comments). If no parameters are modified, the saved file will be identical with the original.

Round trip parsing is desirable in a configuration because oftentimes valuable information is contained in the formatting/indentation (grouping of parameters) as well as comments (user explaining why she sets a parameter to a certain value, dates, other notes). Software modifying a configuration should not destroy all these.

INI vs ...

Although YAML looks nice and has many features, it lacks round trip parsers and has complex rules that can trip beginners or non-programmers, e.g. significant indentation, significant whitespace after colon in mapping, etc.

Although JSON is popular, it lacks round trip parsers and some important features (e.g. comments). (Note: The documentation for JSON Perl module mentions the phrase "round trip", but it uses the phrase to mean integrity of values, not preserving comments/whitespaces.)

Apache-webserver-style lacks round trip parsers.

XML lacks round trip parsers and is not convenient for humans to read and write.

Why not plain INI?

First, INI format is ill-defined. There is no single standard, thus various implementations behave differently and there are various variants of the format. This specification intends to describe the IOD format more precisely.

Second, INI lacks some features that I like/need, like: nested section, variable substitution/expression, inclusion of other files, and merging between sections. Most of these features make writing configuration less repetitive.

SPECIFICATION

A configuration is a text file containing a sequence of lines. Default encoding is UTF-8. Each line is either a blank line, a comment line, a directive line, a section line, or a parameter line. Parsing is done line-by-line and in a single pass.

Argument and quoting

An argument is a sequence of one or more non-whitespace, non-newline, and non-double-quote (") characters. Examples:

 word
 argument2
 yet-another-argument!

To represent a zero-length argument or argument containing quotes, newlines, or other non-printable characters, a quoting mechanism can be used. Quoting is done with the double-quote (") character. Escaping is done using the backslash (\) character. Known escapes are \' (literal single-quote), \" (literal double-quote), \\ (literal backslash), \r (linefeed), \f (formfeed), \$ (literal dollar), \n (newline), \t (tab), \b (backspace), \a (bell), \0 (null character), octal form (e.g. \0377), hex form (e.g., \xff) and wide-hex form (\x{263a}).

Blank line

A blank line is a line containing zero or more whitespaces only. It is ignored.

Comment line

A comment line begins with ; or # as it's first nonblank character (note that some INI parsers do not allow indented comment). The use of ; is preferred.

Directive line

A directive line is a special unindented comment line, the comment starts with zero or more whitespaces, an exclamation mark (!), a directive name (a word matching regular expression /\w+/, and zero or more arguments separated by whitespaces. An invalid directive will cause parsing to fail.

Examples of valid directives:

 ;!include somefile.iod
 ;!include "c:/Configuration Files/somefile.ini"

This directive is invalid because of invalid name:

 ;!what-a-directive!

This directive is invalid because it is unknown:

 ;!foo

This directive is invalid because of unbalanced quotes:

 ;!include "somefile.ini

This directive is invalid because of missing required argument:

 ;!include

Below is the list of known directives (<foo> signifies required arguments, [foo] signifies optional arguments):

!include <PATH>

Include another file, as if the content of the included file is the next lines in the current file. An included file might contain another !include directive. If PATH is not absolute, it is assumed to be relative to the current file (or included file). A circular include will cause the parser to die with an error. Example:

File dir1/a.ini:

 [sectionA/sub1]
 a=1
 ;!include ../dir2/b.ini
 ;!include ../dir2/b3.ini

File dir2/b.ini:

 b=2
 ;!include b2.ini

File dir2/b2.ini:

 c = (2+1)
 ;!include b3.ini

File dir2/b3.ini:

 c=4
 [sectionB]
 c=1

When dir1/a.ini is parsed, the result will be (in JSON):

 {
   "sectionA": {
     "sub1": {
       "a": 1,
       "b": 2,
       "c": [3, 4]
     },
   },
   "sectionB": {
     "c": 1
   },
 }

!defaults <SECTION>

Specify that from now on, when encountering a new section, fill it with values from SECTION unless the values are already specified.

Example:

 [sect1]
 a=1
 b=2

 ;!defaults sect1

 [sect2]
 a=10

 [sect3]
 c=1

 [sect4]
 a=0
 b=(nil)

Will result in:

 {
   "sect1": {"a": 1, "b": 2},
   "sect2": {"a": 10, "b": 2},
   "sect3": {"a": 1, "b": 2, "c": 1},
   "sect4": {"a": 0, "b": null}
 }

Another, more meaty example:

 [-defaults]
 quota=1000
 ftp=1
 shell=1
 mysql=0

 ;!defaults -defaults

 ; double quota for this user
 [user1]
 quota=2000

 ; disable ftp for this user
 [user2]
 ftp=0

 ; all admin users have unlimited quota
 [-admins]
 quota=-1

 ;!defaults -admins

 [admin1]

 ; this admin cannot use shell
 [admin2]
 shell=0

 ;no more defaults
 ;!defaults

Defaults to the same section results in an error.

To end defaults, you can use the !nodefaults directive.

!defaults is actually the same as !merge, except that !merge allows an optional merging mode.

!nodefaults

Remove !defaults effect, if one has been specified previously.

!merge <SECTION> [MODE]

Specify that from now on, section will merge SECTION. An optional MODE is allowed to do mode-merging, as specified by Data::ModeMerge (an implementation which does not support it can disallow mode merging). If there is a !defaults directive in effect, defaults should be applied first. To end merging, you can use the !nomerge directive. Example:

 [defaults]
 d=4

 !defaults defaults

 [s1]
 a=1
 b=2
 d=4

 !merge s1 +
 [s2]
 a=10
 c=30

 !merge s1 .
 [s3]
 a=20
 c=60

 !nomerge

 [s3]
 a=20

will result in:

 {
   "s1": {"a": 1, "b": 2},
   "s2": {"a": 11, "b": 20, "c": 30, "d": 8},  //(s1+defaults) [+] (s2+defaults)
   "s3": {"a": 110, "b": 2, "c": 60, "d": 44}, //(s1+defaults) [.] (s3+defaults)
   "s4": {"a": 20, "d": 4} // no more merging
 }

!nomerge

Remove !merge effect, if one has been specified previously.

!sectionpath <PATH ...>

Specify that from now on, prefix the section with PATH. Example:

 [s1]
 a=1

 [s2]
 b=2

 ;!sectionpath s1
 [s2]
 c=3

 ;!sectionpath x y
 ;same thing: !sectionpath x/y
 [z]
 d=4

will result in:

 {
   "s1": {"a":1, "s2": {"c": 3}},
   "s2": {"b":2},
   "x": {"y": {"z": {"d":4}}}
 }

!nosectionpath

Remove !sectionpath effect, if one has been specified previously.

Section line

A section line introduces a section:

 "[" <ARGUMENT> ( "/" <ARGUMENT> )* "]"

For section line, argument is allowed to have whitespaces in it because it is enclosed inside [].

Examples:

 [Section Name]
 ["quoted [] section name"]
 [""]

Indentation is allowed. Comment is allowed at the end:

 [foo] ; a comment

To write a section name with problematic characters (like "\n", "\0", "]", etc.), use quotes.

To specify nested section, you use the path separator character / to separate between section names. An unquoted argument with / in it is also accepted as path separator. Examples:

 [databases/mysql]
 [databases/sqlite 3]
 ["databases"/"mysql"]
 ["databases" / "sqlite 3"]

Non-contiguous sections are allowed, they will be assumed to be set/add parameters as if the section were written contiguously, e.g.:

 [sect1]
 a=1

 [sect2]
 a=1

 [sect1]
 a=2
 b=3

will result in sect1 containing a as [1, 2] and b as 3. However, note:

 [sect1]
 a=1

 ;!defaults sect1
 [sect2]
 d=4

 [sect1]
 b=2

 [sect3]
 c=3

sect2 will contain {"a":1, "d":4} since at the point of parsing sect2, sect1 only contains {"a":1}. However, sect3 will contain {"a":1, "b":2, "c":3} since at the point of parsing sect3, sect1 already becomes {"a":1, "b":2}.

Parameter line

Parameter lines specify name value pairs:

 C<NAME-ARGUMENT> "=" <VALUE-ARGUMENT>

Unquoted arguments are allowed to contain whitespaces since = is used as separator. Indentation is allowed as well as whitespaces between =. Trailing witespaces should be trimmed, i.e. they are not part of parameter value. Comment is also allowed at the end. To specify parameter name/value which contains problematic characters, use quoting.

Examples:

 Parameter name = value
 name = value ; a comment
 "Parameter name"="value" ; same thing
 Parameter2="value ; this is not a comment and part of value"
 "name containing trailing spaces "="value containing trailing spaces  "
 x="\0"

Specifying several parameters with the same name will create an array. Example:

 a=1
 a=2

will result in:

 {"a": [1, 2]}

To specify expression value, use unquoted argument enclosed with (). Examples:

 a=(1+1) ; -> 2
 a="(1+1)" ; -> a string literal because of quotes, not an expression

See "Expression" for more details on expressions. Using expression you can specify an array with a zero or one element.

 a = ( [] )
 a = ( [1] )

Normally a parameter line should occur below section line, so that parameter belongs to the section. But a parameter line is also allowed before section line, in which it will belong to section called DEFAULT (configurable via the parser's default_section attribute).

Expression

Currently expression is specified using Template::Xslate::Syntax::Kolon syntax. Parameters in the current section is available as normal variables, example:

 [section]
 a=1
 b=($a + 1)

Parameters from other sections are available through the $ROOT variable, example:

 [s1]
 a=3

 [s2/s3]
 b=4

 [s4]
 c = ($ROOT["s1"]["a"] + $ROOT["s2"]["s3"]["b"]) ; -> 7

Unsupported features

Some INI implementations support other features, and listed below are those unsupported by IOD, usually because the features are not popular or too incompatible with regular INI syntax:

  • Line continuation for multiline value

     param=line 1 \
     line 2\
     line 3

    Supported by Config::IniFiles. In IOD, use quoting:

     param="line 1 \nline 2\nline 3"
  • Heredoc syntax for array

     param=<<EOT
     value1
     value2
     EOT

    Supported by Config::IniFiles. In IOD, use multiple assignment or expression:

     param=value1
     param=value2

    or:

     param=(["value1", "value2"])

GUIDELINES FOR IMPLEMENTATIONS

Implementation can provide options to turn off some features. In general, to make an IOD configuration file not context-dependent, a turned off feature should cause parsing to fail to notify users that certain features are not available. A turned off feature should not just silently makes parsing behave differently. For example, if file inclusion is turned off, this line:

 !include somefile.ini

should make parsing fail instead of continuing (without including the file).

An exception is if implementation provides explicit option to ignore certain features. For example, an implementation might provide an option to forbid expressions, or turn off expression parsing and parse it as literal.

Below are guidelines on what parsing options an implementation can provide:

  • whether comment character # is allowed

    Some INI parsers (like PHP 5.3 or earlier) only recognize ; as the comment character.

  • whether indented comment line is allowed

    Some INI parsers do not allow this.

  • whether comment at the end of parameter line is allowed

    Some INI parsers parse comments at the end of parameter line, some do not (they assume it is part of parameter value). This option can avoid ambiguities by forbidding such comments.

  • whether comment at the end of section line is allowed

    Some INI parsers parse comments at the end of parameter line, some do not. This option can avoid ambiguities by forbidding such comments.

  • whether indented section line is allowed

    Some INI parsers do not allow indented section line.

  • whether indented parameter line is allowed

    Most parsers that I know allow indented parameter line, but may be some do not.

  • whether expression is allowed

    Expression is an IOD extension. If user wants her configuration to be more INI-compatible, this option can be turned off.

  • whether the !include directive is allowed

    File inclusion is an IOD extension.

  • whether the !defaults directive is allowed

    Defaults is an IOD extension. If user wants her configuration to be more INI-compatible, this option can be turned off.

  • whether the !merge directive is allowed

    Merging between sections is an IOD extension.

  • whether mode merging is allowed

    Mode-merging is specified and implemented by Data::ModeMerge, which might not yet available in other languages.

  • whether discontiguous section is allowed

    Most INI parsers allow this. Example:

     [s1]
     v1=1
    
     [s2]
     x=y
    
     [s1]
     v2=2

FAQ

What are the downsides of IOD format?

  • Currently only has Perl parser (Config::IOD)

    INI parsers exist everywhere though, so some of the time you can fallback to INI. It is also not terribly hard to write implementations in other languages.

  • You need to learn another minilanguage for expressions

AUTHOR

Steven Haryanto <stevenharyanto@gmail.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2012 by Steven Haryanto.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.