YAMLish
From Test Anything Protocol
YAMLish is a small subset of YAML that TAP producers may use to embed machine readable information in TAP diagnostics. See TAP diagnostic syntax for information about how YAMLish embeds in TAP.
Contents |
[edit] Objectives
The main objectives for YAMLish are
- small - the Perl reader is around 124 lines, 258 lines for the parser
- portable - it should be reasonably easy to implement YAMLish in any language
- able to encode arbitrary data structures
- verifiable - it should be relatively to easy to test that a YAMLish implementation conforms
- JSON compatible - YAMLish should be a super-set of JSON to allow TAP producers to make use of JSON libraries (objective not met)
[edit] Syntax
To avoid the burden of distributing a complete YAML parser with a TAP producer or consumer YAMLish confines itself to a subset of YAML syntax.
These examples demonstrates the supported syntax.
All YAMLish documents must begin with '---' and end with a line containing '...'.
--- Simple scalar ...
Unprintable characters are represented using standard escapes in double quoted strings.
--- "\t\x01\x02\n" ...
Array and hashes are represented thusly
---
- "This"
- "is"
- "an"
- "array"
...
---
This: is
a: hash
...
Hash keys may be double quoted strings and may contain unprintable characters
---
"\t\x00": 'My key is <tab><nul>"
"Now is the time": "t'was ever thus"
...
Structures may nest arbitrarily
---
-
name: 'Hash one'
value: 1
-
name: 'Hash two'
value: 2
...
Undef is a tilde
--- ~ ...
[edit] Root Namespace
When used with TAP the root element of an embedded YAMLish diagnostic is a hash containing keys from this set:
- message
- A textual message giving more detail about the failure (or success)
- severity
- The severity of the problem.
- source
- A uri describing the source of the TAP. This can be a file URL. See "file" for a special case.
- datetime
- the time the test was executed, helping test runners do interesting things like run tests in order of most-recently-failed. ISO8601 or HTTP date format.
- file
- A filename representing the TAP source, really a special case of "source". Not possible for all TAP sources, but I really don't want everyone to have to use file URIs.
- line
- The line number of the TAP source from which this test was produced. Not possible for all TAP sources.
- name
- Name of this test, if any.
- extensions
- A place to put any non-standard keys without worrying out conflicting with future ones
- actual
- For comparison tests, what you got.
- expected
- For comparison tests, what you expected.
- display
- Suggested text to display representing this failure
- dump
- A hash of variables to be pretty-printed by the harness
- error
- An error or exception object
- backtrace
- A stack backtrace in the case of an error or exception
(please feel free to add to this list - it's provisional at the moment)
[edit] Implementations
Because YAMLish is a subset of YAML there are already a number of parsers in a number of languages that accept it. It's also quite likely that existing YAML producers can be coerced into producing YAMLish compliant YAML. Please be careful though to ensure that your YAMLish producer does in fact conform to the subset defined here. Just because your YAML happens to work with a particular test harness doesn't mean that it's valid YAMLish.
YAMLish is based on the subset of YAML supported by Adam Kennedy's YAML::Tiny Perl module. YAML::Tiny doesn't support quoted hash keys - which we need so that we can safely round-trip arbitrary data structures - so YAMLish extends Adam's de-facto subset to include these.
If your concern is only to produce well formed TAP (rather than parsing it) then you should find that it's possible to implement a YAMLish writer in a couple of hundred lines of code.
[edit] Perl
- TAP::Parser implements YAMLish support. You'll need the version from the subversion repository though; YAMLish support hasn't yet made it to CPAN.
- Data::YAML is essentially the YAMLish engine from TAP::Parser packaged as a standalone module
[edit] PHP
- YAMLishWriter is a simple PHP implementation of a YAMLish encoder
If you have a YAMLish implementation please list it here.
[edit] Q&A
[edit] Why YAML?
TAP diagnostics require a way to represent data structions in any language in a human and machine readable form. It would be nice if we didn't have to write our own format. YAML, like TAP, is designed to be both human and machine readable as well as language independent. YAML generators and parsers already exist in many languages. YAML has already solved the hard problems facing a data serialization format (like character sets).
[edit] Why not JSON?
JSON was considered, and it has some of the characteristics of YAML, but it was ultimately rejected for several reasons.
JSON is, effectively, a subset of YAML. If your producer emits JSON then a YAML parser will read it. The inverse is not true.
JSON is more verbose, less human readable, requiring more quoting. For example:
# YAML
---
got: this
expected: that
...
# JSON
{
"got": "this"
"expected": "that"
}
JSON lacks a WYSIWYG multi-line scalar value format. YAML has several. | allows the exact text to be presented, newlines and all. > "soft wraps" text to prevent long lines from spilling across the screen.
# YAML
---
got: >
When in the course of human events,
blah blah blah
expected: >
When, in the course of human events,
it becomes necessary for one people to
dissolve the political bonds which have
connected them with another...
...
# JSON
{
"got": "When in the course of human events, blah blah blah"
"expected": "When, in the course of human events, it becomes necessary for one people to dissolve the political bonds which have connected them with another..."
}
[edit] Why the --- and ... markers?
With the diagnostics indented to indicate they're diagnostics, why the --- and ... markers? TAP producers tend to spit a lot of junk to STDOUT, either explicitly as poorly written comments or accidentally because the thing they're testing prints to STDOUT. We don't want just any old indented text to be parsed, so we put the --- and ... markers around it. The --- is there to indicate the start of a block. The ... is there to indicate it has ended so the parser does not have to wait for the next test line (which could take a while) to know there's no more diagnostics for the previous test forthcoming.

