YAMLish

From Test Anything Protocol

Jump to: navigation, search

YAMLish is a small subset of YAML that TAP producers may use to embed machine readable information in TAP diagnostics. See TAP diagnostic syntax for information about how YAMLish embeds in TAP.

Contents

[edit] Objectives

The main objectives for YAMLish are

  • small - the Perl reader is around 124 lines, 258 lines for the parser
  • portable - it should be reasonably easy to implement YAMLish in any language
  • able to encode arbitrary data structures
  • verifiable - it should be relatively to easy to test that a YAMLish implementation conforms
  • JSON compatible - YAMLish should be a super-set of JSON to allow TAP producers to make use of JSON libraries (objective not met)

[edit] Syntax

To avoid the burden of distributing a complete YAML parser with a TAP producer or consumer YAMLish confines itself to a subset of YAML syntax.

These examples demonstrates the supported syntax.

All YAMLish documents must begin with '---' and end with a line containing '...'.

   --- Simple scalar
   ...

Unprintable characters are represented using standard escapes in double quoted strings.

   --- "\t\x01\x02\n"
   ...

Array and hashes are represented thusly

   ---
     - "This"
     - "is"
     - "an"
     - "array"
   ...
   ---
     This: is
     a: hash
   ...

Hash keys may be double quoted strings and may contain unprintable characters

   ---
     "\t\x00": 'My key is <tab><nul>"
     "Now is the time": "t'was ever thus"
   ...
   

Structures may nest arbitrarily

   ---
     -
       name: 'Hash one'
       value: 1
     -
       name: 'Hash two'
       value: 2
   ...

Undef is a tilde

   --- ~
   ...

[edit] Root Namespace

When used with TAP the root element of an embedded YAMLish diagnostic is a hash containing keys from this set:

message
A textual message giving more detail about the failure (or success)
severity
The severity of the problem.
source
A uri describing the source of the TAP. This can be a file URL. See "file" for a special case.
datetime
the time the test was executed, helping test runners do interesting things like run tests in order of most-recently-failed. ISO8601 or HTTP date format.
file
A filename representing the TAP source, really a special case of "source". Not possible for all TAP sources, but I really don't want everyone to have to use file URIs.
line
The line number of the TAP source from which this test was produced. Not possible for all TAP sources.
name
Name of this test, if any.
extensions
A place to put any non-standard keys without worrying out conflicting with future ones
actual
For comparison tests, what you got.
expected
For comparison tests, what you expected.
display
Suggested text to display representing this failure
dump
A hash of variables to be pretty-printed by the harness
error
An error or exception object
backtrace
A stack backtrace in the case of an error or exception

(please feel free to add to this list - it's provisional at the moment)

[edit] Implementations

Because YAMLish is a subset of YAML there are already a number of parsers in a number of languages that accept it. It's also quite likely that existing YAML producers can be coerced into producing YAMLish compliant YAML. Please be careful though to ensure that your YAMLish producer does in fact conform to the subset defined here. Just because your YAML happens to work with a particular test harness doesn't mean that it's valid YAMLish.

YAMLish is based on the subset of YAML supported by Adam Kennedy's YAML::Tiny Perl module. YAML::Tiny doesn't support quoted hash keys - which we need so that we can safely round-trip arbitrary data structures - so YAMLish extends Adam's de-facto subset to include these.

If your concern is only to produce well formed TAP (rather than parsing it) then you should find that it's possible to implement a YAMLish writer in a couple of hundred lines of code.

[edit] Perl

  • TAP::Parser implements YAMLish support. You'll need the version from the subversion repository though; YAMLish support hasn't yet made it to CPAN.
  • Data::YAML is essentially the YAMLish engine from TAP::Parser packaged as a standalone module

[edit] PHP

  • YAMLishWriter is a simple PHP implementation of a YAMLish encoder

If you have a YAMLish implementation please list it here.

[edit] Q&A

[edit] Why YAML?

TAP diagnostics require a way to represent data structions in any language in a human and machine readable form. It would be nice if we didn't have to write our own format. YAML, like TAP, is designed to be both human and machine readable as well as language independent. YAML generators and parsers already exist in many languages. YAML has already solved the hard problems facing a data serialization format (like character sets).


[edit] Why not JSON?

JSON was considered, and it has some of the characteristics of YAML, but it was ultimately rejected for several reasons.

JSON is, effectively, a subset of YAML. If your producer emits JSON then a YAML parser will read it. The inverse is not true.

JSON is more verbose, less human readable, requiring more quoting. For example:

 # YAML
 ---
 got:      this
 expected: that
 ...
 
 # JSON
 {
   "got":      "this"
   "expected": "that"
 }

JSON lacks a WYSIWYG multi-line scalar value format. YAML has several. | allows the exact text to be presented, newlines and all. > "soft wraps" text to prevent long lines from spilling across the screen.

 # YAML
 ---
 got: >
    When in the course of human events,
    blah blah blah
 expected: >
    When, in the course of human events,
    it becomes necessary for one people to
    dissolve the political bonds which have 
    connected them with another...
 ...
 # JSON
 {
   "got":      "When in the course of human events, blah blah blah"
   "expected": "When, in the course of human events, it becomes necessary for one people to dissolve the political bonds which have connected them with another..."
 }

[edit] Why the --- and ... markers?

With the diagnostics indented to indicate they're diagnostics, why the --- and ... markers? TAP producers tend to spit a lot of junk to STDOUT, either explicitly as poorly written comments or accidentally because the thing they're testing prints to STDOUT. We don't want just any old indented text to be parsed, so we put the --- and ... markers around it. The --- is there to indicate the start of a block. The ... is there to indicate it has ended so the parser does not have to wait for the next test line (which could take a while) to know there's no more diagnostics for the previous test forthcoming.

Personal tools