The OA Community Group's February 2013 Open Annotation Data Model (this document) has been superseded the following W3C Web Annotation Working Group Candiate Recommendations (July 2016):
Implementers are encouraged to begin using the newer specifications as soon as practicable.Although the Open Annotation data model does not specify how Annotations should be transferred between systems at a network protocol level, there are some issues regarding publication in general that it must deal with for interoperability. These include how to embed resources within an Annotation rather than referencing them by external URIs, expressing equivalence between resources to assist with deduplication between multiple systems, and how to express an Annotation using a Named Graph structure.
The serialization of the Annotation MAY be in any format capable of expressing the RDF graph. It MAY be embedded within other resources, such as using RDFa to embed the Annotation within a web page.
If the Annotation has an HTTP URI, then when that URI is dereferenced, then a representation of the Annotation MUST be returned in an appropriate graph serialization format. When the serialization is embedded within other resources, such as when expressed in RDFa, this HTTP URI MUST continue to be expressed in the serialization. If the Annotation is not available from any dereferenceable URI, but only embedded within a containing resource, then it MUST have a globally unique URN identifer such as a UUID or tag URI.
The RECOMMENDED serialization format is JSON-LD. This is to enable web-browser
based implementations to easily consume Annotations using tools and methods familiar to developers.
The Context presented below is RECOMMENDED to ensure consistency between implementations, and can be
referenced as http://www.w3.org/ns/oa-context-20130208.json
.
It is RECOMMENDED to support content negotiation for other serialization formats, including especially RDF/XML and Turtle.
{ "@context": { "oa" : "http://www.w3.org/ns/oa#", "cnt" : "http://www.w3.org/2011/content#", "dc" : "http://purl.org/dc/elements/1.1/", "dcterms": "http://purl.org/dc/terms/", "dctypes": "http://purl.org/dc/dcmitype/", "foaf" : "http://xmlns.com/foaf/0.1/", "rdf" : "http://www.w3.org/1999/02/22-rdf-syntax-ns#", "rdfs" : "http://www.w3.org/2000/01/rdf-schema#", "skos" : "http://www.w3.org/2004/02/skos/core#", "hasBody" : {"@type":"@id", "@id" : "oa:hasBody"}, "hasTarget" : {"@type":"@id", "@id" : "oa:hasTarget"}, "hasSource" : {"@type":"@id", "@id" : "oa:hasSource"}, "hasSelector" : {"@type":"@id", "@id" : "oa:hasSelector"}, "hasState" : {"@type":"@id", "@id" : "oa:hasState"}, "hasScope" : {"@type":"@id", "@id" : "oa:hasScope"}, "annotatedBy" : {"@type":"@id", "@id" : "oa:annotatedBy"}, "serializedBy" : {"@type":"@id", "@id" : "oa:serializedBy"}, "motivatedBy" : {"@type":"@id", "@id" : "oa:motivatedBy"}, "equivalentTo" : {"@type":"@id", "@id" : "oa:equivalentTo"}, "styledBy" : {"@type":"@id", "@id" : "oa:styledBy"}, "cachedSource" : {"@type":"@id", "@id" : "oa:cachedSource"}, "conformsTo" : {"@type":"@id", "@id" : "dcterms:conformsTo"}, "default" : {"@type":"@id", "@id" : "oa:default"}, "item" : {"@type":"@id", "@id" : "oa:item"}, "first": {"@type":"@id", "@id" : "rdf:first"}, "rest": {"@type":"@id", "@id" : "rdf:rest", "@container" : "@list"}, "chars" : "cnt:chars", "bytes" : "cnt:bytes", "format" : "dc:format", "annotatedAt" : "oa:annotatedAt", "serializedAt" : "oa:serializedAt", "when" : "oa:when", "value" : "rdf:value", "start" : "oa:start", "end" : "oa:end", "exact" : "oa:exact", "prefix" : "oa:prefix", "suffix" : "oa:suffix", "label" : "rdfs:label", "name" : "foaf:name", "mbox" : "foaf:mbox", "styleClass" : "oa:styleClass" } }
Its use results in serializations similar to the two examples below:
{ "@context": "http://www.w3.org/ns/oa-context-20130208.json", "@type": "oa:Annotation", "hasBody": "http://www.example.org/body1", "hasTarget": "http://www.example.org/target1" }
{ "@context": "http://www.w3.org/ns/oa-context-20130208.json", "@id": "http://www.example.org/annotations/anno1", "@type": "oa:Annotation", "annotatedAt": "2012-11-10T09:08:07", "annotatedBy": { "@id": "http://www.example.org/people/person1", "@type": "foaf:Person", "mbox": { "@id": "mailto:person1@example.org" }, "name": "Person One" }, "hasBody": { "@id": "urn:uuid:1d823e02-60a1-47ae-ae7f-a02f2ac348f8", "@type": ["cnt:ContentAsText", "dctypes:Text"], "chars": "This is part of our logo" }, "hasTarget": { "@id": "urn:uuid:cc2c8f08-3597-4d73-a529-1c5fed58268b", "@type": "oa:SpecificResource", "hasSelector": { "@id": "urn:uuid:7978fa7b-3e03-47e2-89d8-fa39d1280765", "@type": "oa:FragmentSelector", "conformsTo": "http://www.w3.org/TR/media-frags/", "value": "xywh=10,10,5,5" }, "hasSource": { "@id": "http://www.example.org/images/logo.jpg", "@type": "dctypes:Image" } } }
The Open Annotation Core describes how to embed textual bodies within an Annotation, however it is frequently useful to also embed Selectors, Styles and potentially even Targets to ensure that the representation is available.
The web architecture assumes that every resource has a URI, and furthermore the expectation is that they are available for retrieval from HTTP URIs. Some clients, however, may not be able to generate dereferenceable URIs on their own for all of the resources that are created as part of the annotation process. This includes the Body, any Specifiers, Styles or other user generated information, but also potentially the Target if it is not available online.
It is important to have a model that deals gracefully and consistently with both online, dereferenceable resources, and embedded resources. In both cases, the content is expressed as a resource, rather than using only a string literal, as motivated in the section on embedded textual bodies. Both cases must also deal with any content type, including binary data, and deal with any class of resource within the Open Annotation model: Body, Target, Style, Selector, or State.
The Open Annotation model uses the Representing Content in RDF
specification to include the representation of such resources directly within the
Annotation graph. The resource SHOULD be assigned a non-resolvable URN, and
an appropriate class from the Content in RDF ontology, such as
cnt:ContentAsText
or cnt:ContentAsBase64
. If identity of the resource is not considered
to be important, then an RDF blank node MAY be used instead of the URN.
For information about embedding serializations of RDF graphs within the Annotation, please see Embedding RDF Graphs.
Vocabulary Item Type Description cnt:ContentAsText Class The representation of a resource, expressed as plain text. cnt:ContentAsBase64 Class The representation of a resource, expressed as Base 64 encoded text. cnt:chars Property The property of a ContextAsText that contains the representation.
There MUST be exactly 1 cnt:chars property for a ContentAsText resource.cnt:bytes Property The property of a ContentAsBase64 that contains the representation.
There MUST be exactly 1 cnt:bytes property for a ContentAsBase64 resource.cnt:characterEncoding Property The character encoding of the content string in either cnt:chars or cnt:bytes.
There SHOULD be exactly 1 cnt:characterEncoding for a ContentAsText or ContentAsBase64 resource.dc:format Property The media type of the representation.
There SHOULD be exactly 1 dc:format per embedded resource.
<anno1> a oa:Annotation ; oa:hasBody <body1> ; oa:hasTarget <sptarget1> ; oa:styledBy <style1> . <style1> a oa:CssStyle, cnt:ContentAsText ; cnt:characterEncoding "utf-8" ; cnt:chars ".red { color : red }" . <sptarget1> a oa:SpecificResource ; oa:hasSource <source1> ; oa:styleClass "red" .
SELECT ?anno WHERE { ?anno oa:styledBy ?style . ?style a oa:CssStyle . ?style a cnt:ContentAsText } => <anno1>
One particular case of embedding resources within an Annotation is embedding statements expressed as RDF graphs. The triples that make up these resources MUST NOT be simply put into the Annotation graph, as the triples must remain distinguishable as to authorship and provenance. If it were done otherwise, the metadata and identifier of the Body or Target graph would be lost and subsumed in the Annotation's graph.
The simplest method is to publish the graph as any other resource with a dereferenceable HTTP URI, and refer to this URI in the Annotation. This method is RECOMMENDED.
If a single document is required and thus the graph must be embedded within the Annotation, then there are two possibilities:
dc:format
and the class given of trig:Graph
Vocabulary Item Type Description trig:Graph Class The class of Named Graphs
{ <anno1> a oa:Annotation ; oa:hasBody :graph1 ; oa:hasTarget <target1> . :graph1 a trg:Graph . } :graph1 { <target1> relationship <thing1> . }
SELECT ?anno, ?what WHERE { ?anno oa:hasBody ?g . ?anno oa:hasTarget ?t . GRAPH ?g { ?t relationship ?what } } } => <anno1>, <thing1>
Although it is not a challenge unique to Annotations, deduplicating resources that have been syndicated between systems is greatly reduced by expressing the equivalence between multiple copies, or very close derivatives. A system that could not discover duplicate Annotations would naïvely present all of them, resulting in a very poor experience. Systems that generate statistics, reputation models, spam filtering for annotations and similar would also have very poor results without this capability. Given these requirements, the Open Annotation model includes a relationship to assert that while two resources are not absolutely identical, they are equivalent and hence should not be both maintained and processed separately.
If a system retrieves an Annotation and republishes it at a different HTTP
URI, then it SHOULD express the oa:equivalentTo
relationship between the
original Annotation and the republished one. The system then SHOULD
update the oa:serializedAt
and oa:serializedBy
properties,
as the graph has changed by adding the oa:equivalentTo
relationship.
Embedded resources SHOULD be treated in the same way when republished with their own HTTP URIs. If a system publishes an embedded resource at a new HTTP URI, then it SHOULD express the oa:equivalentTo
relationship between the resource's URN and the new URI from which it is available. If the embedded resource is conveyed as a blank node, then the Skolemization technique described in RDF Concepts 1.1 SHOULD be used. The system MAY also remove the embedded resource from the graph and reference only the dereferenceable URI, at its discretion. When this occurs it is possible for the Annotation's graph to change significantly from the initial version to its republished state, while still remaining equivalent.
Vocabulary Item Type Description oa:equivalentTo Relationship [subProperty of prov:alternateOf] The subject and object resources of the oa:equivalentTo relationship represent the same resource, but potentially have different metadata such as oa:serializedBy, oa:serializedAt and serialization format. oa:equivalentTo is a symmetrical and transitive relationship; if A oa:equivalentTo B, then it is also true that B oa:equivalent A; and that if B oa:equivalentTo C, then it is also true that A oa:equivalentTo C.
The Annotation MAY include 0 or more instances of the oa:equivalentTo relationship between copies of the Annotation or other resources, and SHOULD include as many as are available.
<anno1> a oa:Annotation ; oa:hasBody <body1> ; oa:hasTarget <target1> ; oa:serializedAt "2012-12-12T12:12:12Z" ; oa:equivalentTo <anno2> . <anno2> oa:serializedAt "2013-01-28T20:00:00Z" .
SELECT ?anno WHERE { <Anno1> a oa:Annotation ; oa:equivalentTo ?anno } => <anno2>