Workshop Report

From Open Annotation Collaboration

Jump to: navigation, search

Contents

Open Annotation Collaboration Spring 2011 Workshop Report

Initial draft and notes assembled by Jacob Jett & Kevin Trainor. Additional content and clarification added by Workshop participants.

I. Introduction

On March 24 & 25, 2011, the Open Annotation Collaboration (OAC) project held a one and half day workshop at the Illini Center in downtown Chicago. The purpose of this workshop was to bring together scholars, librarians, and systems designers involved in ongoing digital content projects using or planning to implement annotation tools and services. Participants were asked to talk about their projects dealing with annotation and to provide feedback to the OAC on data modeling work of the Collaboration to date.

The workshop began with an in-depth introduction to the OAC data model and ontology ([1]) for describing scholarly annotations of Web-accessible information resources. Following that, use cases involving a variety of scholarly annotation classes and target media types were presented to the attendees. Discussions of these use cases and the OAC data model itself provided a number of action items for OAC to address and prepared attendees for OAC’s forthcoming Request for Proposals (RFP). Links to presentations made and notes on presentations and subsequent discussions are described below.

Also a summary of workshop outcomes is provided at the end of this brief Workshop Report. Included there you find a list of items that need to be addressed before the Beta release of the OAC Data Model, a prioritized list of work to be done by the Collaboration to help encourage adoption over the next 12 months, and an ordered list of issues of concern needing to be addressed by the Collaboration.

II. Participants

28 projects / initiatives / institutions were represented at the workshop (32 individuals), in addition to members of the core OAC team. Shown below are participants listed by project or institution; for a list ordered by name see: [2]

  • Alfalab
    • Alexander Witteveen (Data Archiving and Networked Services, Royal Netherlands Academy of Arts and Sciences)
  • Annotation Ontology
    • Paolo Ciccarese (Harvard Medical School)
  • ARTstor
    • William Ying
  • AustLit
    • Anna Gerber (The University of Queensland)
    • Jane Hunter (OAC co-PI) (The University of Queensland)
  • BioNLP
    • Karin Verspoor (University of Colorado Denver)
  • Canadian Writing Research Collaboratory
    • Susan Brown
    • James Chartrand
  • CATCHPlus Project
    • Hennie Brugman (Meertens Institute)
  • Center for Informatics Research in Science and Scholarship, University of Illinois
    • Allen Renear
  • CLARIN
    • Menzo Windhouwer (The Language Archive, Max Plank Institute for Psycholinguistics)
  • Coalition for Networked Information
    • Cliff Lynch
  • The Collaborative Annotation Tool
    • Philip Desenne (Academic Technology Group, Harvard University)
  • Elsevier
    • Ron Daniel
  • EVIADA
    • William Cowan (Institute for Digital Arts & Humanities, Indiana University)
  • Giacomo Leopardi’s Zibaldone
    • Silvia Stoyanova (Princeton University)
  • The Long Civil Rights Movement Project
    • Jenn Riley (University of North Carolina at Chapel Hill)
  • MARGOT Annotation Tool
    • Christine McWebb (University of Waterloo)
    • Ian Davis (University of Waterloo)
  • MediaThread
    • Jonah Bossewitch (Columbia Center for New Media Teaching and Learning)
    • Schuyler Duveen (Columbia Center for New Media Teaching and Learning)
  • Northwestern University Library
    • William Parod
  • The Nyangwe Diary of David Livingstone
    • Heather Ball (ASA Institute of Business & Computer Technology)
    • Adrian Wisnicki (Birkbeck University of London)
  • Old Dominion University
    • Michael Nelson
  • Open-Source Toolbox for Annotation
    • Shannon Bradshaw (Drew University)
  • The Pico Project
    • Andrew Ashton (Center for Digital Scholarship, Brown University)
    • Michael Park (Center for Digital Scholarship, Brown University)
  • Project MUSE
    • Brian Harrington (John Hopkins University Press)
  • ProQuest
    • John Burns
  • Shared Canvas
    • Robert Sanderson (Research Library, Los Alamos National Laboratory)
    • Benjamin Albritton (Stanford University Libraries)
  • Subscription Streaming Video Content
    • James Smith (Maryland Institute for Technology in the Humanities)
    • Aaron Wood (Alexander Street Press)
  • Text Encoding Initiative
    • Peter Gorman (University of Wisconsin Digital Collections Center)
  • YUMA
    • Bernhard Haslhofer (University of Vienna)


Also attending were Principal Investigators:

  • Tim Cole (Center for Informatics Research in Science and Scholarship, University of Illinois)
  • Anna Gerber (The University of Queensland)
  • Jane Hunter (OAC co-PI) (The University of Queensland)
  • Robert Sanderson (Research Library, Los Alamos National Laboratory)
  • James Smith (Maryland Institute for Technology in the Humanities)
  • Herbert Van de Sompel (Research Library, Los Alamos National Laboratory)

and CIRSS’s staff members:

  • Jacob Jett
  • Kevin Trainor

III. Workshop Sessions

The Workshop sessions are described below including some of the highlights of the discussions and complete documentation of these discussions was made and is available to OAC.

Introduction to the OAC Shareable Annotation Date Model & Ontology (Herbert Van de Sompel & Rob Sanderson)

  • Technical Overview ([3])
  • Machine Readable Annotations ([4])

Discussion: Questions regarding constraints arose, including, ‘Can the OAC model support multiple constraints on a target?’ (Consensus was that this would be useful and so the OAC Model should accommodate.) There was also discussion of conformance levels with respect to optional features of the data model.

USE CASES – Annotation of marked-up text, including TEI

  • Annotation Supporting Collaborative Development of Scholarly Editions (Jane Hunter & Anna Gerber) ([5])
  • Annotating Texts in the Brown Digital Repository (Andy Ashton) ([6])

Discussion: Several issues were brought to light in the discussion section following these presentations. Text segmentation issues were noted as being problematic. There was discussion of how constraints might be used to resolve some of these issues. There was also some discussion regarding the adoption of very specific semantics to describe sub-classes. Finally, the trust issue (how will the annotations be shared, who will use them, etc.) was raised.

USE CASES – Annotating Scientific Literature

  • Annotating the Biomedical Literature through Text Mining (Karin Verspoor) ([7])
  • Production Publishing Considerations for Annotating the Scientific Literature (Ron Daniel) ([8])
  • Annotation Ontology and SWAN Annotation Tool (Paolo Ciccarese) ([9])

Discussion: Difficulties caused by CSS when annotating HTML were raised. Difficulties mapping AO (Annotation Ontology) to OAC’s ontology were also noted. Rights issues were also discussed including the question of whether or not the act of annotating changes the content, whether or not any such change constitutes a breach of fair use was explored, and whether constraints that include snippets of content (e.g., before and after targeted text) might breach fair use.

USE CASES – Annotation of time-based media

Discussion: It was noted that segmentation issues in the annotation of videos and the annotation of text might indicate some variance in the usage of the term “annotation” in the annotation of varying media classes. Annotation of video sets a high bar for annotation targeting, e.g., when targeting a part of a frame over a time segment (especially if region of frame targeted is changing during time segment). It was suggested that there are automated techniques that support identification and tracking of the target of a video annotation. There was also discussion of how the RDF of an annotation’s body should look and if that RDF could be generated through automation or if the authors of annotations would need to generate their own RDF graphs.

USE CASES – Annotation of manuscripts & other coordinated text & images

  • Shared Canvas: Interoperability for Digitized Medieval MSS Repositories Part 1 & Part 2 (Ben Albritton & Rob Sanderson) ([11]) & ([12])
  • Giacomo Leopardi's Zibaldone: a hypertext template for scholarly annotation [Zibaldone Sample] (Silvia Stoyanova) ([13]) & ([14])
  • MARGOT Annotation Tool (Christine McWebb) ([15])

Discussion: There was some discussion on what was behind the html of the manuscripts in the presentations, whether or not it was a TEI text and if the annotations were targeting text fragments or parts of the image or both. Some questions were raised about what to do if or when a corpus of annotations accumulated to the point of obscuring the image of the manuscripts they are annotating. Generally it was pointed out that getting scholars to use web-based or electronic annotation tools was more a problem of trying to increase the flow rather than one of coping with a flood. The provenance of annotations (i.e. how to determine who writes a specific annotation) was also noted to be a difficult issue as recording an annotation’s author would require some method of securing the annotations. Recording annotation authorship also raised the issue of an author’s right to privacy. One presentation highlighted the question of how to / whether to represent as including annotations digitization source material that itself already contains written annotations. Another highlighted issues when all that remains of a source to be digitized are fragments of the original, especially if there is some scholarly debate over exact relative placement and order of these remaining fragments.

USE CASES – Annotation of maps & geographic texts

  • Historic Map Annotations with YUMA (Bernhard Haslhofer) ([16])
  • An OAC-Compliant Toolbox (Shannon Bradshaw) ([17])

Discussion: This session brought out several technical issues, such as how to annotate events specific to a map, how maps change over time, and when maps are used not as maps but rather as illustrations within another work. How would annotations vary in these cases? How would it be possible to annotate annotations on a map? The annotation of annotations issue also extends back to audio and video media and to some extent, annotation of images as well. It was evident that more work was needed to provide a more structured method of producing the bodies of annotations.

IV. Workshop Conclusion & Outcomes

As clear from the summary above, the OAC Workshop generated a great deal of useful feedback and helpful guidance for the Collaboration going forward. Summarized here is the distillation and synthesis of this feedback and guidance. Further steps and future OAC acitvities, such as development of the Beta spec for the data model and the RFP to recruit additional projects to collaborate with are also summarized. Additionally, community building activities resulting from the workshop are also noted.

Further work is needed to provide guidance on producing structured/machine-readable annotation bodies. Issues of constraint precedence and workflow within the data model need to be addressed. More guidance is needed from OAC on how to address inheritance, provenance, constraint typing, annotation typing, and target segmentation when applying the data model and ontology. From the discussion sessions it is evident that four priority issues & services for OAC to address have been identified:

  • text segmentation
  • inheritance
  • provenance
  • sharing/interoperability

To facilitate the open annotation community in addressing these and other issues identified during the presentation sessions, OAC will find ways to make it easier to share tools and methods for sharing annotations. OAC will also establish best practice guidelines for creating structured bodies, dealing with multiple, aggregate, or discontinuous targets, typing annotations, typing constraints, and addressing architectural issues. OAC will look at services to facilitate sharing and building a repository of shareable annotations conformant to the OAC data model.

OAC is moving forward with the development of a Beta spec for the data model in order to refine areas of the data model that were identified as needing more work at the workshop. Those areas for refinement are: 1. Structured/Machine-readable Body 2. Constraint Precedence/Workflow 3. Constraint Types 4. Annotation Types

As work proceeds on the Beta spec, OAC is also moving forward with its RFP (Request For Proposal). This RFP will allow OAC to partner with four additional projects or institutions that have existing annotation tools or interesting scholarly annotation cases. These funded collaborations will provide opportunities to demonstrate the implementation of OAC’s data model and ontology.

Proposals for work that addresses some of the practicalities of implementing OAC’s data model and ontology that were noted in the workshop session summaries above will be recognized as of priority. These practicalities include:

  • Viable methods for identifying arbitrary segments of text as annotation targets or bodies
  • Use cases involving annotations that propagate across formats and/or FRBR entities
  • Demonstrations of annotation portability, e.g., device and resolution independence
  • Implementations that incorporate provenance, fixity, and/or target context into annotations
  • Strategies for mapping from existing native formats for annotation to the OAC data model
  • Use cases demonstrating the relevance of OAC to Social Reading, etc.
  • Logic for dealing with alternate constraints and supporting graceful fall back
  • Experiments examining the practical utility of annotation and/or constraint typing
  • Use cases involving complex, structured bodies
  • Experiments focusing on interoperability across disciplines

The final outcome of the workshop was the identification of three “Birds of a Feather” groupings which met at the end of the Workshop.

  • Annotation for Education
  • Annotation of Video
  • Annotation of Editions of Texts/Manuscripts

These community efforts will foster further discussion of open annotation practices, guidelines, and issues.

Personal tools