Anvil is a free video annotation tool, used at research institutes world-wide (see the Anvil User Web). It offers frame-accurate, hierarchical multi-layered annotation driven by user-defined annotation schemes. The intuitive annotation board shows color-coded elements on multiple tracks in time-alignment. Special features are cross-level links, non-temporal objects and a project tool for managing multiple annotations. Originally developed for Gesture Research, Anvil has also proved suitable for research in Human-Computer Interaction, Linguistics, Ethology, Anthropology, Psychotherapy, Embodied Agents, Computer Animation and many other fields.

Anvil can import data from the widely used, public domain phonetic tools PRAAT and XWaves which allow precise and comfortable speech transcription. Anvil can display waveform and pitch contour. Anvil's data files are XML-based. Special ASCII output can be used for import in statistical toolkits like SPSS. The Anvil system is written in Java and should run on Windows, Macintosh and Unix (Solaris/Linux) computers. Here is a full description of Anvil, including screen shots. And here is a page with new features and bug fixes.

Here you can download the official user manual describing the Anvil software (installation, concepts, user interface), including a quick-reference card describing essential elements at a glance:

  • Anvil 4.0 Manual (10 Apr 2003, 52 pages)
    Download manual as pdf (0.5 MB)
    Download manual as zipped postscript (1.2 MB)

  • Anvil quick-reference Card (24 Jul 2002, 2 pages)
    Download as pdf.

  • Anvil 3.6 Manual (24 Jul 2002, 51 pages)
    Download manual as pdf (1.9 MB)
    Download manual as zipped postscript (1.3 MB).

For related scientific papers, see my publications page (years 2001 and 2004) and my PhD dissertation (December 2004). To see some examples of how Anvil is used but also to learn about annotation of speech and gesture in general I recommend reading Dan Loehr's excellent PhD thesis "Gesture and Intonation" (2004).