You are here
Date added 13 December 2006
Last edited 12 November 2009
Text Encoding Initiative
The Text Encoding Initiative is a major international initiative within the academic community to provide a standard set of Standard Generalized Mark-up Language (SGML) and Extensible Mark-up Language (XML) tag definitions which can be used to represent all kinds of electronic information, in particular the datasets generated and used by research projects in linguistics, literature and the humanities in general.
TEI is highly modular and extensible and is particularly relevant for bibliographic material. Basic tag sets are provided for prose, verse, drama, speech, dictionaries and terminological databases, and a method has been defined for creating customized mixes from these basic sets.
Additional tag sets are provided to capture information related to linking, analysis (including feature structure analysis), certainty, transcriptions, critiques, names and dates, nets (graphs, digraphs, trees, etc), figures and corpora.
Standards Developing Organisations
- Association for Literary and Linguistic Computing
- Association for Computers and for the Humanities
- Association for Computational Linguistics
No information available.
- Access, Use and Reuse
- Create or Receive
- Description and Representation Information
- Digital Archive Standards
- Digital Repository Standards
- XML DTD and Schema
- 2008 - TEI P5 [external]
- Schema and implementation guidelines for downloading.
- Wikipedia entry for Text Encoding Initiative [external]
Alternative Current Versions
- 1999 - TEI P3
- Documentation no longer available.
- 2002 - TEI P4 [external]
- DTD and implementation guidelines for downloading.