Introduction to EAD

Contents

Introduction to Encoded Archival Description <EAD>

What is EAD?

From SAA:1 EAD (Encoded Archival Description), is the de facto standard for the encoding of archival finding aids for use in an online environment. Finding aids are inventories, indexes, or guides that are created by archival and manuscript repositories to provide information about specific collections. While the finding aids may vary somewhat in style, their common purpose is to provide detailed description of the content and intellectual organization of collections of archival materials. EAD allows for the standardization of collection information in finding aids within and across repositories. For an example of how this works, see TARO (Texas Archival Resources Online: http://www.lib.utexas.edu/taro/index.html). Because each of the various institutions, all of whom have slightly different collections, used EAD to mark up their finding aids, they are all accessible through a single portal.

EAD has been employed by the archival community for over ten years. Originally conceived as an application of SGML (Standard Generalized Mark-up Language), now EAD instance documents are encoded in XML (Extensible Mark-up Language), a simplified derivative of SGML. For more information about using XML, see this tutorial: http://www.ischool.utexas.edu/technology/instruction/handouts_f09/xml_20090219_print.pdf

With EAD, like many encoding or markup languages such as HTML (itself another derivative of SGML). is formally defined by a DTD (Document Type Definition). This is a machine-readable set of rules that specify how the EAD document—known as an “instance”—is to be written. However, unlike HTML which is designed to present information graphically on a Web page, EAD uses XML markup to form a static text document intended to semantically identify units of information useful for archivists and researcher using primary source materials in archival collections. In other words, HTML can determine what information appears on a Web page and how it looks. EAD, using XML, can describe what types of information are in the document.

The Current version of EAD is known as EAD 2002. Over the last 10 years the EAD DTD has undergone many revisions, so you may hear talk of EAD version 1.0, and even of the beta version. These are superseded by EAD 2002. You may download the EAD DTD here: http://www.loc.gov/ead/ead2002.zip

A history of EAD, and motivations for its development can be found at the Library of Congress' website: Development of the Encoded Archival Description DTD.

Implementing EAD:

Typically, encoded finding aids consist of three sections, the first describing the information about the finding aid itself (<eadheader>), the second describing the prefatory matter useful for the display or publication of the finding aid (<frontmatter>), and the third one containing the description of the archival records or manuscript papers (<archdesc>). Document Type Definition (DTD) defines document structure, while elements constitute informational units. Elements can be modified with attributes.

If you are at all familiar with XML or HTML, you may recall the concept of “nesting”. This means that there is a highly circumscribed hierarchy of tags, wherein certain tags may be used within other tags. The best thing to do when starting out with EAD is to consult the EAD manual which is now, helpfully, available online: http://www.loc.gov/ead/tglib/index.html. Having an electronic version is especially helpful since you can search within the document for tags in case you need to review the nesting arrangement.

Although in principle, an EAD file is nothing more than an a static text file and can be composed using any text editor or word processing software, there are several software applications designed specifically for creating and and validating XML code. These programs generally display elements in different colors, highlight errors, and allow users to import DTD files or schemas against which to validate the code.

See http://www.xml.com/pub/pt/3 for a list of the many different XML editors. You’ll see that the first option is <oXygen/> XML editor, as it is very widely used, and you will be happy to know that this software is available in the iSchool IT Lab. Adobe Dreamweaver is another software in the IT Lab that can be used for XML editing and validating.

Getting Started in oXygen:

1. Making a new document:

Create a new XML file in Oxygen
Create a new XML file in Oxygen



2. Link DTD:



Make a new XML file in Oxygen
Make a new XML file in Oxygen



Make a new XML file in Oxygen
Make a new XML file in Oxygen




3. Check header:


Make a new XML file in Oxygen
Make a new XML file in Oxygen



See your DTD? Once it is associated, Oxygen will look to the DTD to determine whether your code is valid.

Selected Elements in EAD:

<eadheader> EAD Header

A wrapper element for bibliographic and descriptive information about the finding aid document rather than the archival materials being described. The <eadheader> is modeled on the Text Encoding Initiative (TEI) header element to encourage uniformity in the provision of metadata across document types.

The <eadheader> is required, because information that was often unrecorded for a local paper finding aid is essential in a machine-readable environment. Four subelements are available, which must occur in the following order: <eadid> (required), <filedesc> (required), <profiledesc> (optional), and <revisiondesc> (optional). These elements and their subelements provide: a unique identification code for the finding aid; bibliographic information, such as the author and title of the finding aid; information about the encoding of the finding aid; and statements about significant revisions.

The FINDAIDSTATUS attribute can be used to indicate how complete or polished the information in the finding aid is. The COUNTRYENCODING, DATEENCODING, LANGENCODING, REPOSITORYENCODING, and SCRIPTENCODING attributes are used to specify the ISO standards from which code values for other attributes, such as COUNTRYCODE in <eadid> and <unitid>, are taken. Some or all of the <eadheader> subelements can be used to display title page information. Alternatively, the <eadheader> can be blocked from display by setting the AUDIENCE attribute to "internal" and using the <frontmatter> <titlepage> elements to create a title page.

May contain:

eadid, filedesc, profiledesc, revisiondesc

May occur within:

ead, eadgrp

Example:

<eadheader>
    <eadid>[...]</eadid>
    <filedesc>
        <titlestmt>
            <titleproper>[...]</titleproper>
        </titlestmt>
    </filedesc>
</eadheader>



<eadid> EAD Identifier

Description:

A required subelement of <eadheader> that designates a unique code for a particular EAD finding aid document. Two of the attributes, COUNTRYCODE and MAINAGENCYCODE, are required to make the <eadid> compliant with ISAD(G) element 3.1.1. MAINAGENCYCODE provides the ISO 15511 code for the institution that maintains the finding aid (which may not be the same as the institution that is the custodian of the materials described). COUNTRYCODE supplies the ISO 3166-1 code for the country of the maintenance agency. In addition to these two attributes, it is recommended that repositories also use at least one of the following attributes: URL, PUBLICID, or IDENTIFIER to make the <eadid> globally unique. PUBLICID should be a Formal Public Identifier, URL an absolute or relative address, and IDENTIFIER a machine-readable unique identifier for the finding aid file. (The proper syntax for PUBLICID is defined in ISO/IEC 9070:1991 Information technology -- SGML support facilities -- Registration procedures for public text owner identifiers.)

May contain:

#PCDATA



May occur within:

eadheader


Examples:

<eadid countrycode="us" mainagencycode="txu-hu"
  publicid="-//us::txu-hu//TEXT us::txu-hu::hrc.00001//EN"
  url="www.lib.utexas.edu/taro/hrc/00001.xml">
    hrc.00001
</eadid><br /> <br />

<frontmatter> Front Matter

Description:

A wrapper element that bundles prefatory text found before the start of the Archival Description <archdesc>. It focuses on the creation, publication, or use of the finding aid rather than information about the materials being described. Examples include a title page, preface, dedication, and instructions for using a finding aid. The optional <titlepage> element within <frontmatter> can be used to repeat selected information from the <eadheader> to generate a title page that follows local preferences for sequencing information. The other <frontmatter> structures, such as a dedication, are encoded as Text Divisions
s, with a <head> element containing word(s) that identify the nature of the text.



May contain:

div, titlepage



May occur within:

ead, eadgrp

Example:

<frontmatter>
        <titlepage>
            <titleproper>Register of the Gibbons (Stuart C.) Papers,
                <date>1955-1964</date>
            </titleproper>
            <num>Collection number: Ms28</num>
            <publisher>San Joaquin County Historical Society and Museum
                <lb/>
                <extptr actuate="onload" show="embed entityref="sjmlogo">
                <lb/>
            Lodi, California</publisher>
            &tp-cstoh;
            <list type="deflist">
                <defitem>
                    <label>Processed by: </label>
                    <item>Don Walker</item>
                </defitem>
                <defitem>
                    <label>Date Completed: </label>
                    <item>1997</item>
                </defitem>
            </list>
    </titlepage>
    </frontmatter>


<archdesc> Archival Description

Description:

A wrapper element for the bulk of an EAD document instance, which describes the content, context, and extent of a body of archival materials, including administrative and supplemental information that facilitates use of the materials. Information is organized in unfolding, hierarchical levels that allow for a descriptive overview of the whole to be followed by more detailed views of the parts, designated by the element Description of Subordinate Components <dsc>. Data elements available at the <archdesc> level are repeated at the various component levels within <dsc>, and information is inherited from one hierarchical level to the next.

The Descriptive Identification <did> element is required to appear in <archdesc> before presenting more detailed descriptions in <bioghist>, <scopecontent>, and <dsc>, in order to provide first a basic description of the archival materials. The <archdesc> element has several specialized attributes. The required LEVEL attribute identifies the character of the whole unit, for example, "class," "collection," "fonds," "recordgrp," "series," "subfonds," "subgrp," "subseries," or "otherlevel." This attribute is comparable to ISAD(G) data element 3.1.4 and MARC field 351 subfield c.
The TYPE attribute can be used to categorize the finding aid as an inventory, register, or other format.

May contain:

accessrestrict, accruals, acqinfo, altformavail, appraisal, arrangement, bibliography, bioghist, controlaccess, custodhist, dao, daogrp, descgrp, did, dsc, fileplan, index, note, odd, originalsloc, otherfindaid, phystech, prefercite, processinfo, relatedmaterial, runner, scopecontent, separatedmaterial, userestrict

May occur within:

ead

Example:

 <archdesc level="collection" type="inventory">
        
 <did>
            
  <unittitle>[...]</unittitle>
      	
  <unitdate type="inclusive"> [...] </unitdate>
  <physdesc><extent> [...] </extent></physdesc>
      	
  <repository>
      		
   <corpname> [...] </corpname>
             		
   <address>
            	 	
   <addressline> [...] </addressline>
             	
   </address>
     		
  </repository>

		
  <abstract> [...] </abstract>
  	
 </did>

	
     
 <bioghist>
            
  <p> [...] </p>
      
 </bioghist>

        
 <scopecontent>
            
  <p> [...] </p>
       
 </scopecontent>

        
      
 <dsc>
  <c01>
   <did>
                    
     <unittitle> [...] </unittitle>
              
   </did>
            
  </c01>      
 </dsc>
   
</archdesc>

For more elements, see the EAD Tag Library:

http://www.loc.gov/ead/tglib/element_index.html

Resources:

Tools and helper files are available at the SAA EAD Roundtable’s web site: http://jefferson.village.virginia.edu/ead/eadfiles.html

MIT provides a very good description and tutorial: http://libraries.mit.edu/guides/subjects/metadata/standards/ead.html

Documentation on the official SGML EAD DTD: http://lcweb.loc.gov/ead/

FAQ regarding XML implementation of EAD concepts: http://jefferson.village.virginia.edu/ead/xml.html

The EAD Cookbook: http://jefferson.village.virginia.edu/ead/cookbookhelp.html

Crosswalks:

EAD Application Guidelines, available at the official EAD web site (http://www.loc.gov/ead/) provide crosswalk tables for conversion between different metadata schemes:

  • ISAD-G to EAD
  • EAD to ISAD-G
  • Dublin Core to EAD
  • USMARC to EAD