JCDL 2003:: Tutorials

Tutorials

Tutorial Schedule

7:30-9:00	Registration (at Alice Pratt Brown Hall on 5/27, Duncan Hall on 5/31)
9:00-12:30	A.M. Sessions at Duncan Hall
12:30-1:30	Lunch at Duncan Hall
1:30-5:00	P.M. Sessions at Duncan Hall

Shuttles will go between the conference hotels, the registration site (Alice Pratt Brown Hall on 5/27), and Duncan Hall. If you have signed up for a tutorial, please pick up your registration materials before coming to your tutorial.

Full Day Tutorials, May 27, 2003

Usability Evaluation of Digital Libraries
Overview of Digital Libraries

Morning Tutorials, May 27, 2003

How to Build a Digital Library Using Open-Source Software (paired with Build Your Own Digital Library Collections: Hands-On Session)
Introduction to the Open Archives Initiative Protocol for Metadata Harvesting (paired with Advanced Tutorial on Open Archives Initiative)
Introduction to Georeferencing in Digital Libraries (paired with How to Build a Geospatial Digital Library)
Thesauri and Ontologies in Digital Libraries I (paired with Thesauri and Ontologies in Digital Libraries II)
Social Change, Digital Libraries, and the Inquiry Page [CANCELLED]
XML (paired with XSL)

Afternoon Tutorials, May 27, 2003

Build Your Own Digital Library Collections: Hands-On Session (paired with How to Build a Digital Library Using Open-Source Software)
Advanced Tutorial on Open Archives Initiative (paired with Introduction to the Open Archives Initiative Protocol for Metadata Harvesting )
How to Build a Geospatial Digital Library [CANCELLED] (paired with Introduction to Georeferencing in Digital Libraries)
Open Content Licenses and Copyright
XSL (paired with XML)

Full-Day Tutorial, May 31, 2003

Audio/Video Digital Libraries: designing, searching for documents, and generating Metadata

Morning Tutorial, May 31, 2003

SRW (Search Retrieve WebService): Z39.50 Next Generation

Register for tutorial (on site registration only after 5/20)

Tutorial 1

Title:	Usability Evaluation of Digital Libraries
Presenters:	Ann Blandford, UCL Interaction Centre, University College London Bob Fields & Suzette Keith, Interaction Design Centre, Middlesex University
Email:	A.Blandford@ucl.ac.uk, {S.Keith, B.Fields}@mdx.ac.uk
Duration:	Full-day
Level:	Introductory / intermediate
Expected audience:	Medium (25-30)

Description: This one-day tutorial is an introduction to usability evaluation for Digital Libraries. Digital libraries are notoriously difficult to design well in terms of their eventual usability. In this tutorial, we will present an overview of usability issues affecting the users' interaction within digital libraries, including the inherent complexity of the information seeking activity. We will introduce Claims Analysis, an established approach which focuses on the designers' motivations and reasons for making particular design decisions and examines the effect on the user's interaction with the system. The general approach, as presented by Carroll and Rosson (1992), has been tailored specifically to the design of digital libraries. The presenters have applied this approach successfully in evaluations of various features of a commercial digital library in collaboration with the developers of that library. Through a graduated series of worked examples, participants will get hands-on experience of applying this approach to developing more usable digital libraries. This tutorial assumes no prior knowledge of usability evaluation, and is aimed at all those involved in the development and deployment of digital libraries.

Biographies of Presenters

Dr. Ann Blandford is a Senior Lecturer in UCL Interaction Centre, where she teaches cognitive and social aspects of HCI and usability evaluation methods on the MSc in HCI with Ergonomics. Previously, she was Director of Research in Computing Science at Middlesex University, where she taught various courses on HCI. She has presented tutorials on HCI topics at HCI’98, CHI’99, and EUPA 2002. All three tutorials were highly rated by participants. She leads several projects investigating usability of digital libraries, covering social and technical aspects of usability and user acceptance as well as the approaches being presented in this tutorial. She co-chaired a successful workshop on usability of digital libraries at JCDL’02, and is currently co-editing a special issue of the Journal of Digital Libraries on this topic.

Dr. Bob Fields is a Senior Lecturer in the School of Computing Sciences at Middlesex University. He has extensive experience of HCI evaluatio methods, and of teaching HCI. He is Principal Investigator on the project that has developed and tested the approach being presented in this tutorial.

Suzette Keith has previous experience of working on user interface design issues with software developers in a number of commercial organisations. She is the researcher on the project that has developed and tested the approach being presented in this tutorial. She has worked closely with library and other staff at BT in the process of developing and testing the Claims Analysis approach.

Tutorial 2

Title:	Overview of Digital Libraries
Presenters:	Edward A. Fox, Dept. of Computer Science, Virginia Tech
Email:	fox@vt.edu
Duration:	Full-day
Level:	Introductory / intermediate
Expected audience:	20-100

Description: This tutorial will start with an overview of definitions, foundations, scenarios and perspectives. It will cover a variety of issues,
including:

search, retrieval and resource discovery;
multimedia/hypermedia;
metadata (e.g., Dublin Core);
electronic publishing; SGML and XML;
document models and representations;
database approaches;
2D and 3D interfaces and visualizations;
architectures and interoperability (e.g., OAI); metrics;
educational (e.g., CSTC, NSDL, NDLTD) and social concerns.

Case studies of projects, initiatives, and systems will illustrate key concepts, including:

Computer Science Teaching Center (http://www.cstc.org/)
National STEM (Science, Technology, Engineering, and Mathematics)
education Digital Library (NSF NSDL, http://www.nsdl.org)
Networked Digital Library of Theses and Dissertations
(http://www.ndltd.org/)
Open Archives Initiative (http://www.openarchives.org/,
http://www.dlib.vt.edu/projects/OAI/)
Systems and approaches to building digital libraries (MARIAN/5SLgen,
ODL)

Biography of Presenter

Dr. Edward A. Fox holds a Ph.D. and M.S. in Computer Science from
Cornell University, and a B.S. from M.I.T. Since 1983 he has been at
Virginia Polytechnic Institute and State University (VPI&SU, also called
Virginia Tech), where he serves as Professor of Computer Science. He
directs the Digital Library Research Laboratory, the Internet Technology
Innovation Center at Virginia Tech, and varied projects (e.g.,
www.ndltd.org, www.citidel.org). He is chair of the NSDL (www.nsdl.org)
Policy Committee. He was general chair of the First ACM/IEEE Joint
Conference on Digital Libraries and program chair of ACM Digital
Libraries'96 and '99. He is co-editor-in-chief of the ACM Journal of
Educational Resources in Computing (JERIC) and serves on the editorial
boards of a number of journals. He has been involved in a number of
digital library efforts including TULIP, NCSTRL, NSDL, and Open Archives
Initiative. He served as vice-chair and then chair of ACM SIGIR over the
period 1987-95. He has authored or co-authored many publications in the
areas of digital libraries, information storage and retrieval,
hypertext/hypermedia/multimedia, computational linguistics, CD-ROM and
optical disc technology, electronic publishing, and expert systems.

Morning Tutorials, May 27, 2003

Tutorial 3

Title:	How to Build a Digital Library Using Open-Source Software
Presenters:	Ian H. Witten, Dept. of Computer Science, University of Waikato
Email:	ihw@cs.waikato.ac.nz
Duration:	Half-day (paired with Build Your Own Digital Library Collections: Hands-On Session)
Level:	Intermediate

Description: This tutorial describes how to build your own digital library using the Greenstone digital library software, a comprehensive, open-source system for constructing, presenting, and maintaining information collections. Collections built automatically include effective full-text searching and metadata-based browsing facilities that are attractive and easy to use. They are easily maintainable and can be rebuilt entirely automatically. Searching is full-text, and different indexes can be constructed (including metadata indexes). Browsing utilizes hierarchical structures that are created automatically from metadata associated with the source documents. Collections can include text, pictures, audio, and video, formed using an easy to use tool called the Collector. Documents can be in any language: Chinese and Arabic interfaces exist. Although primarily designed for Web access, collections can be made available, in precisely the same form, on CD-ROM or DVD. The system is extensible: software "plugins" accommodate different document and metadata types.

The Greenstone software runs under Unix, Windows and Mac (OS/X), and is issued as source code under the GNU public license. Attendees will learn enough to download the software and set up a digital library system. Those with programming skills should be able to extend and tailor the system extensively.

Biography of Presenter

Ian H. Witten is Professor of Computer Science at the University of Waikato in New Zealand, and directs the New Zealand Digital Library project (where the Greenstone software originates). He has published widely in the areas of digital libraries, data compression, information retrieval, and machine learning. He is co-author of "Managing Gigabytes: Compressing and indexing documents and images" (1999), "Data mining: Practical machine learning tools and techniques with Java implementations" (2000), and "How to build a digital library" (2003), as well as many journal articles and conference papers. He is a Fellow of the ACM and of the Royal Society of New Zealand, and a member of professional computing, information retrieval, and engineering associations in the UK, USA, Canada, and New Zealand.

Tutorial 4

Title:	Introduction to the Open Archives Initiative Protocol for Metadata Harvesting
Presenters:	Timothy W. Cole, Mathematics Library, University of Illinois at UC William H. Mischo, Grainger Engineering Library, University of Illinois at UC Thomas G. Habing, Grainger Engineering Library, University of Illinois at UC
Email:	t-cole3@uiuc.edu, w-mischo@uiuc.edu, thabing@uiuc.edu
Duration:	Half-day (paired with Advanced Tutorial on Open Archives Initiative)
Level:	Introductory
Expected audience:	25

Description: Implementations of the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) continue to diversify and increase in number worldwide. Community initiatives such as NSDL, NDLTD, the Open Archives Forum, OLAC, and the IMLS Collection Registry and Metadata Repository project rely heavily on the protocol. OAI-PMH-based features are now incorporated in or have been announced for a wide range of digital library and digital content management products including GNU EPrints, Contentdm, MIT's DSpace, the University of Michigan's DLXS, and Endeavor's ENCompass. Publishers are beginning to provide subscriber access to OAI harvestable metadata (e.g., the Institute of Physics). This introductory tutorial will place the protocol in context and give participants an up-to-date introduction to protocol details. The tutorial will introduce participants to the basics of OAI-PMH architecture and operation, describe what is required to implement the protocol (both as a metadata provider and a metadata harvesting service provider), discuss common issues and problems that arise with basic implementations, and provide pointers to additional, more in-depth tools and resources. Examples and illustrations will be drawn from real-world implementations developed by the presenters. The tutorial will focus on requirements of and experience with the 2.0 version of the protocol (June 2002) and will cover the latest developments including the recent addition of an OAI-PMH Static Repository sub-protocol. This introductory tutorial is being developed in coordination with and is designed to complement the half-day Advanced OAI tutorial intended for advanced practitioners already familiar with the basic architecture and operation of OAI-PMH.

Biographies of Presenters

Timothy W. Cole is Mathematics Librarian and Associate Professor of Library Administration at the University of Illinois at Urbana-Champaign. He is currently principal investigator for the UIUC's OAI Metadata Harvesting Project (2001 – 2003, funded by the Andrew W. Mellon Foundation) and for an IMLS-funded project (2002 – 2005) that will use OAI-PMH to create an item-level metadata repository for metadata describing content digitized under the auspices of the IMLS National Leadership Grant program. He has been active in UIUC digital library projects since 1994, holding the position of Systems Librarian for Digital Projects from 1994-1999. He has published and presented on the use of OAI-PMH for diverse kinds of metadata including metadata describing both cultural heritage resources and digital resources in engineering and mathematics. He is a member of the OAI Technical Committee and is vice-chair of the NSDL Technology Standing Committee.

William H. Mischo is Engineering Librarian and Professor of Library Administration at the University of Illinois at Urbana-Champaign. Before joining the UIUC Library in 1982, he was at OCLC, Inc. and Iowa State University. He was the University Library's principal investigator for the National Science Foundation (NSF) Digital Library Initiatives (DLI-I) grant awarded to UIUC (1994-1998) and for a follow-on CNRI D-Lib Test Suite project that ended in August 2001. He is currently the PI on the Second Generation Digital Mathematics Resources with Innovative Content for Metadata Harvesting and Courseware Development project, an NSF-NSDL (National Science, Engineering, Technology, and Mathematics Digital Library) grant awarded to UIUC in September 2002. Bill has published over 45 articles in library and information science journals and conference proceedings. He has presented at numerous conferences, most recently at the 2002 International Conference on Digital Archive Technologies in Taipei, Taiwan – and gave the keynote address, entitled "XML Technologies and Scholarly Communication," at The XML Workshop for Electronic Journals held March 9, 2001 in Tokyo, Japan. In 2001, Bill received the Homer I. Bernhardt Distinguished Service Award from the American Society for Engineering Education Engineering Libraries Division.

Thomas G. Habing is a Research Programmer at the Grainger Engineering Library Information Center where for the past five years he has worked on various digital library projects including UIUC's OAI Metadata Harvesting Project (2001-2003, funded by the Andrew W. Mellon Foundation), UIUC's D-Lib Test Suite project funded by CNRI (1998 - 2001), and UIUC's Digital Library Initiatives I project (1994 1998, funded by NSF). He is currently providing programming support for the University Library's NSF-NSDL and IMLS grant projects. Prior to returning to the Midwest in 1997, Tom was a Senior Computing Methods and Technology Engineer for The Boeing Company in Seattle, Washington, where he had been employed since 1986 (with a two year break to complete a Masters degree) doing systems analysis, programming, and graphical user interface design.

Tutorial 5

Title:	Introduction to Georeferencing in Digital Libraries
Presenters:	Linda Hill, Alexandria Digital Library Project, Department of Geography, University of California, Santa Barbara Michael Freeston, Alexandria Digital Earth Prototype Project, Department of Computer Science, University of California, Santa Barbara
Email:	lhill@alexandria.ucsb.edu, freeston@alexandria.ucsb.edu
Duration:	Half-day (paired with How to Build a Geospatial Digital Library)
Level:	Introductory
Expected audience:	30

Description: Georeferencing is relating information (e.g., documents, datasets, maps, images, biographical information) to geographic locations through placenames (i.e., toponyms) and place codes (e.g., postal codes) or through geospatial referencing (e.g., longitude and latitude coordinates). The digital library perspective toward georeferencing is a blend of the focus of Geographic Information Systems (GIS) on geospatial coordinates, data layers, and mapping; of map librarianship; and of the traditional library focus on textual representation of location using placenames, administrative unit hierarchies, and other textual forms of spatial reference. This tutorial is not about GIS and it is not about cartography or map libraries; it is instead about the application of georeferencing to all types of information and the integration of geospatial description, searching, and evaluation into digital library practices. It is more about integrating geospatial description and capabilities into general digital library practices than about establishing special collections or issues of cartographic presentation. More broadly, it is about spatial literacy, meaning the ability to interpret problems and their solutions in spatial terms.

It is a powerful concept to think about designing library systems so that users can find all types of information about a location simply by identifying that location on a map or a georeferenced image, without supplying all of the relevant placenames or knowing the coordinates for the area. Map browsers provide not only a query interface but also an evaluation pallet where the spatial coverage of a set of objects can be visualized and the user can see how the different objects are spatially related. When the geospatial “aboutness” of text documents can be determined by an analysis of place references and conversion of those references to coordinates, a vast store of information can be related to place and to the maps and remote sensing images and datasets that are also about that place.

This tutorial covers the broad scope of georeferencing, including an overview of types of georeferenced objects and their characteristics; fundamental concepts of geospatial referencing; georeferencing structures of metadata standards (MARC, FGDC, Dublin Core, and more); gazetteers and their role in translating between textual and geospatial location referencing; supporting database architectures; and geospatial matching in information retrieval. In the process, the major information management standards for geospatial description, retrieval, interoperability, and information exchange will be identified. The tutorial is based on the experience of the presenters with the Alexandria Digital Library (ADL) Project at the University of California, Santa Barbara, but is not intended to be about the ADL itself but rather about the broader principles and practices of georeferencing in digital libraries.

Biographies of Presenters

Linda L. Hill has been a research specialist with the Alexandria Digital Library Project at the University of California, Santa Barbara for six years. She received her Ph.D. in Library Science from the University of Pittsburgh in 1990. Her dissertation was on Access to Geographic Concepts in Online Bibliographic Files. This research interest was generated by previous work as a corporate librarian and as assistant director for an indexing and abstracting service for the exploration and production segment of the petroleum industry. With the ADL Project, she has been active in the development of the ADL Gazetteer and in the development and evaluation of the ADL's georeferenced digital library system. She has participated in an advisory capacity in the development of the Federal Geographic Data Committee's (FGCD) Content Standard for Digital Geospatial Metadata, the ISO's TC 211 Geographic Metadata Standard, and the metadata design for the Digital Library for Earth System Education (DLESE). She has also been an active participant of the Networked Knowledge Organization (NKOS) group and engaged in thesaurus development.

Michael Freeston brings thirty years of experience in academic and industrial computer science to his current position as Project Coordinator for the Alexandria Digital Earth Prototype Project. He specializes in database and knowledge-base systems and is best known in the international database community for his work on multi-dimensional indexing methods. For the last six years, he has been associated with the Alexandria Digital Library Project at the University of California, Santa Barbara, dividing has time between 1999 and 2001 with Aberdeen University in Scotland where he was a Professor of Information Science. He is also Principal Investigator of a new international project funded jointly by the US National Science Foundation and the UK Joint Information Systems Committee to develop digital library support for novel approaches to learning and teaching with Information Technology. He is a frequent reviewer of research proposals for both the US National Science Foundation and the European Commission, and holds a Visiting Professorship in the Department of Electronics and Computer Science at Southampton University. His current personal research focus envisions digital libraries as the foundation of a global information infrastructure (GRID) supporting the next generation of the World Wide Web.

Tutorial 6

Title:	Thesauri and Ontologies in Digital Libraries I
Presenters:	Dagobert Soergel, College of Information Studies University of Maryland, College Park, MD
Email:	ds52@umail.umd.edu
Duration:	Half-day (paired with Thesauri and Ontologies II)
Level:	Introductory

Description: This introductory tutorial is intended for anyone concerned with subject access to digital libraries. It provides a bridge by presenting methods of subject access as treated in an information studies program for those coming to digital libraries from other fields. It will elucidate through examples the conceptual and vocabulary problems users face when searching digital libraries. It will then show how a well-structured thesaurus / ontology can be used as the knowledge base for an interface that can assist users with search topic clarification (for example through browsing well-structured hierarchies and guided facet analysis) and with finding good search terms (through query term mapping and query term expansion — synonyms and hierarchic inclusion). It will touch on cross-database and cross-language searching as natural extensions of these functions. The workshop will cover the thesaurus structure needed to support these functions: Concept-term relationships for vocabulary control and synonym expansion, conceptual structure (semantic analysis, facets, and hierarchy) for topic clarification and hierarchic query term expansion). It will introduce a few sample thesauri and some thesaurus-supported digital libraries and Web sites to illustrate these principles.

Biography of Presenter: Dagobert Soergel holds an MS equivalent in mathematics and physics (1964) and a PhD in political science (1970), both from the University of Freiburg, Germany. He is Professor of Information Studies, University of Maryland, where he teaches courses in information retrieval, thesaurus development, expert systems, and information technology, and an information systems consultant. He has been a visiting professor at the universities of Western Ontario, Chicago, and Konstanz, Germany. Among other books, he has authored Organizing Information (1985), which received the American Society of Information Science Best Book Award, Indexing Languages and Thesauri. Construction and Maintenance (1974) and numerous papers. He has designed several thesauri, most recently the Alcohol and Other Drug Thesaurus (for which he chairs the advisory committee) and the Harvard-Stanford Business Thesaurus (under development). He is developing TermMaster, a thesaurus management software package. In 1997 he received the American Society of Information Science Award of Merit.

Tutorial 7

Title:

Social Change, Digital Libraries, and the Inquiry Page

[CANCELLED]

Tutorial 8

Title:	XML
Presenters:	David Durand, Ingenta/Scholarly Technology Group & Department of Computer Science, Brown
Email:	david.durand@prov.ingenta.com
Duration:	Half-day (paired with XSL)
Level:	Introductory to intermediate

XML is a core document technology that is starting to revolutionize the way data and documents are managed and produced on the web and elsewhere by bringing years of text-encoding experience to bear on document representation and management. This course provides an introduction to the philosophy and technical details of XML. This course will comprise a technical overview of the language itself, as well as information about its applications (especially as they relate to document management and hypertext). We will discuss document analysis, and the place of XML in the larger group of related standards efforts, such as Xlink, RDF, XSL, and DOM.

The course is targeted at anyone who thinks they need a technical and strategic overview of XML.

Biography of Presenter: David Durand is VP software architecture at Ingenta plc, Chief Scientist at the Scholarly Technology Group, and Adjunct Associate Professor at Brown's Department of Computer Science. Dr. Durand is a co-author, with Steve DeRose, of "Making Hypermedia Work: A User's Guide to HyTime." He has taken part in the Text Encoding Initiative, XML, HyTime, XLink and WebDAV standards efforts, and has been working with and on structured document representations and hypertext for the last 20 years, in academic and industrial contexts. This year he is teaching an experimental course in Document Engineering at the Department of Computer Science at Brown.

Afternoon Tutorials, May 27

Tutorial 9

Title:	Build Your Own Digital Library Collections: Hands-On Session
Presenters:	Ian H. Witten, Dept. of Computer Science, University of Waikato David Bainbridge, Dept. of Computer Science, University of Waikato
Email:	ihw@cs.waikato.ac.nz, davidb@cs.waikato.ac.nz
Duration:	Half-day (paired with How to Build a Digital Library Using Open-Source Software)
Level:	Intermediate

Description: This is a hands-on laboratory-style workshop that follows on from the tutorial "How to build a digital library using open-source software". Attendees will first install the basic Greenstone system (described in the former tutorial) on their own computer. Then they will learn how to personalize its appearance, how to built their own collections, and how to take advantage of advanced features such as interactive phrase browsing. The primary goal is to enable attendees to create a collection of their own material that they bring along to the workshop, and leave the workshop with that collection (and others) installed in a digital library system on their own computer.

The workshop will begin with a brief software installation session from self-installing CD-ROMs that the instructors will bring along for the purpose (precisely the same software is available over the Web). Then the appearance of the digital library home page will be discussed, and attendees will learn to alter this by writing HTML that includes Greenstone macros.

In the main part of the workshop attendees will learn how to build their own collections, and experiment with the options available. The two instructors will provide individual advice and assistance, identifying common problems and presenting them to the class as a whole on an ad hoc basis.

At the end of the workshop, there will be an opportunity for any attendees who wish to to give a brief presentation of their collection to the class, using a computer projector (to be provided by the conference organizers).

FACILITIES

Attendees are expected to bring:

1. Laptop PC running Windows, Linux, or Mac OS/X

Must have a reasonable amount of available disk space. OS/X users must have previously installed GDBM.

2. A substantial collection of documents in forms such as:

HTML
Word
PDF
PostScript
JPEG or GIF images
E-Mail
Other formats on a try-it-and-see basis

Attendees must have attended the associated tutorial "How to build a digital library using open-source software", either at this conference or previously.

Greenstone includes a collection-building facility that is designed for end users: no particular computing skills are necessary to use this. Some familiarity with HTML is necessary to personalize the Greenstone home page. To access more advanced facilities (which is optional), attendees should be familiar with executing programs, particularly PERL programs, from the DOS prompt.

Biographies of Presenters

Ian H. Witten (see above)

David Bainbridge is a faculty member in Computer Science at the University of Waikato, New Zealand. An active member of the New Zealand Digital Library project, he has worked with several United Nations Agencies, the BBC and various public libraries. He holds a PhD from the University of Canterbury, New Zealand, where he studied the problem of optical music recognition as a Commonwealth Scholar. Since moving to Waikato in 1996 he has continued and broadened his interest in computer music research, which has received international press and TV coverage and was co-recipient of the Digital Libraries Vannevar Bush award in 1999. David has also worked as a research engineer for Thorn EMI in the area of photo-realistic imaging and graduated from the University of Edinburgh in 1991 as the class medallist in Computer Science. He is co-author of "How to build a digital library" (Morgan Kaufmann, 2003).

Tutorial 10

Title:	Advanced Tutorial on Open Archives Initiative
Presenters:	Michael L. Nelson, Dept. of Computer Science, Old Dominion University Simeon Warner, Digital Libraries Group in Computing and Information Science, Cornell University Herbert Van de Sompel, Los Alamos National Laboratory
Email:	mln@cs.odu.edu, simeon@cs.cornell.edu, herbertv@lanl.gov
Duration:	Half-day (paired with Introduction to the Open Archives Initiative Protocol for Metadata Harvesting)
Level:	Intermediate/Advanced

Description: This tutorial is based on the successful tutorial of the same name given at JCDL 2002 and is a follow-on to "Introduction to the Open Archives Initiative Protocol for Metadata Harvesting" (OAI-PMH), to be given earlier the same day (the Introductory Tutorial is being submitted by Tim Cole (UIUC) and we are coordinating regarding the respective contents). It is appropriate for those who have completed the earlier tutorial or are already familiar with OAI-PMH. The tutorial will begin by highlighting the differences between versions 1.1 and 2.0 of the OAI-PMH. Possible migration strategies for 1.1 harvesters and repositories and techniques for mixed version harvesting will be discussed. Advanced topics and deployment scenarios will also be discussed, including: flow control, load balancing, error recovery, hierarchical harvesting, sets and alternate metadata formats.

Biographies of Presenters

Michael L. Nelson received his B.S. in computer science from Virginia Tech in 1991, and his M.S. and Ph.D. in computer science from Old Dominion University in 1997 and 2000. He worked at NASA Langley Research Center from 1991 - 2002, originally in distributed and parallel computing and then shifting to WWW and digital libraries in 1993. He was a visiting assistant professor at the School of Information and Library Science at the University of North Carolina at Chapel Hill for the 2000-2001 academic year. He joined the department of computer science at Old Dominion University in July 2002. Michael is a member of the OAI technical committee.

Simeon Warner is one of the maintainers and developers of the arXiv e-print archive (http://arXiv.org/). He is a member of the Digital Libraries Group in Computing and Information Science at Cornell University. Before that he worked at Los Alamos National Laboratory and the Physics Department at Syracuse University. He has implemented and maintains an OAI interface for arXiv, along with test-suite of harvesting software. Simeon is a member of the OAI technical committee

Herbert Van de Sompel graduated in mathematics and computer science at Ghent University, and in 2000, obtained a Ph.D. from Ghent University for his research on dynamic and context-sensitive reference linking, now commonly known as the OpenURL framework. From 1982 to 1998 he worked as Head of Library Automation at Ghent University. While at the Los Alamos National Laboratory in 1999, Herbert started the Open Archives Initiative with Paul Ginsparg and Rick Luce. With Carl Lagoze, Herbert forms the executive committee of the OAI; he also is on the technical and the steering committees. Herbert was a Visiting Professor in Computer Science at Cornell University in 2000-2001. Afterwards, he was the Director of e-Strategy and Programmes at the British Library. Currently, he is a digital library researcher at the Los Alamos National Laboratory.

Tutorial 11

Title:	How to Build a Geospatial Digital Library [CANCELLED]
Presenters:	Gregory Janée, Department of Computer Science, University of California, Santa Barbara Rudolf Nottrott, Department of Computer Science, University of California, Santa Barbara James Frew, Donald Bren School of Environmental Science and Management, University of California, Santa Barbara Catherine Masi, Map and Imagery Laboratory (MIL), University of California, Santa Barbara
Email:	gjanee@alexandria.ucsb.edu, RNott@alexandria.ucsb.edu, frew@bren.ucsb.edu, masi@library.ucsb.edu
Duration:	--
Level:	Intermediate to advanced
Expected audience:	20

Description: This tutorial will be of interest to individuals or institutions with geospatial digital content which they would like to publish for structured search and retrieval over the Web. The tutorial is based on software developed by the Alexandria Digital Library Project (ADL), which facilitates the creation and management of distributed digital library collections. ADL collections can operate stand-alone for use by individual users, or optionally and seamlessly switch into a distributed mode for web-based information sharing and publication. Geospatial collections are typically heterogeneous in content and can span items as diverse as maps, historical photographs, field data, remotely sensed images or archeological data. The ADL software allows structured search and retrieval on such heterogeneous data collections, combining the simplicity of Dublin Core with the specificity of a full Boolean query language. The aim of the tutorial is to familiarize participants with the overall technology and with the specific procedures and software involved in setting up a stand-alone or distributed ADL node. As a case study, we will focus on a collection of USGS Digital Raster Graphics (DRG) maps. However, the technology we present is much more general: it can be applied to collections of any georeferenced library objects and, further, to collections of any objects to which a structured discovery technique can be applied. Based on Open Source components and open protocol standards (including Java,Tomcat, XML, JDBC, SQL), the ADL software is freely available and can be installed on all common software and hardware platforms.

Biographies of Presenters

James Frew (http://www.esm.ucsb.edu/fac_staff/fac/frew/) is an Assistant Professor in the Donald Bren School of Environmental Science and Management at the University of California, Santa Barbara (UCSB), and a principal investigator in UCSB's Institute for Computational Earth System Science (ICESS). His research is centered on applications of computing technology to environmental science, particularly involving digital geolibraries and Earth science workflow management. Dr. Frew currently leads the Earth System Science Workbench project, part of NASA's Federation of Earth Science Information Partners (ESIPs). He is a co-PI on the Alexandria Project (part of NSF's Digital Libraries Initiative), where he directs the development of the Alexandria Digital Earth Prototype (ADEPT) testbed system. Dr. Frew also serves on the National Academy of Science's Committee on Earth Science Data Utilization (CESDU), and as consultant to NASA's Strategic Evolution of ESE Data Systems (SEEDS) activity.

Gregory A. Janée (http://alexandria.sdc.ucsb.edu/~gjanee/) is technical leader of the Alexandria Digital Library project and principal author of the library software.

Rudolf W. Nottrott (http://alexandria.ucsb.edu/~rnott/) brings twenty-five years of experience in academic, environmental and commercial information technology to his present position as software engineer with the Alexandria Digital Earth Prototype (ADEPT) project. His present work focuses on the implementation of distributed digital collections based on the Alexandria Digital Library (ADL) technology, and related software development. Prior to joining the ADEPT group, his work focused on information systems development for the University of California Natural Reserve System, and development of software tools for distributed data management at the National Center for Ecological Analysis and Synthesis (NCEAS). He founded the network information systems of the U.S. Long-Term Ecological Research Network (LTER) and served as the networks information systems manager from 1989 to 1997.

Catherine Masi joined the Map and Imagery Laboratory (MIL) in May, 1998. Her previous professional experience includes systems programming and project management at Transamerica Corporation as well as business application programming at the UCLA Financial Aid Office. She earned her Master's in Library and Information Science from UCLA in 1997 and is now working with metadata creation and analysis while also coordinating the ADL project team. She has also provided data processing support to the Evaluation Team for statistical analysis of user registration and session log data.

Tutorial 12

Title:	Open Content Licenses and Copyright
Presenters:	Chris Kelty
Email:	ckelty@rice.edu
Duration:	Half-Day
Level:	Beginner to intermediate

Description: This tutorial covers the use of "open content licenses" (similar to the "open source" licenses used in software) for digital and online works. It covers the available licenses and the justifications and criteria for their use, with a special focus on the work of the non-profit organization Creative Commons. It will address issues related to copyright and credit that are likely to arise and touch on some of the historical, political and cultural issues of open licensing. This tutorial will not cover the use of so called "End User License Agreements" and is not intended to provide legal advice.

Presented Biography

Christopher Kelty teaches in the Anthropology department at Rice University. He is trained in the social and historical study of science and technology and has researched open source and free software communities around the world.

Tutorial 13

Title:	Thesauri and Ontologies in Digital Libraries II
Presenters:	Dagobert Soergel, College of Information Studies University of Maryland, College Park, MD
Email:	ds52@umail.umd.edu
Duration:	Half-day (paired with Thesauri and Ontologies I)
Level:	Intermediate

Description: This tutorial is intended for people who have a basic familiarity with the function and structure of thesauri and ontologies. It will introduce criteria for the design and evaluation of thesauri and ontologies and then deal with methods and tools for their development: Locating sources; collecting concepts, terms. and relationships to reuse existing knowledge; developing and refining thesaurus/ontology structure; software and database structure for the development and maintenance of thesauri and ontologies; standards such as RDF and TopicMaps; collaborative development of thesauri and ontologies; developing crosswalks / mappings between thesauri/ontologies. In summing up, the tutorial will address the question of the amount of resources needed to develop and maintain a thesaurus or ontology.

Biography of Presenter

Please see above.

Tutorial 14

Title:	XSL
Presenters:	David Durand
Email:	david.durand@prov.ingenta.com
Duration:	Half-Day (paired with XML)
Level:	Intermediate

Description:XSLT (the W3C's XML Style and Transformation language) is becoming a key tool in the XML implementor's toolbox. XSLT provides an XML processing language tuned to the creation of XML transformations, and transformation is a key technique in building most document systems. The course is targeted towards anyone who knows something about XML and wants to see how it can be manipulated and processed in useful ways. In addition to pure transformations, we will take a look at XSL:FO (XSL Formatting Objects), a set of XML objects used to produce formatted documents for printing or web delivery.

Biography of Presenter

Please see above.

Full Day Tutorial: Saturday, May 31, 2003

Tutorial 15

Title:	Audio/Video Digital Libraries: designing, searching for documents, and generating Metadata
Presenters:	Giuseppe Amato, ISTI-CNR Claudio Gennaro, ISTI-CNR Pasquale Savino, ISTI-CNR
Email:	G.Amato@iei.pi.cnr.it, C.Gennaro@iei.pi.cnr.it, P.Savino@iei.pi.cnr.it
Duration:	Full Day
Level:	Introductory-Intermediate
Expected audience:	50

Description: The aim is to provide a theoretical and experimental background on the techniques and the methodologies for the organization, creation, and management of an Audio/Video Digital Library (A/V DL). The frontier of DLs consists in the possibility of managing multimedia documents other than pure textual information. In particular, due to the large amount of A/V material that is available in a digital form and due to its economic, environmental, health, cultural, and social significance, the management of A/V digital libraries is becoming of crucial importance. The course will illustrate the techniques and the methodologies to design, build and maintain an A/V DL, along with fundamentals on techniques for generating and searching metadata. Extensive examples will be offered of existing systems and approaches. As a running example we will refer to the ECHO system, which provides a DL service for historical films. It indexes and retrieves the A/V material by using speech transcripts, and enables video features to be automatically extracted from the video and metadata manually associated by the user. Metadata are described by using an A/V metadata model based on the IFLA-FRBR standard.

Biographies of Presenters

Pasquale Savino (born 1955) graduated in Physics from the University of Pisa, Italy, in 1979. From 1983 to 1995 he has worked at the Olivetti Research Labs in Pisa; since 1996 he has been a member of the research staff at CNR-IEI in Pisa, working in the area of multimedia information systems and Digital Libraries. He has participated and coordinated several EU-funded research projects in the multimedia area and Digital Libraries, among which MULTOS (Multimedia Office Systems), OSMOSE (Open Standard for Multimedia Optical Storage Environments), MALIBU (Multimedia And distance Learning In Banking and Business environments), HYTEA (HYperText Authoring), M-CUBE (Multiple Media Multiple Communication Workstation), MIMICS (Multiparty Interactive Multimedia Conferencing Services), HERMES (Foundations of High Performance Multimedia Information Management Systems). Currently he is the coordinator of the EU project ECHO (European Chronicle On line), aiming at the design, development and experimental use of an A/V DL for historical documentaries. He has given a tutorial on "Interactive Multimedia'' at the 4th International Conference on Human-Computer Interaction and (jointly with C.Meghini and F.Sebastiani) a tutorial on "Multimedia Information Retrieval'' at the First European Conference on Research and Advanced Technology for Digital Libraries, Pisa (1997) and with F. Sebastiani the same tutorial at the Second European Conference on Research and Advanced Technology for Digital Libraries, Crete (1998). He has published scientific papers in many international journals and conferences in the areas of multimedia document retrieval, information retrieval and digital libraries.

Giuseppe Amato (born 1968) graduated in Computer Science from the University of Pisa, Italy, in 1992 and received his PhD in Computer Science from the University of Dortmund, Germany, in 2002. In 1994 he was a member of the research staff at CNR-CNUCE in Pisa, working in the area of object-oriented databases and persistent programming languages. Since 1995 he has been a member of the research staff at CNR-IEI in Pisa, working in the area of Multimedia Information Systems and Digital Libraries. He has participated in several EU-funded research actions in the areas of object-oriented databases and content-based retrieval of multimedia data. He was involved in the EU-funded ESPRIT projects FIDE-2 (object oriented persistent programming languages), HERMES (multimedia information systems), and EUROGATHERER (filtering systems). Currently he is involved in the EU-funded project ECHO, whose aim is to build a Digital Library of historical documentary films. He has given a full day tutorial on “Information retrieval and Web Search Engines”, at the Eighth International World Wide Web Conference, May 11-14 1999 in Toronto – Canada. (jointly with F.Sebastiani) His main research interests are content-based retrieval of multimedia documents, access methods for similarity search of multimedia documents, metadata for multimedia documents, multimedia digital libraries, XML databases. He has published several papers in journals and conferences in these areas.

Claudio Gennaro (born 1968) received the Laurea degree in Electronic Engineering from University of Pisa in 1994 and the PhD degree in Computer and Automation Engineering from Politecnico di Milano in 1999. His PhD studies were in the field of Performance Evaluation of Computer Systems and Parallel Applications. Since 1999 he has been a member of the research stuff at CNR-IEI in Pisa, working in the area of Multimedia Information Systems and Digital Libraries. He has participated in many research projects funded by the EU. He was involved in the TRACS n. 6373 Flexible Real time Environment for Traffic Control System, HPCC/SEA (High Performance Parallel Computing/Software Engineering and Application, EUREKA 1063). Currently he is involved in the EU-funded project ECHO (European Chronicle On line), aiming at the design, development and experimental use of an Audio/Video digital library for historical documentaries. His current main research interests are performance evaluation, similarity retrieval, storage structures for multimedia information retrieval, multimedia document modelling, metadata for multimedia documents and digital libraries. He has published several papers in journals and conferences in these areas.

Morning Tutorial, May 31, 2003

Tutorial 16

Title:	SRW (Search Retrieve WebService): Z39.50 Next Generation
Presenters:	Matthew J. Dovey, Oxford University Computing Service Robert Sanderson, University Of Liverpool Ralph LeVan, OCLC Online Computer Library Center, Inc.
Email:	matthew.dovey@oucs.ox.ac.uk, azaroth@liverpool.ac.uk
Duration:	Half-Day
Level:	Intermediate

Description: SRW is the "Search/Retrieve Web Service" protocol, which aims to integrate access to various networked resources, and to promote interoperability between distributed databases, by providing a common utilization framework. Typical applications include library systems and digital library systems. SRW is a web-service-based protocol whose underpinnings are formed by bringing together more than 20 years experience from the collective implementers of the Z39.50 Information Retrieval protocol with recent developments in the web technologies arena. SRW features both SOAP and URL-based access mechanisms to provide for a wide variety of possible clients ranging from Microsoft's .Net initiative to simple Javascript and XSLT transformations. SRW 1.0 was released in November 2002 having been developed by an international team. The tutorial covers developing clients and servers for this WebService under a number of environments including Java, .NET, Perl and Python.

Biographies of Presenters

Matthew Dovey is currently Technical Manager at the Oxford University e-Science Centre where he advises scientific research projects based on WebService and GridService architectures. Prior to this position he worked for the Oxford University Library Services implementing numerous library and digital library technologies including Z39.50, ISO ILL, 3M SIP and NCIP. He has also worked on a number of national and internation projects including JAFER (a JISC funded Java Z39.50/XML toolkit), OMRAS (a JISC//NSF project on Music Information Retrieval) and CEDARS (a JISC project on preservation of digital material).

Robert Sanderson is the lead developer in the United Kingdom for the Cheshire 2 Information Retrieval System, an SGML/XML aware search engine providing access through a range of protocols from Z39.50 to SRW. Born in Canterbury, New Zealand, he has just completed his interdisciplinary PhD in Medieval French and Computer Science at the University of Liverpool.

Ralph LeVan has been in the bibliographic retrieval business since he left college. He spent ten years working at SDC (later Burroughs and Unisys) ORBIT. With the sale of ORBIT to Robert Maxwell, he chose to join OCLC and has been there for the last 16 years. During that time he led the development of the retrieval software that underlies the FirstSearch service (Newton). With the successful launch of FirstSearch, attention was shifted to database interfaces, the most significant of those being Z39.50. He continue to be active in the Z39.50 Implementors Group (ZIG). Moving up the interface stack, he led the development of OCLC's Z39.50-to-Web gateway (WEBZ), which became the interface development environment for FirstSearch. WebZ, combined with the Newton database software, became the SiteSearch product. Most recently, he hasproduced a second generation of database software (Pears), which is available as Open Source from OCLC's Office of Research. Recapitulating previous work, he is looking again at database interfaces and is working on the initiative to implement Z39.50 as a Web Service (SRW).

last modified 3/18/2003 by JCDL2003@rice.edu