Tutorials and Workshops

Tutorials - June 21st

Introduction to Digital Libraries

Full Day

Organizer: Edward Fox. 

Abstract: This tutorial is a thorough and deep introduction to the DL field providing a firm foundation: covering key concepts and terminology, as well as services, systems, technologies, methods, standards, projects, issues, and practices. It introduces and builds upon a firm theoretical foundation (starting with the '5S' set of intuitive aspects: Streams, Structures, Spaces, Scenarios, Societies), giving careful definitions and explanations of all the key parts of a 'minimal digital library', and expanding from that basis to over key DL issues. Illustrations will come from a well-chosen set of case studies. Attendees will be exposed to 4 books that elaborate on 5S, published by Morgan & Claypool, in 2012, 2013, and 2014. Complementing the coverage of '5S' will be an overview of key aspects of the DELOS Reference Model and DL.org activities. Further, use of a Hadoop cluster supporting DLs will be described.

Topic Exploration with the HTRC Data Capsule for Non-Consumptive Research 

Half Day

Organizers: Jaimie Murdock, Jiaan Zeng and Robert McDonald

Abstract: In this half-day tutorial, we will show 1) how the HathiTrust Research Center (HTRC) Data Capsule can be used for non-­consumptive research over collection of texts and 2) how integrated tools for LDA topic modeling and visualization can be used to drive formulation of new research questions. Participants will be given an account in the HTRC Data Capsule and taught how to use the workset manager to create a corpus, and then use the VM’s secure mode to download texts and analyze their contents.

Automatic Methods for Disambiguating Author Names in Bibliographic Data Repositories

Half Day

Organizers: Anderson Ferreira, Marcos Andre Goncalves and Alberto Laender. 

Abstract: Name ambiguity in the context of bibliographic citation records is a hard problem that affects the quality of services and content in digital libraries and similar systems. This problem occurs when an author publishes works under distinct names or distinct authors publish works under similar names. The challenges of dealing with author name ambiguity have led to a myriad of disambiguation methods. Thus, in this tutorial we propose a taxonomy for characterizing such methods and present an overview of some of the most representative ones, as well as discuss some open challenges.

Digital Data Curation Essentials for Data Scientists, Data Curators and Librarians

Full Day

Organizer: Helen Tibbo and Carolyn Hank.

Abstract: As major funding agencies, publishers, and research institutions continue to issue data sharing, management, and archiving policies in increasing numbers, it is necessary for data scientists and information professionals, including data curators and data librarians, to “skill up to do data” by gaining the knowledge, skills, and competencies necessary to confront their growing—and increasingly complex—data management needs. With lecture, discussion, and hands-­‐on exercises, this tutorial will explore the obligations of researchers to manage their data, identify the attributes of data that add to the complexity of data curation tasks, and introduce a range of standards, tools and resources available to help data scientists and librarians effectively implement data management and curation services. Further, as an essential aspect of data management planning is ensuring data is made available for future access and use, depositing data to an appropriate data or digital repository is either a required or highly recommended outcome.

This tutorial will also explore submission agreements that enable such deposits through making clear the expectations, roles, and responsibilities of data scientists and repository managers. This tutorial is being offered, in part, through the CRADLE (Curating Research Assets and Data Using Lifecycle Education) project, sponsored by the Institute of Museum and Library Services, under award #RE-06-13-0052-13.
Workshops - June 24th

4th International Workshop on Mining Scientific Publications (WOSP 2015)

Full Day

Co-Chairs: Petr Knoth, Kris Jack, Lucas Anastasiou, Nuno Freire, Nancy Pontika and Drahomira Herrmannova.

Abstract: Digital libraries that store scientific publications are becoming increasingly central to the research process. They are not only used for traditional tasks, such as finding and storing research outputs, but also as a source for discovering new research trends or evaluating research excellence. With the current growth of scientific publications deposited in digital libraries, it is no longer sufficient to provide only access to content. To aid research, it is especially important to leverage the potential of text and data mining technologies to improve the process of how research is being done.

This workshop aims to bring together people from different backgrounds who: (a) are interested in analysing and mining databases of scientific publications, (b) develop systems that enable such analysis and mining of scientific databases (especially those who run databases of publications) or (c) who develop novel technologies that improve the way research is being done.

Digital Libraries for Musicology (DLf M) 2015

Half Day

Co-Chairs: Kevin Page and Benjamin Fields.

Abstract: Many Digital Libraries have long offered facilities to provide multimedia content, including music. However there is now an ever more urgent need to specifically support the distinct multiple forms of music, the links between them, and the surrounding scholarly context, as required by the transformed and extended methods being applied to musicology and the wider Digital Humanities. 

The Digital Libraries for Musicology (DLfM) workshop presents a venue specifically for those working on, and with, Digital Library systems and content in the domain of music and musicology. This includes Music Digital Library systems, their application and use in musicology, technologies for enhanced access and organisation of musics in Digital Libraries, bibliographic and metadata for music, intersections with music Linked Data, and the challenges of working with the multiple representations of music across large ­scale digital collections such as the Internet Archive and HathiTrust. 

This will be the second edition of DLfM following a very successful and well received workshop at Digital Libraries 2014, giving an opportunity for the community to present and discuss developments in the last year that tackle the agenda that emerged in London. In particular we encourage participants to consider the theme of the main conference - “Large, Dynamic and Ubiquitous” - and how this properties are reflected in Music Digital Libraries and their application to musicology.

Web Archiving and Digital Libraries (WADL 2015)

Full Day

Co-Chairs: Edward Fox and Zhiwu Xie.

  • This will explore the integration of web archiving and digital libraries, over the complete life cycle: creation/authoring, uploading/publishing in the Web, … 
  • It will cover all topics of interest, including but not limited to: 
    Archiving (events) Big data Classification
    Community building Crawling (focused) Curation, quality control
    Databases / collections Discovery Extraction & analysis
    Filling gaps Globalization, languages Social Sciences
    Linking archives Metadata Mobile devices
    Network science Preservation Resource description
    Standards, protocols Systems, tools Tweet connections
  • to continue to build the community of people integrating web archiving & DLs
  • to help attendees learn about useful methods, systems, and software in this area
  • to help chart future research and improved practice in this area
  • to promote synergistic efforts including collaborative projects and proposals
  • to produce an archival publication that will help advance technology and practice
  • to spot work for a planned journal special issue, in Springer’s IJDL

iSamplES – Internet of Samples in Earth Sciences

Half Day

Co-Chairs: Unmil Karadkar and Kerstin Lehnert.

Abstract: Research in the Earth Science disciplines depends on the availability of representative samples collected above, at, and beneath Earth’s surface, on the moon and in space, or those generated in experiments. These samples serve as fundamental references for generating new knowledge about the earth and the entire universe and a deeper understanding of the processes that created and shaped it, the availability of natural resources and the risk of natural hazards. Many samples have been collected at great cost and with substantial difficulty, are rare or unique and irreplaceable. The EarthCube Research Coordination Network (RCN) iSamplES (Internet of Samples in the Earth Sciences) is intended to advance the use of innovative cyberinfrastructure to connect physical samples and sample collections across the Earth Sciences with digital data infrastructures to revolutionize their utility in the support of science. The goal of this RCN is to dramatically improve the discovery, access, sharing, analysis, and curation of physical samples and the data generated by their study for the benefit of science and society as part of the EarthCube program. The RCN hosted its first workshop in Austin,TX and is coordinating its efforts with other EarthCube RCNs, such as the Earth-Centered Communication for Cyberinfrastructure (EC3).

The proposed workshop will focus on designing a distributed data infrastructure required to make these samples easily accessible, to ensure persistent access to relevant sample metadata, and to allow unambiguous linking of the physical objects to the digital data. This workshop is intended to attract a broad audience comprising of domain scientists, data curators, and computer and information scientists to learn from each other about the requirements of physical and digital sample and collection management. Attendees will address the issues and challenges in the creation, development and maintenance of collections, system architectures, administration, access and management, user interfaces, requirements engineering, evaluation models, and policy implications for digital collections.

Knowledge Maps and Information Retrieval (KMIR)


Half Day

Co-Chairs: Peter Mutschke, Andrea Scharnhorst, Philipp Mayr, Aida Slavic and Preben Hansen.

Abstract: Information systems usually show as a particular point of failure the vagueness between user search terms and the knowledge orders of the information space in question. Some kind of guided searching therefore becomes more and more important in order to more precisely discover information without knowing the right search terms. Knowledge maps are promising tools for visualizing the structure of large-scale information spaces. However, there is no continuous knowledge exchange between the “map makers” on the one hand and the Information Retrieval (IR) specialists on the other hand. Thus, knowledge maps are still far away from being applicable for searching a digital library, due to a lack of models, explorations and user studies that properly combine insights of the two strands. The half-day workshop aims at bringing together these two communities: experts in IR reflecting on visual enhanced search interfaces and experts in knowledge mapping reflecting on interactive visualizations of the content of a digital library collection. The focus of the workshop is to discuss the potential of interactive knowledge maps for information exploration.