2017 event

The second face-to-face Mashcat event in North America was held on January 24, 2017 at Georgia State University Library in Atlanta, Georgia.


Time Session Speaker(s)
9:15-9:45 Mashcat facilitated discussion Erin Leach
After some logistical announcements, Erin Leach will lead a discussion on the prompt of people’s current pressing concerns at the intersection of technology and library data.
9:45-10:15 Linking People: Collaborations Between Metadata Librarians and Programmers Jeremy Myntti, Anna Neatrour, Liz Woolcott
In early 2016, the University of Utah was awarded an IMLS Grant to explore the metadata, workflows, technical requirements, and software that could be used to implement a collaborative, regional controlled vocabulary for personal names and corporate bodies for digital collections within the Mountain West region. This four phase project includes collaborations between metadata librarians and programmers to investigate tools that can be used to create and maintain this type of authority file. Currently the project is evaluating tools to use for local or regional authority control. In this phase, we are developing a set of evaluation criteria that can be used to rank these tools in order to make sure that the needs of a pilot implementation of this type of vocabulary are being met. This presentation will have information about the full grant project along with details regarding the evaluation criteria developed by metadata librarians and programmers and how the different tools rank according to that criteria.

Slides [PPTX]
10:15-10:45 Using Big Data Techniques for Metadata Remediation Roy Tennant
The largest library metadata aggregation in the world also has the most metadata errors. With 380 million MARC records and growing, OCLC must use big data techniques to find and fix errors in WorldCat. Apache Hadoop and Spark are tools that OCLC Research uses to report on MARC errors that are fixed by WorldCat Quality control. The MapReduce paradigm will be explained and illustrated using a few simple examples of Python and/or Scala programs.
10:45-11:15 Reconciling Legacy Archival Metadata Greer Martin
Before our archive became open to the public one year ago, the catalog of nearly 4,000 records were findable by keyword search and little else. The data was devoid of authorities and controlled vocabularies, and had not been compiled into finding aids. Attempting a migration from our legacy museum software to ArchivesSpace involved a great deal of cleanup, and authority reconciliation proved to be the most challenging. OpenRefine is a powerful tool for reconciliation, but what if Library of Congress Subject Headings (LCSH) or Name Authority File (NAF) was never consulted to begin with? This presentation will investigate the tools and decisions involved in resolving data to LCSH and NAF, creating local authorities, and incorporating URIs for future Linked Data projects.

Slides [PDF]
11:15-11:30 Coffee and tea break
11:30-12:30 Teaching Linked Data to Librarians: A Discussion of Pedagogical Methods Sonoe Nakasone, Jacob Shelby, Allison Jai O’Dell
How can libraries prepare their personnel for Linked Data modeling and Semantic Web technologies? Three panelists will share their experiences providing educational opportunities around Linked Data in libraries. These talks will be followed by an interactive question-and-answer session between the panel and audience, aimed at gaining insights to teach new skills in data management, analysis, and programming to a library audience.

North Carolina State University
North Carolina State University (NCSU) Libraries has experimented with Linked Data through various projects, such as its Organizational Names Linked Data project and adding schema.org metadata to web content. NCSU has used these projects to engage staff in learning about Linked Data through the familiar concepts of descriptive and authorities cataloging. By combining on-the-job learning with locally developed training sessions and educational presentations, NCSU Libraries Digital Projects and Partnerships Unit is building the experience and expertise to provide the library with services in semantic web technologies.

University of Florida
The George A. Smathers Libraries at the University of Florida’s “Linked Data Working Group” has developed curriculum to teach Semantic Web technologies (with topics including the Linked Data principles and Resource Description Framework, various serializations, ontologies and ontology languages, and the SPARQL query language). These mini-workshops are aimed at library staff with a metadata background, but little-to-no programming experience. By combining these workshops with discussions aimed to teach conceptual change and inspire innovation, the Smathers Libraries are building capacity for new metadata services.

12:30-1:45 Lunch
Lunch is on your own.
1:45-1:55 #mashcat meets the org chart Holly Tomren
In 2016, Drexel University Libraries created a new division, Data & Digital Stewardship, which includes the Metadata, Discovery, and Archives units. The presenter will discuss the benefits of this new organizational structure in which metadata staff, archives staff & developers are now working together under the same director with common strategic goals.
1:55-2:05 Hyacinth: a new cataloging tool for an institutional repository Brian Luna Lucero
This talk will present the of creation of Hyacinth, a Hydra-based cataloging tool, and its customization for Academic Commons, Columbia’s institutional repository. Additionally, it will cover the transformation of existing records and their subsequent migration to Hyacinth.

Hyacinth is a new cataloging tool coded by developers in CU Libraries’ Digital Projects Division to be used for multiple digital collections projects.The libraries supported development of new software for cataloging digital collections and exporting them to a Fedora repository in order to unify the workflows of several departments and ease the demands for maintenance of multiple platforms. Hyacinth also provides several forward-looking upgrades including a Hydra-based architecture that incorporates the Portland Common Data Model (PCDM) and a URI service capable of minting local URIs and linking to existing authorities. Creating one tool that suits the needs of different departments and projects presented its own technical challenges, however.

A new team was created to handle the implementation of Hyacinth in Academic Commons. Members from different departments brought expertise in repository infrastructure, metadata, software development and project management. In addition to configuring the necessary fields for repository records and mapping them to MODS XML, the team also implemented a template view to help novice catalogers create records, the ability to mint DOIs using the EZID service, and a separate application for processing SWORD deposits and delivering them to Hyacinth. Migrating existing records from Hypatia also included transformation of subject topics to the FAST vocabulary in order to fit Hyacinth’s linked data architecture.

Our presentation will describe the technical characteristics of the repository and the new cataloging tool and the challenges we faced in implementing Hyacinth. We will also present new possibilities for improving the repository in light of the upgrades provided by Hyacinth.
2:05-2:35 Automating controlled subject access from IR keyword strings Matthew Miguez
When moving from a proprietary and hosted IR solution to a local and open one, an intense migration schedule necessitated some time saving measures and ETD and faculty publications were moved with only submitter-assigned keywords. After seeing the reduction of controlled subject access points, FSU’s Digital Library Center developed a python script using direct matches between the submitted keywords and subject headings in LC’s linked data service to add subject elements to MODS records. Safely in post-migration, FSU Libraries can retroactively and automatically provide controlled subject access and linked data URIs to IR materials and integrate the script into the submission workflow for improved access to future materials.

2:35-3:05 Automating Cataloging Workflows with OCLC and Alma APIs Erin Grant, Alex Cooper
This session will detail how a cataloging manager and an application analyst worked together to develop two programs that automate and streamline cataloging workflows. One program utilizes Alma Analytics and SRU APIs and the WorldCat Metadata API to automatically remove OCLC holdings daily for withdrawn and deleted monographs. The second program prevents Promptcat records from overlaying previously cataloged records by extracting the OCLC numbers from the incoming file and searching for full records in Alma using the SRU API, and then splitting the file into two segments for use in different cataloging workflows. We will cover rationale, methodology, and technical specifications for both programs.

Slides [PPTX]
3:05-3:20 Coffee and tea break
3:20-4:05 Searching for sound: implementing a discovery layer for music Kyle Shockey, Yamil Suarez, Mary Jinglewski, Ellie Collier
Music discovery at local, library, and web-scales present unique challenges that exist on the periphery of most academic institutions’ needs. Consequently, few use cases for music discovery exist in librarianship literature and those that do focus on large academic libraries with significant research collections. However, conservatory-style institutions make up a significant portion of academic music training; these institutions often have significant, sometimes unique music-specific holdings which students utilize in different ways than their research counterparts for different educational purposes. This panel brings together the stakeholders of an implementation of a popular bibliographic discovery layer at one such institution to discuss their process, successes, and failures.

Using the Music Library Association’s Music Discovery Requirements (MLA MDR) document as a starting point for music discovery best practices, the panelists will discuss the complexities of normalizing MARC music metadata for use in a discovery layer. In particular, the panelists will discuss challenges such as those presented by incorporating two sets of MARC metadata with different bibliographic and authority control workflows and by mapping the aforementioned MARC metadata to an XML serialization designed for text. In addition, the panelists will discuss the pitfalls of using best practices designed for Western art music at an institution that is primarily focused on other types of musical performance and production. Specifically the panelists will discuss a collection with higher-than-normal instances of corporate body authorities and compilations, as well as special needs for content/carrier disambiguation and MARC subfield indexes.

4:05-4:35 Developing a Metadata Consultation Service Program Jacob Shelby, Sonoe Nakasone
The role of the metadata librarian/unit deviates widely among institutions. It is often up to the metadata librarian/unit to define their role in the institution. Since there is so much variation, defining one’s role can be rather challenging.

In this talk we will discuss our efforts to develop a metadata consulting service program at NCSU Libraries. We begin by sharing how we reviewed our unit’s existing services, such as data management and data reconciliation. We then explore strategies for redefining services and our service model. Next, we detail how we have developed training for unit staff in order to prepare for new services. We will also describe our project management strategies. Finally, we share our endeavors to create a web presence and conduct outreach in order to promote our services.

Slides [PDF]
4:35-4:45 Closing remarks

Planning Team

  • Galen Charlton (gmcharlt AT gmail.com)
  • Erin Grant (erin.grant AT emory.edu)
  • Mary Jinglewski (mary.jinglewski AT gmail.com)
  • Erin Leach (eleach AT uga.edu)
  • Emily Williams (ewill220 AT kennesaw.edu)
  • Susan Wynne (swynne AT gsu.edu)

Comments are closed.