Loading…
Back To Schedule
Friday, September 14 • 9:45am - 10:30am
Session 6 - Lightning Talks

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
How reliable are our forensic tools?
Working with born digital files requires the use of various tools, and there is an expectation that the software will perform as advertised. The tools do not always work properly, but problems are usually apparent in the form of error messages or other clear indicators. Unfortunately, this is not always the case. In this lightening talk I will discuss my work using FTK Imager and IsoBuster to image optical discs containing project files and videos of lectures at the Getty. In the process of exporting files from the images, I discovered that FTK Imager and IsoBuster sometimes generate corrupt ISO files without producing error messages. In some cases the ISO files were completely unusable, while in other instances videos in the ISO images could be played but were missing portions from the original. In addition, I found that using FTK Imager’s file export feature on non-corrupt ISO images sometimes produced files with checksums different from those on the mounted image. I will discuss this process of discovery and how the Getty’s Institutional Records and Archives adjusted its workflow in response.
Lorain Wang, J. Paul Getty Trust

Managing the Environmental Impact of Digital Preservation
Environmental sustainability is an imperative that has engaged the cultural heritage community for many years. This has taken numerous forms, such as reducing environmental impacts from the built environment, disaster planning and adaptation in the face of climate change, and reevaluating purchasing decisions based on products’ environmental impacts. However, the drive toward environmental sustainability has not been thoroughly explored in relation to digital preservation activities. In this lightning talk, the speakers will present a summary of their research on the environmental impact of digital preservation, and argue for a shift in the way that digital preservation activities are evaluated. The authors propose that sustainable practice will come only from critical examination of the underlying motivations and assumptions of digital preservation practice, and not from improvements in technological efficiencies. The speakers will briefly explore the paradigm shift that is needed in three areas of digital preservation practice: appraisal, permanence, and availability.
Tim Walsh, Canadian Centre for Architecture
Laura Alagna, Northwestern University
Keith Pendergrass, Harvard Business School
Walker Sampson, University of Colorado Boulder

The Case of the QIC Data Cartridge Tapes
As an intern at the NASA-Caltech Jet Propulsion Laboratory (JPL), I worked on a digital repository focused on capturing and preserving Entry, Descent, and Landing (EDL) records. I was given two QIC data cartridge tapes as contributions to the repository, and my talk will outline the steps I took to try and recover the data on the cartridge tapes, which ultimately I was unable to do.  
Sara Bond, UCLA GSEIS Information Studies

BitCurator.edu - Introduction and Overview
BitCurator.edu is a project funded by the Institute for Museum and Library Services (IMLS) to to study and advance the adoption of digital forensics tools and methods in libraries and archives through professional education efforts. This project will address two primary research questions:  What are the primary institutional and technological factors that influence adoption of digital forensics tools and methods in library and information science (LIS) classes in different educational settings? What are the most viable mechanisms for sustaining collaboration among LIS programs on the adoption of digital forensics tools and methods? This lightning talk will summarize the project rationale, scope and projected deliverables.
Cal Lee, University of North Carolina at Chapel Hill School of Information and Library Science

BitCurator NLP
The BitCurator NLP project is developing software for collecting institutions to extract, analyze, and produce reports on features of interest in text identified in born-digital materials. The software uses existing natural language processing software libraries to identify and report on those items likely to be relevant to ongoing preservation, information organization, and access activities. These may include entities (e.g. persons, places, and organizations), potential relationships among entities, and topic models to provide insight into how concepts are naturally clustered within the documents.This presentation will focus on two software services. The first, BitCurator Access Webtools, allows users to create customized web-accessible views from groups of raw and forensically packaged disk images identified within collections. Selected disk images are automatically processed in a background service that identifies candidate file types (common document formats), extracts and indexes text identified in relevant files. and generates statistical reports for each group of images. A web interface allows users to browse the contents of file systems, examine text extracted from files, and view automatically tagged features including entities. The second, bitcurator-nlp-gentm, uses a similar text-extraction method to prepare candidate materials identified within disk images for topic modeling. Abstract topics generated from these materials can provide insight into term clustering, differences in term distribution within particular disk images versus the group, and assist in identifying outliers or unrelated materials. The tool incorporates a widely-used topic modeling technique (LDA), and leverages existing visualization platforms (including PyLDAvis) to support visualization. BitCurator NLP is supported by a grant from The Andrew W. Mellon Foundation.
Cal Lee, University of North Carolina at Chapel Hill School of Information and Library Science
Kam Woods, University of North Carolina at Chapel Hill School of Information and Library Science

Normalizing partition system analysis to understand disk images
The objective of disk imaging software is to make a faithful representation of original source media.  The storage system parsing software that comprises archival workflows should equally make a faithful reproduction of all of the files on the disk image. However, default workflows of extracting files from disk media may not be designed to recognize all available file systems and may miss entire partitioning systems, potentially resulting in significant data omission. File system detection -- a different problem from parsing -- is a foundational problem in file extraction workflows worth further attention. There are numerous examples of archival material that would be adversely affected by workflows that rely on a single parse of a disk image, including hard drives of desktop computers (especially considering dual-boot computers or drives with recovery partitions), USB keys formatted for multiple operating systems, or older software installers (e.g. hybrid Mac/PC optical media). To assist with file system detection, we have released supporting tooling to bring independent partition system parsing perspectives to file extraction workflows. We released a Disktype output parser that generates DFXML to represent container layers more foundational than file systems. This work includes DFXML core language and library updates to describe the path to discovery of file systems.  What we released can enable archival workflows to discover file systems in a mechanically parseable way.  Overall, we hope for an outcome of this talk to be providing institutions more information to support decision processes regarding retention of disk images acquired through archival processing.
Alex Nelson, National Institute of Standards and Technology
Dianne Dietrich, Cornell University


Speakers
avatar for Laura Alagna

Laura Alagna

Digital Preservation Librarian, Northwestern University
Laura Alagna is the digital preservation librarian at Northwestern University Libraries, where she develops and implements policies and workflows for preserving born-digital and digitized content. Her research interests include repository interoperability, sustainability in digital... Read More →
SB

Sara Bond

UCLA GSEIS Information Studies
DD

Dianne Dietrich

Cornell University
avatar for Cal Lee

Cal Lee

Professor, University of North Carolina
Christopher (Cal) Lee is Professor at the School of Information and Library Science at UNC, Chapel Hill. He teaches courses and workshops in archives and records management. He is a Fellow of SAA, and he serves as editor of American Archivist.
AN

Alex Nelson

Computer Scientist, National Institute of Standards and Technology
avatar for Keith Pendergrass

Keith Pendergrass

Digital Archivist, Harvard Business School
Keith Pendergrass is the digital archivist for Baker Library Special Collections at Harvard Business School, where he develops and oversees born-digital content workflows. He is also the Library's representative on the HBS Green Team, a School-wide staff group coordinating grassroots... Read More →
avatar for Walker Sampson

Walker Sampson

Digital Archivist, CU Boulder
Walker Sampson is an assistant professor and digital archivist at the University of Colorado Boulder, where he leads the management of the Special Collections and Archives' born-digital accessions.
T

Tim Walsh

Canadian Centre for Architecture
LW

Lorain Wang

Digital Archivist, Institutional Archives, J. Paul Getty Trust
avatar for Kam Woods

Kam Woods

Research Scientist, University of North Carolina
Research Scientist @ UNC SILS. RATOM Technical Lead. @kamwoods. he/him/his


Friday September 14, 2018 9:45am - 10:30am PDT
Main Conference Room

Attendees (5)