Loading…

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Thursday, September 13
 

8:00am PDT

Registration and Coffee
Thursday September 13, 2018 8:00am - 8:45am PDT
Main Conference Room

8:45am PDT

Welcome and Opening Remarks
Thursday September 13, 2018 8:45am - 9:00am PDT
Main Conference Room

9:00am PDT

Introduction to Digital Forensics - Part 1
Speakers
avatar for Matthew Farrell

Matthew Farrell

Duke University


Thursday September 13, 2018 9:00am - 10:30am PDT
Main Conference Room

9:00am PDT

Session 1 - Nobody Puts Access in the Corner
At the DLF Forum 2017, members of the DLF Born-Digital Access Group held a working lunch and discussed issues related to preparing and providing access to born digital archival materials. Following that meeting, a subset of the group formed to create, as one of its main activities, a document proposing “Levels of Access to Born-Digital Archival Material.” Structured similarly to the NDSA “Levels of Preservation,” the document includes various access-related areas of work (Description, Distribution, Tools, Researcher support, Policy and Documentation, Security, and Accessibility), and proposes a tiered set of recommendations for decision making and actionable progress in each of these areas.

This work is already well underway, and by the time BUF-LA takes place in September, the group will have a complete initial draft of its Levels document. In this proposed Let’s-Do-This-A-Thon session, members of the DLF B-D Access group will share the group’s output to date with BUF-LA attendees. Following a brief presentation on the project’s background, we will break into small groups, with each group focusing on one or two access-related areas. Community feedback will be incorporated into a future iteration of the draft. The session’s goal is to solicit feedback about the project while it’s still in a draft phase; hear feedback from BUF-LA attendees; include more voices from the archival community in the discussion; and clarify and improve the proposed Levels ahead of a wider release to the digital archives community at large. Facilitators will include Shira Peltzman, Brian Dietz, Elvia Arroyo-Ramirez, Kelly Bolding, and Jessica Venlet of the DLF B-D Access group.

Speakers
avatar for Elvia Arroyo-Ramírez

Elvia Arroyo-Ramírez

Assistant University Archivist, UC Irvine
avatar for Kelly Bolding

Kelly Bolding

Project Archivist, Princeton University Library
Kelly Bolding is the Project Archivist for Americana Manuscript Collections at Princeton University Library, where she works with 18th and 19th century American history collections, as well as on developing workflows for processing born-digital and audiovisual materials. She is a... Read More →
avatar for Brian Dietz

Brian Dietz

NC State University
avatar for Shira Peltzman

Shira Peltzman

Digital Archivist, UCLA Library
Shira is the Digital Archivist for UCLA Library Special Collections where she leads the development of a preservation program for born-digital archival material.
avatar for Jessica Venlet

Jessica Venlet

Assistant University Archivist for Digital Records, UNC at Chapel Hill Libraries


Thursday September 13, 2018 9:00am - 12:00pm PDT
Presentation Room

9:00am PDT

10:30am PDT

Coffee Break
Thursday September 13, 2018 10:30am - 11:00am PDT
Hallway Area

11:00am PDT

Introduction to Digital Forensics - Part 2
Speakers
avatar for Matthew Farrell

Matthew Farrell

Duke University


Thursday September 13, 2018 11:00am - 12:00pm PDT
Main Conference Room

12:00pm PDT

Lunch
Thursday September 13, 2018 12:00pm - 1:00pm PDT
Main Conference Room

1:00pm PDT

Session 0 - Scripts
Speakers
DD

Dianne Dietrich

Cornell University


Thursday September 13, 2018 1:00pm - 2:00pm PDT
Presentation Room

1:00pm PDT

Introduction to Digital Forensics - Part 3
Speakers
avatar for Matthew Farrell

Matthew Farrell

Duke University


Thursday September 13, 2018 1:00pm - 2:30pm PDT
Main Conference Room

1:00pm PDT

Session 3 - Journalism and Digital Preservation
Several incidents in recent years have demonstrated the necessity of digital preservation in the media: the deletion of the Gothamist, DNAinfo, and LAist sites, and the battle for ownership of Gawker’s content have made it clear that, although journalists and information professionals are in agreement that such things should be preserved, we have not yet shared knowledge or collaborated on how to make this happen. The purpose of this let’s-do-this-a-thon is to bring together digital preservation practitioners and journalists to brainstorm, collaborate, and explore the intersection of journalism and digital forensics. Specifically, we’d like to know: how can we work together to prevent “mass deletions”? What knowledge or skills can we share to facilitate the preservation of digital journalism? How can digital archiving become a part of journalists’ process?

Speakers
avatar for Laura Alagna

Laura Alagna

Digital Preservation Librarian, Northwestern University
Laura Alagna is the digital preservation librarian at Northwestern University Libraries, where she develops and implements policies and workflows for preserving born-digital and digitized content. Her research interests include repository interoperability, sustainability in digital... Read More →


Thursday September 13, 2018 1:00pm - 4:00pm PDT
Presentation Room

1:00pm PDT

Session 4 - PII and You
Thursday September 13, 2018 1:00pm - 4:00pm PDT
West Electronic Classroom

2:30pm PDT

Coffee Break
Thursday September 13, 2018 2:30pm - 3:00pm PDT
Hallway Area

3:00pm PDT

Introduction to Digital Forensics - Part 4
Speakers
avatar for Matthew Farrell

Matthew Farrell

Duke University


Thursday September 13, 2018 3:00pm - 4:00pm PDT
Main Conference Room

4:00pm PDT

Reception
Thursday September 13, 2018 4:00pm - 6:00pm PDT
Powell Rotunda
 
Friday, September 14
 

8:00am PDT

Registration and Coffee
Friday September 14, 2018 8:00am - 8:45am PDT
Hallway Area

8:45am PDT

Opening Remarks
Friday September 14, 2018 8:45am - 9:00am PDT
Main Conference Room

9:00am PDT

9:45am PDT

Session 6 - Lightning Talks
How reliable are our forensic tools?
Working with born digital files requires the use of various tools, and there is an expectation that the software will perform as advertised. The tools do not always work properly, but problems are usually apparent in the form of error messages or other clear indicators. Unfortunately, this is not always the case. In this lightening talk I will discuss my work using FTK Imager and IsoBuster to image optical discs containing project files and videos of lectures at the Getty. In the process of exporting files from the images, I discovered that FTK Imager and IsoBuster sometimes generate corrupt ISO files without producing error messages. In some cases the ISO files were completely unusable, while in other instances videos in the ISO images could be played but were missing portions from the original. In addition, I found that using FTK Imager’s file export feature on non-corrupt ISO images sometimes produced files with checksums different from those on the mounted image. I will discuss this process of discovery and how the Getty’s Institutional Records and Archives adjusted its workflow in response.
Lorain Wang, J. Paul Getty Trust

Managing the Environmental Impact of Digital Preservation
Environmental sustainability is an imperative that has engaged the cultural heritage community for many years. This has taken numerous forms, such as reducing environmental impacts from the built environment, disaster planning and adaptation in the face of climate change, and reevaluating purchasing decisions based on products’ environmental impacts. However, the drive toward environmental sustainability has not been thoroughly explored in relation to digital preservation activities. In this lightning talk, the speakers will present a summary of their research on the environmental impact of digital preservation, and argue for a shift in the way that digital preservation activities are evaluated. The authors propose that sustainable practice will come only from critical examination of the underlying motivations and assumptions of digital preservation practice, and not from improvements in technological efficiencies. The speakers will briefly explore the paradigm shift that is needed in three areas of digital preservation practice: appraisal, permanence, and availability.
Tim Walsh, Canadian Centre for Architecture
Laura Alagna, Northwestern University
Keith Pendergrass, Harvard Business School
Walker Sampson, University of Colorado Boulder

The Case of the QIC Data Cartridge Tapes
As an intern at the NASA-Caltech Jet Propulsion Laboratory (JPL), I worked on a digital repository focused on capturing and preserving Entry, Descent, and Landing (EDL) records. I was given two QIC data cartridge tapes as contributions to the repository, and my talk will outline the steps I took to try and recover the data on the cartridge tapes, which ultimately I was unable to do.  
Sara Bond, UCLA GSEIS Information Studies

BitCurator.edu - Introduction and Overview
BitCurator.edu is a project funded by the Institute for Museum and Library Services (IMLS) to to study and advance the adoption of digital forensics tools and methods in libraries and archives through professional education efforts. This project will address two primary research questions:  What are the primary institutional and technological factors that influence adoption of digital forensics tools and methods in library and information science (LIS) classes in different educational settings? What are the most viable mechanisms for sustaining collaboration among LIS programs on the adoption of digital forensics tools and methods? This lightning talk will summarize the project rationale, scope and projected deliverables.
Cal Lee, University of North Carolina at Chapel Hill School of Information and Library Science

BitCurator NLP
The BitCurator NLP project is developing software for collecting institutions to extract, analyze, and produce reports on features of interest in text identified in born-digital materials. The software uses existing natural language processing software libraries to identify and report on those items likely to be relevant to ongoing preservation, information organization, and access activities. These may include entities (e.g. persons, places, and organizations), potential relationships among entities, and topic models to provide insight into how concepts are naturally clustered within the documents.This presentation will focus on two software services. The first, BitCurator Access Webtools, allows users to create customized web-accessible views from groups of raw and forensically packaged disk images identified within collections. Selected disk images are automatically processed in a background service that identifies candidate file types (common document formats), extracts and indexes text identified in relevant files. and generates statistical reports for each group of images. A web interface allows users to browse the contents of file systems, examine text extracted from files, and view automatically tagged features including entities. The second, bitcurator-nlp-gentm, uses a similar text-extraction method to prepare candidate materials identified within disk images for topic modeling. Abstract topics generated from these materials can provide insight into term clustering, differences in term distribution within particular disk images versus the group, and assist in identifying outliers or unrelated materials. The tool incorporates a widely-used topic modeling technique (LDA), and leverages existing visualization platforms (including PyLDAvis) to support visualization. BitCurator NLP is supported by a grant from The Andrew W. Mellon Foundation.
Cal Lee, University of North Carolina at Chapel Hill School of Information and Library Science
Kam Woods, University of North Carolina at Chapel Hill School of Information and Library Science

Normalizing partition system analysis to understand disk images
The objective of disk imaging software is to make a faithful representation of original source media.  The storage system parsing software that comprises archival workflows should equally make a faithful reproduction of all of the files on the disk image. However, default workflows of extracting files from disk media may not be designed to recognize all available file systems and may miss entire partitioning systems, potentially resulting in significant data omission. File system detection -- a different problem from parsing -- is a foundational problem in file extraction workflows worth further attention. There are numerous examples of archival material that would be adversely affected by workflows that rely on a single parse of a disk image, including hard drives of desktop computers (especially considering dual-boot computers or drives with recovery partitions), USB keys formatted for multiple operating systems, or older software installers (e.g. hybrid Mac/PC optical media). To assist with file system detection, we have released supporting tooling to bring independent partition system parsing perspectives to file extraction workflows. We released a Disktype output parser that generates DFXML to represent container layers more foundational than file systems. This work includes DFXML core language and library updates to describe the path to discovery of file systems.  What we released can enable archival workflows to discover file systems in a mechanically parseable way.  Overall, we hope for an outcome of this talk to be providing institutions more information to support decision processes regarding retention of disk images acquired through archival processing.
Alex Nelson, National Institute of Standards and Technology
Dianne Dietrich, Cornell University


Speakers
avatar for Laura Alagna

Laura Alagna

Digital Preservation Librarian, Northwestern University
Laura Alagna is the digital preservation librarian at Northwestern University Libraries, where she develops and implements policies and workflows for preserving born-digital and digitized content. Her research interests include repository interoperability, sustainability in digital... Read More →
SB

Sara Bond

UCLA GSEIS Information Studies
DD

Dianne Dietrich

Cornell University
avatar for Cal Lee

Cal Lee

Professor, University of North Carolina
Christopher (Cal) Lee is Professor at the School of Information and Library Science at UNC, Chapel Hill. He teaches courses and workshops in archives and records management. He is a Fellow of SAA, and he serves as editor of American Archivist.
AN

Alex Nelson

Computer Scientist, National Institute of Standards and Technology
avatar for Keith Pendergrass

Keith Pendergrass

Digital Archivist, Harvard Business School
Keith Pendergrass is the digital archivist for Baker Library Special Collections at Harvard Business School, where he develops and oversees born-digital content workflows. He is also the Library's representative on the HBS Green Team, a School-wide staff group coordinating grassroots... Read More →
avatar for Walker Sampson

Walker Sampson

Digital Archivist, CU Boulder
Walker Sampson is an assistant professor and digital archivist at the University of Colorado Boulder, where he leads the management of the Special Collections and Archives' born-digital accessions.
T

Tim Walsh

Canadian Centre for Architecture
LW

Lorain Wang

Digital Archivist, Institutional Archives, J. Paul Getty Trust
avatar for Kam Woods

Kam Woods

Research Scientist, University of North Carolina
Research Scientist @ UNC SILS. RATOM Technical Lead. @kamwoods. he/him/his


Friday September 14, 2018 9:45am - 10:30am PDT
Main Conference Room

10:30am PDT

Coffee Break
Friday September 14, 2018 10:30am - 11:00am PDT
Hallway Area

11:00am PDT

Session 7 - Data Journalism
This panel discussion follows the “Let’s-do-this-a-thon”  session on the intersection of journalism and digital preservation from Day 1 of the BitCurator Users Forum. Leaders in this area will present and analyze challenges in preserving digital journalism, as well as describe current work in this area, including blockchain-based efforts, peer to peer networks, and other alternative methods of preservation. This panel will feature Maria Bustillos of Popula, Katherine Boss of New York University, and Ben Welsh of the Los Angeles Times. 

Speakers
KB

Katherine Boss

New York University
avatar for Maria Bustillos

Maria Bustillos

Founder and editor in chief, Popula
I'm the founder and editor in chief of Popula.com, an online magazine publishing global perspectives with an alt-weekly sensibility. We are using ETH-based cryptoeconomics, *right now*, to archive our work directly to the Ethereum blockchain, protect journalism and speech rights... Read More →
BW

Ben Welsh

Editor, Data Desk, Los Angeles Times


Friday September 14, 2018 11:00am - 12:00pm PDT
Main Conference Room

12:00pm PDT

Lunch
Friday September 14, 2018 12:00pm - 1:00pm PDT
Main Conference Room

1:00pm PDT

Session 8 - Work Work Work Workflows
Communities of Practice: Building a Foundation for Born-Digital Processing 
When I arrived at UNC Wilson Special Collections Library in 2016, newly hired in a digital archivist role, I found that roles for born-digital processing work were complex and distributed. This led me to start thinking about communities of practice and consistency for workflows and collections that were often a bit beyond my control. In order to work towards clarity, as well as boost my colleagues’ confidence and skills with born-digital work assigned to them, I have and continue to try a variety of training and in-reach techniques. In this presentation, I will share some of these experiences including some successes, things that didn’t work so well, and new opportunities on the horizon.
Jessica Venlet, University at North Carolina at Chapel Hill Libraries

BitCurator: Beyond Environment
The NCSU Libraries has recently revised its digital archiving ingest and workflows. Until recently, we processed born digital materials using a hybrid Windows/BitCurator virtual machine environment. We now perform most of our work at the command line on a Mac. To support these new workflows, we have also updated our processing “wizard,” DAEV (Digital Assets of Enduring Value), which guides processors through workflows, produces preservation metadata, and integrates with ArchivesSpace. While we have moved away from the BitCurator environment for ingesting and processing data, the leap would not have been possible without the jumpstart provided through our experience using BitCurator. We also will continue to look to BitCurator for access and discovery tools and, more importantly, as a central connection point in a federation of digital archival practices. In this talk, we will present on our new working environment, including the rationale for the move; introduce DAEV, including its past and future, and how its built-in documentation supports staff in processing archival packages; and discuss how this all relates to the BitCurator community.
Brian Dietz, NCSU Libraries

Time to clean out our closet:  challenges accessioning fugitive media
The American Folklife Center at the Library of Congress is one of the world’s largest ethnographic archives and, from its inception in 1972, has specialized in multiformat content. While for years, AFC’s born-digital accessions outnumber analog, in previous decades flash drives, optical, disks, and hard drives accompanying analog collections were not migrated and ingested. In fact, non-standard and uncontrolled vocabularies made it impossible to estimate the extent of the problem. Over approximately 7 months, AFC’s digital assets specialist led a pilot study addressing this issue, first pulling and identifying the “fugitive media” spread across its 4 disparate physical locations. During the course of the pilot, over 5TB from 8 prioritized optical and floppy disk collections were migrated and ingested into the digital repository. More importantly, AFC had a much better accounting of the remainder of its fugitive media backlog and the minimal baseline workflows, practices, and policies necessary to better address them. The digital assets specialist will share some of the outcomes and unexpected lessons learned from this process.
Julia Kim, American Folklife Center, Library of Congress


Speakers
avatar for Brian Dietz

Brian Dietz

NC State University
avatar for Julia Kim

Julia Kim

American Folklife Center, Library of Congress
avatar for Jessica Venlet

Jessica Venlet

Assistant University Archivist for Digital Records, UNC at Chapel Hill Libraries


Friday September 14, 2018 1:00pm - 2:00pm PDT
Main Conference Room

2:00pm PDT

Session 9 - Nuts and Bolts of Digital Archives
A practical approach to working with proprietary file formats
At The New York Public Library, archival collections increasingly contain proprietary file formats related to music and video editing, desktop publishing, and design and drafting software programs. This presentation will discuss approaches the Digital Archives Program at NYPL has been developing to systematically acquire, appraise, arrange and describe proprietary file formats, as well as prepare the files for preservation. Topics that will be touched on include: what information is needed about the software and corresponding file formats to appraise and describe files, identification of gaps in digital preservation resources and how to fill them, and strategies for knowledge sharing with other institutions.
Susan Malsbury, NYPL

A DIY Approach to Data Recovery from Damaged 5.25” Floppy Disks
At RAND’s Corporate Archive, work is ongoing to recover data from a collection of damaged 5.25” floppy disks. The fragility of the medium itself poses unique challenges during the capturing process, and the end result is often that files are captured with minor or significant corruption. A single disk is captured 3 times in order to control for the variations in the capture process, but further preservation tasks on these files are inhibited due to the quality of some captures. This presentation will lay out the hardware and software tools, along with the cleaning methods for disks and drives, that I have found result in the most successful captures, while still being an affordable in-house DIY workflow. The presentation will also describe the ongoing work to “merge” the successfully captured information within files together into a “modified master” copy.
David Tenenholtz, RAND Corporation

Computer Hardware for the Digital Archivist
Parallel ports, VGA cables, and floppy drives, oh my! This presentation will focus on the nuts and bolts of hardware needed to understand (mostly older) PC systems. Drawing from my own collection of orphaned computer parts and various contemporaneous hardware guides, I will outline the basic innerworkings of a desktop computer and explain how all of the pieces fit together. I’ll also talk about how the core components of PC hardware have changed over time, and how to identify what you’re looking at. The aim of this presentation is to offer a starting point for understanding computer hardware, which can help in a variety of archival contexts: from tackling data recovery from older media (e.g., various internal hard drive types) to even contextualizing emulation strategies for older software and file formats.
Dianne Dietrich, Cornell University

Speakers
DD

Dianne Dietrich

Cornell University
avatar for Susan Malsbury

Susan Malsbury

New York Public Library
avatar for David Tenenholtz

David Tenenholtz

RAND Corporation


Friday September 14, 2018 2:00pm - 3:00pm PDT
Main Conference Room

3:00pm PDT

Coffee Break
Friday September 14, 2018 3:00pm - 3:30pm PDT
Main Conference Room

3:30pm PDT

Session 10 - Closing Discussion: Where Does Digital Forensics Fit in the Digital Curation Workflow?
This panel will discuss the substantial findings and documentation produced in the first year of the OSSArcFlow project, a collaborative effort of the Educopia Institute, the University of North Carolina at Chapel Hill School of Information and Library Science (UNC SILS), LYRASIS, and Artefactual, Inc. In this project, 12 archives and libraries of different sizes and sectors are investigating, synchronizing, and modeling a range of workflows to increase the capacity of libraries and archives to curate born digital content. These archival workflows incorporate three leading open source software (OSS) platforms (BitCurator, Archivematica, and ArchivesSpace) but the project is designed such that all of these workflows are institutionally driven rather than functioning as part of a "one-size-fits-all" approach.

From digital dossiers to aspirational workflows, our 12 partners have thoroughly documented both their current practices and their visions for the future. In this panel, we will discuss some of the trends and distinctions we see thus far in the workflows of our partner institutions, including 1) where current standards, models and “best practices” guides have fallen short in guiding their digital curation workflow development;  2) the impact of broader organizational conventions and policies upon local workflow instantiations; and 3) how junctures between software (BitCurator and Archivematica, or BitCurator and ArchivesSpace) are handled, and what specific pieces of metadata (like IDs) or artifacts are used to tie local systems together.

Speakers
AC

Alex Chassanoff

Assistant Professor, NCCU
archives, digital preservation, cybernetics
avatar for Sam Meister

Sam Meister

Educopia Institute


Friday September 14, 2018 3:30pm - 4:30pm PDT
Main Conference Room