People – BitCurator NLP Project


BitCurator NLP Team

Christopher (Cal) Lee (PI), University of North Carolina at Chapel Hill (Additional Info)

Christopher (Cal) Lee is Associate Professor at the School of Information and Library Science at
the University of North Carolina, Chapel Hill. He teaches archival administration; records management; digital curation; understanding information technology for managing digital collections; and digital forensics. He is a lead organizer and instructor for the DigCCurr Professional Institute, and he teaches professional workshops on the application of digital forensics methods and principles.

Cal’s primary area of research is curation of digital collections. He is particularly interested in the professionalization of this work and the diffusion of existing tools and methods into professional practice. Cal developed “A Framework for Contextual Information in Digital Collections,” and edited and provided several chapters to I, Digital: Personal Collections in the Digital Era published by the Society of American Archivists.

Cal is Principal Investigator of BitCurator Access and was Principal Investigator of BitCurator; both projects have developed and disseminated open-source digital forensics tools for use by archivists and librarians. He was also Principal Investigator of the Digital Acquisition Learning Laboratory (DALL) project and is Senior Personnel on the DataNet Federation Consortium funded by the National Science Foundation. Cal has served as Co-PI on several projects focused on digital curation education: Preserving Access to Our Digital Future: Building an International Digital Curation Curriculum (DigCCurr), DigCCurr II: Extending an International Digital Curation Curriculum to Doctoral Students and Practitioners; Educating Stewards of Public Information for the 21st Century (ESOPI-21), Educating Stewards of the Public Information Infrastructure (ESOPI2), and Closing the Digital Curation Gap (CDCG).

Kam Woods (Co-PI, Technical Lead), University of North Carolina at Chapel Hill @kamwoods (Additional Info)

Kam Woods is a Research Scientist in the School of Information and Library Science at the University of North Carolina at Chapel Hill. His research focuses on long-term preservation of born-digital materials.

Sunitha Misra (Software Developer), University of North Carolina at Chapel Hill @ortsalexip (Additional Info)

Sunitha Misra is a Software Developer for the BitCurator NLP project in the School of Information and Library Science at the University of North Carolina at Chapel Hill. She holds a Masters in Information Sciences from UNC SILS, and an MS in Computer Science from the University of Alabama in Huntsville. Previously, she worked as a Software Developer for major Networking and Operating Systems companies in the San Francisco Bay area and in Research Triangle Park.

Jacob Hill (Project Manager), University of North Carolina at Chapel Hill (Additional Info)

Jacob Hill is the Project Manager for the BitCurator NLP project in the School of Information and Library Science at the University of North Carolina at Chapel Hill. He holds a BA in History from the University of Nevada, Reno and an MSIS from North Carolina Central University. His research interests include knowledge organization, digital humanities, Baha’i studies, and Arabic & Persian manuscripts.


Advisory Group

Mary Elings, University of California, Berkeley @maryelings (Additional Info)

Mary W. Elings is the Interim Head of Technical Services and Principal Archivist for Digital Collections at The Bancroft Library at the University of California, Berkeley. She leads the acquisitions, cataloging, and processing units and is responsible for all aspects of the digital collections, including digital initiatives and the born digital archives program. Prior to coming to the Bancroft, Ms. Elings worked in museums focusing on art conservation, collection documentation, conservation imaging, information and asset management, and digitization initiatives. Her current work concentrates on issues surrounding born-digital materials, supporting digital humanities and digital social sciences, and research data management. Ms. Elings has taught as an adjunct professor in the School of Information Studies at Syracuse University, New York and the School of Library and Information Science, Catholic University, Washington, DC, and is a regular guest-lecturer in the John F. Kennedy University Museum Studies program.

Mark Matienzo, Stanford University Libraries @anarchivist (Additional Info)

Mark A. Matienzo is the Collaboration & Interoperability Architect at the Stanford University Libraries, serving as a technologist, advocate, and facilitator for cross-institutional projects. Prior to joining Stanford, Mark worked as an archivist, technologist, and strategist specializing in born-digital materials and metadata management, at institutions including the Digital Public Library of America, Yale University Library, The New York Public Library, and the American Institute of Physics. Mark received a MSI from the University of Michigan School of Information and a BA in Philosophy from the College of Wooster, and was a recipient of the Emerging Leader Award from the Society of American Archivists in 2012.

Don Mennerich, New York University @mennerich (Additional Info)

Don Mennerich joined DLTS in January 2014 as a Digital Archivist, working primarily with forensic tools and their relationship to managing born-digital archives. Prior to working at NYU, he held positions at The New York Public Library, Beinecke Rare Book and Manuscript Library, and Yale University Library. Don holds an MS in Information Systems from Pace University and an MLS from Simmons College. Don is a member of both DLTS and the Archival Collections Management unit.

Michael Piotrowski, Leibniz Institute of European History @true_mxp (Additional Info)

Michael Piotrowski is head of the Digital Humanities research group at the Leibniz Institute of European History (IEG) in Mainz, Germany. His main research interests are language technology for historical texts, document engineering, interactive editing and authoring aids (LingURed), and e-learning technology. He is the author of the first text book on NLP for historical texts (published by Morgan & Claypool). At the IEG, he leads several projects, including the institute’s contributions to DARIAH-DE (funded by the German Federal Ministry of Education and Research [BMBF]).
Piotrowski received his doctoral degree in Computer Science in 2009 from Otto von Guericke University Magdeburg, Germany. His dissertation’s title was Document-Oriented E-Learning Components. He also holds an M.A. in Computational Linguistics, English Philology, and Applied Linguistics from Friedrich Alexander University Erlangen-Nuremberg, Germany. His master’s thesis was entitled NLP-Supported Full-Text Retrieval.

Daniel Pitti, University of Virginia (Additional Info)

Daniel Pitti is Associate Director of the Institute for Advanced Technology in the Humanities at the University of Virginia. Pitti currently serves as the chair/président of the International Council on Archives Experts Group on Archival Description, charged with developing an archival description conceptual model called Records in Contexts (RiC). From 1993-2010, Pitti served as the chief technical architect of Encoded Archival Description (EAD, an international standard for encoding archival guides, and Encoded Archival Context-Corporate Bodies, Persons, and Families (EAC-CPF), an international standard for encoding archival descriptions of persons, organizations, and families. Pitti is project director of the Social Networks and Archival Context (SNAC). At IATH, Pitti collaborates with faculty fellows and other scholars in humanities research projects that employ innovative methods based on computer and network technologies. Among the many humanities projects are the William Blake Archive; the Walt Whitman Archive; Leonardo’s Treatise on Painting; Arapesh Grammar and Digital Language Archive; and Collective Biographies of Women.

Josh Schneider, Stanford University (Additional Info)

Josh Schneider is Assistant University Archivist at Stanford University, where he acquires and supports researcher use of Stanford University records, faculty papers, and materials documenting campus and student life. His case study on appraisal of electronic records appeared in the latest volume of the Society of American Archivists’ Trends in Archival Practice series. Josh is also Community Manager for ePADD, an open-source software package that uses named entity recognition and other NLP-driven processes to support the appraisal, processing, discovery, and delivery of email archives. Josh serves on the editorial boards of The American Archivist, Journal of Western Archives, and the blog of SAA’s Electronic Records Section (BloggERS!). He received an MLIS from Simmons College and a BA in Philosophy from Brown University.

Ryan Shaw, University of North Carolina at Chapel Hill @rybesh (Additional Info)

Ryan Shaw received his Ph.D. in 2010 from the University of California, Berkeley School of Information, where he wrote his dissertation on how events and periods function as concepts for organizing historical knowledge. He is also the author of the LODE (Linking Open Descriptions of Events) ontology, recently adopted by the UK Archives Hub for their Linked Data effort. In 2012 he received a three-year Early Career Development grant from the Institute of Museum and Library Services to invent new tools for applying computational text processing techniques to organize large collections of civil rights histories. He is also a co-PI of the Editors’ Notes project, a Mellon Foundation-funded effort to develop open, collaborative notebooks for humanists, and the PeriodO project, an NEH-funded gazetteer of scholarly assertions about the extents of historical, art-historical, and archaeological periods. In the past he has been involved in a number of digital humanities projects through his work with the Electronic Cultural Atlas Initiative. In a previous life, he worked as a software engineer in Tokyo, Japan.

Stéfan Sinclair, McGill University @sgsinclair (Additional Info)

Stéfan Sinclair is an Associate Professor of Digital Humanities at McGill University. His primary area of research is in the design, development, usage and theorization of tools for the digital humanities, especially for text analysis and visualization. He has led or contributed significantly to projects such as Voyant Tools, the Text Analysis Portal for Research (TAPoR), the MONK Project, the Simulated Environment for Theatre, the Mandala Browser, and BonPatron. In additional to his work developing sophisticated scholarly tools, Sinclair has numerous publications related to research and teaching in the Digital Humanities, including Visual Interface Design for Digital Cultural Heritage, co-authored with Stan Ruecker and Milena Radzikowska (Ashgate 2011).

Other professional activities include serving as President of the Association for Computers and the Humanities (ACH), on executive committees of the the Canadian Society for Digital Humanities / Société pour l’étude de médias interactifs (CSDH/SCHN), the Alliance of Digital Humanities Associations (ADHO) and centerNET, and as an editor of Digital Humanities Quarterly (Digital Humanities Quarterly). Prior to moving to McGill University, Sinclair Associate Professor in the Department of Communication Studies and Multimedia at McMaster University from 2004 to 2011, where he was also Director of the Sherman Centre for Digital Scholarship. Before joining McMaster University, he was at the University of Alberta where he was co-responsible for the creation and development of the M.A. in Humanities Computing program from 2001 to 2004. His Ph.D. in French Literature is from Queen’s University (2000), his M.A. in French literature is from the University of Victoria (1995), and his honors B.A. in French is from the University of British Columbia (1994).

Carl Wilson, Open Preservation Foundation @carlscode (Additional Info)

Carl is the Technical Lead for the Open Preservation Foundation, overseeing all of OPF’s technical activities. He is an experienced software engineer with a focus on software quality through testing. He is an open source enthusiast, both as a user and developer. His professional interest is using virtualisation, automation and continuous delivery techniques to improve the software development process. Carl is leading the development team for veraPDF and is responsible for the software quality, website development and continuous integration / delivery. Before this he was responsible for OPF’s technical contribution to the SCAPE project. Prior to joining OPF Carl worked for The British Library’s Digital Preservation Team on internal and external projects.