Extracting process information from archival records

By Isto Huvila, 20 May, 2022

Date

Wednesday, August 24, 2022 - 08:00

Until

Friday, August 26, 2022 - 13:00

Presentation together with Ekta Vats, Zanna Friberg, Lisa Börjesson, Jessica Kaiser and Olle Sköld at Final conference of the international network Digitization and the Future of Archives: Digital archives, Big Data and Memory in Copenhagen.

Abstract

Apart from the lack of information on what archival records are about—described using metadata—there is an increasing awareness of that the lack of understanding of the contexts and processes of how records were created and how they have been manipulated (i.e. data about creation, curation and use processes, or paradata). This poses a significant hindrance to their effective management, preservation, findability and use. However, typically the records themselves contain a lot of information that qualifies as paradata. The problem is that it is dispersed throughout the material and can be difficult to find and use. Moreover, paradata can be identified in text, images (incl. photographs and drawings) and tabular data in the records. This presentation reports findings from a pilot project that investigates how AI-based text and image analysis techniques can be used for mining paradata from archival records pertaining to archaeological excavations. The talk describes how the developed approach is promising in extracting meaningful information on how records and their contents have been created and processed. Further, the presentation outlines key lessons learned during the development and implementation analysis workflow. The heterogeneity of records and especially that of the expressions of paradata causes problems for computational analysis but considering that they also slow down manual processing of the data, the approach discussed in the project emerges as successful. The reported work is a part of the research project CApturing Paradata for documenTing data creation and Use for the REsearch of the future (CAPTURE) that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme grant agreement No 818210 and InterPARES Trust AI funded by a Canadian SSHRC grant. The work has also received funding from the Centre for Digital Humanities Uppsala (CDHU) pilot project scheme.

File attachments

HuvilaEtAl-IRFD2022-handout.pdf (1.99 MB)

Latest Publications

Habitats of Archaeological Knowledge: From Information Ecologies to Information-in-Ecologies

Huvila, I. (2026). Habitats of Archaeological Knowledge: From Information Ecologies to Information-in-Ecologies. In N. Solhjoo (Ed.), Multispecies Information Science (pp. 201–220). London: Routledge. http://doi.org/10.4324/9781003583424-15

Documenting AI Use in Humanities Research

Huvila, I. (2025). Documenting AI Use in Humanities Research. In H. Verhagen, S. Tienken, A. Widholm, M. Fridlund, M. Nermo, & A. Blåder (Eds.), Huminfra 2025 (pp. 57–62). Stockholm: Stockholm University.

Letting AI Loose in an Archive: Technology to Manage or to Manage With

Huvila, I. (2025). Letting AI Loose in an Archive: Technology to Manage or to Manage With. Archiv, Theorie & Praxis, 75, 12–15.

Researchers Data Processing Descriptions–Understanding Paradata Creation Practices and Their Underpinning Instrumentalities

Huvila, I., Andersson, L., & Sköld, O. (2025). Researchers Data Processing Descriptions–Understanding Paradata Creation Practices and Their Underpinning Instrumentalities. Journal of the Association for Information Science and Technology, 76(11), 1570–1590. http://doi.org/10.1002/asi.70003 (Original work published 2026)

Paradata: Documenting Data Creation, Curation and Use

Huvila, I., Andersson, L., Friberg, Z., Liu, Y.-H., & Sköld, O. (2025). Paradata: Documenting Data Creation, Curation and Use. Cambridge: Cambridge University Press. http://doi.org/10.1017/9781009366564

Extracting process information from archival records

Abstract

Forthcoming presentations

Latest Publications

Latest toots

Isto Huvila