#paradata — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #paradata, aggregated by home.social.
-
(This post is being modified) -
Porting SafeText and analyzing digital content with Apache Tika
by @beet_keeperLast year I wrote about pitfalls in modern journalism, especially with regards to receiving documents and information from whistleblowers without offering them adequate protection.
The tl;dr is that you, as a whistleblower, need to protect yourself; and you, as an editor or journalist, need to protect your whistleblowers.
Steganographic fingerprints might be one method adopted to detect someone leaking information. Steganographic characters replace common textual characters with unusual but hard to detect variants, e.g. they look the same to the human eye, or are actually invisible. Using a tool called SafeText by David Jacobson we can identify these hidden fingerprints in the content that you share.
I firmly believe we can find clues about what is important to preserve, or learn to preserve, when we analyse the content of the digital record and not just the (file) format of the digital record.
A file can contain many different features and these are all challenges to their future interpretation, and thus preservation.
I wanted to use SafeText in some of my other non-Python tooling and so I decided to port the code to Golang as a composable module and binary.
By coincidence at the time I started writing this I had also just written about revisiting tikalinkextract and so I thought I would write this small explanation about how you might combine Tika and SafeText to perform some content analysis of your own.
Who knows, maybe we will find a conspiracy. Maybe we’ll find secret codes in our own digital records. Maybe we’ll learn something new about our records…
Lets have a look at putting Tika and SafeText together and see where it goes.
#ApacheTika #authenticity #Code #Coding #ContentAnalysis #Data #DigitalHumanities #digitalLiteracy #DigitalPreservation #Golang #integrity #Metadata #Paradata #SafeText #steganography -
Porting SafeText and analyzing digital content with Apache Tika
by @beet_keeperLast year I wrote about pitfalls in modern journalism, especially with regards to receiving documents and information from whistleblowers without offering them adequate protection.
The tl;dr is that you, as a whistleblower, need to protect yourself; and you, as an editor or journalist, need to protect your whistleblowers.
Steganographic fingerprints might be one method adopted to detect someone leaking information. Steganographic characters replace common textual characters with unusual but hard to detect variants, e.g. they look the same to the human eye, or are actually invisible. Using a tool called SafeText by David Jacobson we can identify these hidden fingerprints in the content that you share.
I firmly believe we can find clues about what is important to preserve, or learn to preserve, when we analyse the content of the digital record and not just the (file) format of the digital record.
A file can contain many different features and these are all challenges to their future interpretation, and thus preservation.
I wanted to use SafeText in some of my other non-Python tooling and so I decided to port the code to Golang as a composable module and binary.
By coincidence at the time I started writing this I had also just written about revisiting tikalinkextract and so I thought I would write this small explanation about how you might combine Tika and SafeText to perform some content analysis of your own.
Who knows, maybe we will find a conspiracy. Maybe we’ll find secret codes in our own digital records. Maybe we’ll learn something new about our records…
Lets have a look at putting Tika and SafeText together and see where it goes.
#ApacheTika #authenticity #Code #Coding #ContentAnalysis #Data #DigitalHumanities #digitalLiteracy #DigitalPreservation #Golang #integrity #Metadata #Paradata #SafeText #steganography -
Porting SafeText and analyzing digital content with Apache Tika
by @beet_keeperLast year I wrote about pitfalls in modern journalism, especially with regards to receiving documents and information from whistleblowers without offering them adequate protection.
The tl;dr is that you, as a whistleblower, need to protect yourself; and you, as an editor or journalist, need to protect your whistleblowers.
Steganographic fingerprints might be one method adopted to detect someone leaking information. Steganographic characters replace common textual characters with unusual but hard to detect variants, e.g. they look the same to the human eye, or are actually invisible. Using a tool called SafeText by David Jacobson we can identify these hidden fingerprints in the content that you share.
I firmly believe we can find clues about what is important to preserve, or learn to preserve, when we analyse the content of the digital record and not just the (file) format of the digital record.
A file can contain many different features and these are all challenges to their future interpretation, and thus preservation.
I wanted to use SafeText in some of my other non-Python tooling and so I decided to port the code to Golang as a composable module and binary.
By coincidence at the time I started writing this I had also just written about revisiting tikalinkextract and so I thought I would write this small explanation about how you might combine Tika and SafeText to perform some content analysis of your own.
Who knows, maybe we will find a conspiracy. Maybe we’ll find secret codes in our own digital records. Maybe we’ll learn something new about our records…
Lets have a look at putting Tika and SafeText together and see where it goes.
#ApacheTika #authenticity #Code #Coding #ContentAnalysis #Data #DigitalHumanities #digitalLiteracy #DigitalPreservation #Golang #integrity #Journalism #Metadata #Paradata #SafeText #steganography #Whistleblow #Whistleblower -
Porting SafeText and analyzing digital content with Apache Tika
by @beet_keeperLast year I wrote about pitfalls in modern journalism, especially with regards to receiving documents and information from whistleblowers without offering them adequate protection.
The tl;dr is that you, as a whistleblower, need to protect yourself; and you, as an editor or journalist, need to protect your whistleblowers.
Steganographic fingerprints might be one method adopted to detect someone leaking information. Steganographic characters replace common textual characters with unusual but hard to detect variants, e.g. they look the same to the human eye, or are actually invisible. Using a tool called SafeText by David Jacobson we can identify these hidden fingerprints in the content that you share.
I firmly believe we can find clues about what is important to preserve, or learn to preserve, when we analyse the content of the digital record and not just the (file) format of the digital record.
A file can contain many different features and these are all challenges to their future interpretation, and thus preservation.
I wanted to use SafeText in some of my other non-Python tooling and so I decided to port the code to Golang as a composable module and binary.
By coincidence at the time I started writing this I had also just written about revisiting tikalinkextract and so I thought I would write this small explanation about how you might combine Tika and SafeText to perform some content analysis of your own.
Who knows, maybe we will find a conspiracy. Maybe we’ll find secret codes in our own digital records. Maybe we’ll learn something new about our records…
Lets have a look at putting Tika and SafeText together and see where it goes.
Continue reading “Porting SafeText and analyzing digital content with Apache Tika”…
#ApacheTika #authenticity #Code #Coding #ContentAnalysis #Data #DigitalHumanities #digitalLiteracy #DigitalPreservation #Golang #integrity #Journalism #Metadata #Paradata #SafeText #steganography #Whistleblow #Whistleblower -
New article w Olle Sköld, Dydimus Zengenene, Lisa Andersson ”What a standard makes out of a process? Data-documentation standards and their consequences to process documentation” out in JDOC https://doi.org/10.1108/JD-10-2025-0324 #OpenAccess #paradata #CAPTURE_ERC
-
I am really excited to be a part of conf Disentangling the Intertwinement of Digitalisation and Decolonisation conference at the Royal Danish Academy of Sciences and Letters org’d by Eleanor Q. Neil and Rubina Raja
(Aarhus University) with a talk on ”Digital Dataset as an Archive” #archaeology #data #archives #paradata https://urbnet.au.dk/news/events/2025/disentangling -
⏰ Reminder: Tomorrow!
ENLIGHT & Arqus Alliance OS Webinar
Topic: Data makers’ and users’ views on useful paradata
🗓️ Mon, Sept 29 | 10:00–11:00 CET
💻 Online | 🎙️ Prof. dr. Isto Huvila (Uppsala University)
Don’t miss insights on what info about data creation, curation & use (paradata) makes data reusable!
🔗 Register: https://us05web.zoom.us/meeting/register/jKSxX6mJRvGpEaYmTqqKKg#/registration
ℹ️ More info: https://enlight-eu.org/landing-research-and-innovation/open-science/1040-enlight-rise-and-arqus-alliance-ambassador-webinar-series-on-open-science
#OpenScience #Paradata #ResearchData -
📢 Upcoming ENLIGHT & Arqus Alliance OS Webinar!
Topic: Data makers’ and users’ views on useful paradata
🗓️ Monday, Sept 29, 10:00–11:00 CET
💻 Online
🎙️ Prof. dr. Isto Huvila (Uppsala University)
Paradata = the metadata about how data is created, curated, manipulated, and used — crucial for reusability.
🔗 Register: https://us05web.zoom.us/meeting/register/jKSxX6mJRvGpEaYmTqqKKg#/registration
ℹ️ More info: https://enlight-eu.org/landing-research-and-innovation/open-science/1040-enlight-rise-and-arqus-alliance-ambassador-webinar-series-on-open-science
📺 Previous webinars: https://www.youtube.com/playlist?list=PLnfetl7rb1WIhBuY-OuOU6B_G8yquro55
#OpenScience #Paradata #ResearchData -
This week in Leiden at Lorentz Center working on #3D #paradata with @cpapadopoulos and a fantastic group of colleagues https://www.lorentzcenter.nl/paradata-in-3d-scholarship.html
-
Slides for today’s talk on Documenting the unruly AI: Capturing sociotechnical practices with paradata at #WORK2025 conf available at https://istohuvila.se/content/documenting-unruly-ai-capturing-sociotechnical-practices-paradata #CAPTURE_ERC #paradata #AI #AIResearch
-
Out now! A book length introduction to and comprehensive exploration of #paradata ”Paradata: Documenting Data Creation, Curation and Use” from #CAPTURE_ERC available #openaccess from Cambridge University Press https://www.cambridge.org/fi/universitypress/subjects/computer-science/computing-and-society/paradata-documenting-data-creation-curation-and-use with Zanna Friberg, Olle Sköld, Lisa Andersson & Ying-Hsang Liu
-
New article from #CAPTURE_ERC with Lisa Andersson and Olle Sköld : Researchers engage in #paradata generation both as integrated in their research work and as a discrete standalone activity, both with implications to generated paradata https://doi.org/10.1002/asi.70003 #openaccess #ERC_research
-
Difficult to Know If You Can Rely on Or Use Your Data? Look for Paradata to Understand Better https://informationmatters.org/2024/11/difficult-to-know-if-you-can-rely-on-or-use-your-data-look-for-paradata-to-understand-better/ #paradata #CAPTURE_ERC #RDM #opendata #data #researchdata #ercresearch
-
Recording of the Perspectives on Paradata book https://link.springer.com/book/10.1007/978-3-031-53946-6 launch available at https://www.uu.se/en/department/alm/research/research-projects/ongoing-projects/capture/capture-events/capture-talks #paradata #openaccess #ercresearch
-
#CfP EAA Rome, Aug 28-31 2024 Session Understanding the Research Process as a Chaîne Opératoire on the impact of digitization and born-digital data collection on data curation and reuse. We welcome papers on all aspects of the research process. Deadline February 8th, 2024. Submit your contribution (150-300 words) at https://submissions.e-a-a.org/eaa2024/ #Opendata #Paradata, #digitalmethods #FAIR #Dataarchiving For more information, contact [email protected] or [email protected]
-
Excited to be part of a super interesting session on Artificial data https://external.invajo.com/events/7b34da60-8f31-4282-a65c-768158fe708f/scheduling/caff45ea-afdf-4029-ac3d-6b2263e41e18/dates/caeb0ee7-825d-4a35-9d10-14ac3367c882/scheduling-overview?session=7d6a0a84-80da-405d-8353-58dd41de77a4 tomorrow at #NordicSTS http://www.nordicsts.se with paper "Following the footsteps of synthetic data: documenting epistemological justifications in paradata"
https://www.istohuvila.se/content/following-footsteps-synthetic-data-documenting-epistemological-justifications-paradata #paradata #syntheticdata #CAPTURE_ERC -
Excited to be part of a super interesting session on Artificial data https://external.invajo.com/events/7b34da60-8f31-4282-a65c-768158fe708f/scheduling/caff45ea-afdf-4029-ac3d-6b2263e41e18/dates/caeb0ee7-825d-4a35-9d10-14ac3367c882/scheduling-overview?session=7d6a0a84-80da-405d-8353-58dd41de77a4 tomorrow at #NordicSTS http://www.nordicsts.se with paper "Following the footsteps of synthetic data: documenting epistemological justifications in paradata"
https://www.istohuvila.se/content/following-footsteps-synthetic-data-documenting-epistemological-justifications-paradata #paradata #syntheticdata #CAPTURE_ERC -
Excited to be part of a super interesting session on Artificial data https://external.invajo.com/events/7b34da60-8f31-4282-a65c-768158fe708f/scheduling/caff45ea-afdf-4029-ac3d-6b2263e41e18/dates/caeb0ee7-825d-4a35-9d10-14ac3367c882/scheduling-overview?session=7d6a0a84-80da-405d-8353-58dd41de77a4 tomorrow at #NordicSTS http://www.nordicsts.se with paper "Following the footsteps of synthetic data: documenting epistemological justifications in paradata"
https://www.istohuvila.se/content/following-footsteps-synthetic-data-documenting-epistemological-justifications-paradata #paradata #syntheticdata #CAPTURE_ERC -
#Cfp Deadline for CAPTURE_ERC and ASIS&T European Chapter
conf on Information Science Perspectives to Documenting Processes and Practices abstract deadline extended to April 25. Read more and submit at https://abm.uu.se/research/Ongoing+Research+Projects/capture/events/conference--information-science-perspectives-to-documenting-processes-and-practices/
#paradata #asist -
A new piece on #paradata and #AI: Cameron, S., Franks, P., Huvila, I., & Mooradian, N. (in print). Navigating Accountability: The Role of Paradata in AI Documentation and Governance. Journal of Documentation. https://doi.org/10.1108/JD-01-2025-0009 #interparestrustai #interpares #CAPTURE_ERC
-
Talking about #paradata "Approaching prefigurative heritage data practices" at NordicTAG Conference in Turku/Åbo in the session on No more lost futures; Postcapitalism and Cultural Heritage #NordicTAG Slides at https://istohuvila.se/content/approaching-prefigurative-heritage-data-practices #CAPTURE_ERC
-
#CAPTURE_ERC work on the nexus of #paradata and #dataliteracy presented at #CoLIS2025 conference in Glasgow. The paper is available #OA at https://doi.org/10.47989/ir30CoLIS52324
-
Super exciting discussions on paradata in the context of highly multimodal multidisciplinary research on rhythm, time and motion with researchers at RITMO @ University of Oslo https://www.uio.no/ritmo/english/news-and-events/events/workshops/2025/paradata/index.html Recording of my talk available at https://youtu.be/eRA9biUnPWU #paradata #CAPTURE_ERC
-
"Vital to continue our quest to understand datasets better in their own right, incl. flaws but also social / historical aspects of which they are formed and the extent to which they form us as resaerch communities " Also #paradata mentioned by Chris Green (Universität Kiel) at the SweDigArch conf, Stockholm as a facilitating factor to make big data work in #archaeology
-
Today discussing #paradata being even more important with #syntheticdata but also useful for problematizing what is data, good data, how synthetic is synthetic etc. at a wonderful liquid Tema Datalab on what is good (synthetic data) @ Linköping University https://liu.se/event/what-is-good-data-
-
I have lived two years without coming across the term "paradata" (discussed in https://doi.org/10.1515/opis-2022-0129), until Semantic Scholar suggested https://doi.org/10.1007/978-3-031-53946-6_8 to me. (Oh, that is a chapter in a book edited by the person introducing paradata.)
My initial reaction was "sounds like a rebrand of 'provenance metadata'", but I should probably read on.
-
Today some more #paradata at #EuroMed2024 Workshop 1: Paradata, Metadata, and Data in 3D Digital Documentation for Cultural Heritage: #DigitalTwins or #MemoryTwins https://euromed2024.eu/workshops/workshop-1/ with a paper "Generating paradata by asking questions or telling stories" from #CAPTURE_ERC
-
Two #CAPTURE_ERC papers on Nov 7 at Finnish Information Studies Days: Paradata literacy and the challenges of research data management by Isto Huvila, Jessica Kaiser, Olle Sköld & Lisa Andersson (https://journal.fi/inf/article/view/148594) and The Data Creation Practices of Archaeologists in the Field by Michael Olsson (https://journal.fi/inf/article/view/148607), orgd at Åbo Akademi University, Programme: http://www.informaatiotutkimus.fi/?Informaatiotutkimuksen_p%E4iv%E4t_2024 #paradata
-
An online book launch of "Perspectives on Paradata: Research and Practice of Documenting Data Processes” on Nov 13 at 1-2 pm CET. Register free at https://uu-se.zoom.us/meeting/register/u5Upf-CurzwiGNKtITesGFwPPntLDP8r3rVM More info https://www.uu.se/en/department/alm/research/research-projects/ongoing-projects/capture/capture-events/capture-talks Download the book at https://doi.org/10.1007/978-3-031-53946-6
#paradata #openaccess -
Talked about paradata and transmission of practices at the Swedish STS conference https://liu.se/forskning/svenska-sts-konferensen-2024 in Norrköping earlier today. A super-interesting conference with loads of exciting presentations and colleagues. My slides can be found at https://istohuvila.se/content/scientific-and-scholarly-practice-and-its-documentation-transmission #paradata
-
Perspectives on Paradata https://link.springer.com/book/10.1007/978-3-031-53946-6 magnificent line-up of chapter authors incl. Richel Bilderbeek, Sarah Buchanan, Jenny Bunn, Megan Cohen, Ian Dawson, Wout Dillen, Lena Enqvist, Pekka Henttonen, @jameshodges , Theresa Huntsman, Kevin Matthew Jones, Jardi A. M., Saara Packalén, @cpapadopoulos , Alexandria Rayburn, @PaulReilly , Simone Reuss, Patrick Oliver Schenk, M. Scott Sotebeer, Michael Stiber & Andrea Thomer #openaccess #OA #paradata #CAPTURE_ERC
-
Perspectives on Paradata: Research and Practice of Documenting Process Knowledge is out https://link.springer.com/book/10.1007/978-3-031-53946-6 #openaccess #OA #CAPTURE_ERC with chapters with interdisciplinary perspectives to #paradata edited with Olle Sköld & Lisa Andersson ..
-
Soon the Session on Understanding the Research process as a Chaîne Opératoire @ EAA 2024 in Rome and online https://www.e-a-a.org/EAA2024 #paradata #chainedoperatoire #archaeology #CAPTURE_ERC
-
New research on #paradata w Olle Sköld & Lisa Andersson incl. findings: data-makers and data reusers have different preferences regarding paradata; usefulness of paradata data reuse; and five clusters of paradata types associated to data practices were identified in Huvila, I., Andersson, L., & Sköld, O. (2024). Patterns in paradata preferences among the makers and reusers of archaeological data. Data and Information Management. doi.org/10.1016/j.dim.2024.100077 #CAPTURE_ERC #ERC_research
-
Slides for yesterday's talk at #ESOF2024 can be found at https://istohuvila.se/content/keynote-do-you-know-how-your-data-was-made-you-should @CAPTURE_ERC #ERC_Research #paradata
-
Thrilled to be part of Euroscience Open Forum with a keynote on CAPTURE research "Do you know how your data was made? You should." https://www.esof.eu/keynote-speakers and in a panel on ERC resaerch #CAPTURE_ERC #ERC_Research #paradata www.uu.se/en/research/capture
-
Slides from today Huvila & Ekman ”Documentation of data making, processing and use facilitates future reuse of research data: the CAPTURE project” at https://istohuvila.se/content/documentation-data-making-processing-and-use-facilitates-future-reuse-research-data-capture #CAPTURE_ERC #paradata
-
#CAPTURE_ERC work on the nexus of #paradata and #dataliteracy presented at #CoLIS2025 conference in Glasgow. The paper is available #OA at https://doi.org/10.47989/ir30CoLIS52324
-
#CAPTURE_ERC work on the nexus of #paradata and #dataliteracy presented at #CoLIS2025 conference in Glasgow. The paper is available #OA at https://doi.org/10.47989/ir30CoLIS52324
-
Zanna Friberg and me are presenting at #CAA2023Ams in Amsterdam a paper on management and archiving of #paradata at a session on archiving process documentation org'd by
Jessica Kaiser and me #archaelogy #ERC_research https://2023.caaconference.org => Session 16 -
1-2 open positions for senior #researchers and/or #postdocs in CAPTURE ERC-COG project with background e.g. in information studies, science studies, research data management etc. https://www.jobb.uu.se/details/?positionId=637476 Deadline on Aug 14, 2023 #postdocposition #postdoctoralfellowship #job #hiring #paradata #archaeology
-
#CfP EAA Rome, Aug 28-31 2024 Session Understanding the Research Process as a Chaîne Opératoire on the impact of digitization and born-digital data collection on data curation and reuse. We welcome papers on all aspects of the research process. Deadline February 8th, 2024. Submit your contribution (150-300 words) at https://submissions.e-a-a.org/eaa2024/ #Opendata #Paradata, #digitalmethods #FAIR #Dataarchiving For more information, contact [email protected] or [email protected]
-
#CfP EAA Rome, Aug 28-31 2024 Session Understanding the Research Process as a Chaîne Opératoire on the impact of digitization and born-digital data collection on data curation and reuse. We welcome papers on all aspects of the research process. Deadline February 8th, 2024. Submit your contribution (150-300 words) at https://submissions.e-a-a.org/eaa2024/ #Opendata #Paradata, #digitalmethods #FAIR #Dataarchiving For more information, contact [email protected] or [email protected]
-
#CfP EAA Rome, Aug 28-31 2024 Session Understanding the Research Process as a Chaîne Opératoire on the impact of digitization and born-digital data collection on data curation and reuse. We welcome papers on all aspects of the research process. Deadline February 8th, 2024. Submit your contribution (150-300 words) at https://submissions.e-a-a.org/eaa2024/ #Opendata #Paradata, #digitalmethods #FAIR #Dataarchiving For more information, contact [email protected] or [email protected]
-
Zanna Friberg and me are presenting at #CAA2023Ams in Amsterdam a paper on management and archiving of #paradata at a session on archiving process documentation org'd by
Jessica Kaiser and me #archaelogy #ERC_research https://2023.caaconference.org => Session 16 -
#Cfp Deadline for CAPTURE_ERC and ASIS&T European Chapter
conf on Information Science Perspectives to Documenting Processes and Practices abstract deadline extended to April 25. Read more and submit at https://abm.uu.se/research/Ongoing+Research+Projects/capture/events/conference--information-science-perspectives-to-documenting-processes-and-practices/
#paradata #asist -
#Cfp Deadline for CAPTURE_ERC and ASIS&T European Chapter
conf on Information Science Perspectives to Documenting Processes and Practices abstract deadline extended to April 25. Read more and submit at https://abm.uu.se/research/Ongoing+Research+Projects/capture/events/conference--information-science-perspectives-to-documenting-processes-and-practices/
#paradata #asist -
#Cfp Deadline for CAPTURE_ERC and ASIS&T European Chapter
conf on Information Science Perspectives to Documenting Processes and Practices abstract deadline extended to April 25. Read more and submit at https://abm.uu.se/research/Ongoing+Research+Projects/capture/events/conference--information-science-perspectives-to-documenting-processes-and-practices/
#paradata #asist