#webarchiving — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #webarchiving, aggregated by home.social.
-
National Library of Finland: Principles for Finnish Web Archive content selection published. “The National Library of Finland is responsible for the diverse and representative preservation of online material. To make this work more transparent, we produced a document entitled Content selection for the Finnish Web Archive, outlining the principles for content selection in thematic and continuous […]
https://rbfirehose.com/2026/05/13/national-library-of-finland-principles-for-finnish-web-archive-content-selection-published/ -
National Library of Finland: Principles for Finnish Web Archive content selection published. “The National Library of Finland is responsible for the diverse and representative preservation of online material. To make this work more transparent, we produced a document entitled Content selection for the Finnish Web Archive, outlining the principles for content selection in thematic and continuous […]
https://rbfirehose.com/2026/05/13/national-library-of-finland-principles-for-finnish-web-archive-content-selection-published/ -
National Library of Finland: Principles for Finnish Web Archive content selection published. “The National Library of Finland is responsible for the diverse and representative preservation of online material. To make this work more transparent, we produced a document entitled Content selection for the Finnish Web Archive, outlining the principles for content selection in thematic and continuous […]
https://rbfirehose.com/2026/05/13/national-library-of-finland-principles-for-finnish-web-archive-content-selection-published/ -
National Library of Finland: Principles for Finnish Web Archive content selection published. “The National Library of Finland is responsible for the diverse and representative preservation of online material. To make this work more transparent, we produced a document entitled Content selection for the Finnish Web Archive, outlining the principles for content selection in thematic and continuous […]
https://rbfirehose.com/2026/05/13/national-library-of-finland-principles-for-finnish-web-archive-content-selection-published/ -
National Library of Finland: Principles for Finnish Web Archive content selection published. “The National Library of Finland is responsible for the diverse and representative preservation of online material. To make this work more transparent, we produced a document entitled Content selection for the Finnish Web Archive, outlining the principles for content selection in thematic and continuous […]
https://rbfirehose.com/2026/05/13/national-library-of-finland-principles-for-finnish-web-archive-content-selection-published/ -
The web never stands still 🌐 ... and neither do the challenges of preserving it.
The #DPC is preparing for the return of its Web Archiving Special Interest Group (WA-SIG), bringing DPC Members together in a welcoming and transparent space where Members can exchange ideas, surface challenges, and learn from one another’s approaches.
The renewed WA-SIG gets together on 7 July.
Read more & join us 😊: https://www.dpconline.org/news/dpc-prepares-return-of-web-archiving-special-interest-group
#DigitalPreservation #Coalition #DPC #WebArchiving #Archives
-
The web never stands still 🌐 ... and neither do the challenges of preserving it.
The #DPC is preparing for the return of its Web Archiving Special Interest Group (WA-SIG), bringing DPC Members together in a welcoming and transparent space where Members can exchange ideas, surface challenges, and learn from one another’s approaches.
The renewed WA-SIG gets together on 7 July.
Read more & join us 😊: https://www.dpconline.org/news/dpc-prepares-return-of-web-archiving-special-interest-group
#DigitalPreservation #Coalition #DPC #WebArchiving #Archives
-
The web never stands still 🌐 ... and neither do the challenges of preserving it.
The #DPC is preparing for the return of its Web Archiving Special Interest Group (WA-SIG), bringing DPC Members together in a welcoming and transparent space where Members can exchange ideas, surface challenges, and learn from one another’s approaches.
The renewed WA-SIG gets together on 7 July.
Read more & join us 😊: https://www.dpconline.org/news/dpc-prepares-return-of-web-archiving-special-interest-group
#DigitalPreservation #Coalition #DPC #WebArchiving #Archives
-
The web never stands still 🌐 ... and neither do the challenges of preserving it.
The #DPC is preparing for the return of its Web Archiving Special Interest Group (WA-SIG), bringing DPC Members together in a welcoming and transparent space where Members can exchange ideas, surface challenges, and learn from one another’s approaches.
The renewed WA-SIG gets together on 7 July.
Read more & join us 😊: https://www.dpconline.org/news/dpc-prepares-return-of-web-archiving-special-interest-group
#DigitalPreservation #Coalition #DPC #WebArchiving #Archives
-
The web never stands still 🌐 ... and neither do the challenges of preserving it.
The #DPC is preparing for the return of its Web Archiving Special Interest Group (WA-SIG), bringing DPC Members together in a welcoming and transparent space where Members can exchange ideas, surface challenges, and learn from one another’s approaches.
The renewed WA-SIG gets together on 7 July.
Read more & join us 😊: https://www.dpconline.org/news/dpc-prepares-return-of-web-archiving-special-interest-group
#DigitalPreservation #Coalition #DPC #WebArchiving #Archives
-
Tom’s Hardware: Internet archival sites struggling to preserve the internet because of skyrocketing hard drive prices due to the AI boom — Wayback Machine and Wikimedia punished by stratospheric storage pricing and stricter anti-scraping measures blocking the wrong bots. “The internet is getting harder to archive because the AI boom has caused a storage crisis, with both NAND and mechanical […]
https://rbfirehose.com/2026/05/09/toms-hardware-internet-archival-sites-struggling-to-preserve-the-internet-because-of-skyrocketing-hard-drive-prices-due-to-the-ai-boom-wayback-machine-and-wikimedia-punished-by-stratosphe/ -
The poster “The CiVers-Project: Bridging research texts and fine-grained research data”, presented at #CAA2026 in Vienna, is now available #openaccess.
👉 Check it out here: https://doi.org/10.34780/cifa5ycy
#CiVers #DigitalHumanities
#ResearchData #OpenScience #opensource #research #digitalpreservation #webarchiving -
We presented #CiVers at #CAA2026 in Vienna (Mar 31–Apr 4) with a poster! 🎉
🔍 By combining #WebArchiving change detection, and #metadata extraction, CiVers enables reliable citation of versioned web pages using persistent identifiers (PIDs).
🥅 Our goal: make web-based #research resources citable, traceable, and reproducible.
Let’s connect! 🤲Photo Credits: Lisa Steinmann.
-
A timely panel on why web archiving is a civic duty — researchers, archivists, activists and citizens discuss preserving our digital history. Practical tips, ethics, and why this matters for memory and democracy. Inspiring and actionable! #WebArchiving #DigitalPreservation #Archives #CivicDuty #OpenAccess #DigitalRights #InternetHistory #Archiving #English
https://video.rhizome.org/videos/watch/77f762bd-1322-4c86-b8af-52428909f993 -
New-to-me, from Library of Congress: Preserving U.S. Indigenous Government Websites: From Directory to Digital Archive. “As a 2025 Junior Fellow, Maggie Jones helped build the United States Indigenous Government Websites Web Archive with the guidance of her mentor, Giselle Aviles. In this interview, they describe how the collection developed from a list of over 500 tribes and what that process […]
https://rbfirehose.com/2026/03/07/preserving-u-s-indigenous-government-websites-from-directory-to-digital-archive-library-of-congress/ -
What website archiving services do people use [multiple choice]?
#WaybackMachine #Megalodon #GhostArchive #ArchiveToday #WebArchiving #Archivism #Poll
-
Arndt, Tracy; Arndt, Natanael: How to describe the past Web? A data model for web archiving. SWIB25 - Semantic Web in Libraries, ZBW - Leibniz-Informationszentrum Wirtschaft et al., 2025. https://doi.org/10.5446/72405
-
Popular Science: The Internet Archive records its 1 trillionth website. “The Internet Archive—one of cyberspace’s most essential library projects—has achieved a feat that’s hard to even conceptualize. After nearly 30 years of painstaking work, the nonprofit has preserved its trillionth webpage.”
https://rbfirehose.com/2026/02/23/popular-science-the-internet-archive-records-its-1-trillionth-website/ -
Ars Technica: Wikipedia blacklists Archive.today, starts removing 695,000 archive links. “In the course of discussing whether Archive.today should be deprecated because of the DDoS, Wikipedia editors discovered that the archive site altered snapshots of webpages to insert the name of the blogger who was targeted by the DDoS. The alterations were apparently fueled by a grudge against the blogger […]
https://rbfirehose.com/2026/02/21/ars-technica-wikipedia-blacklists-archive-today-starts-removing-695000-archive-links/ -
#WaybackMachine Director Pushes Back on AI Scraping Fears Driving Archive Blocks
https://blog.archive.org/2026/02/18/wayback-machine-director-pushes-back
As reported by Nieman Lab last month, some major media organizations—including The #NewYorkTimes, #TheGuardian, and #Reddit—have started blocking the Wayback Machine from archiving their sites over unfounded concerns about AI scraping.
Mike Masnick in #Techdirt explained why this is “a mistake we’re going to regret for generations.”
limiting #webarchiving threatens our shared #digitalhistory. -
Library of Congress: From Print Volumes to Digital Scholarship: The Handbook of Latin American Studies Web Archive. “Since the 1930s, the Handbook of Latin American Studies has documented scholarship on Latin America and the Caribbean. In this interview, Tracy North describes how that long-standing mission now extends to web archiving, ensuring long-term access to web-based research materials. […]
https://rbfirehose.com/2026/02/09/from-print-volumes-to-digital-scholarship-the-handbook-of-latin-american-studies-web-archive-library-of-congress/ -
Alex Chan: Hard problems in social media archiving. “Institutional archiving has different constraints to individual collections – institutions serve a much wider audience, so their decisions need consistency and boundaries. My own scrapbook is tiny and personal, and comparing it alongside institutional efforts really highlights the differences and difficulties. It’s why I usually call it a […]
https://rbfirehose.com/2025/12/15/alex-chan-hard-problems-in-social-media-archiving/
-
Library of Congress Blogs: Where Science Meets Storytelling: Twelve Years of the Science Blogs Web Archive. ” More than a decade after its launch, the Science Blogs Web Archive continues to grow and evolve. In this interview, Jennifer ‘JJ’ Harbster reflects on building and maintaining the collection, while intern Yahir Brito brings a fresh perspective on updating and expanding it. Together, […]
-
Common Crawl - Setting the Record Straight: Common Crawl’s Commitment to Transparency, Fair Use, and the Public Good commoncrawl.org/blog/setting-t… #AI #CommonCrawl #data #WebArchiving (wow, that Atlantic piece was bad, needing this rebuttal)
-
Web Archive Archive.today: FBI Subpoenas Web Registrar Tucows to Unmask Operators
#ArchiveToday #FBI #Subpoena #Tucows #OnlinePrivacy #Anonymity #WebArchiving #Surveillance #WarrantCanary #DigitalRights #Cybersecurity #Investigation #Privacy #InternetFreedom #Web
-
📣 New blog post! 📝
October 14, we hosted our first #CiVers workshop at the @dai_weltweit in Berlin 🏛️ This was a great opportunity to exchange ideas on citing versioned web resources and managing research data in #archaeology and the #humanities
Read more about what we discussed 👇
🔗 https://www.dainst.org/blogs/noslug/253#Metadata #Research #OpenScience #DigitalPreservation #WebArchiving #DigitalHumanities
-
‼️ ATTENTION ‼️
⏰ Proposals for #iipcWAC26, "Sustainable #WebArchiving," in Brussels are due in 1 WEEK (15 OCT): http://netpreserve.org/ga2026/CfP
First-time submitters encouraged! Need inspiration?
https://www.youtube.com/@iipc8855/featured#WebArchives #WebArchiveWednesday #DigitalPreservation #DigitalHumanities
@webarchives -
#WAAM – #WebArchiving Aix Marseille #AMU
https://pba.mmsh.fr/?p=35306
WAAM – Web Archiving Aix #Marseille est le nom donné à une instance d’archivage maintenue par le CEntre de formation et de soutien aux DOnnées de la REcherche #CEDRE à la demande du #WebLab.Cette plateforme permet de collecter des pages web ou des sites complets et d’en conserver une version archivée (fichier au format .wacz), ainsi que de partager en ligne ces versions archivées afin de pouvoir les « rejouer ».
-
#WAAM – #WebArchiving Aix Marseille #AMU
https://pba.mmsh.fr/?p=35306
WAAM – Web Archiving Aix #Marseille est le nom donné à une instance d’archivage maintenue par le CEntre de formation et de soutien aux DOnnées de la REcherche #CEDRE à la demande du #WebLab.Cette plateforme permet de collecter des pages web ou des sites complets et d’en conserver une version archivée (fichier au format .wacz), ainsi que de partager en ligne ces versions archivées afin de pouvoir les « rejouer ».
-
#WAAM – #WebArchiving Aix Marseille #AMU
https://pba.mmsh.fr/?p=35306
WAAM – Web Archiving Aix #Marseille est le nom donné à une instance d’archivage maintenue par le CEntre de formation et de soutien aux DOnnées de la REcherche #CEDRE à la demande du #WebLab.Cette plateforme permet de collecter des pages web ou des sites complets et d’en conserver une version archivée (fichier au format .wacz), ainsi que de partager en ligne ces versions archivées afin de pouvoir les « rejouer ».
-
📣This #WebArchiveWednesday, plan your proposal for #iipcWAC26, “Sustainable #WebArchiving,” at KBR, Royal Library of Belgium! http://netpreserve.org/ga2026/CfP
🗓️ Deadline for proposals: OCT 15
#webarchives #DigitalPreservation #DigitalHumanities
@webarchives -
This #WebArchiveWednesday, check out these travel reports from attendees of #iipcWAC25: https://netpreserveblog.wordpress.com/tag/WAC2025/
Inspired? Start planning your proposal for #iipcWAC26, “Sustainable #WebArchiving,” at KBR, Royal Library of Belgium! http://netpreserve.org/ga2026/CfP
#webarchives | #DigitalPreservation | #DigitalHumanities | @webarchives
-
📚 First publication from CiVers!
🤔 How can we reliably cite resources of web-based research databases in archaeology and the humanities?
💡 In our new article, we present the CiVers approach: creating versioned, citable web resources using Persistent Identifiers (PIDs).
🧠 Read the full open-access paper here:
🔗 https://doi.org/10.34780/6k764r03#CiVers #DigitalHumanities #WebArchiving #OpenScience #PID #DigitalPreservation
-
📣Start planning your proposal for #iipcWAC26, “Sustainable #WebArchiving,” today! 📣
🗓️ Proposals due October 15
🇧🇪 20-23 APR 2026 at KBR, Royal Library of BelgiumFor more info: https://netpreserve.org/ga2026/cfp/
Need inspiration? Check out past presentations: https://www.youtube.com/@iipc8855/featured
#webarchives | #DigitalPreservation | #DigitalHumanities | @webarchives
-
📢 Hello Mastodon! 👋
We’re CiVers Citation of Versioned Web Pages by Persistent IdentifierWeb pages change. Links rot. Academic references break. We’re fixing that. 🛠️
💻 CiVers develops software and methodologies to make web content reliably citable with PIDs and versioning
🔗 DFG-funded @dfg_public project at the DAI Berlin @dai_weltweit with Heidelberg University Library @uniheidelberg, GBV @vzg_gbv and DataCite @datacite
#WebArchiving #OpenScience #DigitalHumanities #PID #DataCite #CiVers
-
What can hacked websites tell us about the history of political activism of the web? My latest article explores the political, cultural, and archival value of over 10,000 web defacements from attrition.org—now made available for research.
🔗 https://journalofdigitalhistory.org/en/article/hXsgcT9BZ5jP
-
[halshs-05113368] Neglect, Stammering, Focus: Processes of an #Archival Experience of #Archive Collections and #Audiovisual Projects at #AMU Posted on the Web Over the Past 30 Years
https://shs.hal.science/halshs-05113368v1
#webarchiving
#soundarchives
#resaw2025 -
[halshs-05113368] Neglect, Stammering, Focus: Processes of an #Archival Experience of #Archive Collections and #Audiovisual Projects at #AMU Posted on the Web Over the Past 30 Years
https://shs.hal.science/halshs-05113368v1
#webarchiving
#soundarchives
#resaw2025 -
[halshs-05113368] Neglect, Stammering, Focus: Processes of an #Archival Experience of #Archive Collections and #Audiovisual Projects at #AMU Posted on the Web Over the Past 30 Years
https://shs.hal.science/halshs-05113368v1
#webarchiving
#soundarchives
#resaw2025 -
Looking after your URLs: tikalinkextract eight years on
by @beet_keeperWe might not have a second life, but what if I told you there was a second internet? Not the deep web, but another web that we engage with nearly every day?
Think about it, that QR code you scanned for more information? That payment link you followed on your electricity bill? The website you’re told to visit at the end of a television ad?
The antipodes of the internet are these terminal endpoints, material and not necessarily material objects that represent the end of the freely navigable web — the QR code on a concert poster is the web printed onto the physical world. There is every chance it will be scanned and followed by someone from a mobile device, but it’s a transient object, something that will exist for a short amount of time, and then disappear into the palimpsest of the poster board or wall it was pasted on until it eventually disappears.
This is part of the materiality of the internet that has long fascinated me. Perhaps it comes from being a student of material culture, but if we look around, we see the Internet everywhere!
#Archives #digipres #DigitalArchiving #digitalContinuity #DigitalPreservation #httpreserve #Memento #outreach #RobustLinks #RobustWebLinks #WebArchives #webArchiving
-
Quicker, better, robuster,... this is ZIMit 2.0! Our scraper able to make an offline version of any Web site is only a few days away from its release! Stay tuned! https://github.com/openzim/zimit #webscraping #webarchiving #zim #offline #kiwix #warc
-
Wow! #TIL about #ArchiveBox, your #selfhosted #alternativeTo @internetarchive!
Runs on #Python (OS-packaged or #dockered) and saves both single pages or whole website crawls in every format you could wish for:
✅ self-contained single-page HTML
✅ PDF
✅ PNG screenshot
✅ plaintext
✅ DOM-dump
✅ priv./publ. #archive
✅ media audio/video included (+yt-dlp)
✅ #WARC compat.🌐 https://archivebox.io
📜 https://github.com/ArchiveBox/ArchiveBox
▶ https://demo.archivebox.io -
We had an awesome time at #JCDL2023 this summer. Our proceedings are now online! Enjoy!
#RethinkingDigitalRecordsProceedings: https://doi.org/10.1109/JCDL57899.2023
#DigitalLibraries #Proceedings #Research #DigitalPreservation #WebArchiving #InformationRetrieval #MachineLearning #Conferences
-
✍️ Blog post: Software prototype #ArtDocArchive (work in progress)
https://nullmuseum.hypotheses.org/602Summary: Art Doc Archive will be a set of tools for digital #archive care, for all the #artdocumentation on websites and #socialmedia. How to save this archival material and analyze it for #digitalarthistory ?
The project is accompanied by a blog titled 𝙍𝙚𝙘𝙡𝙖𝙞𝙢 𝙮𝙤𝙪𝙧 𝘼𝙧𝙘𝙝𝙞𝙫𝙚, open source software prototyping happens until end of February 2023
Follow our project blog at https://reclaim.hypotheses.org
#webarchiving #digitalhumanities #dataviz #Datavisualization #mirrors #semanticdata #structureddata, #namedentityrecognition, #dataviz
-
✍️ Blog post: Software prototype #ArtDocArchive (work in progress)
https://nullmuseum.hypotheses.org/602Summary: Art Doc Archive will be a set of tools for digital #archive care, for all the #artdocumentation on websites and #socialmedia. How to save this archival material and analyze it for #digitalarthistory ?
The project is accompanied by a blog titled 𝙍𝙚𝙘𝙡𝙖𝙞𝙢 𝙮𝙤𝙪𝙧 𝘼𝙧𝙘𝙝𝙞𝙫𝙚, open source software prototyping happens until end of February 2023
Follow our project blog at https://reclaim.hypotheses.org
#webarchiving #digitalhumanities #dataviz #Datavisualization #mirrors #semanticdata #structureddata, #namedentityrecognition, #dataviz
-
A very nice contribution to #StormCrawler improving the generation of #WARC files