#webarchiving — Public Fediverse posts on home.social

Tommi 🤯 @[email protected] · 2026-05-28 · 11:03 UTC

I was the first person to archive a webpage from Internet Archive Europe on the Internet Archive’s Wayback Machine.

LoL

#InternetArchive #InternetArchiveEurope #WaybackMachine #archive #archiving #WebArchiving #WebPreservation #inception

#internetarchive #internetarchiveeurope #waybackmachine #archive #archiving #webarchiving

Tommi 🤯 @[email protected] · 2026-05-28 · 11:03 UTC

I was the first person to archive a webpage from Internet Archive Europe on the Internet Archive’s Wayback Machine.

LoL

#InternetArchive #InternetArchiveEurope #WaybackMachine #archive #archiving #WebArchiving #WebPreservation #inception

#internetarchive #internetarchiveeurope #waybackmachine #archive #archiving #webarchiving

Tommi 🤯 @[email protected] · 2026-05-28 · 11:03 UTC

I was the first person to archive a webpage from Internet Archive Europe on the Internet Archive’s Wayback Machine.

LoL

#InternetArchive #InternetArchiveEurope #WaybackMachine #archive #archiving #WebArchiving #WebPreservation #inception

#internetarchive #internetarchiveeurope #waybackmachine #archive #archiving #webarchiving

Tommi 🤯 @[email protected] · 2026-05-28 · 11:03 UTC

I was the first person to archive a webpage from Internet Archive Europe on the Internet Archive’s Wayback Machine.

LoL

#InternetArchive #InternetArchiveEurope #WaybackMachine #archive #archiving #WebArchiving #WebPreservation #inception

#inception #webpreservation #webarchiving #archiving #archive #waybackmachine

Tommi 🤯 @[email protected] · 2026-05-28 · 11:03 UTC

I was the first person to archive a webpage from Internet Archive Europe on the Internet Archive’s Wayback Machine.

LoL

#InternetArchive #InternetArchiveEurope #WaybackMachine #archive #archiving #WebArchiving #WebPreservation #inception

#internetarchive #internetarchiveeurope #waybackmachine #archive #archiving #webarchiving

internetarchive @[email protected] · 2026-05-27 · 23:14 UTC

“People aren’t sure what’s true, and what libraries are here for is to help with that.”

Brewster Kahle, digital librarian of the Internet Archive, discusses the future of the #WaybackMachine in ABC Radio National (🇦🇺 Australia)’s “Wayback Machine: The internet’s archive in peril,” a look at how media companies are restricting the preservation of the web itself.

🎧 Listen ⤵️
https://www.abc.net.au/listen/programs/sundayextra/wayback-machine/106604988

#InternetHistory #WebArchiving @abcaustraliarss @brewsterkahle

#waybackmachine #internethistory #webarchiving

raffaele @[email protected] · 2026-05-27 · 05:48 UTC

"Common Crawl mirrors its monthly crawl archive to the Hugging Face Hub as a Storage Bucket. Alongside the raw pages, it now publishes the columnar URL index — one parquet row per crawled page (host, language, MIME type, fetch status, and a pointer to the page's bytes). That makes the whole crawl queryable without touching the petabytes of underlying WARCs."
https://huggingface.co/spaces/davanstrien/common-crawl-april-2026
#webarchiving

#webarchiving

ResearchBuzz: Firehose @[email protected] · 2026-05-21 · 20:53 UTC

NiemanLab: More than 340 local news outlets are limiting the Internet Archive’s access to their journalism. “Our new analysis shows that more than 340 local news sites across the United States are now limiting the Internet Archive’s ability to access and preserve their stories. Many sites in our sample are owned by five of the seven largest local news publishers in the country: USA Today […]

https://rbfirehose.com/2026/05/21/niemanlab-more-than-340-local-news-outlets-are-limiting-the-internet-archives-access-to-their-journalism/

#archives #digitalimpermanence #endangeredarchives #internetarchives #journalism #media

ResearchBuzz: Firehose @[email protected] · 2026-05-21 · 20:53 UTC

NiemanLab: More than 340 local news outlets are limiting the Internet Archive’s access to their journalism. “Our new analysis shows that more than 340 local news sites across the United States are now limiting the Internet Archive’s ability to access and preserve their stories. Many sites in our sample are owned by five of the seven largest local news publishers in the country: USA Today […]

https://rbfirehose.com/2026/05/21/niemanlab-more-than-340-local-news-outlets-are-limiting-the-internet-archives-access-to-their-journalism/

#archives #digitalimpermanence #endangeredarchives #internetarchives #journalism #media

ResearchBuzz: Firehose @[email protected] · 2026-05-21 · 20:53 UTC

NiemanLab: More than 340 local news outlets are limiting the Internet Archive’s access to their journalism. “Our new analysis shows that more than 340 local news sites across the United States are now limiting the Internet Archive’s ability to access and preserve their stories. Many sites in our sample are owned by five of the seven largest local news publishers in the country: USA Today […]

https://rbfirehose.com/2026/05/21/niemanlab-more-than-340-local-news-outlets-are-limiting-the-internet-archives-access-to-their-journalism/

#archives #digitalimpermanence #endangeredarchives #internetarchives #journalism #media

ResearchBuzz: Firehose @[email protected] · 2026-05-21 · 20:53 UTC

NiemanLab: More than 340 local news outlets are limiting the Internet Archive’s access to their journalism. “Our new analysis shows that more than 340 local news sites across the United States are now limiting the Internet Archive’s ability to access and preserve their stories. Many sites in our sample are owned by five of the seven largest local news publishers in the country: USA Today […]

https://rbfirehose.com/2026/05/21/niemanlab-more-than-340-local-news-outlets-are-limiting-the-internet-archives-access-to-their-journalism/

#webarchiving #publishers #news #media #journalism #internetarchives

ResearchBuzz: Firehose @[email protected] · 2026-05-21 · 20:53 UTC

NiemanLab: More than 340 local news outlets are limiting the Internet Archive’s access to their journalism. “Our new analysis shows that more than 340 local news sites across the United States are now limiting the Internet Archive’s ability to access and preserve their stories. Many sites in our sample are owned by five of the seven largest local news publishers in the country: USA Today […]

https://rbfirehose.com/2026/05/21/niemanlab-more-than-340-local-news-outlets-are-limiting-the-internet-archives-access-to-their-journalism/

#archives #digitalimpermanence #endangeredarchives #internetarchives #journalism #media

raffaele @[email protected] · 2026-05-19 · 16:00 UTC

RE: https://fedihum.org/@aiucd/116600978409273225

#webarchiving

ResearchBuzz: Firehose @[email protected] · 2026-05-13 · 16:38 UTC

National Library of Finland: Principles for Finnish Web Archive content selection published. “The National Library of Finland is responsible for the diverse and representative preservation of online material. To make this work more transparent, we produced a document entitled Content selection for the Finnish Web Archive, outlining the principles for content selection in thematic and continuous […]

https://rbfirehose.com/2026/05/13/national-library-of-finland-principles-for-finnish-web-archive-content-selection-published/

#bestpractices #borndigitalarchives #borndigitalarchiving #contentcuration #culturalheritage #gallerieslibrariesarchivesmuseumsglam

ResearchBuzz: Firehose @[email protected] · 2026-05-13 · 16:38 UTC

National Library of Finland: Principles for Finnish Web Archive content selection published. “The National Library of Finland is responsible for the diverse and representative preservation of online material. To make this work more transparent, we produced a document entitled Content selection for the Finnish Web Archive, outlining the principles for content selection in thematic and continuous […]

https://rbfirehose.com/2026/05/13/national-library-of-finland-principles-for-finnish-web-archive-content-selection-published/

#bestpractices #borndigitalarchives #borndigitalarchiving #contentcuration #culturalheritage #gallerieslibrariesarchivesmuseumsglam

ResearchBuzz: Firehose @[email protected] · 2026-05-13 · 16:38 UTC

National Library of Finland: Principles for Finnish Web Archive content selection published. “The National Library of Finland is responsible for the diverse and representative preservation of online material. To make this work more transparent, we produced a document entitled Content selection for the Finnish Web Archive, outlining the principles for content selection in thematic and continuous […]

https://rbfirehose.com/2026/05/13/national-library-of-finland-principles-for-finnish-web-archive-content-selection-published/

#bestpractices #borndigitalarchives #borndigitalarchiving #contentcuration #culturalheritage #gallerieslibrariesarchivesmuseumsglam

ResearchBuzz: Firehose @[email protected] · 2026-05-13 · 16:38 UTC

National Library of Finland: Principles for Finnish Web Archive content selection published. “The National Library of Finland is responsible for the diverse and representative preservation of online material. To make this work more transparent, we produced a document entitled Content selection for the Finnish Web Archive, outlining the principles for content selection in thematic and continuous […]

https://rbfirehose.com/2026/05/13/national-library-of-finland-principles-for-finnish-web-archive-content-selection-published/

#webarchiving #webarchives #nationallibraryoffinland #glam #gallerieslibrariesarchivesmuseumsglam #culturalheritage

ResearchBuzz: Firehose @[email protected] · 2026-05-13 · 16:38 UTC

National Library of Finland: Principles for Finnish Web Archive content selection published. “The National Library of Finland is responsible for the diverse and representative preservation of online material. To make this work more transparent, we produced a document entitled Content selection for the Finnish Web Archive, outlining the principles for content selection in thematic and continuous […]

https://rbfirehose.com/2026/05/13/national-library-of-finland-principles-for-finnish-web-archive-content-selection-published/

#bestpractices #borndigitalarchives #borndigitalarchiving #contentcuration #culturalheritage #gallerieslibrariesarchivesmuseumsglam

Digital Preservation Coalition @[email protected] · 2026-05-12 · 12:18 UTC

The web never stands still 🌐 ... and neither do the challenges of preserving it.

The #DPC is preparing for the return of its Web Archiving Special Interest Group (WA-SIG), bringing DPC Members together in a welcoming and transparent space where Members can exchange ideas, surface challenges, and learn from one another’s approaches.

The renewed WA-SIG gets together on 7 July.

#DigitalPreservation #Coalition #DPC #WebArchiving #Archives

#dpc #digitalpreservation #coalition #webarchiving #archives

Digital Preservation Coalition @[email protected] · 2026-05-12 · 12:18 UTC

The web never stands still 🌐 ... and neither do the challenges of preserving it.

The #DPC is preparing for the return of its Web Archiving Special Interest Group (WA-SIG), bringing DPC Members together in a welcoming and transparent space where Members can exchange ideas, surface challenges, and learn from one another’s approaches.

The renewed WA-SIG gets together on 7 July.

#DigitalPreservation #Coalition #DPC #WebArchiving #Archives

#dpc #digitalpreservation #coalition #webarchiving #archives

Digital Preservation Coalition @[email protected] · 2026-05-12 · 12:18 UTC

The web never stands still 🌐 ... and neither do the challenges of preserving it.

The #DPC is preparing for the return of its Web Archiving Special Interest Group (WA-SIG), bringing DPC Members together in a welcoming and transparent space where Members can exchange ideas, surface challenges, and learn from one another’s approaches.

The renewed WA-SIG gets together on 7 July.

#DigitalPreservation #Coalition #DPC #WebArchiving #Archives

#dpc #digitalpreservation #coalition #webarchiving #archives

Digital Preservation Coalition @[email protected] · 2026-05-12 · 12:18 UTC

The web never stands still 🌐 ... and neither do the challenges of preserving it.

The #DPC is preparing for the return of its Web Archiving Special Interest Group (WA-SIG), bringing DPC Members together in a welcoming and transparent space where Members can exchange ideas, surface challenges, and learn from one another’s approaches.

The renewed WA-SIG gets together on 7 July.

#DigitalPreservation #Coalition #DPC #WebArchiving #Archives

#archives #webarchiving #coalition #digitalpreservation #dpc

Digital Preservation Coalition @[email protected] · 2026-05-12 · 12:18 UTC

The web never stands still 🌐 ... and neither do the challenges of preserving it.

The #DPC is preparing for the return of its Web Archiving Special Interest Group (WA-SIG), bringing DPC Members together in a welcoming and transparent space where Members can exchange ideas, surface challenges, and learn from one another’s approaches.

The renewed WA-SIG gets together on 7 July.

#DigitalPreservation #Coalition #DPC #WebArchiving #Archives

#dpc #digitalpreservation #coalition #webarchiving #archives

ResearchBuzz: Firehose @[email protected] · 2026-05-09 · 15:29 UTC

Tom’s Hardware: Internet archival sites struggling to preserve the internet because of skyrocketing hard drive prices due to the AI boom — Wayback Machine and Wikimedia punished by stratospheric storage pricing and stricter anti-scraping measures blocking the wrong bots. “The internet is getting harder to archive because the AI boom has caused a storage crisis, with both NAND and mechanical […]

https://rbfirehose.com/2026/05/09/toms-hardware-internet-archival-sites-struggling-to-preserve-the-internet-because-of-skyrocketing-hard-drive-prices-due-to-the-ai-boom-wayback-machine-and-wikimedia-punished-by-stratosphe/

#borndigital #borndigitalarchives #borndigitalarchiving #digitalimpermanence #endangeredarchives #harddrives

ResearchBuzz: Firehose @[email protected] · 2026-05-09 · 15:29 UTC

Tom’s Hardware: Internet archival sites struggling to preserve the internet because of skyrocketing hard drive prices due to the AI boom — Wayback Machine and Wikimedia punished by stratospheric storage pricing and stricter anti-scraping measures blocking the wrong bots. “The internet is getting harder to archive because the AI boom has caused a storage crisis, with both NAND and mechanical […]

https://rbfirehose.com/2026/05/09/toms-hardware-internet-archival-sites-struggling-to-preserve-the-internet-because-of-skyrocketing-hard-drive-prices-due-to-the-ai-boom-wayback-machine-and-wikimedia-punished-by-stratosphe/

#borndigital #borndigitalarchives #borndigitalarchiving #digitalimpermanence #endangeredarchives #harddrives

ResearchBuzz: Firehose @[email protected] · 2026-05-09 · 15:29 UTC

Tom’s Hardware: Internet archival sites struggling to preserve the internet because of skyrocketing hard drive prices due to the AI boom — Wayback Machine and Wikimedia punished by stratospheric storage pricing and stricter anti-scraping measures blocking the wrong bots. “The internet is getting harder to archive because the AI boom has caused a storage crisis, with both NAND and mechanical […]

https://rbfirehose.com/2026/05/09/toms-hardware-internet-archival-sites-struggling-to-preserve-the-internet-because-of-skyrocketing-hard-drive-prices-due-to-the-ai-boom-wayback-machine-and-wikimedia-punished-by-stratosphe/

#borndigital #borndigitalarchives #borndigitalarchiving #digitalimpermanence #endangeredarchives #harddrives

ResearchBuzz: Firehose @[email protected] · 2026-05-09 · 15:29 UTC

Tom’s Hardware: Internet archival sites struggling to preserve the internet because of skyrocketing hard drive prices due to the AI boom — Wayback Machine and Wikimedia punished by stratospheric storage pricing and stricter anti-scraping measures blocking the wrong bots. “The internet is getting harder to archive because the AI boom has caused a storage crisis, with both NAND and mechanical […]

https://rbfirehose.com/2026/05/09/toms-hardware-internet-archival-sites-struggling-to-preserve-the-internet-because-of-skyrocketing-hard-drive-prices-due-to-the-ai-boom-wayback-machine-and-wikimedia-punished-by-stratosphe/

#wikipedia #wikimedia #webarchiving #webarchives #pricehikes #internetarchive

ResearchBuzz: Firehose @[email protected] · 2026-05-09 · 15:29 UTC

Tom’s Hardware: Internet archival sites struggling to preserve the internet because of skyrocketing hard drive prices due to the AI boom — Wayback Machine and Wikimedia punished by stratospheric storage pricing and stricter anti-scraping measures blocking the wrong bots. “The internet is getting harder to archive because the AI boom has caused a storage crisis, with both NAND and mechanical […]

https://rbfirehose.com/2026/05/09/toms-hardware-internet-archival-sites-struggling-to-preserve-the-internet-because-of-skyrocketing-hard-drive-prices-due-to-the-ai-boom-wayback-machine-and-wikimedia-punished-by-stratosphe/

#borndigital #borndigitalarchives #borndigitalarchiving #digitalimpermanence #endangeredarchives #harddrives