home.social

#httparchive — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #httparchive, aggregated by home.social.

  1. What are the longest HTTP header names and values? I dug into the HTTP Archive to find out: fastly.com/blog/the-lengthiest

  2. Congrats to our founder @MichaelLewittes, as well as @dwsmart, @jammer_volts, Mikael Araujo, and @tunetheweb for their work on the SEO chapter of the HTTP Archive's Web Almanac!

    You can read it here: almanac.httparchive.org/en/202

    #WebAlmanac #HTTPArchive #SEO

  3. The 2024 Web Almanac has been published:

    almanac.httparchive.org/en/202

    I have contributed to the chapters on accessibility and sustainability.

    #WebAlmanac #WebAlmanac24 #HTTPArchive

  4. 🔥 Microcks 1.8.0 is out in the wild! 🚀

    1st release since we joined the #cncf, and our theme is definitely #Open! Open to #community, open to new usages with #AI and #HttpArchive, open to new #shiftleft #devexp with @Testcontainers, open to you!

    👉 See microcks.io/blog/microcks-1.8.

  5. And some more:

    Content-Type: image / png
    Content-Type: image/$JPG
    Content-Type: image%2Fjpeg
    Content-Type: images/gif
    Content-Type: max-age=1555200
    Content-Type: plain/txt
    Content-Type: test/plain
    Content-Type: text/htmml
    Content-Type: text/javasciprt
    Content-Type: text/javascriipt
    Content-Type: text\html
    Content-Type: type
    Content-Type: TYPE/SUBTYPE
    Content-Type: UTC
    Content-Type: UTF-8
    Content-Type: width="1280" height="720"

  6. Interesting Content-Types I have noticed in the 2023-07 mobile HTTP Archive crawl:

    Content-Type: [*/*]
    Content-Type: */*
    Content-Type: #<Mime::NullType:0x0000000cf50828>
    Content-Type: <content-typeheader>
    Content-Type: <img/>
    Content-Type: $MIMETYPE
    Content-Type: 2
    Content-Type: AddType font/woff
    Content-Type: all/all
    Content-Type: application/jason
    Content-Type: application/jon
    Content-Type: Default
    Content-Type: FALSE
    Content-Type: IMAGE

  7. Is there anyone who recently configured HTTP Archive dataset on GCP BigQuery? I struggle to do so based one this guide: github.com/HTTPArchive/httparc github.com/HTTPArchive/httparc

    If anyone can help, or run 1-2 queries for me, that would be fab! Thanks a ton!

    #httparchive

  8. Recently I've been building reports based upon HTTP Archive data. Rather than call BigQuery, I instead export the data I'm interested in into Parquet format and then query it locally on my laptop using DuckDB. Here's how I did it: discuss.httparchive.org/t/quer

  9. And my favourite is:
    Last-Modified: Invalid Date
    ... which was seen on 119 responses from the 2023-04-01 mobile run.

  10. Is this common? I queried the HTTP Archive and it found 121,058 double weak resources in the 2022-12-01 dataset with ETags that start with W/W/. That's 0.008% of all the resources. Good news: I found no triple weak validators.