home.social

#hyperloglog — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #hyperloglog, aggregated by home.social.

  1. RE: wisskomm.social/@ioer/11589933

    I really took a deep dive into #datashader with this map: Locals & Tourists in Germany, as derived from 67 Million Geo-Social Media Posts (2007-2022) in Germany. The data includes public shared posts from Instagram, Flickr, Twitter and iNaturalist.

    I always wanted to create such a map, following the footsteps of Eric Fisher's Locals & Tourists dataset from 2011 [1].

    I shared the code for producing this map here [2]. The repository is available here [3]. This includes some neat methods for various #geospatial processing tasks in #Python, such as exporting a datashader map to a #GeoTiff [4] with the help of #Xarray and #Rasterio.

    Finally, all of this was created in a privacy-preserving way using #HyperLogLog, which allowed me to share the code and abstracted data publicly for full reproducibility and transparency. [6] #FAIR

    Below you'll find the link to the (quite succinct) publication in Natur und Landschaft in Karten (#NuL).

    [1]: flickr.com/photos/walkingsf/al
    [2]: code.ad.ioer.info/wip/digital_
    [3]: gitlab.hrz.tu-chemnitz.de/ad/d
    [4]: gitlab.hrz.tu-chemnitz.de/s739
    [5]: nul-online.de/article-7301410-
    [6]: doi.org/10.71830/VDMUWW

  2. RE: wisskomm.social/@ioer/11589933

    I really took a deep dive into #datashader with this map: Locals & Tourists in Germany, as derived from 67 Million Geo-Social Media Posts (2007-2022) in Germany. The data includes public shared posts from Instagram, Flickr, Twitter and iNaturalist.

    I always wanted to create such a map, following the footsteps of Eric Fisher's Locals & Tourists dataset from 2011 [1].

    I shared the code for producing this map here [2]. The repository is available here [3]. This includes some neat methods for various #geospatial processing tasks in #Python, such as exporting a datashader map to a #GeoTiff [4] with the help of #Xarray and #Rasterio.

    Finally, all of this was created in a privacy-preserving way using #HyperLogLog, which allowed me to share the code and abstracted data publicly for full reproducibility and transparency. [6] #FAIR

    Below you'll find the link to the (quite succinct) publication in Natur und Landschaft in Karten (#NuL).

    [1]: flickr.com/photos/walkingsf/al
    [2]: code.ad.ioer.info/wip/digital_
    [3]: gitlab.hrz.tu-chemnitz.de/ad/d
    [4]: gitlab.hrz.tu-chemnitz.de/s739
    [5]: nul-online.de/article-7301410-
    [6]: doi.org/10.71830/VDMUWW

  3. Redis 對 HyperLogLog 省空間的實作

    HyperLogLog (HLL) 是用統計方式解決 Count-distinct problem 的資料結構以及演算法,不要求完全正確,而是大概的數量。

    演算法其實沒有很難懂,在 2007 年的原始論文「HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm」裡面可以讀到演算法是長這樣:

    可以

    blog.gslin.org/archives/2024/0

    #Computer #Murmuring #Software #algorithm #count #data #distinct #hyperloglog #problem #redis #structure