#hyperloglog — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #hyperloglog, aggregated by home.social.
-
RE: https://wisskomm.social/@ioer/115899330915763542
I really took a deep dive into #datashader with this map: Locals & Tourists in Germany, as derived from 67 Million Geo-Social Media Posts (2007-2022) in Germany. The data includes public shared posts from Instagram, Flickr, Twitter and iNaturalist.
I always wanted to create such a map, following the footsteps of Eric Fisher's Locals & Tourists dataset from 2011 [1].
I shared the code for producing this map here [2]. The repository is available here [3]. This includes some neat methods for various #geospatial processing tasks in #Python, such as exporting a datashader map to a #GeoTiff [4] with the help of #Xarray and #Rasterio.
Finally, all of this was created in a privacy-preserving way using #HyperLogLog, which allowed me to share the code and abstracted data publicly for full reproducibility and transparency. [6] #FAIR
Below you'll find the link to the (quite succinct) publication in Natur und Landschaft in Karten (#NuL).
[1]: https://www.flickr.com/photos/walkingsf/albums/72157624209158632
[2]: https://code.ad.ioer.info/wip/digital_traces_map/html/03_visualization.html
[3]: https://gitlab.hrz.tu-chemnitz.de/ad/digital_traces_map/
[4]: https://gitlab.hrz.tu-chemnitz.de/s7398234--tu-dresden.de/base_modules/-/blob/main/raster.py?ref_type=heads#L78
[5]: https://www.nul-online.de/article-7301410-1111/landschaft-und-natur-in-karten-.html
[6]: https://doi.org/10.71830/VDMUWW -
RE: https://wisskomm.social/@ioer/115899330915763542
I really took a deep dive into #datashader with this map: Locals & Tourists in Germany, as derived from 67 Million Geo-Social Media Posts (2007-2022) in Germany. The data includes public shared posts from Instagram, Flickr, Twitter and iNaturalist.
I always wanted to create such a map, following the footsteps of Eric Fisher's Locals & Tourists dataset from 2011 [1].
I shared the code for producing this map here [2]. The repository is available here [3]. This includes some neat methods for various #geospatial processing tasks in #Python, such as exporting a datashader map to a #GeoTiff [4] with the help of #Xarray and #Rasterio.
Finally, all of this was created in a privacy-preserving way using #HyperLogLog, which allowed me to share the code and abstracted data publicly for full reproducibility and transparency. [6] #FAIR
Below you'll find the link to the (quite succinct) publication in Natur und Landschaft in Karten (#NuL).
[1]: https://www.flickr.com/photos/walkingsf/albums/72157624209158632
[2]: https://code.ad.ioer.info/wip/digital_traces_map/html/03_visualization.html
[3]: https://gitlab.hrz.tu-chemnitz.de/ad/digital_traces_map/
[4]: https://gitlab.hrz.tu-chemnitz.de/s7398234--tu-dresden.de/base_modules/-/blob/main/raster.py?ref_type=heads#L78
[5]: https://www.nul-online.de/article-7301410-1111/landschaft-und-natur-in-karten-.html
[6]: https://doi.org/10.71830/VDMUWW -
#FOSS breaks down barriers and makes innovation more accessible to everyone, worldwide. Roberto Luna Rojas from #Valkey shares why #opensource matters to him.
Learn more about #vectors, #hyperloglog, #Redis, and how to improve your observability with key-value datastores: https://t.ly/ZnTNX
#Linux #observability #kubernetes #softwarelibre #freesoftware
-
Counting Millions of Things with Kilobytes
A Hands-On Quarkus Tutorial for Scalable Unique Counting with HyperLogLog
https://myfear.substack.com/p/quarkus-hyperloglog-unique-counting-java
#Java #Quarkus #GitHub #HyperLogLog -
@bkastl Hm, feel you!
Arbeite durchaus in dem Bereich und war bisher immer ein großer Freund des Ethikrates.
Evtl. sollte sie eine Befürworterin des #HyperLogLog werden.
https://media.ccc.de/v/38c3-privacy-preserving-health-data-processing-is-possible
-
Completed the First Assignment of #645 @CMUDB , Hyperloglog was an interesting data structure to learn about.
-
Google 的 HyperLogLog++
算是接續昨天寫的「Redis 對 HyperLogLog 省空間的實作」,在 Redis 的 HyperLogLog 實作有提到 Google 的論文「HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm」,裡面提出了 HyperLogLog++ (HLL++)。
論文中 Google 提出來的改進主要有三個,第一個是用了 64-bit hash function:
5.1 Using a 64 Bit Hash Fu
https://blog.gslin.org/archives/2024/03/21/11709/google-%e7%9a%84-hyperloglog/
#Computer #Murmuring #Programming #algorithm #data #google #hll #hyperloglog #structure
-
Google 的 HyperLogLog++
算是接續昨天寫的「Redis 對 HyperLogLog 省空間的實作」,在 Redis 的 HyperLogLog 實作有提到 Google 的論文「HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm」,裡面提出了 HyperLogLog++ (HLL++)。
論文中 Google 提出來的改進主要有三個,第一個是用了 64-bit hash function:
5.1 Using a 64 Bit Hash Fu
https://blog.gslin.org/archives/2024/03/21/11709/google-%e7%9a%84-hyperloglog/
#Computer #Murmuring #Programming #algorithm #data #google #hll #hyperloglog #structure
-
Google 的 HyperLogLog++
算是接續昨天寫的「Redis 對 HyperLogLog 省空間的實作」,在 Redis 的 HyperLogLog 實作有提到 Google 的論文「HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm」,裡面提出了 HyperLogLog++ (HLL++)。
論文中 Google 提出來的改進主要有三個,第一個是用了 64-bit hash function:
5.1 Using a 64 Bit Hash Fu
https://blog.gslin.org/archives/2024/03/21/11709/google-%e7%9a%84-hyperloglog/
#Computer #Murmuring #Programming #algorithm #data #google #hll #hyperloglog #structure
-
Google 的 HyperLogLog++
算是接續昨天寫的「Redis 對 HyperLogLog 省空間的實作」,在 Redis 的 HyperLogLog 實作有提到 Google 的論文「HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm」,裡面提出了 HyperLogLog++ (HLL++)。
論文中 Google 提出來的改進主要有三個,第一個是用了 64-bit hash function:
5.1 Using a 64 Bit Hash Fu
https://blog.gslin.org/archives/2024/03/21/11709/google-%e7%9a%84-hyperloglog/
#Computer #Murmuring #Programming #algorithm #data #google #hll #hyperloglog #structure
-
Google 的 HyperLogLog++
算是接續昨天寫的「Redis 對 HyperLogLog 省空間的實作」,在 Redis 的 HyperLogLog 實作有提到 Google 的論文「HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm」,裡面提出了 HyperLogLog++ (HLL++)。
論文中 Google 提出來的改進主要有三個,第一個是用了 64-bit hash function:
5.1 Using a 64 Bit Hash Fu
https://blog.gslin.org/archives/2024/03/21/11709/google-%e7%9a%84-hyperloglog/
#Computer #Murmuring #Programming #algorithm #data #google #hll #hyperloglog #structure
-
Redis 對 HyperLogLog 省空間的實作
HyperLogLog (HLL) 是用統計方式解決 Count-distinct problem 的資料結構以及演算法,不要求完全正確,而是大概的數量。
演算法其實沒有很難懂,在 2007 年的原始論文「HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm」裡面可以讀到演算法是長這樣:
可以
#Computer #Murmuring #Software #algorithm #count #data #distinct #hyperloglog #problem #redis #structure
-
#HyperLogLog is super clever.
It can count any number of unique values in constant space (i.e. without storing the values) within a specified margin of error.
And HLLs can be merged to count unique number of values in both sets! So you can quickly count something like "unique number of requests per day", and combine these into "per month", and "per year", without storing a year worth of history.
-
Fantastic explanation of #HyperLogLog algorithm: https://www.youtube.com/watch?v=lJYufx0bfpw. What's great about this video is that it uses very basic concepts, so that even non-programmers will understand it. On the other hand, CS terms like hash functions or sorted sets are mentioned in fine print, so the video doesn't sound childish
-
#TIL about the #HyperLogLog algorithm and I think it's a damn brilliant way to estimate the number of unique elements of a potentially gargantuan set of items and only running in O(n) time and O(1) space. The fact that variants of the algorithm can be done in parallel makes it even more awesome!
#algorithms #ComputerScience #SoME3 #mathematics #maths #statistics
-
Something a little different today on the channel: HyperLogLog!
It's one of my favorite algorithms, used to estimate cardinality of a set. Typically used in environments with very large datasets (spread across many servers in a cluster) where a true, accurate distinct count would be very expensive.
HLL uses a simple observation about coin flipping probabilities, and extends that to cardinality estimation. Really clever algo, and provides a very fast and compact datastructure with reasonably small errors (<2% across billions of unique elements, typically in just a few kb of memory).
https://www.youtube.com/watch?v=lJYufx0bfpw
#programming #algorithm #hyperloglog #cardinality #datastructures
-
#TIL about the #HyperLogLog algorithm and I think it's a damn brilliant way to estimate the number of unique elements of a potentially gargantuan set of items and only running in O(n) time and O(1) space. The fact that variants of the algorithm can be done in parallel makes it even more awesome!
#algorithms #ComputerScience #SoME3 #mathematics #maths #statistics
-
#TIL about the #HyperLogLog algorithm and I think it's a damn brilliant way to estimate the number of unique elements of a potentially gargantuan set of items and only running in O(n) time and O(1) space. The fact that variants of the algorithm can be done in parallel makes it even more awesome!
#algorithms #ComputerScience #SoME3 #mathematics #maths #statistics
-
#TIL about the #HyperLogLog algorithm and I think it's a damn brilliant way to estimate the number of unique elements of a potentially gargantuan set of items and only running in O(n) time and O(1) space. The fact that variants of the algorithm can be done in parallel makes it even more awesome!
#algorithms #ComputerScience #SoME3 #mathematics #maths #statistics
-
#TIL about the #HyperLogLog algorithm and I think it's a damn brilliant way to estimate the number of unique elements of a potentially gargantuan set of items and only running in O(n) time and O(1) space. The fact that variants of the algorithm can be done in parallel makes it even more awesome!
#algorithms #ComputerScience #SoME3 #mathematics #maths #statistics