#blosc2 β Public Fediverse posts
Live and recent posts from across the Fediverse tagged #blosc2, aggregated by home.social.
-
π Our #PyDataGlobal 2025 tutorial recording on modern #Blosc2 & #Caterva2 features is out!
We show how compression is more than just a space saver, boosting performance for large in-memory & out-of-memory arrays via auto-chunking & parallelism.We also cover: π Serving Blosc2/#HDF5 data online with Caterva2 βοΈ Computing directly in the cloud (no downloads needed!)
Watch here: π https://www.youtube.com/watch?v=tUvSI3EpTBQ&list=PLGVZCDnMOq0qmerwB1eITnr5AfYRGm0DF&index=80
-
Interact with your vasts remote datasets right in your phone! π±
I've built a demo Jupyter notebook that connects to a Cat2Cloud server from an Android phone and slices into an 8 TB dataset, downloading a 1 MB chunk in under 100 milliseconds. β‘
The 8 TB dataset is from the Gaia DR3 catalogue. As it turns out, there are ~1000 stars in a cube of 100 light-years in our vicinity; the space is mostly empty.π π
Try this out by visiting: https://cat2.cloud/demo/roots/@public/large/slice-gaia-3d.ipynb
-
πIronPill 2π
In the second of our series of short videos ("ironPills") showcasing ironArray's work, we see how Blosc2 can be used to power heavy-duty linear algebra (100GB!) workflows
β‘1.5-2x faster than PyTorch + h5py!
π§± automated chunking optimised for your machine's cache hierarchy
π simple one-line syntax ππππππΈ.ππππππ(π°, π±, πππππππ='πππ.ππΈππ')See blog here: https://ironarray.io/blog/la-blosc
-
π IronPill 1π
In the first of a series of short videos ("ironPills") showcasing ironArray's work, we see how Blosc2 can be used to calculate Fourier approximations:
β‘5x faster than NumPy
π£ fraction of the memory footprint
π pythonic one-line syntax πππ(πβ*βπππ(π)β+βπβ*βπππ(π),βππ‘ππ=π·)See full notebook here: https://github.com/Blosc/python-blosc2/blob/main/examples/ndarray/ironpill1.ipynb
(inspired by this blog post: https://towardsdatascience.com/numexpr-the-faster-than-numpy-library-that-no-ones-heard-of/)
-
π£οΈ Announcing Python-Blosc2 3.8.0 π
A step closer to compliance with the array-api standard: data-apis.org/array-api!
This is an effort across all array-based libraries so that your code works (e.g. for both blosc2 and NumPy) by simply changing the import statement below!Highlights:
β C-Blosc2 updated to latest 2.21.2
β Incorporate isnan, isfinite, isinf
β Better indexing coverage
β linspace and arange functions more numerically stable
β Improved array-api compliance -
#Blosc2 now runs directly in your browser! Leveraging the power of #WASM, #Pyodide, and #JupyterLite, you can harness efficient, adaptable compression through the web's universal interface. Experience the future of large-scale data processing without leaving your browser window.
Compress Better, Compute Bigger, Share Faster
-
π’ We are pleased to announce the integration of a new stack feature in #Blosc2 π, which allows for stacking large arrays along a new axis.
Performance benchmarks show that while aligned chunks yield the best results, #Blosc2 with unaligned chunks can still outperform #NumPyβa welcome discovery! π
Many thanks to Luke Shaw for his excellent work on this new functionality. π
We've updated our recent blog post:
Check it out! π https://www.blosc.org/posts/blosc2-new-concatenate/#stacking-arrays -
π Excited to share more about Caterva2, your ultimate gateway to Blosc2/HDF5 repositories! π
Caterva2 is designed to redefine how you interact with large datasets.
Want to see it in action? π€ We've just released a new introductory video showcasing Caterva2's main functionalities! π¬
π https://ironarray.io/caterva2
#Caterva2 #Blosc2 #HDF5 #BigData #DataManagement #FreeSoftware #Python #DataScience #Tech
-
Now it's @FrancescAlted to introduce the #Blosc2 #compression algorithm to reduce #HDF5 file size.
-
Big news! #Caterva2 enters advanced beta stage π π
Caterva2 is a FOSS distributed system written in Python meant for sharing Blosc2 datasets (either native or converted on-the-fly from HDF5) among different hosts.
It follows the pub-sub paradigm, so it can publish data once and allow multiple subscribers to access it, saving time and resources. It comes with a Python API and a Web interface for easy browsing.
Learn more in https://ironarray.io/caterva2
Make Compression Better π
#blosc2 #ironarray -
Happy after showing how real-time exploration of the Milky Way stars in an array cube of 7.3 TB can be done in a laptop with 8 GB RAM and 15 GB of free disk space thanks to #ESAGaia data and the magic of #compression via #Blosc2. Also shown how #Btune helps immensely in determining the best combination of filters and codecs for achieving best #performance or cratio.
Thanks to the attendees; it has been a great experience! #SciPy2023Slides available at: https://www.blosc.org/docs/Exploring-MilkyWay-SciPy2023.pdf
-
Excited to travel to #SciPy2023 to present our approach on how to explore efficiently the Milky Way by using the #gaia dataset and leveraging #compression. Also, I'll introduce #Btune (btune.blosc.org), an AI tool for improving the compression process without the headaches of trying the many combinations of filters and codecs that are possible in #Blosc2. Looking forward to see you if you are coming too!