Search
698 results for “pydata_helsinki”
-
The #QuartoLive extension lets you plug interactive (editable) #RStats code directly into your Quarto documents. And you can easily build exercises into your docs, too! #PositConf2024 #RStats #PyData 🧵 12/17
-
The #QuartoLive extension lets you plug interactive (editable) #RStats code directly into your Quarto documents. And you can easily build exercises into your docs, too! #PositConf2024 #RStats #PyData 🧵 12/17
-
The TargetEncoder PR has been merged into the scikit-learn main branch!
https://github.com/scikit-learn/scikit-learn/pull/25334
It's a very efficient way to deal with high cardinality categorical variables for supervised machine learning tasks. See the following quick tutorial to compare its performance with one-hot encoding, ordinal encoding and native support of categorical variables in Gradient Boosted Trees:
https://scikit-learn.org/dev/auto_examples/preprocessing/plot_target_encoder.html
It will be part of scikit-learn 1.3.
-
Noticias sobre Python y Datos de la semana, episodio 72 🐍⚙️🐼
En resumen: ¡pandas 2.0! Versiones nuevas de Polars y Great Expectations, anotando cantos de pájaros, y despidiéndome para unas breves vacaciones
https://astrojuanlu.substack.com/p/episodio-72-pandas-20
Apoya el noticiero suscribiéndote por correo 📬
¡Y sigue a @pandas_dev!
#pandas #polars #python #pydata #pycamp #noticieropythonydatos
-
https://DSLC.io welcomes you to week 14 of #TidyTuesday! We're exploring Repair Cafes Worldwide!
📁 https://tidytues.day/2026/2026-04-07
📰 https://insideclimatenews.org/news/11112025/todays-climate-repair-cafe-consumer-waste/Submit a dataset! https://github.com/rfordatascience/tidytuesday/blob/main/.github/CONTRIBUTING.md
-
https://DSLC.io welcomes you to week 13 of #TidyTuesday! We're exploring Coastal Ocean Temperature by Depth!
📁 https://tidytues.day/2026/2026-03-31
🗞️ https://data.novascotia.ca/stories/s/a25g-piwsSubmit a dataset! https://github.com/rfordatascience/tidytuesday/blob/main/.github/CONTRIBUTING.md
-
https://DSLC.io welcomes you to week 12 of #TidyTuesday! We're exploring One Million Digits of Pi!
📂 https://tidytues.day/2026/2026-03-24
🗞️ https://www.jpl.nasa.gov/edu/news/how-many-decimals-of-pi-do-we-really-need/Submit a dataset! https://github.com/rfordatascience/tidytuesday/blob/main/.github/CONTRIBUTING.md
-
At PyData NYC tutorial, by Jacob Tomlinson, I learned that now it is possible to access the same array on the GPU from pytorch and cupy. I'm loving how it will let you use the strengths of different libraries without dealing with extra memory copies.
https://nyc2024.pydata.org/cfp/talk/VAVRYW/
#PyDataNYC2024
#pytorch
#cupy -
@melissawm @jni @simon_brooke @hynek @napari
I build them for #PyQtGraph. You use the html builder with a little bit of (emphasis on little) custom css and the sphinx pydata theme looks amazing as a docset. I also disabled sidebars which makes for better viewing in dash.
The longest part was going through all the docs to identify areas that were problematic. I would occasionally identify oddities.
-
Many of the keynote and session videos for PyData Global 2022 went up online today, and here's my talk:
https://youtu.be/IKFGFFtxgow?t=5463 -
I'll present at PyData Global, Thu Dec 01 13:30 US Pacific:
"Data Prep for Graphs"
https://global2022.pydata.org/cfp/talk/AH9DJD/TL;DR: data prep phase in #graphdatascience work involves tools/techniques vastly different than data science in general. This stage of work is computationally expensive, and ironically much must be performed *prior* to loading into a graph DB.
Here's a sampler.
Also, we'll cover the https://github.com/DerwenAI/pynock proposal for Parquet serialization of graph data.
-
This library for matching power plant data from different sources looks very cool. From #PyPSA via Max Parzen:
#EnergyTransition #OpenData #PyData #EnergyData #Python
https://powerplantmatching.readthedocs.io/en/latest/index.html -
OSSci will be in beautiful Prague this Thursday, May 16. Thanks to the PyData Prague team for putting together a great agenda. 50+ people registered. Please share with your networks.Thanks!
-
Can you name that algorithm based on this dataflow representation?
It's Linear Discriminant Analysis as implemented by Scikit Learn!
I finished up a notebook showing how you can build an Array API compatible library with the egglog e-graph library in Python and use that to optimize a #scikit-learn algorithm with #numba
https://egg-smol-python.readthedocs.io/en/stable/tutorials/sklearn.html
For more context, I gave a talk on the broader goals this summer:
https://egg-smol-python.readthedocs.io/en/stable/explanation/2023_07_presentation.html
-
In an open source project called `kglab` (since 2020) we've worked to build integration paths between these different camps, making them more compatible with PyData approaches, and providing tutorials with examples.
https://github.com/DerwenAI/kglab
https://derwen.ai/docs/kgl/tutorial/ -
Video is now available from our talk at Ray Summit 2022 "Graphs at scale with Ray, for AI in Manufacturing"
https://www.anyscale.com/ray-summit-2022/agenda/sessions/232Lots of details discussed!
(free, requires registration details)
#graphthinking #graphdatascience #ai #manufacturing #ray #pydata
-
PyData Venezia: PyDataVE #26 - #ClimateNetworks & #Trading
April 30, 2026, 7:00:00 PM CEST - GMT+2https://mobilizon.it/events/b8b25b8d-008a-4f9a-9f06-568d99c083e1
-
Really excited to attend #JupyterCon in Paris next month!
@vincent_m and I will give a full day tutorial on predictive Survival Analysis and Competing Risks modeling with a Gradient Boosting model assembled from generic scikit-learn building blocks. We will also introduce many concepts and model evaluation methodology using specialized libraries such as lifelines and scikit-survival.
Here is the full agenda for this session:
-
-
-
-
🧠 Masterclass spotlight: Decoupled Data (April 17)
Build a production-grade Python API with clean, reliable database connections.
In this full-day, hands-on masterclass, Dr. Kristian Rother explores the Repository Pattern and compares SQL, ORM, and NoSQL approaches in real systems.🎟️ Space is limited.
👉 https://2026.pycon.de/masterclasses/decoupled-data-kristian-rother/ -
🧠 Masterclasses are published!
Our new Masterclass Day (April 17) features hands-on, group sessions — from AI and testing to data, security, and Python internals.
More masterclasses coming soon 👀
👉 https://2026.pycon.de/masterclasses/ -
I finally got plone-sphinx-theme to build using Sphinx Theme Builder, while it inherits Sphinx Book Theme, which in turn inherits from PyData Sphinx Theme. Next steps include tidy it up, write some docs, and package it for its first release. Then all #PloneCMS docs will have a single theme for all its projects. @plone
Thanks to @choldgraf, @pydata, and @pradyunsg for being the giants upon whose shoulders I can stand. -
Asked to help debug an online maths question about #BoxPlots, and learnt there are conflicting interpretations of the 1st and 3rd #quartiles which matter with small datasets like teaching examples!
https://en.wikipedia.org/wiki/Quartile
It is also not easy to see from the documentation which any plotting library actually uses, e.g. seaborn https://seaborn.pydata.org/generated/seaborn.boxplot.html
-
Really looking forward to PyData Global 2024 (online) !!
I'll be presenting
"Catching Bad Guys using open data and open models for graphs"
Thu Dec 5, 14:30-15:00 BST
https://global2024.pydata.org/cfp/talk/XMU9X9/#PyDataGlobal #Senzing #ERKG #knowledgegraphs #AI #darkmoney #AML #entityresolution #opendata
-
#PyData #Pittsburgh is hosting a live Q&A tomorrow with the @pycon and @ThePSF team about how the community can get involved with #PyConUS24 in Pittsburgh!
What questions should we ask? Let us know in the thread!
-
Ya está abierto el registro para nuestra reunión de octubre: "🗄️ SQL generado con lenguaje natural y MLFlow para productivización de modelos" este mes en las oficinas de Cepsa
https://www.meetup.com/pydata-madrid/events/296678892/
¡Nos vemos el jueves 19 a las 19:00! Y después, al bar a hacer networking 🍻
#PyDataMadrid #PyData #Python #datascience #machinelearning #text2sql #llms #mlflow
-
Ya está abierto el registro para nuestra reunión de octubre: "🗄️ SQL generado con lenguaje natural y MLFlow para productivización de modelos" este mes en las oficinas de Cepsa
https://www.meetup.com/pydata-madrid/events/296678892/
¡Nos vemos el jueves 19 a las 19:00! Y después, al bar a hacer networking 🍻
#PyDataMadrid #PyData #Python #datascience #machinelearning #text2sql #llms #mlflow