#pudl — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #pudl, aggregated by home.social.
-
Everything is a lot right now. Want to tune it all out for ~10 minutes and help us better understand your energy data needs? We're kicking off our first (hopefully annual!) PUDL community survey. Please boost! Both current and possible future PUDL users encouraged.
-
For the energy data nerds: we've got a new data release out. PUDL v2024.11.0 includes quarterly updates to EIA 860M, EIA 923 year-to-date, EIA 930, EPA CEMS, and final 2023 data for the EIA 861. Comment in this GitHub discussion if you find anything weird. (or just to say Hi 👋)
https://github.com/orgs/catalyst-cooperative/discussions/3967
-
We want to apply to the Google Season of Docs for #PUDL but have never worked with an outside technical writer before. Does anybody have someone to recommend? It's a #Python project focused on producing open data describing the US energy system.
Cc: @turingway @choldgraf @yabellini @leahawasser
#PyData #WriteTheDocs #EnergyTransition #OpenData #OpenSource #EnergyMastodon
-
@catalystcoop @ZaneSelvans are there any other open utility databases/projects besides #PUDL?
-
Now that we're putting all our denormalized output tables and analyses into the #PUDL DB, we've got a lot more #metadata to manage, and are trying to figure out how to best combine existing tools to do it.
GitHub Discussion: https://github.com/orgs/catalyst-cooperative/discussions/2546
Currently we store column, table, and dataset level information in big JSON-ish #python data structures, which are converted into objects using @pydantic models based (loosely) on the #FrictionlessData tabular data package abstractions.
-
The @dagster folks interviewed us and did a write-up of our migration of #PUDL from a messy DIY #Python ETL to using their orchestration framework, which has thus far been a very positive experience. Unlike most of their users we are producing #OpenData outputs. Very curious to see if other non-profit / open-data users will adopt the platform:
https://dagster.io/blog/catalyst-cooperative-case-study
#DataEngineering #datadon #EnergyMastodon #OpenSource #EnergyTransition
-
As #PUDL moves toward distributing only data (and much more of it) rather than expecting everyone to run the software (with its 500+ dependencies...) we're going to deprecate our output management layer.
We see two possible deprecation paths. Should we go slow? Or rip the band-aid off now?
#OpenData #EnergyTransition #OpenSource #EnergyMastodon #datadon #pydata #EnergyTwitter
Discussion on GitHub here: https://github.com/orgs/catalyst-cooperative/discussions/2503
-
I did not realize you can post up to 100GB of data to #Kaggle and they provide access to computational resources and #Jupyter notebooks.
We're thinking about automatically posting all our #PUDL data there, and maybe running community competitions to help solve entity matching, anomaly detection, and imputation problems. Is there any downside to doing this?
#OpenData #MachineLearning #DataScience #EnergyTransition #EnergyTwitter #EnergyMastodon
https://www.kaggle.com/datasets/zaneselvans/catalyst-cooperative-pudl
-
A few #PUDL announcements!
https://github.com/orgs/catalyst-cooperative/discussions/2475
Our migration to @dagster is progressing rapidly. If you use PUDL and run the ETL yourself, and need help getting Dagster set up, feel free to sign up for office hours:
https://calendly.com/catalyst-cooperative/pudl-office-hours
Or ask for help in our GitHub discussions:
https://github.com/orgs/catalyst-cooperative/discussions
#OpenSource #OpenData #EnergyTransition #EnergyMastodon #datadon #DataEngineering #EnergyTwitter
-
We finally have the whole #PUDL data pipeline running in @dagster and the visualizations make it very clear where we need to parallelize stuff. 🐌 #datadon
Anybody else running particularly large or complex #OpenData / #OpenSource DAGs with these tools? We'd love to compare notes.
It would also be cool if there were some way to expose all this information to our users in a read-only form, so they can see what's happening with the nightly builds too.
-
This kind of analysis has been done for the US ISO/RTO regions before, but this is the first publicly available analysis of non-ISO/RTO regions like the Southeast and West. The data illuminates massive opportunities for reduced reliance on coal and increased customer savings over time. A lot of the ongoing plant capital expenses and non-fuel O&M costs come from FERC Form 1 data liberated by #CatalystCoop #PUDL
-
Catalyst Cooperative has brought together a bunch of data for the Deployment Gap Education Fund to track anti-renewable energy policies at the county and local level, and understand how they intersect with and constrain deployment. Public Tableau dashboard here:
#EnergyTransition #EnergyMastodon #EnergyTwitter #ClimateChange #Solar #CatalystCoop #PUDL #OpenData #datascience
-
The Open Grid Emissions Initiative, which uses #CatalystCoop #PUDL data as one of its main inputs, is trying to do this for historical analyses. Even if it's never useful for dispatching demand, it'll be hugely valuable for modeling the feasibility of 24/7 renewables.
-
Yo, here is a new PUDL data release, including updates to all the data through the end of 2021. #FERC Form 1, #EIA 860/923, #EPA CEMS, etc. Tarball includes the #SQLite DBs + #ApacheParuqet for the CEMS, a #Docker container with the software environment used to create them, and some example #Jupyter notebooks!
#OpenData #EnergyMastodon #EnergyTwitter #Energy #PyData #ClimateChange #Policy #CatalystCoop #pudl #DataEngineering #DataScience
-
Woo! Draft deployment of our extracted #FERC #XBRL data using @simon's #Datasette. Including 2021 data for FERC forms 1, 2, 6, 60, and 714. Still need to clean up the metadata, but hey it's a relational DB usable for analysis and transformation... not a bunch-o-gigabytes of XML. #sqlite #OpenData #EnergyTwitter #EnergyMastodon #PUDL
https://data.catalyst.coop/ -
The 2021 #FERC Form 1 data has been more recalcitrant, since they've switched to using #XBRL for reporting (after 27 years of Visual FoxPro...), but we're close!
It looks like there's enough structured information in the XBRL taxonomies that we can reproduce all the calculations and tag the data with the relevant FERC accounting categories.