#data-science — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #data-science, aggregated by home.social.
-
Season 1 Lesson 21 Part 5 - Your First Steps in Python Duplicates Behaviour Dictionary #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #codingtutorial #pythonprogramming #dataanalysis #datascience #pythoncode #learncoding #softwarengineer #machinelearning #python
-
Regex vs. LLM for B2B document extraction. This week, I tried out both.
:blobcoffee: The rule-based pipeline with pytesseract + regex worked perfectly for Layout A. For Layout B? Every single field returned None.
:blobcoffee: Because "PO Number" and "Order Reference" are the same thing for a human. Not for a regex pattern.
:blobcoffee: The LLM-based approach (pytesseract + Ollama + LLaMA 3) extracted both layouts correctly, without touching a single rule. It even normalized the date format automatically.
:blobcoffee: But LLMs aren't always the right answer. If your documents are stable, speed matters at scale, or explainability is required, regex might still win.
Full comparison with code and trade-off breakdown on TDS: https://shorturl.at/v4gdl
#Python #DataScience #business #technology #dataengineering #LLM #Automation #OCR
-
1/5
The "algorithmic ultimatum" has arrived. Prediction markets just pushed "Power Plants Day"—the total neutralization of the Iranian electrical grid—to a 94% probability for Friday, May 15. When the "smart money" hits this level of certainty, we aren't looking at a guess; we're looking at a countdown.
#Geopolitics #DataScience #War #Economy #Intelligence #Technology #Politics #Defense #Strategy #GlobalNews -
🌑 Astrobites (M. Ogborn, Penn State): wandering supermassive black holes can be revealed via tidal disruption events (TDEs) – stars ripped apart by an SMBH. The event AT2024tvd lies ~0.8 kpc off the host galaxy's nucleus – the SMBH likely drifted away. Future key tool: Rubin/LSST.
📅 May 13, 2026
👉 https://astrobites.org/2026/05/13/wandering-smbh/ -
💧 Interstellar comet 3I/ATLAS carries water from another planetary system: the ALMA radio interferometer measured ~30× more semi-heavy water (HDO) than Solar System comets and ~40× more than Earth's oceans. It formed in extreme cold below 30 K. Published in Nature Astronomy.
📅 April 24, 2026
👉 https://www.almaobservatory.org/en/press-releases/alma-reveals-interstellar-comet-3i-atlas-formed-in-a-far-colder-world-than-our-own/ -
☀️ A University of Sheffield team (R. Jain) has published in Solar Physics: an AI decodes the Sun's p-modes – acoustic waves carrying info from deep inside our star. 30 years of helioseismic data + machine learning = an independent forecaster of solar activity, key to protecting satellites & power grids.
📅 May 12, 2026
👉 https://phys.org/news/2026-05-scientists-ai-sun-acoustic-heartbeat.html -
🔭 Citizen scientists from Backyard Worlds: Planet 9 have DOUBLED the known population of brown dwarfs! A new paper (Schneider et al., Astronomical Journal) reports 3,000+ motion-confirmed L & T dwarf candidates found by volunteers in WISE/NEOWISE-R data via Zooniverse — over the project's 10 years.
📅 May 13, 2026
👉 https://science.nasa.gov/get-involved/citizen-science/nasa-volunteers-double-known-population-of-brown-dwarfs/ -
Season 1 Lesson 21 Part 4 - Your First Steps in Python Useful Dictionary Methods #learncoding #softwarengineer #machinelearning #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #python #codingtutorial #pythonprogramming #dataanalysis #datascience #pythoncode