#datascience — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #datascience, aggregated by home.social.
-
Season 1 Lesson 21 Part 5 - Your First Steps in Python Duplicates Behaviour Dictionary #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #codingtutorial #pythonprogramming #dataanalysis #datascience #pythoncode #learncoding #softwarengineer #machinelearning #python
-
Season 1 Lesson 21 Part 5 - Your First Steps in Python Duplicates Behaviour Dictionary #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #codingtutorial #pythonprogramming #dataanalysis #datascience #pythoncode #learncoding #softwarengineer #machinelearning #python
-
Season 1 Lesson 21 Part 5 - Your First Steps in Python Duplicates Behaviour Dictionary #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #codingtutorial #pythonprogramming #dataanalysis #datascience #pythoncode #learncoding #softwarengineer #machinelearning #python
-
Season 1 Lesson 21 Part 5 - Your First Steps in Python Duplicates Behaviour Dictionary #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #codingtutorial #pythonprogramming #dataanalysis #datascience #pythoncode #learncoding #softwarengineer #machinelearning #python
-
Season 1 Lesson 21 Part 5 - Your First Steps in Python Duplicates Behaviour Dictionary #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #codingtutorial #pythonprogramming #dataanalysis #datascience #pythoncode #learncoding #softwarengineer #machinelearning #python
-
Regex vs. LLM for B2B document extraction. This week, I tried out both.
:blobcoffee: The rule-based pipeline with pytesseract + regex worked perfectly for Layout A. For Layout B? Every single field returned None.
:blobcoffee: Because "PO Number" and "Order Reference" are the same thing for a human. Not for a regex pattern.
:blobcoffee: The LLM-based approach (pytesseract + Ollama + LLaMA 3) extracted both layouts correctly, without touching a single rule. It even normalized the date format automatically.
:blobcoffee: But LLMs aren't always the right answer. If your documents are stable, speed matters at scale, or explainability is required, regex might still win.
Full comparison with code and trade-off breakdown on TDS: https://shorturl.at/v4gdl
#Python #DataScience #business #technology #dataengineering #LLM #Automation #OCR
-
Regex vs. LLM for B2B document extraction. This week, I tried out both.
:blobcoffee: The rule-based pipeline with pytesseract + regex worked perfectly for Layout A. For Layout B? Every single field returned None.
:blobcoffee: Because "PO Number" and "Order Reference" are the same thing for a human. Not for a regex pattern.
:blobcoffee: The LLM-based approach (pytesseract + Ollama + LLaMA 3) extracted both layouts correctly, without touching a single rule. It even normalized the date format automatically.
:blobcoffee: But LLMs aren't always the right answer. If your documents are stable, speed matters at scale, or explainability is required, regex might still win.
Full comparison with code and trade-off breakdown on TDS: https://shorturl.at/v4gdl
#Python #DataScience #business #technology #dataengineering #LLM #Automation #OCR
-
Regex vs. LLM for B2B document extraction. This week, I tried out both.
:blobcoffee: The rule-based pipeline with pytesseract + regex worked perfectly for Layout A. For Layout B? Every single field returned None.
:blobcoffee: Because "PO Number" and "Order Reference" are the same thing for a human. Not for a regex pattern.
:blobcoffee: The LLM-based approach (pytesseract + Ollama + LLaMA 3) extracted both layouts correctly, without touching a single rule. It even normalized the date format automatically.
:blobcoffee: But LLMs aren't always the right answer. If your documents are stable, speed matters at scale, or explainability is required, regex might still win.
Full comparison with code and trade-off breakdown on TDS: https://shorturl.at/v4gdl
#Python #DataScience #business #technology #dataengineering #LLM #Automation #OCR
-
Regex vs. LLM for B2B document extraction. This week, I tried out both.
:blobcoffee: The rule-based pipeline with pytesseract + regex worked perfectly for Layout A. For Layout B? Every single field returned None.
:blobcoffee: Because "PO Number" and "Order Reference" are the same thing for a human. Not for a regex pattern.
:blobcoffee: The LLM-based approach (pytesseract + Ollama + LLaMA 3) extracted both layouts correctly, without touching a single rule. It even normalized the date format automatically.
:blobcoffee: But LLMs aren't always the right answer. If your documents are stable, speed matters at scale, or explainability is required, regex might still win.
Full comparison with code and trade-off breakdown on TDS: https://shorturl.at/v4gdl
#Python #DataScience #business #technology #dataengineering #LLM #Automation #OCR
-
Regex vs. LLM for B2B document extraction. This week, I tried out both.
:blobcoffee: The rule-based pipeline with pytesseract + regex worked perfectly for Layout A. For Layout B? Every single field returned None.
:blobcoffee: Because "PO Number" and "Order Reference" are the same thing for a human. Not for a regex pattern.
:blobcoffee: The LLM-based approach (pytesseract + Ollama + LLaMA 3) extracted both layouts correctly, without touching a single rule. It even normalized the date format automatically.
:blobcoffee: But LLMs aren't always the right answer. If your documents are stable, speed matters at scale, or explainability is required, regex might still win.
Full comparison with code and trade-off breakdown on TDS: https://shorturl.at/v4gdl
#Python #DataScience #business #technology #dataengineering #LLM #Automation #OCR
-
1/5
The "algorithmic ultimatum" has arrived. Prediction markets just pushed "Power Plants Day"—the total neutralization of the Iranian electrical grid—to a 94% probability for Friday, May 15. When the "smart money" hits this level of certainty, we aren't looking at a guess; we're looking at a countdown.
#Geopolitics #DataScience #War #Economy #Intelligence #Technology #Politics #Defense #Strategy #GlobalNews -
🌑 Astrobites (M. Ogborn, Penn State): wandering supermassive black holes can be revealed via tidal disruption events (TDEs) – stars ripped apart by an SMBH. The event AT2024tvd lies ~0.8 kpc off the host galaxy's nucleus – the SMBH likely drifted away. Future key tool: Rubin/LSST.
📅 May 13, 2026
👉 https://astrobites.org/2026/05/13/wandering-smbh/ -
💧 Interstellar comet 3I/ATLAS carries water from another planetary system: the ALMA radio interferometer measured ~30× more semi-heavy water (HDO) than Solar System comets and ~40× more than Earth's oceans. It formed in extreme cold below 30 K. Published in Nature Astronomy.
📅 April 24, 2026
👉 https://www.almaobservatory.org/en/press-releases/alma-reveals-interstellar-comet-3i-atlas-formed-in-a-far-colder-world-than-our-own/ -
☀️ A University of Sheffield team (R. Jain) has published in Solar Physics: an AI decodes the Sun's p-modes – acoustic waves carrying info from deep inside our star. 30 years of helioseismic data + machine learning = an independent forecaster of solar activity, key to protecting satellites & power grids.
📅 May 12, 2026
👉 https://phys.org/news/2026-05-scientists-ai-sun-acoustic-heartbeat.html -
🔭 Citizen scientists from Backyard Worlds: Planet 9 have DOUBLED the known population of brown dwarfs! A new paper (Schneider et al., Astronomical Journal) reports 3,000+ motion-confirmed L & T dwarf candidates found by volunteers in WISE/NEOWISE-R data via Zooniverse — over the project's 10 years.
📅 May 13, 2026
👉 https://science.nasa.gov/get-involved/citizen-science/nasa-volunteers-double-known-population-of-brown-dwarfs/ -
Season 1 Lesson 21 Part 4 - Your First Steps in Python Useful Dictionary Methods #learncoding #softwarengineer #machinelearning #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #python #codingtutorial #pythonprogramming #dataanalysis #datascience #pythoncode
-
Season 1 Lesson 21 Part 4 - Your First Steps in Python Useful Dictionary Methods #learncoding #softwarengineer #machinelearning #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #python #codingtutorial #pythonprogramming #dataanalysis #datascience #pythoncode
-
Season 1 Lesson 21 Part 4 - Your First Steps in Python Useful Dictionary Methods #learncoding #softwarengineer #machinelearning #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #python #codingtutorial #pythonprogramming #dataanalysis #datascience #pythoncode
-
Season 1 Lesson 21 Part 4 - Your First Steps in Python Useful Dictionary Methods #learncoding #softwarengineer #machinelearning #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #python #codingtutorial #pythonprogramming #dataanalysis #datascience #pythoncode
-
AI doesn’t create bias, it inherits it – how do we ensure fairness when it comes to automated decisions?
#AI #Tech #MachineLearning #Ethics #Bias #Automation #DataScience #DigitalRights #HumanRights #Innovation #Data #AIEthics #Algorithms #AIBias #Fairness
https://the-14.com/ai-doesnt-create-bias-it-inherits-it-how-do-we-ensure-fairness-when-it-comes-to-automated-decisions/ -
AI doesn’t create bias, it inherits it – how do we ensure fairness when it comes to automated decisions?
#AI #Tech #MachineLearning #Ethics #Bias #Automation #DataScience #DigitalRights #HumanRights #Innovation #Data #AIEthics #Algorithms #AIBias #Fairness
https://the-14.com/ai-doesnt-create-bias-it-inherits-it-how-do-we-ensure-fairness-when-it-comes-to-automated-decisions/ -
AI doesn’t create bias, it inherits it – how do we ensure fairness when it comes to automated decisions?
#AI #Tech #MachineLearning #Ethics #Bias #Automation #DataScience #DigitalRights #HumanRights #Innovation #Data #AIEthics #Algorithms #AIBias #Fairness
https://the-14.com/ai-doesnt-create-bias-it-inherits-it-how-do-we-ensure-fairness-when-it-comes-to-automated-decisions/ -
AI doesn’t create bias, it inherits it – how do we ensure fairness when it comes to automated decisions?
#AI #Tech #MachineLearning #Ethics #Bias #Automation #DataScience #DigitalRights #HumanRights #Innovation #Data #AIEthics #Algorithms #AIBias #Fairness
https://the-14.com/ai-doesnt-create-bias-it-inherits-it-how-do-we-ensure-fairness-when-it-comes-to-automated-decisions/ -
AI doesn’t create bias, it inherits it – how do we ensure fairness when it comes to automated decisions?
#AI #Tech #MachineLearning #Ethics #Bias #Automation #DataScience #DigitalRights #HumanRights #Innovation #Data #AIEthics #Algorithms #AIBias #Fairness
https://the-14.com/ai-doesnt-create-bias-it-inherits-it-how-do-we-ensure-fairness-when-it-comes-to-automated-decisions/ -
Season 1 Lesson 21 Part 3 - Your First Steps in Python Merge Dictionaries #pythoncode #codingtutorial #pythonprogramming #learncoding #softwarengineer #machinelearning #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #python #dataanalysis #datascience
-
Season 1 Lesson 21 Part 3 - Your First Steps in Python Merge Dictionaries #pythoncode #codingtutorial #pythonprogramming #learncoding #softwarengineer #machinelearning #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #python #dataanalysis #datascience
-
Season 1 Lesson 21 Part 3 - Your First Steps in Python Merge Dictionaries #pythoncode #codingtutorial #pythonprogramming #learncoding #softwarengineer #machinelearning #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #python #dataanalysis #datascience
-
Season 1 Lesson 21 Part 3 - Your First Steps in Python Merge Dictionaries #pythoncode #codingtutorial #pythonprogramming #learncoding #softwarengineer #machinelearning #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #python #dataanalysis #datascience
-
Registration for the Statistics Globe Hub and all Statistics Globe courses will close in one week: https://statisticsglobe.com/courses
I’ll be on vacation until June 15 and won’t be able to process enrollments during that time.
#statistics #datascience #rstats #python #machinelearning #ai
-
Running Posit Connect, Workbench, or Package Manager in production?
The Posit Health Check (PHC) runs 100+ checks across your entire setup, identifying critical issues, security gaps, and configuration drift, then gives you tailored recommendations to sort it.
Learn more and contact us: https://www.jumpingrivers.com/posit/health-check/
-
Season 1 Lesson 21 Part 2 - Your First Steps in Python Nested Dictionaries #datascience #machinelearning #dataanalysis #pythoncode #codingtutorial #pythonprogramming #learncoding #softwarengineer #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #python
-
Season 1 Lesson 21 Part 2 - Your First Steps in Python Nested Dictionaries #datascience #machinelearning #dataanalysis #pythoncode #codingtutorial #pythonprogramming #learncoding #softwarengineer #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #python
-
Season 1 Lesson 21 Part 2 - Your First Steps in Python Nested Dictionaries #datascience #machinelearning #dataanalysis #pythoncode #codingtutorial #pythonprogramming #learncoding #softwarengineer #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #python
-
Season 1 Lesson 21 Part 2 - Your First Steps in Python Nested Dictionaries #datascience #machinelearning #dataanalysis #pythoncode #codingtutorial #pythonprogramming #learncoding #softwarengineer #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #python
-
Season 1 Lesson 21 Part 2 - Your First Steps in Python Nested Dictionaries #datascience #machinelearning #dataanalysis #pythoncode #codingtutorial #pythonprogramming #learncoding #softwarengineer #vibecoding #softwaredeveloper #jupyternotebook #dataengineer #python
-
Hm OK I just noticed the automatic git blame displayed in the status bar of Positron for wherever your cursor is. That's pretty cool.
#Positron #git #dataScience -
RE: https://zirk.us/@MidniteMikeWrites/116523605434204476
I didn’t set out to write about LLMs but felt compelled to given the media’s abject failure to explain the tech.
I think there’s still a need for good information, but I’m not an expert. I’m looking for recommendations on technical people to talk to about this from a social impact lens. Is this a rare combo or are there such people?
#technology #research #STS #computerscience #AI #datascience #machinelearning
-
RE: https://zirk.us/@MidniteMikeWrites/116523605434204476
I didn’t set out to write about LLMs but felt compelled to given the media’s abject failure to explain the tech.
I think there’s still a need for good information, but I’m not an expert. I’m looking for recommendations on technical people to talk to about this from a social impact lens. Is this a rare combo or are there such people?
#technology #research #STS #computerscience #AI #datascience #machinelearning
-
RE: https://zirk.us/@MidniteMikeWrites/116523605434204476
I didn’t set out to write about LLMs but felt compelled to given the media’s abject failure to explain the tech.
I think there’s still a need for good information, but I’m not an expert. I’m looking for recommendations on technical people to talk to about this from a social impact lens. Is this a rare combo or are there such people?
#technology #research #STS #computerscience #AI #datascience #machinelearning
-
RE: https://zirk.us/@MidniteMikeWrites/116523605434204476
I didn’t set out to write about LLMs but felt compelled to given the media’s abject failure to explain the tech.
I think there’s still a need for good information, but I’m not an expert. I’m looking for recommendations on technical people to talk to about this from a social impact lens. Is this a rare combo or are there such people?
#technology #research #STS #computerscience #AI #datascience #machinelearning
-
RE: https://zirk.us/@MidniteMikeWrites/116523605434204476
I didn’t set out to write about LLMs but felt compelled to given the media’s abject failure to explain the tech.
I think there’s still a need for good information, but I’m not an expert. I’m looking for recommendations on technical people to talk to about this from a social impact lens. Is this a rare combo or are there such people?
#technology #research #STS #computerscience #AI #datascience #machinelearning
-
Season 1 Lesson 21 Part 1 - Your First Steps in Python Loop Dictionaries Finally Explained #softwaredeveloper #vibecoding #datascience #machinelearning #dataanalysis #pythoncode #codingtutorial #pythonprogramming #learncoding #softwarengineer #jupyternotebook #python #dataengineer
-
Season 1 Lesson 21 Part 1 - Your First Steps in Python Loop Dictionaries Finally Explained #softwaredeveloper #vibecoding #datascience #machinelearning #dataanalysis #pythoncode #codingtutorial #pythonprogramming #learncoding #softwarengineer #jupyternotebook #python #dataengineer
-
Season 1 Lesson 21 Part 1 - Your First Steps in Python Loop Dictionaries Finally Explained #softwaredeveloper #vibecoding #datascience #machinelearning #dataanalysis #pythoncode #codingtutorial #pythonprogramming #learncoding #softwarengineer #jupyternotebook #python #dataengineer
-
Season 1 Lesson 21 Part 1 - Your First Steps in Python Loop Dictionaries Finally Explained #softwaredeveloper #vibecoding #datascience #machinelearning #dataanalysis #pythoncode #codingtutorial #pythonprogramming #learncoding #softwarengineer #jupyternotebook #python #dataengineer
-
Season 1 Lesson 21 Part 1 - Your First Steps in Python Loop Dictionaries Finally Explained #softwaredeveloper #vibecoding #datascience #machinelearning #dataanalysis #pythoncode #codingtutorial #pythonprogramming #learncoding #softwarengineer #jupyternotebook #python #dataengineer
-
https://winbuzzer.com/2026/05/12/microsoft-opens-48-state-grid-dataset-for-power-research-xcxwbn/
Microsoft has released a 48-state U.S. grid dataset for power research, giving teams a shared model to test congestion, siting and transmission upgrades.
#MicrosoftResearch #Microsoft #Energy #DataCenters #DataScience #ITInfrastructure #CriticalInfrastructure
-
https://winbuzzer.com/2026/05/12/microsoft-opens-48-state-grid-dataset-for-power-research-xcxwbn/
Microsoft has released a 48-state U.S. grid dataset for power research, giving teams a shared model to test congestion, siting and transmission upgrades.
#MicrosoftResearch #Microsoft #Energy #DataCenters #DataScience #ITInfrastructure #CriticalInfrastructure