#bigquery — Public Fediverse posts on home.social

Franz Graf @[email protected] · 2026-05-05 · 15:25 UTC

One day at #Google Munich for the new services from Google.

Lot's of new stuff. But one thing that I see throughout the industry is that so many #aiagents examples don't require AI but are pure automation.

Also interesting to see that AI services can be called directly in #bigquery... But thinking about the #aiact and respective #governance, this means that a lot of applicants should appear in the company's ai register. 😕

Not a hot topic, but subject to more scanning of the query log. And.. Automation. Of course with an agent 😉

#googlecloudnext2026
#googlecloudnext

#google #aiagents #bigquery #aiact #governance #googlecloudnext2026

Norvik Tech @[email protected] · 2026-05-01 · 12:06 UTC

CloudSync MLBridge y su Impacto…

CloudSync MLBridge es una herramienta diseñada para facilitar la sincronización de datos entre Google Cloud Datastore y BigQuery. Permite a las empresas integrar sus sistemas de manera eficiente, reduciendo el tiempo necesario para actualizar los datos en tiempo real.

https://norvik.tech/news/analisis-cloudsync-mlbridge-google-cloud

#Technology #Cloudsync #GoogleCloud #Bigquery #Datastore #NorvikTech #DesarrolloSoftware #TechInnovation

#technology #cloudsync #googlecloud #bigquery #datastore #norviktech

Brian Greenberg :verified: @[email protected] · 2026-04-30 · 18:40 UTC

I'm hiring an Analytics Engineer (GCP) to join my team at RHR International.
What you'd actually be doing: building and owning our analytics foundation in a Google Cloud GCP-first environment — BigQuery, Data/Looker Studio, Python, SQL, GitHub, Docker. Real production work, version-controlled and documented, not throwaway queries.
RHR is a leadership consulting firm that's been around for 80+ years. We're cloud-first, SaaS-only, no on-prem. Small IT team, which means your work matters immediately.
What I'm looking for beyond the technical skills: curiosity, self-direction, and the ability to explain what you built and why to people who don't write code. Bonus points if you've fixed something nobody asked you to fix.
Hybrid in Chicago preferred, remote considered.
Apply here: https://www.linkedin.com/jobs/view/4399748962/
If you know someone who fits, I'd appreciate the tag or share.

#Hiring
#AnalyticsEngineer
#GCP
#BigQuery
#DataEngineering
#Chicago
#RHRInternational
#Google
#GoogleCloud
#GoogleCloudPlatform

#hiring #analyticsengineer #gcp #bigquery #dataengineering #chicago

PPC Land @[email protected] · 2026-04-17 · 05:55 UTC

Data Studio is back: Google kills Looker Studio name for good: Google reversed its 2022 Looker Studio rebrand on April 11, 2026, restoring the Data Studio name and expanding the platform with BigQuery agents and Colab apps. https://ppc.land/data-studio-is-back-google-kills-looker-studio-name-for-good/ #DataStudio #Google #BigQuery #LookerStudio #DataAnalytics

#datastudio #google #bigquery #lookerstudio #dataanalytics

Brian Greenberg :verified: @[email protected] · 2026-04-14 · 20:41 UTC

I'm hiring an Analytics Engineer (GCP) to join my team at RHR International, reporting directly to me.

What you'd actually be doing: building and owning our analytics foundation in a GCP-first environment — BigQuery, Looker Studio, Python, SQL, GitHub, Docker. Real production work, version-controlled and documented, not throwaway queries.

RHR is a leadership consulting firm that's been around 80+ years. We're cloud-first, SaaS-only, no on-prem. Small IT team, which means your work matters immediately.

What I'm looking for beyond the technical skills: curiosity, self-direction, and the ability to explain what you built and why to people who don't write code. Bonus points if you've fixed something nobody asked you to fix.

Hybrid in Chicago preferred, remote considered.

Link to apply: https://www.linkedin.com/jobs/view/4399748962/

If you know someone who fits, I'd appreciate the tag or share.

#Hiring #AnalyticsEngineer #GCP #BigQuery #DataEngineering #Chicago #RHRInternational #Google

#hiring #analyticsengineer #gcp #bigquery #dataengineering #chicago

Spatialists @[email protected] · 2026-04-06 · 17:00 UTC

geoparquet-io: Fast #GeoParquet tool: geoparquet-io is an open-source #CLI tool and #Python library for converting, inspecting, optimizing, and partitioning #GeoParquet files, automatically applying GeoParquet performance best practices along the way. Its extract command can pull geodata from sources such as #WFS, #Esri ArcGIS Feature Services, or #BigQuery into GeoParquet.
https://spatialists.ch/posts/2026/04/06-geoparquet-io-fast-geoparquet-tool/ #GIS #GISchat #geospatial #SwissGIS

#geoparquet #cli #python #gis #gischat #geospatial

Habr @[email protected] · 2026-03-24 · 13:02 UTC

TCO или Полная Стоимость Владение современных подходов в ETL для DB MPP

О чем эта статья : В данной статье я хочу сравнить TCO старых добрых ETL как например Informatica, ODI, MarkitEDM и подобных им vs DBT + AirFlow и подобных им Очень легко проанализировать стоимость лицензий или вычислений и хранения в случае облачной БД, но очень сложно — TCO. Стоимость разработки одной фичи, стоимость поддержки, стоимость сопровождения, стоимость изменений. Очень заманчиво учитывать только расходы на лицензии и вычисления и предполагать, что все остальные расходы одинаковы, хотя это не так. По умолчанию облачные MPP-базы обычно дешевле по хранению и вычислениям и не имеют лицензионной платы, и возникает соблазн использовать такой же безлицензионный подход в ETL, но есть недостатки :

https://habr.com/ru/articles/1014362/

#mppбазы #informatica #dbt #etl #airflow #oracle #bigquery

#bigquery #oracle #airflow #etl #dbt #informatica

I to się liczy @[email protected] · 2026-03-19 · 08:53 UTC

Claude Code + BigQuery → agent analityczny, który pracuje na Twoich danych 24/7

Bez kopiowania zapytań. Bez pośredników. Bez przełączania między narzędziami.

To wszystko dzięki połączeniu Claude'a bezpośrednio do BigQuery przez MCP.

#iToSięLiczy
#AI #BigQuery #GoogleCloud #GA4 #DataDrivenMarketing #Automatyzacja #MarketingAnalytics

#itosieliczy #ai #bigquery #googlecloud #ga4 #datadrivenmarketing

NextLytics AG @[email protected] · 2026-02-10 · 18:21 UTC

RE: https://saptodon.org/@nextlytics/116028370585178197

Die Ausschreibung für Senior Data Scientist (gn) findet sich jetzt auch auf unserer Webseite!

https://www.nextlytics.com/de/karriere/jobs/senior-data-science-consultant-w/m/d

#fedihire #fedihire_de #fedijobs #fedijobs_de #datasciencejobs #datascience #databricks #snowflake #bigquery

#fedihire #fedihire_de #fedijobs #fedijobs_de #datasciencejobs #datascience

Habr @[email protected] · 2025-12-25 · 09:12 UTC

построение интеллектуальной системы вопросов и ответов и корпоративной базы знаний на базе StarRocks + DeepSeek

Типовые сценарии на базе StarRocks + DeepSeek. DeepSeek: генерация качественных эмбеддингов и ответов, StarRocks: высокоэффективный векторный поиск и хранение.Вместе они образуют основу для точных и масштабируемых AI‑решений.

https://habr.com/ru/articles/980410/

#starrocks #deepseek #vector_index #rag #bigdata #bigquery

#bigquery #bigdata #rag #vector_index #deepseek #starrocks

Habr @[email protected] · 2025-12-25 · 09:12 UTC

построение интеллектуальной системы вопросов и ответов и корпоративной базы знаний на базе StarRocks + DeepSeek

Типовые сценарии на базе StarRocks + DeepSeek. DeepSeek: генерация качественных эмбеддингов и ответов, StarRocks: высокоэффективный векторный поиск и хранение.Вместе они образуют основу для точных и масштабируемых AI‑решений.

https://habr.com/ru/articles/980410/

#starrocks #deepseek #vector_index #rag #bigdata #bigquery

#bigquery #bigdata #rag #vector_index #deepseek #starrocks

Habr @[email protected] · 2025-12-25 · 09:12 UTC

построение интеллектуальной системы вопросов и ответов и корпоративной базы знаний на базе StarRocks + DeepSeek

Типовые сценарии на базе StarRocks + DeepSeek. DeepSeek: генерация качественных эмбеддингов и ответов, StarRocks: высокоэффективный векторный поиск и хранение.Вместе они образуют основу для точных и масштабируемых AI‑решений.

https://habr.com/ru/articles/980410/

#starrocks #deepseek #vector_index #rag #bigdata #bigquery

#bigquery #bigdata #rag #vector_index #deepseek #starrocks

Habr @[email protected] · 2025-12-25 · 09:12 UTC

построение интеллектуальной системы вопросов и ответов и корпоративной базы знаний на базе StarRocks + DeepSeek

Типовые сценарии на базе StarRocks + DeepSeek. DeepSeek: генерация качественных эмбеддингов и ответов, StarRocks: высокоэффективный векторный поиск и хранение.Вместе они образуют основу для точных и масштабируемых AI‑решений.

https://habr.com/ru/articles/980410/

#starrocks #deepseek #vector_index #rag #bigdata #bigquery

Knowledge Zone @[email protected] · 2025-11-24 · 13:52 UTC

#ITByte: Amazon #Redshift and Google #BigQuery are two of the most popular cloud #Data warehouses: two comparable fully managed petabyte-scale cloud data warehouses.

Here is a short comparison between the two.

https://knowledgezone.co.in/posts/Cloud-Datawarehouse--Amazon-RedShift-vs-Google-BigQuery-637f21808e87f5ff4977d579

#itbyte #redshift #bigquery #data

NextLytics AG @[email protected] · 2025-11-19 · 12:08 UTC

RE: https://saptodon.org/@nextlytics/115501853415430874

Our #webinar from last week is available as an on-demand recording for anyone who missed it. How can #SAP Business Data Cloud interact with a wider ecosystem of modern data platforms like #Databricks, #Snowflake, #BigQuery, and (new this week) #Fabric? Where does this trend lead?

Spoiler: maybe truly open players have the advantage in the future interoperable data ecosystem over old-fashioned proprietary-first vendors...

#datascience #dataengineering #datawarehouse #datalakehouse #lakehouse

#webinar #sap #databricks #snowflake #bigquery #fabric

Kevinleary.net @[email protected] · 2025-11-10 · 01:35 UTC

BigQuery TABLESAMPLE: Faster, Cheaper Queries

https://ift.tt/KaXeZbQ

#BigQuery #DataAnalytics #CloudComputing #DataEngineering #SQL #CostOptimization

#bigquery #dataanalytics #cloudcomputing #dataengineering #sql #costoptimization

DracoBlue @[email protected] · 2025-09-15 · 08:43 UTC

🚀 TopicWatchdog – Week 3: Stable Topics with BERTopic

KMeans worked, but cluster IDs kept jumping across retrains. This week I added a Python BERTopic stage with a BigQuery registry → stable topic IDs!

🟢 UMAP + HDBSCAN
🟢 Stable IDs via registry
🟢 Auto-labels with Gemini
🟢 Looker Studio dashboards

📊 3,802 topics → 2,472 mapped, top clusters: migration, economy, climate, politics.

👉 Blog: https://dracoblue.net/dev/topicwatchdog-stable-topics-with-bertopic/

#TopicWatchdog #BERTopic #BigQuery
#Clustering
#MachineLearning
#FediScience

#topicwatchdog #bertopic #bigquery #clustering #machinelearning #fediscience

graste @[email protected] · 2025-09-02 · 18:28 UTC

„Kickoff (Week 1): Extracting Topics & Claims from German Politics Videos“

https://dracoblue.net/dev/kickoff-topicwatchdog-extracting-topics-and-claims-from-german-politics-videos/

„Limitations: small sample size (≈175 videos), no stable Topic IDs yet, clustering not applied, and claims only minimally canonicalized.“

#LLM #gcp #bigquery #YouTube #Politics #Research #transcripts #gemini

#llm #gcp #bigquery #youtube #politics #research

Jons Mostovojs @[email protected] · 2025-05-15 · 06:21 UTC

How messy is #terraform with #gcp?

I'm trying to make a system where #rust worker ingests data of HTTP requests via #cloudrun, and passes them into #bigtable, which uses further ingestion recipe to export the data into #bigquery

I have tried to make a complete terraform declaration for this, but got into permission issues, then I have tried to make a system that generates all the artefacts like service accounts and docker image and then refers to those from terraform builds, but I almost don't see a value of doing it like that.

Does anyone have an example of #cloudrun #cdc? I am new to this and I feel really slow.

#lazyweb #askfedi

#askfedi #lazyweb #cdc #bigquery #bigtable #cloudrun

Spatialists @[email protected] · 2025-04-14 · 18:50 UTC

New geospatial data in Google BigQuery: #Google is adding geospatial content to its #DWH solution #BigQuery. Additions encompass annotated Street View #imagery, Places (#POI) data, and #traffic data, among others.
https://spatialists.ch/posts/2025/04-14-new-geospatial-data-in-google-bigquery/ #GIS #GISchat #geospatial #SwissGIS

#google #dwh #bigquery #imagery #poi #traffic

Clinton @[email protected] · 2025-04-03 · 20:09 UTC

Diving into #Vermont wildlife for the #30DayChartChallenge "circle" day! 🦌 Using #Python & #plotly to compare monthly #Moose and #BlackBear sightings 🐻 Data wrangled with #BigQuery and #SQL. Any guesses which animal is seen more consistently throughout the year? 😜 #DataViz #Wildlife #RadialChart

#vermont #30daychartchallenge #python #plotly #moose #blackbear

Stuifbergen.com @[email protected] · 2025-02-26 · 20:40 UTC

GA4 intraday exports and cookieless pings

I build a lot of reports for clients that use Big Query GA4 as source.

Now.. that works like a charm. But.. you will need to wait some time to get processed data from the events_ tables.

More recent data will appear in the streaming _intraday_ tables, if you have that enabled. But.. that data is not always complete! Especially when your site has consent mode enabled, and does not set a cookie until after consent.

Here’s how it works:

The scenario

Someone visits the site for the first time (source: some campaign), gets confronted with the cookie banner, and then clicks accept.

We tagged the site correctly, so this is what happens

a page_view event triggers (with URL parameters) – and notices analytics consent is denied (the default)
the tracker attaches some parameters to this hit, to help processing
- a session is started
- this is the first visit
there is an item list on the page: view_item_list event is triggered
the cookiebanner pops up (event: cookiebar_view)
the visitor clicks accept (event: cookiebar_accept) and the tracker gets sent a granted signal
now the cookie can be uses, and is attached to an automatic event user_engagement

Sounds simple. Now, let’s see what is streamed into Big Query:

The streaming data gap

Basically, the intraday tables store what happens, as it happens.

cookie field ( user_pseudo_id ) is filled in on hits on/after consent
cookie field is NULL for hits before consent

As it should be, right? But there’s a third bullet:

first batch of events will not appear in the intraday table!

Here’s what we see (most recent hit first, read from bottom to top)

the page_view is missing in the streaming table
the collected_traffic_source information is missing (it is always only filled in on the first batch of events)
As a byproduct, we also do not see the session start and first visit
the other events are all sent without a cookie
after consent, we see the user_pseudo_id – finally

The next day.. Google has glued it all together

Processed data: every event has a row

The following is in the processed data: (most recent hit first, read from bottom to top)

The page_view event and all other events leading up to the consent have a cookie attached to it! Google rescued that information
the “Attached” parameters to the hit expand to two extra rows
- session_start
- first_visit
we have source information: collected_traffic_source is present – on the first batch, as normal

Not visible in the screenshot: session_traffic_source_last_click – the session information is properly filled in.

The consequences

If you decide to use intraday tables in your Big Query reports: be aware that although the information is fresh (no pun intended, GA360 users), it’s incomplete

intraday misses crucial events, namely the first batch (most often a page_view)
- bye bye landing_page reports based on page_views
- bye bye traffic source reports based on session_traffic_source_last_click or collected_traffic_source
intraday misses cookies on some events
- which is not too much of an issue, really

Your experiences?

Do you use intraday tables in your models? Have you found clever workarounds to get the correct data in?

Let me know! Drop a comment here, or send me a bluesky message!

Still here?

Check out GA4Dataform – a product I’ve helped build that turns the GA4 Big Query exports into usable tables!

#bigQuery #consentMode #cookies #ga4 #tagging

#bigquery #consentmode #cookies #ga4 #tagging

Eph Zero :emacs: 💪🏻 🚲 🤘🏻 @[email protected] · 2024-11-07 · 23:41 UTC

Welp, I was laid off today. Anyone need an SQL monkey?

[UPDATED to include location and link to résumé]

Physical location: #madison #wisconsin #usa

Link to résumé: https://drive.google.com/file/d/1RkZ_buZxopuJOAgOh0Mp0ck0onF0Scex/view?usp=sharing

#SQL #bigquery #lookerstudio #gcp #downtowork #laidoff #opentowork #remote #getfedihired

#sql #bigquery #lookerstudio #gcp #downtowork #laidoff

Andreas Scherbaum @[email protected] · 2024-07-05 · 10:04 UTC

Today is DBA Appreciation Day!

If you have a DBA in your company who relentlessly takes care that your databases are humming along and delivering query results, today is the day to say Thank You!

#PostgreSQL #MySQL #MariaDB #Oracle #Greenplum #SQLite #SQLServer #MongoDB #Redis #Snowflake #DB2 #Elasticsearch #Teradata #InfluxDB #Firebird #Informix #Couchbase #CouchDB #Vertica #DuckDB #CockroachDB #SAPHana #Splunk #DynamoDB #BigQuery #Hive #Neo4j ...

https://dbaday.org/

#postgresql #mysql #mariadb #oracle #greenplum #sqlite

Mehdi Oudjida @[email protected] · 2023-11-10 · 10:38 UTC

#LookerStudio - Blog post

Usually, BigQuery helps so much to create high-level Looker Studio reports.

In this article, it is the opposite, LS comes to assist BigQuery:
How to explore quickly schemas of #BigQuery tables with Looker Studio

https://bit.ly/3u56gOB

#lookerstudio #bigquery

Joan León @[email protected] · 2023-04-25 · 08:43 UTC

Intro to BigQuery and HttpArchive with Rick Viscomi

📺 https://www.youtube.com/watch?v=00f9kza3BJ0

#WebPerf #Perfomance #BigQuery #DataAnalytics

#webperf #perfomance #bigquery #dataanalytics

Tero Keski-Valkama @[email protected] · 2023-03-18 · 11:49 UTC

Tracking the #Fake #GitHub #Star #BlackMarket with #Dagster, #dbt and #BigQuery | #DagsterBlog

"We knew there were dubious services out there offering #StarsCorCash, so we set up a dummy repo (frasermarlow/tap-bls) and purchased a bunch of stars. From these, we devised a profile for fake accounts and ran a number of #repos through a test using the GitHub REST API (via pygithub) and the GitHub Archive database."

https://dagster.io/blog/fake-stars

#fake #github #star #blackmarket #dagster #dbt