#bigquery — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #bigquery, aggregated by home.social.
-
One day at #Google Munich for the new services from Google.
Lot's of new stuff. But one thing that I see throughout the industry is that so many #aiagents examples don't require AI but are pure automation.
Also interesting to see that AI services can be called directly in #bigquery... But thinking about the #aiact and respective #governance, this means that a lot of applicants should appear in the company's ai register. 😕
Not a hot topic, but subject to more scanning of the query log. And.. Automation. Of course with an agent 😉
-
CloudSync MLBridge y su Impacto…
CloudSync MLBridge es una herramienta diseñada para facilitar la sincronización de datos entre Google Cloud Datastore y BigQuery. Permite a las empresas integrar sus sistemas de manera eficiente, reduciendo el tiempo necesario para actualizar los datos en tiempo real.
https://norvik.tech/news/analisis-cloudsync-mlbridge-google-cloud
#Technology #Cloudsync #GoogleCloud #Bigquery #Datastore #NorvikTech #DesarrolloSoftware #TechInnovation
-
I'm hiring an Analytics Engineer (GCP) to join my team at RHR International.
What you'd actually be doing: building and owning our analytics foundation in a Google Cloud GCP-first environment — BigQuery, Data/Looker Studio, Python, SQL, GitHub, Docker. Real production work, version-controlled and documented, not throwaway queries.
RHR is a leadership consulting firm that's been around for 80+ years. We're cloud-first, SaaS-only, no on-prem. Small IT team, which means your work matters immediately.
What I'm looking for beyond the technical skills: curiosity, self-direction, and the ability to explain what you built and why to people who don't write code. Bonus points if you've fixed something nobody asked you to fix.
Hybrid in Chicago preferred, remote considered.
Apply here: https://www.linkedin.com/jobs/view/4399748962/
If you know someone who fits, I'd appreciate the tag or share.#Hiring
#AnalyticsEngineer
#GCP
#BigQuery
#DataEngineering
#Chicago
#RHRInternational
#Google
#GoogleCloud
#GoogleCloudPlatform -
Data Studio is back: Google kills Looker Studio name for good: Google reversed its 2022 Looker Studio rebrand on April 11, 2026, restoring the Data Studio name and expanding the platform with BigQuery agents and Colab apps. https://ppc.land/data-studio-is-back-google-kills-looker-studio-name-for-good/ #DataStudio #Google #BigQuery #LookerStudio #DataAnalytics
-
I'm hiring an Analytics Engineer (GCP) to join my team at RHR International, reporting directly to me.
What you'd actually be doing: building and owning our analytics foundation in a GCP-first environment — BigQuery, Looker Studio, Python, SQL, GitHub, Docker. Real production work, version-controlled and documented, not throwaway queries.
RHR is a leadership consulting firm that's been around 80+ years. We're cloud-first, SaaS-only, no on-prem. Small IT team, which means your work matters immediately.
What I'm looking for beyond the technical skills: curiosity, self-direction, and the ability to explain what you built and why to people who don't write code. Bonus points if you've fixed something nobody asked you to fix.
Hybrid in Chicago preferred, remote considered.
Link to apply: https://www.linkedin.com/jobs/view/4399748962/
If you know someone who fits, I'd appreciate the tag or share.
#Hiring #AnalyticsEngineer #GCP #BigQuery #DataEngineering #Chicago #RHRInternational #Google
-
geoparquet-io: Fast #GeoParquet tool: geoparquet-io is an open-source #CLI tool and #Python library for converting, inspecting, optimizing, and partitioning #GeoParquet files, automatically applying GeoParquet performance best practices along the way. Its extract command can pull geodata from sources such as #WFS, #Esri ArcGIS Feature Services, or #BigQuery into GeoParquet.
https://spatialists.ch/posts/2026/04/06-geoparquet-io-fast-geoparquet-tool/ #GIS #GISchat #geospatial #SwissGIS -
TCO или Полная Стоимость Владение современных подходов в ETL для DB MPP
О чем эта статья : В данной статье я хочу сравнить TCO старых добрых ETL как например Informatica, ODI, MarkitEDM и подобных им vs DBT + AirFlow и подобных им Очень легко проанализировать стоимость лицензий или вычислений и хранения в случае облачной БД, но очень сложно — TCO. Стоимость разработки одной фичи, стоимость поддержки, стоимость сопровождения, стоимость изменений. Очень заманчиво учитывать только расходы на лицензии и вычисления и предполагать, что все остальные расходы одинаковы, хотя это не так. По умолчанию облачные MPP-базы обычно дешевле по хранению и вычислениям и не имеют лицензионной платы, и возникает соблазн использовать такой же безлицензионный подход в ETL, но есть недостатки :
-
Claude Code + BigQuery → agent analityczny, który pracuje na Twoich danych 24/7
Bez kopiowania zapytań. Bez pośredników. Bez przełączania między narzędziami.
To wszystko dzięki połączeniu Claude'a bezpośrednio do BigQuery przez MCP.
#iToSięLiczy
#AI #BigQuery #GoogleCloud #GA4 #DataDrivenMarketing #Automatyzacja #MarketingAnalytics -
RE: https://saptodon.org/@nextlytics/116028370585178197
Die Ausschreibung für Senior Data Scientist (gn) findet sich jetzt auch auf unserer Webseite!
https://www.nextlytics.com/de/karriere/jobs/senior-data-science-consultant-w/m/d
#fedihire #fedihire_de #fedijobs #fedijobs_de #datasciencejobs #datascience #databricks #snowflake #bigquery
-
построение интеллектуальной системы вопросов и ответов и корпоративной базы знаний на базе StarRocks + DeepSeek
Типовые сценарии на базе StarRocks + DeepSeek. DeepSeek: генерация качественных эмбеддингов и ответов, StarRocks: высокоэффективный векторный поиск и хранение.Вместе они образуют основу для точных и масштабируемых AI‑решений.
-
построение интеллектуальной системы вопросов и ответов и корпоративной базы знаний на базе StarRocks + DeepSeek
Типовые сценарии на базе StarRocks + DeepSeek. DeepSeek: генерация качественных эмбеддингов и ответов, StarRocks: высокоэффективный векторный поиск и хранение.Вместе они образуют основу для точных и масштабируемых AI‑решений.
-
построение интеллектуальной системы вопросов и ответов и корпоративной базы знаний на базе StarRocks + DeepSeek
Типовые сценарии на базе StarRocks + DeepSeek. DeepSeek: генерация качественных эмбеддингов и ответов, StarRocks: высокоэффективный векторный поиск и хранение.Вместе они образуют основу для точных и масштабируемых AI‑решений.
-
построение интеллектуальной системы вопросов и ответов и корпоративной базы знаний на базе StarRocks + DeepSeek
Типовые сценарии на базе StarRocks + DeepSeek. DeepSeek: генерация качественных эмбеддингов и ответов, StarRocks: высокоэффективный векторный поиск и хранение.Вместе они образуют основу для точных и масштабируемых AI‑решений.
-
RE: https://saptodon.org/@nextlytics/115501853415430874
Our #webinar from last week is available as an on-demand recording for anyone who missed it. How can #SAP Business Data Cloud interact with a wider ecosystem of modern data platforms like #Databricks, #Snowflake, #BigQuery, and (new this week) #Fabric? Where does this trend lead?
Spoiler: maybe truly open players have the advantage in the future interoperable data ecosystem over old-fashioned proprietary-first vendors...
#datascience #dataengineering #datawarehouse #datalakehouse #lakehouse
-
BigQuery TABLESAMPLE: Faster, Cheaper Queries
#BigQuery #DataAnalytics #CloudComputing #DataEngineering #SQL #CostOptimization
-
🚀 TopicWatchdog – Week 3: Stable Topics with BERTopic
KMeans worked, but cluster IDs kept jumping across retrains. This week I added a Python BERTopic stage with a BigQuery registry → stable topic IDs!
🟢 UMAP + HDBSCAN
🟢 Stable IDs via registry
🟢 Auto-labels with Gemini
🟢 Looker Studio dashboards📊 3,802 topics → 2,472 mapped, top clusters: migration, economy, climate, politics.
👉 Blog: https://dracoblue.net/dev/topicwatchdog-stable-topics-with-bertopic/
#TopicWatchdog #BERTopic #BigQuery
#Clustering
#MachineLearning
#FediScience -
How messy is #terraform with #gcp?
I'm trying to make a system where #rust worker ingests data of HTTP requests via #cloudrun, and passes them into #bigtable, which uses further ingestion recipe to export the data into #bigquery
I have tried to make a complete terraform declaration for this, but got into permission issues, then I have tried to make a system that generates all the artefacts like service accounts and docker image and then refers to those from terraform builds, but I almost don't see a value of doing it like that.
Does anyone have an example of #cloudrun #cdc? I am new to this and I feel really slow.
-
New geospatial data in Google BigQuery: #Google is adding geospatial content to its #DWH solution #BigQuery. Additions encompass annotated Street View #imagery, Places (#POI) data, and #traffic data, among others.
https://spatialists.ch/posts/2025/04-14-new-geospatial-data-in-google-bigquery/ #GIS #GISchat #geospatial #SwissGIS -
Diving into #Vermont wildlife for the #30DayChartChallenge "circle" day! 🦌 Using #Python & #plotly to compare monthly #Moose and #BlackBear sightings 🐻 Data wrangled with #BigQuery and #SQL. Any guesses which animal is seen more consistently throughout the year? 😜 #DataViz #Wildlife #RadialChart
-
GA4 intraday exports and cookieless pings
I build a lot of reports for clients that use Big Query GA4 as source.
Now.. that works like a charm. But.. you will need to wait some time to get processed data from the
events_tables.More recent data will appear in the streaming
_intraday_tables, if you have that enabled. But.. that data is not always complete! Especially when your site has consent mode enabled, and does not set a cookie until after consent.Here’s how it works:
The scenario
Someone visits the site for the first time (source: some campaign), gets confronted with the cookie banner, and then clicks accept.
We tagged the site correctly, so this is what happens
- a
page_viewevent triggers (with URL parameters) – and notices analytics consent is denied (the default) - the tracker attaches some parameters to this hit, to help processing
- a session is started
- this is the first visit
- there is an item list on the page:
view_item_listevent is triggered - the cookiebanner pops up (event:
cookiebar_view) - the visitor clicks accept (event:
cookiebar_accept) and the tracker gets sent agrantedsignal - now the cookie can be uses, and is attached to an automatic event
user_engagement
Sounds simple. Now, let’s see what is streamed into Big Query:
The streaming data gap
Basically, the intraday tables store what happens, as it happens.
- cookie field (
user_pseudo_id) is filled in on hits on/after consent - cookie field is
NULLfor hits before consent
As it should be, right? But there’s a third bullet:
- first batch of events will not appear in the
intradaytable!
Here’s what we see (most recent hit first, read from bottom to top)
- the page_view is missing in the streaming table
- the
collected_traffic_sourceinformation is missing (it is always only filled in on the first batch of events) - As a byproduct, we also do not see the session start and first visit
- the other events are all sent without a cookie
- after consent, we see the
user_pseudo_id– finally
The next day.. Google has glued it all together
Processed data: every event has a row
The following is in the processed data: (most recent hit first, read from bottom to top)
- The page_view event and all other events leading up to the consent have a cookie attached to it! Google rescued that information
- the “Attached” parameters to the hit expand to two extra rows
- session_start
- first_visit
- we have source information:
collected_traffic_sourceis present – on the first batch, as normal
Not visible in the screenshot:
session_traffic_source_last_click– the session information is properly filled in.The consequences
If you decide to use
intradaytables in your Big Query reports: be aware that although the information is fresh (no pun intended, GA360 users), it’s incomplete- intraday misses crucial events, namely the first batch (most often a page_view)
- bye bye landing_page reports based on page_views
- bye bye traffic source reports based on session_traffic_source_last_click or collected_traffic_source
- intraday misses cookies on some events
- which is not too much of an issue, really
Your experiences?
Do you use intraday tables in your models? Have you found clever workarounds to get the correct data in?
Let me know! Drop a comment here, or send me a bluesky message!
Still here?
Check out GA4Dataform – a product I’ve helped build that turns the GA4 Big Query exports into usable tables!
Related posts:
Google Analytics 4 truncates page locationMaking sense of Event Parameters in GA4Make your GA4 life easier: Some powertips!Smart incremental GA4 tables in Dataform - a
-
Welp, I was laid off today. Anyone need an SQL monkey?
[UPDATED to include location and link to résumé]
Physical location: #madison #wisconsin #usa
Link to résumé: https://drive.google.com/file/d/1RkZ_buZxopuJOAgOh0Mp0ck0onF0Scex/view?usp=sharing
#SQL #bigquery #lookerstudio #gcp #downtowork #laidoff #opentowork #remote #getfedihired
-
Today is DBA Appreciation Day!
If you have a DBA in your company who relentlessly takes care that your databases are humming along and delivering query results, today is the day to say Thank You!
#PostgreSQL #MySQL #MariaDB #Oracle #Greenplum #SQLite #SQLServer #MongoDB #Redis #Snowflake #DB2 #Elasticsearch #Teradata #InfluxDB #Firebird #Informix #Couchbase #CouchDB #Vertica #DuckDB #CockroachDB #SAPHana #Splunk #DynamoDB #BigQuery #Hive #Neo4j ...
-
#LookerStudio - Blog post
Usually, BigQuery helps so much to create high-level Looker Studio reports.
In this article, it is the opposite, LS comes to assist BigQuery:
How to explore quickly schemas of #BigQuery tables with Looker Studio -
Intro to BigQuery and HttpArchive with Rick Viscomi
-
Tracking the #Fake #GitHub #Star #BlackMarket with #Dagster, #dbt and #BigQuery | #DagsterBlog
"We knew there were dubious services out there offering #StarsCorCash, so we set up a dummy repo (frasermarlow/tap-bls) and purchased a bunch of stars. From these, we devised a profile for fake accounts and ran a number of #repos through a test using the GitHub REST API (via pygithub) and the GitHub Archive database."
-
It was about time... Finally!
Item-scoped custom dimensions in #GA4 are being rolled in explorations...
No signs of them in #LokerStudio , #DataAPI or #BigQuery export yet...
-
It was about time... Finally!
Item-scoped custom dimensions in #GA4 are being rolled in explorations...
No signs of them in #LokerStudio , #DataAPI or #BigQuery export yet...
-
It was about time... Finally!
Item-scoped custom dimensions in #GA4 are being rolled in explorations...
No signs of them in #LokerStudio , #DataAPI or #BigQuery export yet...
-
RT @isb_cgc: Proteomic Data Commons release V2.15 case, file, and quant data are now available in #BigQuery. Check out the new tables using our BigQuery Table Search tool at https://isb-cgc.appspot.com/bq_meta_search/ and filter Source by PDC.
#proteome #NCIProteomics #CancerResearch #proteomic https://twitter.com/isb_cgc/status/1622624155496480771/photo/1 -
Google Cloud opens its Seoul region - Google Cloud today announced that its new Seoul region, its first in Korea, is now open for busines... more: http://feedproxy.google.com/~r/Techcrunch/~3/bs2AWMnO6wM/ #businessintelligence #cloudinfrastructure #googlecloudplatform #googlecomputeengine #cloudcomputing #cloudservices #webservices #enterprise #southkorea #computing #indonesia #netmarble #bigquery #bigtable #lasvegas #jakarta #google #taiwan #cloud #japan #seoul #asia #tc
-
Google makes moving data to its cloud easier - Google Cloud today announced Transfer Service, a new service for enterprises that want to move thei... more: http://feedproxy.google.com/~r/Techcrunch/~3/ciWwX7MMKtE/ #artificialintelligence #cloudinfrastructure #softwareasaservice #machinelearning #cloudanalytics #cloudcomputing #webservices #computing #bigquery #petabyte #google #cloud #fedex #tc