home.social

#datapiplines — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #datapiplines, aggregated by home.social.

  1. Just completed a project building an end-to-end data pipeline for NYC taxi data using dlt 🚕📊! What a ride! 😅 The REST API extraction was particularly fun (in a challenging way) but dlt's modular design made it manageable. Here’s what I learned:

    ✅ Full life cycle: From REST API extraction to DuckDB loading, all in one framework
    ✅ Reproducibility: Tracked every transformation with dlt's lineage features
    ✅ Modular design: Defined reusable components for extracting and normalizing data
    ✅ Handles complexity: Seamlessly handled pagination from the API
    Big takeaway: dlt isn't just tooling, it's a framework for thinking about data pipelines that emphasizes transparency and reproducibility which is essential for any modern data stack