home.social

#ai-engineering — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #ai-engineering, aggregated by home.social.

fetched live
  1. Agents do not need RAG or vector databases for most real world work. They need structure and semantics.

    Agent Knowledge Graphs turn mixed repositories of code, docs, configs, and PDFs into a connected model that agents can reason over. This often replaces entire retrieval pipelines.

    antaoalmada.dev/posts/Code-Age

    #AIEngineering #KnowledgeGraphs #CodingAgents #AgentWorkflows #SoftwareArchitecture #Graphify

  2. Agents do not need RAG or vector databases for most real world work. They need structure and semantics.

    Agent Knowledge Graphs turn mixed repositories of code, docs, configs, and PDFs into a connected model that agents can reason over. This often replaces entire retrieval pipelines.

    antaoalmada.dev/posts/Code-Age

    #AIEngineering #KnowledgeGraphs #CodingAgents #AgentWorkflows #SoftwareArchitecture #Graphify

  3. 🧠 From Simple Indexing to Semantic Understanding: Why I Layered Both Approaches

    Finishing LLM Zoomcamp Module 2 felt like leveling up my RAG system. I was already doing agentic RAG in Module 1, but vector search opened a whole new layer of retrieval flexibility. Here's why the technical decisions matter:

    -**Gained exposure to various vector databases including pgvector, sqlitesearch, and minsearch** – Each tool carries distinct tradeoffs: pgvector for PostgreSQL integration, SQLite for lightweight local workloads, minsearch for in-memory prototyping. Knowing which fits where matters more than the technology itself
    - **Embedding actual lesson content with ONNX library** - Lightweight CPU inference means this stacks directly on existing infrastructure without needing GPU dependencies or scaling headaches
    - **Chunking 72 lesson pages into ~300 chunks with 50% overlap** - Sliding window preserves context across topic boundaries while reducing prompt token usage compared to whole-page indexing
    - **Building the same query against both vector and keyword indexes to compare scores** - Quantifies semantic vs lexical retrieval so you can decide when each method adds value
    - **Using hybrid search (RRF fusion) to blend vector and keyword search results intelligently** - Captures both conceptual meaning and precise terminology, which matters when queries span multiple technical domains

    One thing that stuck: even queries like "How do I store vectors in PostgreSQL?" returned meaningful results because I was comparing semantic similarity, not just matching words. That's the difference lexical vs. semantic search really makes. It shows hybrid search isn't just a nice-to-have, it's practical engineering when you care about retrieval precision and coverage.

    Project is live if you're curious to see how the pieces fit together: github.com/ammartin8/llm_zoomc

    Huge thanks again to Alexey Grigorev for putting this together, open-source learning at this level matters more than most realize. Anyone else finishing up Module 2 or working with hybrid retrieval themselves?

    #ai #localai #llm #mastodon #fediverse #buildinpublic #linux #github #aiengineering #DataEngineering #agentic #rag #vector #openai

  4. 🧠 From Simple Indexing to Semantic Understanding: Why I Layered Both Approaches

    Finishing LLM Zoomcamp Module 2 felt like leveling up my RAG system. I was already doing agentic RAG in Module 1, but vector search opened a whole new layer of retrieval flexibility. Here's why the technical decisions matter:

    -**Gained exposure to various vector databases including pgvector, sqlitesearch, and minsearch** – Each tool carries distinct tradeoffs: pgvector for PostgreSQL integration, SQLite for lightweight local workloads, minsearch for in-memory prototyping. Knowing which fits where matters more than the technology itself
    - **Embedding actual lesson content with ONNX library** - Lightweight CPU inference means this stacks directly on existing infrastructure without needing GPU dependencies or scaling headaches
    - **Chunking 72 lesson pages into ~300 chunks with 50% overlap** - Sliding window preserves context across topic boundaries while reducing prompt token usage compared to whole-page indexing
    - **Building the same query against both vector and keyword indexes to compare scores** - Quantifies semantic vs lexical retrieval so you can decide when each method adds value
    - **Using hybrid search (RRF fusion) to blend vector and keyword search results intelligently** - Captures both conceptual meaning and precise terminology, which matters when queries span multiple technical domains

    One thing that stuck: even queries like "How do I store vectors in PostgreSQL?" returned meaningful results because I was comparing semantic similarity, not just matching words. That's the difference lexical vs. semantic search really makes. It shows hybrid search isn't just a nice-to-have, it's practical engineering when you care about retrieval precision and coverage.

    Project is live if you're curious to see how the pieces fit together: github.com/ammartin8/llm_zoomc

    Huge thanks again to Alexey Grigorev for putting this together, open-source learning at this level matters more than most realize. Anyone else finishing up Module 2 or working with hybrid retrieval themselves?

  5. I have been following the field of harness engineering for some time now. This article distills the essence of harness engineering from the testimonials and shared experiences of practitioners.

    dev.to/gitaroktato/harness-eng

    #aiengineering #genai #llm #harnessengineering

  6. I have been following the field of harness engineering for some time now. This article distills the essence of harness engineering from the testimonials and shared experiences of practitioners.

    dev.to/gitaroktato/harness-eng

    #aiengineering #genai #llm #harnessengineering

  7. Stop measuring AI performance without measuring resilience. High bench scores often mask fragile backend logic that fails silently under pressure.

    We break down the invisible machinery: models rerouted from broken providers, responses caught before reaching users, and metrics refusing to penalize failure unfairly. Reliability isn't hoped for; it's engineered. ⚙️

    Read the full analysis: post.kapualabs.com/yckr6746

    #AIEngineering #ModelReliability #TechInfrastructure #LLM

  8. Your Agent Failed in Prod. Good Luck Reproducing It. - Tisha Chawla & Susheem Koul, Microsoft

    video.ut0pia.org/w/wNUPQCXVMDq

  9. Your Agent Failed in Prod. Good Luck Reproducing It. - Tisha Chawla & Susheem Koul, Microsoft

    video.ut0pia.org/w/wNUPQCXVMDq

  10. The Unix philosophy for AI agents: each tool does one thing well, then chain them.

    Phillip Merrick made this case on TalkDev's Enterprise Unlocked: AI is probabilistic now, not deterministic. 5 agents at 95% each >> 1 monolithic agent doing 5 steps. Probabilities compound - in your favor when you go modular, against you when you don't. That's the design shift.

    Shaun Thomas went deeper on this for the pgEdge blog. 📖

    hubs.la/Q04mJmPW0

    #AgenticAI #AIEngineering #Programming #OpenSource

  11. The Unix philosophy for AI agents: each tool does one thing well, then chain them.

    Phillip Merrick made this case on TalkDev's Enterprise Unlocked: AI is probabilistic now, not deterministic. 5 agents at 95% each >> 1 monolithic agent doing 5 steps. Probabilities compound - in your favor when you go modular, against you when you don't. That's the design shift.

    Shaun Thomas went deeper on this for the pgEdge blog. 📖

    hubs.la/Q04mJmPW0

    #AgenticAI #AIEngineering #Programming #OpenSource

  12. AI systems fail silently and at massive scale, yet the field building them has no licensure, no inspection, and no shared code of practice. Here is what I think hackernoon.com/ai-engineering- #aiengineering

  13. AI systems fail silently and at massive scale, yet the field building them has no licensure, no inspection, and no shared code of practice. Here is what I think hackernoon.com/ai-engineering- #aiengineering

  14. Module 1 of LLM Zoomcamp is done! 🎉

    I turned my original RAG pipeline into an Agent!

    I spent these last few days diving deep into Agentic RAG. It's been fascinating to build it step by step. Every time I ask the LLM to learn about something new, I see how it naturally figures out which tools to use, when to search, and how many times to gather info before giving me a solid answer.

    What exactly is Agentic RAG?
    It’s like giving the AI a brain that can actually act. Instead of just retrieving from a fixed knowledge base, the model decides whether it needs external tools first, gathers what it needs, and then answers. It’s pretty interesting to understand how it actually works behind the scenes!

    Why does this matter?
    A few days ago I asked for a detailed guide on using the OpenAI Python library with the chat.completion API. The Local LLM called web search multiple times until it had enough context and built something useful from those pieces. Now that I am building these systems, I can finally understand why it does what it does.

    💡 Insights from this week:
    - Building a static pipeline is a great start, but to make something truly flexible, you need function or tool calling. It lets the LLM look at the question first and decide whether it needs to search a knowledge base before answering.
    - I used to think "chunking" was just about breaking up text. Turns out it can reduce token input by 3x! 🤯
    - You have to learn how to walk before you run. Starting small, understanding each component manually, and seeing how the pieces fit together… it felt slow at first but worth it. Now I’m able to accelerate with agent frameworks like toyaikit, LangChain, PydanticAI, or OpenAI Agents.
    - There is definitely a learning curve with the API syntax. Between the new response API and chat completions, tool responses are structured differently and you have to adjust your code accordingly. Frustrating at times, but also a great way to learn!

    Quick takeaway:
    It is best to start simple, then add complexity only when needed. Sometimes an agent can burn tokens unnecessarily, so only add that layer if your problem really needs it!

    Had a lot of fun with this module and I’m already curious about what’s next. If you’re interested in learning along, this is the full free course Alexey at the Data Talks Club: github.com/DataTalksClub/llm-z

    Anyone else tinkering with LLM agents lately? What kind of projects are you exploring or trying out? Would love to hear where your journey is heading!

    #ai #localai #llm #mastodon #fediverse #buildinpublic #linux #github #aiengineering #DataEngineering

  15. Module 1 of LLM Zoomcamp is done! 🎉

    I turned my original RAG pipeline into an Agent!

    I spent these last few days diving deep into Agentic RAG. It's been fascinating to build it step by step. Every time I ask the LLM to learn about something new, I see how it naturally figures out which tools to use, when to search, and how many times to gather info before giving me a solid answer.

    What exactly is Agentic RAG?
    It’s like giving the AI a brain that can actually act. Instead of just retrieving from a fixed knowledge base, the model decides whether it needs external tools first, gathers what it needs, and then answers. It’s pretty interesting to understand how it actually works behind the scenes!

    Why does this matter?
    A few days ago I asked for a detailed guide on using the OpenAI Python library with the chat.completion API. The Local LLM called web search multiple times until it had enough context and built something useful from those pieces. Now that I am building these systems, I can finally understand why it does what it does.

    💡 Insights from this week:
    - Building a static pipeline is a great start, but to make something truly flexible, you need function or tool calling. It lets the LLM look at the question first and decide whether it needs to search a knowledge base before answering.
    - I used to think "chunking" was just about breaking up text. Turns out it can reduce token input by 3x! 🤯
    - You have to learn how to walk before you run. Starting small, understanding each component manually, and seeing how the pieces fit together… it felt slow at first but worth it. Now I’m able to accelerate with agent frameworks like toyaikit, LangChain, PydanticAI, or OpenAI Agents.
    - There is definitely a learning curve with the API syntax. Between the new response API and chat completions, tool responses are structured differently and you have to adjust your code accordingly. Frustrating at times, but also a great way to learn!

    Quick takeaway:
    It is best to start simple, then add complexity only when needed. Sometimes an agent can burn tokens unnecessarily, so only add that layer if your problem really needs it!

    Had a lot of fun with this module and I’m already curious about what’s next. If you’re interested in learning along, this is the full free course Alexey at the Data Talks Club: github.com/DataTalksClub/llm-z

    Anyone else tinkering with LLM agents lately? What kind of projects are you exploring or trying out? Would love to hear where your journey is heading!

  16. Today, we are introducing Upsun Dispatch. 🚀

    AI made engineers faster. It didn't make teams faster. The constraint was never typing; it was everything around it. The SDLC is being rewritten, and we'd like to rewrite it with you.

    Upsun Dispatch is our platform for the agentic software development lifecycle, launching in September 2026. Workflow is the primitive, not the agent. 😎

    👉 Read the full story: upsun.com/blog/introducing-ups

    #SDLC #AIEngineering #EngineeringLeadership #DeveloperTools

  17. Today, we are introducing Upsun Dispatch. 🚀

    AI made engineers faster. It didn't make teams faster. The constraint was never typing; it was everything around it. The SDLC is being rewritten, and we'd like to rewrite it with you.

    Upsun Dispatch is our platform for the agentic software development lifecycle, launching in September 2026. Workflow is the primitive, not the agent. 😎

    👉 Read the full story: upsun.com/blog/introducing-ups

    #SDLC #AIEngineering #EngineeringLeadership #DeveloperTools

  18. Most developers 🧑‍💻 use Claude Code wrong 🤔.
    Instead of asking it to write code immediately:

    1.Ask it to create a plan

    2.Review assumptions

    3.Refine the plan

    4.Then implement

    The quality difference is massive.

    This simple workflow has saved me countless hours and improved output consistency across projects.

    I cover more practical Claude Code workflows, MCP etc.

    Link in Bio 🔗
    moonpiecreates.gumroad.com/l/h

    #ClaudeCode #AIEngineering #BuildInPublic #DeveloperTools #ArtificialIntelligence

  19. Most developers 🧑‍💻 use Claude Code wrong 🤔.
    Instead of asking it to write code immediately:

    1.Ask it to create a plan

    2.Review assumptions

    3.Refine the plan

    4.Then implement

    The quality difference is massive.

    This simple workflow has saved me countless hours and improved output consistency across projects.

    I cover more practical Claude Code workflows, MCP etc.

    Link in Bio 🔗
    moonpiecreates.gumroad.com/l/h

    #ClaudeCode #AIEngineering #BuildInPublic #DeveloperTools #ArtificialIntelligence