home.social

#rags — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #rags, aggregated by home.social.

  1. "Context engineering goes beyond earlier approaches to refining agent behavior in software development, such as prompt engineering or retrieval-augmented generation (RAG). The latter primarily helps AI retrieve one-off documents when generating a response.

    At a technical level, context engineering boils down to which information and tools you expose to the large language model (LLM) at the heart of an agent. This helps the LLM enrich its responses and programmatically decide its next course of action.

    The easiest way to enact context engineering is by using system prompts. These are found in most AI tools and accept instructions that help define an agent’s role, goals, and constraints. System prompts can also include few-shot examples that demonstrate target input and output behaviors.

    According to experts, establishing context for AI agents involves a mix of structured and unstructured data types. Core areas include:

    - System behaviors: code and documentation.
    - System architecture: database schemas and deployment configurations.
    - Code events: commits, pull requests, and review threads.
    - Error information: tickets, failure logs, build output, and feedback from linters or compilers.
    - Rationale: chat histories and design documentation.
    - Business rules: compliance policies and operating procedures.
    - Team behaviors: common workflows and execution patterns.

    “This data is used to inform reasoning, guide execution, align with goals, and enable adaptive learning,” said Babak Hodjat, chief AI officer at Cognizant, an IT consulting company that recently announced plans to deploy over 1,000 context engineers within the next year."

    leaddev.com/ai/what-is-context

    #AI #GenerativeAI #LLMs #RAGs #ContextEngineering #PromptEngineering

  2. "Without clear signposting, an agent might miss your API entirely. If it does discover it, large language models (LLMs) may stumble with undocumented behaviors, hallucinate methods or flood your servers with random calls.

    So, it’s important to get it right. Thankfully, strategies are emerging to position APIs for AI agents, from new standards to underground tricks. And, it’s more than just “get an MCP server” (though that’s a crucial step).

    The jury’s out on what strategy will be most effective. So I’ve structured this guide to start with broadly agreed-upon best practices, then explore ones still taking shape. Most tips apply equally to public, partner and private APIs."

    thenewstack.io/how-to-prepare-

    #AI #GenerativeAI #AIAgents #AgenticAI #LLMs #API #OpenAPI #RAGs #MCP

  3. #AI #GenerativeAI #Journalism #News #LLMs #RAGs #Science #ScienceJournalism #Media: "Surprisingly, GPT-4 with the abstracts performs a little bit better than GPT-4 with RAG over the article text, both in terms of accuracy (96.6% vs. 93.5%) and win percentage (29.2% vs. 27.8% — the rest were ties). This suggests that more context from the scientific article did not necessarily lead to higher accuracy or better understandability (i.e. our first assumption about using RAG here as described above was not met). A deeper investigation of the retrieved snippets and their actual relevance to the jargon term may help understand if this is an issue with the quality of the context, or if there may be other causes such as the similarity threshold we used. To apply RAG successfully, it’s essential to explore and test different parameters. Without this kind of careful experimentation, RAG on its own might not provide the desired results.

    It may also be the case that the large size of GPT-4’s pre-training dataset enables it to draw from other sources to generate definitions. This can be as much a concern as a benefit though — it can make it harder to override irrelevant information from the pre-training data, or eschew its limitations based on cutoff dates for model training.

    We also found that the effectiveness of these approaches varied based on the reader’s expertise. For instance, the less experienced annotator found similar value in both methods (higher tie percentage), while the more expert reader noticed more differences. Further evaluation with a larger set of annotators may help to replicate and understand these differences."

    generative-ai-newsroom.com/mak