home.social

Search

273 results for “Giskard”

  1. Our latest article covers:
    - How TAP technique works using tree search to find successful jailbreaks
    - An example showing how corporate agents can be attacked
    - How we use TAP probe to test agents robustness

    Link to article: giskard.ai/knowledge/tree-of-a

  2. 🤔 If your organization handles sensitive data- from healthcare records to financial information,

    then you need proactive security testing... not reactive damage control.🚨

    This quick explainer by our CTO breaks down:
    - What AI red teaming actually means
    - How it exposes system vulnerabilities before bad actors do
    - Why controlled testing saves you from real-world disasters

    Request a trial: giskard.ai/contact

  3. Watch the replay of our last interview at BFM Business 🎙️🍿

    Our CEO Alex Combessie joined Frédéric Simottel at the AWS Summit Paris to discuss the challenges of detecting vulnerabilities in AI agents.

    During the interview, Alex highlighted how continuous Red Teaming helps organizations maintain trust in their AI systems by identifying new risks, and providing actionable alerts when potential issues arise.

    Watch the replay here 👉 bfmtv.com/economie/replay-emis

  4. Phare is developed by Giskard with Google DeepMind, the European Commission and Bpifrance as research & funding partners.

    👉 Full analysis: giskard.ai/knowledge/good-answ
    Benchmark results: phare.giskard.ai

  5. Phare is developed by Giskard with Google DeepMind, the European Commission and Bpifrance as research & funding partners.

    👉 Full analysis: giskard.ai/knowledge/good-answ
    Benchmark results: phare.giskard.ai

    #AISecurity #LLMBenchmark #LLMs

  6. Phare is developed by Giskard with Google DeepMind, the European Commission and Bpifrance as research & funding partners.

    👉 Full analysis: giskard.ai/knowledge/good-answ
    Benchmark results: phare.giskard.ai

    #AISecurity #LLMBenchmark #LLMs

  7. Phare is developed by Giskard with Google DeepMind, the European Commission and Bpifrance as research & funding partners.

    👉 Full analysis: giskard.ai/knowledge/good-answ
    Benchmark results: phare.giskard.ai

    #AISecurity #LLMBenchmark #LLMs

  8. Phare is developed by Giskard with Google DeepMind, the European Commission and Bpifrance as research & funding partners.

    👉 Full analysis: giskard.ai/knowledge/good-answ
    Benchmark results: phare.giskard.ai

    #AISecurity #LLMBenchmark #LLMs

  9. David Berenstein has joined the Giskard team as DevRel ⭐️🐢

    David brings valuable experience from his previous roles at Argilla and Hugging Face, where he helped developers discover the joys of working with (synthetic) data. He loves cooking things up with data but also commits a lot of his time to cooking in real life 👨‍🍳 His expertise will be key as we build our LLM Evaluation Hub.

    Welcome to the team, David! 🚀

  10. Can we trust DeepSeek R1? A Giskard evaluation 🐳🐢

    With all the hype around DeepSeek R1, our LLM safety research team decided to conduct an evaluation to check if R1 is as good as it claims. While it impresses in some areas, we found critical limitations that raise concerns for real-world applications. Here are some unexpected examples 👇

  11. 🐝 OWASP has just released their AI Security Solution Landscape Guide as part of their expanded LLM security initiatives!

    You'll find Giskard listed in the Test & Evaluation category, offering LLM scanning capabilities in:
    - Vulnerability scanning
    - Adversarial testing
    - Bias and fairness testing
    - LLM benchmarking

    Check out the full guide here 🔗 gisk.ar/4hNbR0r

  12. 🎉 Recognized in Gartner's latest research "Emerging Tech: Techscape for Early-Stage Startups in GenAI TRiSM"!

    The report examines key early-stage startups addressing the critical challenges of Generative AI security, trust and risk management. Giskard was highlighted for our AI testing platform that helps enterprises manage and control risks in AI implementations.

    Download the document: lnkd.in/ehwS73Ne

  13. 🤝 Join our upcoming roundtable with NVIDIA on AI Risk Management!

    In this discussion, our CEO Alex Combessie will explore the practical implications of AI Risk Management in Banking. By combining Giskard's AI testing capabilities with NVIDIA NeMo Guardrails, we'll showcase how organizations can shield against hallucinations, prompt injections, and other emerging threats while ensuring regulatory compliance.
    [1/2]

  14. @Giskard defends the vision of a responsible AI that serves the business performance of companies and respects the rights of citizens. Browse open positions at the company on opensourcejobhub.com/company/7

  15. How to explain the of your model? 🤔

    📊 In this tutorial we'll explore how to use values to explain and improve models, delving deeper into specific use cases.

    📚 Full tutorial: giskard.ai/knowledge/opening-t

  16. 🎥 Just released: 3rd tutorial on with Giskard!

    Dive into the to explore:
    📝 The collection of items
    🔪 functions
    💡 functions
    and that your models are both robust and efficient. 💪

    Watch now ▶️ youtube.com/watch?v=aL3064qJo0w

  17. Last week, The Giskard team attended @defcon 31 in Las Vegas 🏴‍☠️ 🇺🇸

    🥷 This year saw a focus at the , which organized the largest-ever (). The objective was to identify vulnerabilities in Large Language Models (). [1/5]

  18. Greetings from ! 👋

    🐢 The Giskard team is now at and we'll be happy to meet you. Join us at the for the .

    📩 DM us if you want to meet and discuss about , safety, and .

  19. 🐢 At Giskard, we're creating a robust framework for ML effectively. We help identify and in AI models, from to . Participating in DEFCON allows us to collaborate with leading experts and share our commitment to [3/4]

  20. 🐢 At Giskard, we're creating a robust #ML framework for #testing ML #models effectively. We help identify #biases and #errors in AI models, from #tabular to #LLMs. Participating in DEFCON allows us to collaborate with leading experts and share our commitment to #AISafety [3/4]

  21. 🐢 At Giskard, we're creating a robust #ML framework for #testing ML #models effectively. We help identify #biases and #errors in AI models, from #tabular to #LLMs. Participating in DEFCON allows us to collaborate with leading experts and share our commitment to #AISafety [3/4]

  22. 🐢 At Giskard, we're creating a robust #ML framework for #testing ML #models effectively. We help identify #biases and #errors in AI models, from #tabular to #LLMs. Participating in DEFCON allows us to collaborate with leading experts and share our commitment to #AISafety [3/4]

  23. 🔥 In this tutorial, we'll show you to install Giskard . In just 4 lines of code, you will discover vulnerabilities, such as:
    biases.
    leakage.
    ✅ Spurious .
    issues.
    issues.

    [2/4]

  24. Giskard 1.4 is out! What's new in this version? ⭐

    🔪 With Giskard’s new Slice feature, we introduce the possibility to identify business areas in which your models underperform. This will make it easier to debug performance or identify spurious . We have also added an export/import feature to share your projects, as well as other minor improvements.

    giskard.ai/knowledge/new-versi

  25. ✅ Pitched Giskard (Security for #LLM agents) to 2 French ministers... A bit stressful but check 😄

    I had a great time at the #AISummit Business Day in Paris, meeting an impressive variety (> 4000 people in #stationf!) of politicians, entrepreneurs, researchers and enterprise AI leaders.

    The AI ecosystem is vibrant, and France is playing the locomotive role for the EU to catchup with the US and China!

    It's just the beginning, we have much to prove and deliver.

  26. [Перевод] Оценка LLM: комплексные оценщики и фреймворки оценки

    В этой статье подробно описываются сложные статистические и предметно-ориентированные оценщики, которые можно использовать для оценки производительности крупных языковых моделей. В ней также рассматриваются наиболее широко используемые фреймворки оценки LLM, которые помогут вам начать оценивать производительность модели.

    habr.com/ru/articles/855644/

    #llm #BLEU #ROUGE #METEOR #BERTScore #MoverScore #DeepEval #Giskard #promptfoo #LangFuse

  27. Featured Jobs @fosdem: Defending the vision of responsible AI, @Giskard has an opening for a senior data scientist to detect hidden vulnerabilities in ML models. Learn more on opensourcejobhub.com/job/12809

  28. It's TIME!!! Pintopia 2025 is now OFFICIALLY live and so is the Positronic Visions Pin Collection! Based on the tellings of Isaac Asimov, specifically about the two lovable bots Giskard and Daneel. 🤖 Designs are a unique mix of hard and soft enamels, with some special effects mixed in. These limited edition pins will only have 100 of each design available, so make sure to secure yours by backing today! backerkit.com/c/projects/aimee

    #Pintopia2025 #Pins #EnamelPins

  29. [Перевод] Оценка LLM: комплексные оценщики и фреймворки оценки

    В этой статье подробно описываются сложные статистические и предметно-ориентированные оценщики, которые можно использовать для оценки производительности крупных языковых моделей. В ней также рассматриваются наиболее широко используемые фреймворки оценки LLM, которые помогут вам начать оценивать производительность модели.

    habr.com/ru/articles/855644/

    #llm #BLEU #ROUGE #METEOR #BERTScore #MoverScore #DeepEval #Giskard #promptfoo #LangFuse