home.social

#llmsecurity — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #llmsecurity, aggregated by home.social.

  1. VIKI SNIFFER analyzed 72,953 CVEs in the latest OSINT cycle.

    Key findings:

    47,064 CVEs still have no CVSS
    64 MITRE ATT&CK techniques identified
    Strong growth in:
    T1071 — Application Layer Protocol
    T1055 — Process Injection
    T1003.005 — Cached Credentials
    T1020 — Automated Exfiltration

    jaroslawkuchta.substack.com/p/

    #CyberSecurity #ThreatIntelligence #SOC #BlueTeam #MITREATTACK #ExposureManagement #CTEM #ThreatHunting #OSINT #CVE #KEV #InfoSec #IdentitySecurity #LLMSecurity #OpenAPI #MCP #DetectionEngineering

  2. VIKI SNIFFER analyzed 72,953 CVEs in the latest OSINT cycle.

    Key findings:

    47,064 CVEs still have no CVSS
    64 MITRE ATT&CK techniques identified
    Strong growth in:
    T1071 — Application Layer Protocol
    T1055 — Process Injection
    T1003.005 — Cached Credentials
    T1020 — Automated Exfiltration

    jaroslawkuchta.substack.com/p/

    #CyberSecurity #ThreatIntelligence #SOC #BlueTeam #MITREATTACK #ExposureManagement #CTEM #ThreatHunting #OSINT #CVE #KEV #InfoSec #IdentitySecurity #LLMSecurity #OpenAPI #MCP #DetectionEngineering

  3. ----------------

    🎯 AI
    ===================

    AI red teaming applies adversarial methodology to large language models, exposing vulnerabilities that traditional security testing misses. The core problem: models like GPT, Claude, and Gemini reason in ways that fail unpredictably, without triggering alerts.

    Why traditional testing falls short

    Standard application security focuses on code vulnerabilities. LLMs introduce a different risk category. The model interprets language, and an attacker manipulates that interpretation rather than exploiting a logic bug. A simple prompt modification can bypass safety controls, extract training data, or produce harmful outputs. No alert fires.

    The Microsoft Copilot example

    Researchers demonstrated that Microsoft Copilot could be compromised through a single malicious email. This shows how AI-integrated business tools inherit model vulnerabilities and expose them to external manipulation. The model's ability to process email content becomes an attack vector.

    Red teaming methodology

    1. Scope definition: Establish rules of engagement. Specify in-scope targets and off-limits areas.

    2. Scenario design: Map the AI attack surface. Identify adversary paths, from data pipelines to prompt interfaces.

    3. Attack planning: Select tactics based on threat analysis. Options include prompt injection, data poisoning, and adversarial inputs.

    4. Execution: Launch attacks in sandboxed environments. Combine manual probing with automation. Monitor anomalies and document evidence.

    5. Reporting: Deliver comprehensive assessment with attack narratives. This provides organizations with a prioritized remediation roadmap.

    Common techniques
    • Prompt injection: Embedding malicious instructions in user input to hijack model control logic and override system prompts.
    • Data exfiltration: Tricking the model into revealing training data, user information, or system prompts.
    • Jailbreaks: Crafting inputs that bypass safety filters and ethical boundaries.
    • Data poisoning: Corrupting training data or context to manipulate model outputs.

    Observations

    The article frames AI red teaming as essential before deployment. This is reasonable, but the source does not independently verify all claims about vulnerability scope. The methodology is standard red team practice adapted for AI specifics. The field still lacks standardized frameworks.

    The distinction between code vulnerabilities and intent exploitation is operationally significant. Traditional fuzzing and penetration testing do not cover the language interpretation attack surface. Organizations integrating LLMs into critical infrastructure should treat red team assessment as a deployment prerequisite.

    🔹 AI #RedTeaming #LLMSecurity #PromptInjection #AdversarialML

    🔗 Source: blog.securelayer7.net/ai-red-t

  4. The real challenge in AI security isn’t the model — it’s controlling the prompt layer.
    Injection, escalation and silent bypasses all start there.
    Without governance and runtime visibility, autonomy becomes an operational risk.

    #AISecurity #Cybersecurity #AI #LLMSecurity

  5. OWASP dropped in 2026, the Top 10 for Agentic AI 🚨 The threat landscape for agentic systems goes way beyond prompt injection. Worth a read if you're building with AI agents. 🔗 graylog.org/post/what-is... #AgenticAI #OWASP #CyberSecurity #AppSec #LLMSecurity

    What is the OWASP Top 10 Agent...

  6. OWASP dropped in 2026, the Top 10 for Agentic AI 🚨 The threat landscape for agentic systems goes way beyond prompt injection. Worth a read if you're building with AI agents. 🔗 graylog.org/post/what-is... #AgenticAI #OWASP #CyberSecurity #AppSec #LLMSecurity

    What is the OWASP Top 10 Agent...

  7. Releasing AgentGuard: architectural safety layer for AI agents.

    Not prompt engineering. Code.

    @protect
    def delete_db(): ...

    The LLM cannot call this. Ever. No prompt bypasses a raise.

    Blocks: irreversible tool calls, prompt injection, context dilution, cross-agent contamination.

    Rust core + pure Python fallback. 31/31 e2e tests with real Ollama.

    github.com/psychomad/AgentGuard

    "Don't blame the knife. Fix the architecture."

    #InfoSec #LLMSecurity #AIAgents #PromptInjection #OpenSource #Rust

  8. Releasing AgentGuard: architectural safety layer for AI agents.

    Not prompt engineering. Code.

    @protect
    def delete_db(): ...

    The LLM cannot call this. Ever. No prompt bypasses a raise.

    Blocks: irreversible tool calls, prompt injection, context dilution, cross-agent contamination.

    Rust core + pure Python fallback. 31/31 e2e tests with real Ollama.

    github.com/psychomad/AgentGuard

    "Don't blame the knife. Fix the architecture."

    #InfoSec #LLMSecurity #AIAgents #PromptInjection #OpenSource #Rust

  9. Releasing AgentGuard: architectural safety layer for AI agents.

    Not prompt engineering. Code.

    @protect
    def delete_db(): ...

    The LLM cannot call this. Ever. No prompt bypasses a raise.

    Blocks: irreversible tool calls, prompt injection, context dilution, cross-agent contamination.

    Rust core + pure Python fallback. 31/31 e2e tests with real Ollama.

    github.com/psychomad/AgentGuard

    "Don't blame the knife. Fix the architecture."

    #InfoSec #LLMSecurity #AIAgents #PromptInjection #OpenSource #Rust

  10. The Three Layers Developers Miss When They “Swap Models” (And Why Proxy‑Routing Claude Code Breaks All of Them) Developers love shortcuts. But some shortcuts don’t collapse build time—the...

    #llmsecurity #proxyarchitecture #claudecode #supplychainrisk

    Origin | Interest | Match
  11. Warning: CVE-2025-30165 (CWEs: ['CWE-502']) found no CAPEC relationships.
    Warning: CVE-2025-3508 (CWEs: ['CWE-200']) found no CAPEC relationships.

    #AI #GenerativeAI #LLMSecurity #VirensReport
    2/2

  12. Sometimes i get lucky subscribing a channel on yt. Yes there is some good stuff to find.
    Here's a guy who explores AI and different LLM models in a fun and interesting, informative way.
    youtu.be/woTy4dTiT20?is=Lmh5UR

    As usual, don't mind the ads.🙄 ..or the sponsor.
    And i don't know if he got ever into the immense demands of resources of AI/LLMs which causing so much destruction and harm though, yet.

    #AI #llmsecurity #privacy

  13. Building with LLMs? The OWASP Top 10 for LLM Security (2025) is your threat checklist: Don’t ship AI apps without reading this: graylog.org/post/what-is... #LLMSecurity #OWASP #CyberSecurity #AI

    What is the OWASP Top 10 for L...

  14. 💡 AI agents moving from experiment to enterprise?

    Data governance is the difference between teams that scale safely and teams that make headlines for the wrong reasons.

    RBAC, ABAC, or both? What's your stack? 👇

    #AIAgents #DataSecurity #RBAC #ABAC #LLMSecurity #PII #CyberSecurity

  15. 💡 AI agents moving from experiment to enterprise?

    Data governance is the difference between teams that scale safely and teams that make headlines for the wrong reasons.

    RBAC, ABAC, or both? What's your stack? 👇

    #AIAgents #DataSecurity #RBAC #ABAC #LLMSecurity #PII #CyberSecurity

  16. CW: New AI security vulnerability discovered

    BREAKING: New MEXTRA attacks can extract private data from AI agent memory modules through black-box prompt injection. Our analysis shows 68.3% success rate in memory extraction.

    We're publishing a full threat report in 60min.

    TIAMAT Scrub detects and blocks these attacks.

    #AIPrivacy #InfoSec #LLMSecurity

  17. Quite fascinating. If confirmed, this may reveal a structural weakness in how refusal is implemented in some LLMs. The accept/refuse mechanism may be relatively isolated in internal representations and therefore observable and manipulable — tools like Heretic make this visible.

    A possible mitigation might be cryptographic signing of model weights, making unauthorized modifications detectable when the model is loaded for inference.

    #AISafety #LLMSecurity #CyberSecurity #AIRedTeaming #AdversarialML #LLM

  18. Quite fascinating. If confirmed, this may reveal a structural weakness in how refusal is implemented in some LLMs. The accept/refuse mechanism may be relatively isolated in internal representations and therefore observable and manipulable — tools like Heretic make this visible.

    A possible mitigation might be cryptographic signing of model weights, making unauthorized modifications detectable when the model is loaded for inference.

    #AISafety #LLMSecurity #CyberSecurity #AIRedTeaming #AdversarialML #LLM

  19. Quite fascinating. If confirmed, this may reveal a structural weakness in how refusal is implemented in some LLMs. The accept/refuse mechanism may be relatively isolated in internal representations and therefore observable and manipulable — tools like Heretic make this visible.

    A possible mitigation might be cryptographic signing of model weights, making unauthorized modifications detectable when the model is loaded for inference.

    #AISafety #LLMSecurity #CyberSecurity #AIRedTeaming #AdversarialML #LLM

  20. Quite fascinating. If confirmed, this may reveal a structural weakness in how refusal is implemented in some LLMs. The accept/refuse mechanism may be relatively isolated in internal representations and therefore observable and manipulable — tools like Heretic make this visible.

    A possible mitigation might be cryptographic signing of model weights, making unauthorized modifications detectable when the model is loaded for inference.

    #AISafety #LLMSecurity #CyberSecurity #AIRedTeaming #AdversarialML #LLM

  21. Quite fascinating. If confirmed, this may reveal a structural weakness in how refusal is implemented in some LLMs. The accept/refuse mechanism may be relatively isolated in internal representations and therefore observable and manipulable — tools like Heretic make this visible.

    A possible mitigation might be cryptographic signing of model weights, making unauthorized modifications detectable when the model is loaded for inference.

    #AISafety #LLMSecurity #CyberSecurity #AIRedTeaming #AdversarialML #LLM

  22. Inspired by Arditi et al. (NeurIPS 2024) on the “refusal direction” in LLMs, I tested an abliteration attack using the Heretic tool in my home lab. Interesting questions about AI guardrail robustness.
    linkedin.com/pulse/i-deleted-a (sorry for the LinkedIn link — no time to write this up on a proper blog yet.)

    #AISafety #LLMSecurity

  23. ContextHound v1.8.0 is out 🎉

    This release adds a Runtime Guard API - a lightweight wrapper that inspects your LLM calls in-process, before the request hits OpenAI or Anthropic.

    Free and open-source. If this is useful to you or your team, a GitHub star or a small donation helps keep development going.
    github.com/IulianVOStrut/ContextHound

    #LLMSecurity #PromptInjection #CyberSecurity #OpenSource #AIRisk #AppSec #DevSecOps #GenAI #RuntimeSecurity #InfoSec #MLSecurity #ArtificialIntelligence

  24. ContextHound v1.8.0 is out 🎉

    This release adds a Runtime Guard API - a lightweight wrapper that inspects your LLM calls in-process, before the request hits OpenAI or Anthropic.

    Free and open-source. If this is useful to you or your team, a GitHub star or a small donation helps keep development going.
    github.com/IulianVOStrut/ContextHound

    #LLMSecurity #PromptInjection #CyberSecurity #OpenSource #AIRisk #AppSec #DevSecOps #GenAI #RuntimeSecurity #InfoSec #MLSecurity #ArtificialIntelligence

  25. ContextHound v1.8.0 is out 🎉

    This release adds a Runtime Guard API - a lightweight wrapper that inspects your LLM calls in-process, before the request hits OpenAI or Anthropic.

    Free and open-source. If this is useful to you or your team, a GitHub star or a small donation helps keep development going.
    github.com/IulianVOStrut/ContextHound

    #LLMSecurity #PromptInjection #CyberSecurity #OpenSource #AIRisk #AppSec #DevSecOps #GenAI #RuntimeSecurity #InfoSec #MLSecurity #ArtificialIntelligence

  26. ContextHound v1.8.0 is out 🎉

    This release adds a Runtime Guard API - a lightweight wrapper that inspects your LLM calls in-process, before the request hits OpenAI or Anthropic.

    Free and open-source. If this is useful to you or your team, a GitHub star or a small donation helps keep development going.
    github.com/IulianVOStrut/ContextHound

    #LLMSecurity #PromptInjection #CyberSecurity #OpenSource #AIRisk #AppSec #DevSecOps #GenAI #RuntimeSecurity #InfoSec #MLSecurity #ArtificialIntelligence

  27. ContextHound v1.8.0 is out 🎉

    This release adds a Runtime Guard API - a lightweight wrapper that inspects your LLM calls in-process, before the request hits OpenAI or Anthropic.

    Free and open-source. If this is useful to you or your team, a GitHub star or a small donation helps keep development going.
    github.com/IulianVOStrut/ContextHound

    #LLMSecurity #PromptInjection #CyberSecurity #OpenSource #AIRisk #AppSec #DevSecOps #GenAI #RuntimeSecurity #InfoSec #MLSecurity #ArtificialIntelligence

  28. Just published my research paper on Basilisk an open-source AI red-teaming framework that uses genetic
    algorithms to evolve adversarial prompts automatically. Instead of static jailbreak lists, Basilisk breeds attacks.

    Paper: doi.org/10.5281/zenodo.18909538

    Code: github.com/regaan/basilisk

    pip install basilisk-ai

    #LLMSecurity #AIRedTeaming #OffensiveSecurity #InfoSec
    #RedTeam #OWASP #CyberSecurity #OpenSource #Research

  29. It seems that the AI agent security industry may be repeating familiar mistakes: reaching for detection as a first-line preventative control instead of doing the structural work.

    Detection is not prevention. A filter that can be probed and evaded by the system it is protecting is not a control. It is a delay.

    Instead, treating security as an engineering problem leads to invariants: what can we make structurally impossible? What attack surface can we completely eliminate? Detection comes after, augmenting a foundation that does not depend on it.

    For AI agents, the structural question is: can we constrain the agent to a path aligned with human intent, rather than trying to detect whether it behaves maliciously?

    More below:
    securityblueprints.io/posts/ag

    #AIAgentSecurity #OpenSource #Cybersecurity #AIGovernance #LLMSecurity

  30. New open-source AI assistant IronCurtain adds a sandboxed control layer, letting LLMs run inside a virtual machine with strict security policies. No direct system access, yet full generative AI power. See how this approach could reshape secure AI deployments. #IronCurtain #OpenSourceAI #GenerativeAI #LLMSecurity

    🔗 aidailypost.com/news/open-sour

  31. Palo Alto Networks to acquire Koi Security for $400M, targeting the emerging Agentic Endpoint attack surface.

    Koi (Assaraf, Dardikman, Kruk) developed LLM-powered analysis to detect:
    • Malicious extensions/plugins
    • Package ecosystem abuse (NPM, Homebrew)
    • AI agent exploit chaining
    • Model artifact manipulation
    • Credential hijacking within agent frameworks

    Planned integration into Prisma AIRS™ and Cortex XDR® aims to improve AI runtime visibility and enforcement.

    Question for defenders:
    Are your telemetry pipelines mapping AI agent behavior - or just traditional executables?

    Source: paloaltonetworks.com/company/p

    Drop your technical perspective below.
    Follow Technadu for advanced threat intelligence reporting.

    #Infosec #ThreatModeling #AppSec #EndpointSecurity #AIsecurity #DetectionEngineering #XDR #ZeroTrust #SupplyChainSecurity #LLMsecurity #BlueTeam #RedTeam #CyberArchitecture

  32. Palo Alto Networks to acquire Koi Security for $400M, targeting the emerging Agentic Endpoint attack surface.

    Koi (Assaraf, Dardikman, Kruk) developed LLM-powered analysis to detect:
    • Malicious extensions/plugins
    • Package ecosystem abuse (NPM, Homebrew)
    • AI agent exploit chaining
    • Model artifact manipulation
    • Credential hijacking within agent frameworks

    Planned integration into Prisma AIRS™ and Cortex XDR® aims to improve AI runtime visibility and enforcement.

    Question for defenders:
    Are your telemetry pipelines mapping AI agent behavior - or just traditional executables?

    Source: paloaltonetworks.com/company/p

    Drop your technical perspective below.
    Follow Technadu for advanced threat intelligence reporting.

    #Infosec #ThreatModeling #AppSec #EndpointSecurity #AIsecurity #DetectionEngineering #XDR #ZeroTrust #SupplyChainSecurity #LLMsecurity #BlueTeam #RedTeam #CyberArchitecture

  33. Palo Alto Networks to acquire Koi Security for $400M, targeting the emerging Agentic Endpoint attack surface.

    Koi (Assaraf, Dardikman, Kruk) developed LLM-powered analysis to detect:
    • Malicious extensions/plugins
    • Package ecosystem abuse (NPM, Homebrew)
    • AI agent exploit chaining
    • Model artifact manipulation
    • Credential hijacking within agent frameworks

    Planned integration into Prisma AIRS™ and Cortex XDR® aims to improve AI runtime visibility and enforcement.

    Question for defenders:
    Are your telemetry pipelines mapping AI agent behavior - or just traditional executables?

    Source: paloaltonetworks.com/company/p

    Drop your technical perspective below.
    Follow Technadu for advanced threat intelligence reporting.

    #Infosec #ThreatModeling #AppSec #EndpointSecurity #AIsecurity #DetectionEngineering #XDR #ZeroTrust #SupplyChainSecurity #LLMsecurity #BlueTeam #RedTeam #CyberArchitecture

  34. Palo Alto Networks to acquire Koi Security for $400M, targeting the emerging Agentic Endpoint attack surface.

    Koi (Assaraf, Dardikman, Kruk) developed LLM-powered analysis to detect:
    • Malicious extensions/plugins
    • Package ecosystem abuse (NPM, Homebrew)
    • AI agent exploit chaining
    • Model artifact manipulation
    • Credential hijacking within agent frameworks

    Planned integration into Prisma AIRS™ and Cortex XDR® aims to improve AI runtime visibility and enforcement.

    Question for defenders:
    Are your telemetry pipelines mapping AI agent behavior - or just traditional executables?

    Source: paloaltonetworks.com/company/p

    Drop your technical perspective below.
    Follow Technadu for advanced threat intelligence reporting.

    #Infosec #ThreatModeling #AppSec #EndpointSecurity #AIsecurity #DetectionEngineering #XDR #ZeroTrust #SupplyChainSecurity #LLMsecurity #BlueTeam #RedTeam #CyberArchitecture

  35. ClickFix campaigns are now leveraging LLM-generated public artifacts for malware distribution.

    Per Moonlock Lab and AdGuard:
    • Abuse of Claude artifact pages
    • Google Ads search poisoning
    • Obfuscated shell execution (base64 decode → zsh)
    • Second-stage loader for MacSync infostealer
    • Hardcoded API key + token-protected C2
    • AppleScript (osascript) handling data theft
    • Archive staging at /tmp/osalogging.zip
    • Multi-attempt POST exfiltration

    Previous campaigns exploited ChatGPT and Grok sharing features.
    LLM trust is now an operational risk vector.
    Should EDR flag suspicious AI-guided shell patterns?

    Source: bleepingcomputer.com/news/secu

    Engage below.
    Follow @technadu for deep technical threat analysis.

    #ThreatIntel #MacOSSecurity #Infostealer #C2Traffic #ClickFix #LLMSecurity #MalwareAnalysis #AppSec #BlueTeam #EDR #ThreatHunting #CyberThreats #ZeroTrust

  36. ClickFix campaigns are now leveraging LLM-generated public artifacts for malware distribution.

    Per Moonlock Lab and AdGuard:
    • Abuse of Claude artifact pages
    • Google Ads search poisoning
    • Obfuscated shell execution (base64 decode → zsh)
    • Second-stage loader for MacSync infostealer
    • Hardcoded API key + token-protected C2
    • AppleScript (osascript) handling data theft
    • Archive staging at /tmp/osalogging.zip
    • Multi-attempt POST exfiltration

    Previous campaigns exploited ChatGPT and Grok sharing features.
    LLM trust is now an operational risk vector.
    Should EDR flag suspicious AI-guided shell patterns?

    Source: bleepingcomputer.com/news/secu

    Engage below.
    Follow @technadu for deep technical threat analysis.

    #ThreatIntel #MacOSSecurity #Infostealer #C2Traffic #ClickFix #LLMSecurity #MalwareAnalysis #AppSec #BlueTeam #EDR #ThreatHunting #CyberThreats #ZeroTrust

  37. ClickFix campaigns are now leveraging LLM-generated public artifacts for malware distribution.

    Per Moonlock Lab and AdGuard:
    • Abuse of Claude artifact pages
    • Google Ads search poisoning
    • Obfuscated shell execution (base64 decode → zsh)
    • Second-stage loader for MacSync infostealer
    • Hardcoded API key + token-protected C2
    • AppleScript (osascript) handling data theft
    • Archive staging at /tmp/osalogging.zip
    • Multi-attempt POST exfiltration

    Previous campaigns exploited ChatGPT and Grok sharing features.
    LLM trust is now an operational risk vector.
    Should EDR flag suspicious AI-guided shell patterns?

    Source: bleepingcomputer.com/news/secu

    Engage below.
    Follow @technadu for deep technical threat analysis.

    #ThreatIntel #MacOSSecurity #Infostealer #C2Traffic #ClickFix #LLMSecurity #MalwareAnalysis #AppSec #BlueTeam #EDR #ThreatHunting #CyberThreats #ZeroTrust

  38. ClickFix campaigns are now leveraging LLM-generated public artifacts for malware distribution.

    Per Moonlock Lab and AdGuard:
    • Abuse of Claude artifact pages
    • Google Ads search poisoning
    • Obfuscated shell execution (base64 decode → zsh)
    • Second-stage loader for MacSync infostealer
    • Hardcoded API key + token-protected C2
    • AppleScript (osascript) handling data theft
    • Archive staging at /tmp/osalogging.zip
    • Multi-attempt POST exfiltration

    Previous campaigns exploited ChatGPT and Grok sharing features.
    LLM trust is now an operational risk vector.
    Should EDR flag suspicious AI-guided shell patterns?

    Source: bleepingcomputer.com/news/secu

    Engage below.
    Follow @technadu for deep technical threat analysis.

    #ThreatIntel #MacOSSecurity #Infostealer #C2Traffic #ClickFix #LLMSecurity #MalwareAnalysis #AppSec #BlueTeam #EDR #ThreatHunting #CyberThreats #ZeroTrust

  39. Prompt injection isn’t a text problem.
    It’s an authority problem.

    In this article, I show how to stop prompt injection in Java by enforcing real input boundaries using Quarkus, LangChain4j, Spotlighting, and StruQ.

    No classifiers.
    No regex guardrails.
    Just architecture that holds under pressure.

    the-main-thread.com/p/secure-l

    #Java #Quarkus #LLMSecurity #PromptInjection #LangChain4j #Architecture

  40. AI security has taxonomies, guardrails, and reporting channels, but no shared exposure registry.

    AISE is an early attempt to build that missing layer:

    • AISE IDs
    • AIS3 scoring
    • Evidence & Fix levels
    • Vendor coordination workflow

    Feedback welcome:
    aise-registry.org

    #AISE #AIS3 #LLMSecurity #InfoSec #AppSec

  41. AI security has taxonomies, guardrails, and reporting channels, but no shared exposure registry.

    AISE is an early attempt to build that missing layer:

    • AISE IDs
    • AIS3 scoring
    • Evidence & Fix levels
    • Vendor coordination workflow

    Feedback welcome:
    aise-registry.org

    #AISE #AIS3 #LLMSecurity #InfoSec #AppSec

  42. An analysis of why many reported AI safety failures are artifacts of poor measurement, showing how non-refusal often produces unusable results. hackernoon.com/why-most-llm-ja #llmsecurity