#malhaus — Public Fediverse posts on home.social

----------------

🛠️ Tool
===================

Opening: Malhaus is a self‑hosted malware static triage platform that aggregates outputs from established static analysis tools and interprets them via user‑selected LLMs to produce a structured triage report. The project targets analysts who need fast, explainable static assessments without relying on behavioral execution.

Key Features:
• Aggregates outputs from radare2, YARA, strings, objdump, oletools, floss, binwalk, exiftool and optional Ghidra headless decompilation for PE/ELF.
• Supports multiple LLM backends including Gemini, OpenAI, Azure AI Foundry, Claude, DeepSeek and OpenAI‑compatible servers (Ollama, vLLM, LM Studio).
• Produces a structured verdict with a confidence score, key reasoning points and full raw tool outputs.
• Exposes a REST API with bearer token authentication and per‑key rate limiting; includes an MCP server allowing AI agents to call analyze natively.
• Implements mathematical visualizations: entropy profile, compression curves, a 256×256 bigram matrix and an experimental byte‑trigram point cloud clustered with HDBSCAN.
• Caches results by SHA‑256 to make re‑submissions instant.

Technical Implementation:
• Pipeline design ingests the uploaded file, runs a configurable suite of static analyzers, computes byte‑sequence visualizations, and passes aggregated evidence to an LLM prompt template that returns a JSON‑structured verdict.
• Visualization outputs (entropy, compression ratios, bigram heatmap, PCA‑reduced trigram point cloud) are treated as evidence rather than classifiers; the LLM synthesizes these signals into human‑readable conclusions.
• Security controls include captcha‑protected web UI and tokenized API access; caching minimizes repeated heavy analysis.

Use Cases:
• Rapid triage of suspicious samples to determine priority for dynamic analysis.
• Augmenting human analysts with LLM‑summarized rationale and full tool outputs for auditability.
• Bulk triage workflows where SHA‑256 caching reduces redundant processing.

Limitations:
• The byte‑trigram visualization is experimental and not a validated classifier.
• LLM outputs are dependent on prompt design and model choice; false positives/negatives remain possible.
• Static triage cannot replace behavioral analysis for runtime behaviors, C2 activity or evasive techniques.

🔹 tool #malhaus #malwareanalysis #LLM #entropy

🔗 Source: https://github.com/toorandom/malhaus?tab=readme-ov-file