AutomationSystemsBackend

Ghost-Admin

Air-gapped Linux daemon that heals servers through semantic reasoning — not blind thresholds.

7-stage MAPE-K< 30s detectionRAG memoryZero cloud calls

The Problem

Kubernetes probes kill processes when RAM hits a threshold. Nagios fires alerts engineers silence at 3AM. Static bash scripts restart services mid-transaction. These tools are syntactic — they read numbers. They do not reason. The result: $9,000/minute average cost for large enterprise outages, 25 unplanned incidents per month in the average industrial plant.

Approach

Built the first air-gapped, semantically-aware server healing daemon that classifies process intent — not just resource state. Implemented a strict 7-stage MAPE-K closed-loop: DETECT → ISOLATE → PROFILE → EXTRACT → REASON → EXECUTE → AUDIT. Every stage has a defined interface; the AI only sees fully-fused context, never raw metrics.

Architecture

DETECT: 60-second rolling RAM window triggers on growth trajectory before crisis (fires at 60% climbing, not 85% already critical). ISOLATE: Whitelist guardrail immunizes systemd, sshd, and critical processes. PROFILE: 24-hour behavioral fingerprinting builds per-process P95 baselines — an ML training job hitting 78% RAM is normal; an HTTP server hitting 55% for the first time is not. EXTRACT: Multi-signal context fusion assembles RAM, CPU, file handles, thread count, network connections, full process ancestry tree, journalctl logs, and RAG-retrieved similar past incidents. REASON: llama3.2:3b via Ollama (CUDA) classifies intent across 5 categories: WORKING_AS_INTENDED, DEGRADED_BUT_FUNCTIONAL, LEAKING, UNDER_ATTACK, UNKNOWN — making it the first self-healing daemon with rudimentary intrusion detection built in. EXECUTE: 4-step graceful degradation ladder (SIGTERM → cgroup cap → SIGSTOP → SIGKILL) with a pre-kill forensic gcore dump for post-mortem analysis. Cascade Correlation halts individual kill decisions when 2+ processes spike within 60 seconds. AUDIT: Append-only JSONL audit log directly ingestible by SIEM systems (Splunk, Elastic, Wazuh).

Results

Mean time to detection reduced from ~15 minutes (on-call engineer) to under 30 seconds. Confidence gate below 0.70 escalates rather than kills — false positive rate under 5%. Pre-kill forensic memory dumps enable post-mortem root cause analysis. RAG memory layer means the daemon improves with every incident it handles. Zero bytes of operational data leave the machine.

Invoice Automation POC

Fully local AI pipeline that extracts 29 invoice fields (including line items) from PDFs and scanned images — zero cloud, zero data leakage.

→

SentinelMesh

Autonomous drone fleet that detects, debates, and dispatches — without human input.

→