Building intelligent systems
that ship securely.

AI systems that prove they work. Cloud platforms hardened from day one. DevSecOps pipelines that never skip a scan.

Systems in Production

Built, validated, and running

900+
AI Agents
30+
E2E Test Suites
7,700+
KB Documents
24/7
Autonomous Ops

Michael Thola

Technology professional building at the intersection of AI, security, and infrastructure.

Michael Thola

DevSecOps Engineer. AI Systems Builder. Problem solver.

I build systems that work autonomously and reduce workload by design. Every system I ship follows true DevSecOps procedures: continuously validated, scanned, and hardened. Even rapid prototypes get full security scanning before they deploy.

Currently leading development on EMBER, an autonomous AI companion that runs 24/7 on edge devices and mathematically proves its own improvement, and Crucible Cloud, a platform for deploying and security-auditing any GitHub repository in isolated sandboxes.

I believe the best software is secure by default, observable in production, and built to run without hand-holding.

DevSecOps AI Systems Cloud Architecture Security Engineering Edge Computing Full-Stack

What I Build

End-to-end systems engineering, from architecture through production hardening.

🤖 Autonomous AI Systems

Multi-agent orchestration with self-improvement pipelines. Systems that detect their own gaps, generate new capabilities, and mathematically prove they got better. Not by asking an LLM to grade itself, but through Wilson score intervals and hard statistical thresholds.

🛡 DevSecOps Pipelines

Security scanning built into every step. SonarQube SAST, Trivy container scanning, CrowdSec threat intelligence, fail2ban intrusion prevention. No code ships without passing the scan. No exceptions.

Cloud Security Platforms

Isolated sandbox environments for deploying and evaluating any codebase. Automated compliance scoring, dependency auditing, and production readiness assessment. Built for teams that need to move fast without cutting security corners.

💻 Edge Computing & IoT

Production systems running on Raspberry Pi clusters with systemd hardening, watchdog timers, blue/green deployments, and out-of-process self-healing. Real infrastructure at the edge, not lab demos.

🔍 MCP Integration & Tool Orchestration

Model Context Protocol servers that give AI agents live access to databases, knowledge bases, and APIs. Secure by design with scoped tokens, secret scrubbing, table blocklists, and full audit logging on every tool call.

📊 Observability & Monitoring

Prometheus metrics, Grafana dashboards, Uptime Kuma health checks, and custom security posture monitoring. Systems that know when they're sick and can tell you exactly what's wrong.

Featured Projects

Systems built from scratch, shipping in production.

EMBER AI Companion AI Companion

EMBER

A self-evolving AI companion that runs 24/7 on a Raspberry Pi 5. EMBER doesn't just improve herself. She proves it with math, not AI opinion. Every self-generated improvement is measured using real statistics: Wilson score confidence intervals, P50/P95 latency from raw execution data, and hard degradation thresholds with automatic rollback. No LLM grades its own homework here.

  • Self-evolution measured by math: Wilson score intervals, percentile latency, auto-rollback, grounded in published methods (Wilson 1927, Google SRE handbook, Kohavi et al. 2009)
  • 900+ autonomous AI agents and 900+ reusable skills organized into specialized teams with DAG-based orchestration, generating new agents and skills on the fly as demand requires
  • EMBER CLI for running the agent stack, dispatching tasks, and managing the multi-process companion (backend, guardian, web UI) from any terminal
  • MCP server with scoped token auth, secret scrubbing, and per-tool authorization so agents can query the knowledge base, threads, and domains during reasoning
  • Multi-channel conversation memory with per-topic threading across Pin voice, Telegram chat, and web UI, all maintaining parallel conversation threads
  • Cost-tiered routing: Ollama (free, local GPU) handles 90% of requests; Claude escalation only when accuracy requires it
  • Post-response self-evaluation: detects when EMBER admits failure in her own words and auto-creates improvement tickets
  • Production-grade systemd hardening: out-of-process stability guardian, watchdog, auto-restart, blue/green deploys
Claude API MCP Ollama CrewAI Python Raspberry Pi
Crucible Cloud Platform Cloud Platform

Crucible Cloud

A self-service cloud platform for deploying any GitHub repository into isolated, security-audited sandboxes. One-click deployments with built-in compliance scoring, cost analysis, and network monitoring. Per-stack build agents auto-detect the tech stack and generate the right build pipeline.

  • Fully automated multi-phase assessment pipeline: clone, detect stack, build, scan, run, collect artifacts, and draft a deployable showcase
  • Per-stack build agents for Docker, Python, Node.js, Rust, Go, Zig, and more, with a meta-agent that generates new stack agents on demand
  • EMBER compliance engine: SonarQube + Trivy + multi-model AI risk scoring (0-100) grounded in real scan findings
  • User feedback loop: approve/reject decisions adjust stack priority scores via learned deltas
  • Keycloak SSO with OIDC/SAML, full audit trail, PDF compliance reports
  • 36 passing E2E tests (Playwright) with real service UI validation
React Node.js AWS Docker Keycloak SonarQube

Mathematically Proven Improvement

Most AI systems that claim "self-improvement" use the LLM to score whether it got better. That's the AI grading its own homework. We don't do that.

The Problem

Most AI "self-improvement" systems ask the model: "did you get better?" The model says yes. The score goes up. Nobody checks if it actually did. These scores reflect what the model thinks happened, not what actually happened. They're unfounded, unreproducible, and indistinguishable from hallucination.

Our Method

Every improvement EMBER deploys is measured with real statistics: Wilson score confidence intervals for success rates (not raw percentages), P50/P95 latency from actual execution data (not averages), and hard degradation thresholds. If quality drops 10% or latency increases 30%, the change auto-rolls back. No AI judgment call. Just math.

30-Test Self-Evaluation Suite

Runs automatically on every deploy. Covers routing accuracy, domain classification, goal tracking, voice quality, and secret scrubbing. Regressions create backlog tickets automatically. Current baseline: 100% pass rate.

Post-Response Gap Detection

Scans EMBER's own output for admissions of failure like "I'm unable to fetch", placeholder text, or redirecting the user to check manually. Each detected gap auto-creates a Kanban story for the self-improvement pipeline to fix.

Cost-Tiered Escalation

90% of routing decisions use free local models (Ollama on GPU). Only when the local model's confidence is low or the routing evaluator detects a wrong domain does the system escalate to a paid API with MCP tools for live database access. Every escalation is logged with cost tracking.

Technologies

Tools and platforms I build with daily.

🧠 AI & ML

Claude API OpenAI Gemini CrewAI Mastra LiteLLM MCP Ollama WhisperX ComfyUI

Cloud & Infra

AWS Route 53 Docker Nginx Terraform systemd

🛡 Security

SonarQube Trivy CrowdSec fail2ban Keycloak Fernet OIDC/SAML

💻 Development

Python Node.js React Flask Playwright PyQt6

📊 Observability

Prometheus Grafana Uptime Kuma Structured Logging

Edge & Hardware

Raspberry Pi 5 Intel Arc A770 NVIDIA RTX 3070 Pi Zero 2W

Let's Talk

Interested in working together or have questions about these systems? Reach out.