Portrait

Senior freelance SRE for observability-heavy platforms

I help engineering teams debug faster, reduce operational noise, and harden the platforms behind critical services: Grafana/Loki/Thanos, Kubernetes, Proxmox/Ceph, and AI-assisted SRE workflows.

Available for freelance missions - Paris area / remote - 4 days/week
            
10+Years infra/SRE
TB/dayObservability ingestion
10 DCsMulti-datacenter delivery
ProxmoxVirtualization
CephDistributed storage

How I can help

Senior SRE work, grounded in real production operations. Not a generic DevOps body, a focused operator for teams that need reliability, not tickets.

Observability at scale

Grafana, Loki, Thanos, Vector, CloudWatch. Stack audits, migrations, query performance work, SLO dashboards, and platform hygiene.

Grafana Loki Thanos Vector CloudWatch

Platform & Kubernetes engineering

Kubernetes operations, Helm, Argo CD, CI/CD (Jenkins, GitLab), Terraform and Ansible automation for safer, repeatable delivery.

Kubernetes Helm Argo CD Terraform Ansible

Proxmox / Ceph / on-prem HA

Virtualization and distributed storage on bare metal. Multi-datacenter clusters, migrations from legacy virtualization, HA and network design.

Proxmox Ceph HA clusters Migrations

AI-assisted SRE tooling

Graphia, a domain-specific SRE agent for Grafana diagnosis. MCP-based workflows with RBAC-aware safeguards, built for real operations, not demos.

Graphia MCP RBAC-aware Diagnosis

On-prem AI: run models on your own infra

For teams that need AI capabilities without sending data to third-party clouds: local LLM serving, private RAG pipelines, and AI-assisted diagnosis running entirely on your Proxmox/Kubernetes infrastructure. GDPR-compliant by design, full control over models and data.

Local LLM serving GDPR-compliant Private RAG pipelines Data sovereignty

Selected work

A few receipts from 10 years operating infrastructure that can't quietly fail.

High-volume observability platform

TB/day ingestion - multi-cluster

Operated multi-cluster observability at multi-TB/day ingestion across logs, metrics, and traces, with Kafka pipelines feeding SIEM, logging, EDR, APM, and uptime monitoring.

Proxmox / Ceph HA platform

4 racks - 2 datacenters - 25 Gb/host

Integrated a highly available Proxmox cluster across 4 racks and 2 datacenters with Ceph storage, PXE automation, and 25 Gb networking per host.

Multi-provider platform automation

10 datacenters - 4 providers

Led automated VM and application delivery across 10 datacenters and 4 providers (OpenStack, Proxmox, vSphere, NetBox) from shared Terraform templates.

See full CV →

Open Source

Tools I build around real operational pain points, kept practical, open, and useful beyond my own environment.

SSHplex preview

SSHplex

Python TUI SSH NetBox Consul

Modern SSH multiplexing with multi-source inventory and tmux or iTerm2 backends.

OpenClaw Audit TUI preview

OpenClaw Audit TUI

TypeScript Audit TUI OpenClaw Observability

Terminal audit UI for OpenClaw sessions with live events and real-time streaming.

terraform-provider-centreon preview

terraform-provider-centreon

Go Terraform Monitoring IaC Centreon

Terraform provider for Centreon API V2, monitoring configuration managed as infrastructure as code.

See all open source work →

Let's talk

If your team runs critical infrastructure and needs SRE work that doesn't add theatre, get in touch.

Email me