Zernel — The AI-Native Linux Operating System

19.0

TFLOPS on A100
(97.4% of theoretical peak)

1.8x

FP16 speedup
with mixed precision

10-20%

Energy savings
via phase-aware power mgmt

50+

Built-in ML tools
zero external dependencies

The kernel knows your workload

Zernel's sched_ext scheduler detects five ML workload phases in real time and applies different kernel policies to each. No code changes needed.

# Training step timeline — what the kernel sees:

[Data Loading] → [Forward Pass] → [Backward Pass] → [All-Reduce] → [Optimizer]

# CPU priority: HIGH LOW (yield) LOW (yield) HIGH (tc) HIGH
# GPU clocks: 33% 100% 100% 50% 100%
# Power limit: 60% 100% 100% 70% 100%

# Result: GPU never starves for data. NCCL gets network priority.
# Energy reduced 10-20% with <1% throughput impact.

What you can do right now

GPU Management

Clean nvidia-smi replacement. Real-time process viewer, memory tracking, GPU locking, health checks.

zernel gpu top zernel gpu kill 0

Experiment Tracking

Automatic metric extraction from training output. SQLite-backed, zero config, ZQL queries.

zernel run train.py zernel exp compare a b

ML Benchmarks

Prove your hardware performance. GPU TFLOPS, memory bandwidth, DataLoader throughput, ResNet-50.

zernel bench all zernel bench e2e

Training Debugger

One command to diagnose why training is slow. GPU, CPU, memory, I/O analysis with fix suggestions.

zernel debug why-slow zernel debug oom

Model Deployment

Deploy to local vLLM, Docker container, or AWS SageMaker. Multi-GPU inference with quantization.

zernel serve start ./model

PQC Security

Quantum-resistant encryption and signatures for model weights. AES-256-GCM + ML-KEM compatible.

zernel pqc sign ./model zernel pqc encrypt

Fleet Management

GPU fleet dashboard, cost attribution per team, idle detection, capacity planning.

zernel fleet status zernel fleet costs

Distributed Training

One command for local, SSH multi-node, or Kubernetes PyTorchJob. Automatic NCCL configuration.

zernel job submit --target ssh --nodes 4

Compliance

SOC 2 and HIPAA compliance reports. Immutable audit trail, data lineage, model provenance.

zernel audit report --standard soc2

Verified A100 benchmarks

GPU Compute (4096x4096)19.0 TFLOPS

Memory Bandwidth690 GB/s

Host→Device Transfer4.5 GB/s

DataLoader (4 workers)3,413 samples/s

Training Step (FP32)4.48 ms/step

Training Step (FP16 AMP)2.49 ms/step (1.8x)

ResNet-50 Training942 images/s

Why not just use Ubuntu?

	Stock Ubuntu	Zernel
CPU Scheduler	CFS (generic)	sched_ext with 5 ML phase types
GPU Observability	nvidia-smi (poll-based)	5 eBPF probes (real-time, zero overhead)
Power Management	Static limits	Phase-aware dynamic clocks (10-20% savings)
Model Security	Plaintext, RSA keys	PQC encryption + quantum-resistant signatures
Setup Time	Days (drivers, CUDA, frameworks)	Minutes (pre-installed, pre-validated)
Experiment Tracking	Install MLflow separately	Built-in, zero config
Debugging	nvidia-smi + guessing	zernel debug why-slow (automated)
Cost Tracking	Custom scripts	zernel fleet costs (built-in)
Compliance	Manual audit	zernel audit report (one command)

Get started in 5 minutes

$ git clone https://github.com/dyber-pqc/Zernel.git && cd Zernel
$ cargo build --workspace --release
$ export PATH=$PWD/target/release:$PATH

$ zernel doctor               # Check your environment
$ zernel gpu status           # See your GPUs
$ zernel bench quick          # 5-minute benchmark
$ zernel init my-project      # Create a project
$ zernel run train.py         # Train with auto-tracking
$ zernel watch                # Live GPU dashboard

Download Zernel

Bootable ISO — install on bare metal or VM

Server ISO (1.3 GB) Desktop ISO (1.9 GB)

Server: headless, max GPU memory | Desktop: GNOME + GPU dashboard
Both include: CUDA, PyTorch, JAX, vLLM, Ollama, 50+ Zernel tools

Pre-installed ML stack

Frameworks

PyTorch + CUDA
JAX + CUDA
TensorFlow

LLM

vLLM
Transformers
PEFT, TRL, bitsandbytes

Distributed

DeepSpeed
FairScale
ColossalAI

Developer

JupyterLab
TensorBoard
W&B, MLflow, Gradio

RAG

LangChain
ChromaDB
FAISS-GPU

Local LLM

Ollama
Llama 3.1 8B
(works offline)