AI-Native Systems Research¶

What if a system could observe its own behavior, hypothesize improvements, validate them, and deploy — continuously, at machine speed?

Explore Domains Read the Blog

The Vision¶

Modern software systems serving AI workloads are extraordinarily complex and must evolve under relentless pressure — new models, new hardware, changing usage patterns, shifting objectives. Today, even with powerful AI tools, every improvement is mediated by humans step by step. This human-mediated loop has become the bottleneck.

AI-Native Systems close this loop. In an AI-native System, AI is the primary agent of continuous creation, evolution, and operation. Humans define objectives, constraints, and governance — while the system continuously executes within those boundaries.

graph LR
    O[Observe] --> R[Reason]
    R --> C[Change]
    C --> E[Experiment]
    E --> V[Validate]
    V --> D[Deploy]
    D --> O

The continuous meta-loop: from observation to deployment, then back again.

Architecture¶

An AI-native system consists of two parts:

System Under Control¶

The software system that delivers business value and is subject to continuous evolution — inference platforms, kernel pipelines, storage systems. It is not necessarily an AI system itself.

Controlling System¶

The agentic AI-driven layer that continuously improves the System Under Control. It has two functions: a Reasoner (observes, hypothesizes, proposes goals) and a Changer (plans, experiments, produces artifacts).

Key Principles¶

Continuous, proactive evolution — not just reactive to failures, but seeking latent optimization opportunities
Governed autonomy — every change has complete provenance: what, why, and evidence
Spec-driven development — specifications are live documents that evolve with the system
Experimentation as a first-class activity — exploring a space of possibilities, not relying on single proposed fixes
Hyper-specialization — systems optimized for how they are actually used in each specific deployment
Simulation environments — enabling rapid evolution and verification when real‑world experimentation is too costly (e.g., BLIS, our high-fidelity and accurate llm-d simulator).

Technical Domains¶

We are applying the AI-native vision to three initial domains:

llm-d¶

A Kubernetes-native distributed LLM inference framework. AI-native approaches drive intelligent scheduling, KV-cache optimization, and continuous performance tuning.

Learn more

AI-Generated Kernels¶

Autonomous generation and optimization of compute kernels for GPUs and accelerators — driven by workload observations, evolutionary techniques, and continuous experimentation.

Learn more

Storage Systems¶

Applying spec-driven development and AI-native continuous improvement to storage infrastructure — enabling domain-specific, self-optimizing, and workload-aware storage systems.

Learn more

Latest from the Blog¶

Check our blogs for latest posts on AI-Native Systems research, progress updates, and deep dives into specific domains.

AI-Native Systems Research · Apache 2.0