Skip to content

AI-Native Systems Research

What if a system could observe its own behavior, hypothesize improvements, validate them, and deploy — continuously, at machine speed?


The Vision

Modern software systems serving AI workloads are extraordinarily complex and must evolve under relentless pressure — new models, new hardware, changing usage patterns, shifting objectives. Today, even with powerful AI tools, every improvement is mediated by humans step by step. This human-mediated loop has become the bottleneck.

AI-Native Systems close this loop. In an AI-native System, AI is the primary agent of continuous creation, evolution, and operation. Humans define objectives, constraints, and governance — while the system continuously executes within those boundaries.

graph LR
    O[Observe] --> R[Reason]
    R --> C[Change]
    C --> E[Experiment]
    E --> V[Validate]
    V --> D[Deploy]
    D --> O

The continuous meta-loop: from observation to deployment, then back again.


Architecture

An AI-native system consists of two parts:

System Under Control

The software system that delivers business value and is subject to continuous evolution — inference platforms, kernel pipelines, storage systems. It is not necessarily an AI system itself.

Controlling System

The agentic AI-driven layer that continuously improves the System Under Control. It has two functions: a Reasoner (observes, hypothesizes, proposes goals) and a Changer (plans, experiments, produces artifacts).


Key Principles

  • Continuous, proactive evolution — not just reactive to failures, but seeking latent optimization opportunities
  • Governed autonomy — every change has complete provenance: what, why, and evidence
  • Spec-driven development — specifications are live documents that evolve with the system
  • Experimentation as a first-class activity — exploring a space of possibilities, not relying on single proposed fixes
  • Hyper-specialization — systems optimized for how they are actually used in each specific deployment
  • Simulation environments — enabling rapid evolution and verification when real‑world experimentation is too costly (e.g., BLIS, our high-fidelity and accurate llm-d simulator).

Technical Domains

We are applying the AI-native vision to three initial domains:

llm-d

A Kubernetes-native distributed LLM inference framework. AI-native approaches drive intelligent scheduling, KV-cache optimization, and continuous performance tuning.

Learn more

AI-Generated Kernels

Autonomous generation and optimization of compute kernels for GPUs and accelerators — driven by workload observations, evolutionary techniques, and continuous experimentation.

Learn more

Storage Systems

Applying spec-driven development and AI-native continuous improvement to storage infrastructure — enabling domain-specific, self-optimizing, and workload-aware storage systems.

Learn more


Latest from the Blog

Check our blogs for latest posts on AI-Native Systems research, progress updates, and deep dives into specific domains.


AI-Native Systems Research · Apache 2.0

We use cookieless Google Analytics to count how many readers each post gets — no cookies, no tracking across sites. Your page URL (without query parameters), browser, and approximate location may be processed. Read what's collected →