NEWS
Pay attention to the latest developments in the industry

SambaNova Joins Hands with Intel to Launch Disaggregated Agentic AI Inference Solution

Time: 2026.04.14

SambaNova Joins Hands with Intel to Launch Disaggregated Agentic AI Inference Solution

Published: 2026-04-14 | Global Semiconductor & AI Industry News

SambaNova partners with Intel to build a disaggregated inference architecture for agentic AI. The joint solution distributes LLM prefill and decode workloads across Xeon 6 CPUs, SN50 RDUs and existing GPUs, boosting inference efficiency for enterprise AI deployment in H2 2026.

Based on the previously reached strategic cooperation agreement, SambaNova officially collaborates with Intel to create a brand-new disaggregated inference solution, which is mainly oriented to the rapidly developing agentic artificial intelligence scenarios. The overall architecture adopts hybrid deployment mode, including SambaNova RDU computing server racks, Intel Xeon 6 CPU server racks and existing general-purpose GPU equipment in data centers.

undefined

Figure: SambaNova & Intel Disaggregated Inference Solution vs. Competitors (Source: SambaNova Official Public Materials)

High-interactive application scenarios such as intelligent code generation put forward higher requirements for inference response speed and system compatibility, which also makes disaggregated inference architecture become the mainstream development trend in the industry. Rodrigo Liang, CEO of SambaNova, stated that with the continuous expansion of AI model scale, the market demand for low-latency and high-efficiency inference capability will keep rising steadily.

In the whole set of joint solution, Intel Xeon 6 series CPUs undertake core tasks such as intelligent agent scheduling, system resource orchestration and docking with traditional enterprise business systems, efficiently connecting intelligent AI agents with enterprise databases, transaction systems and long-term running business platforms, realizing seamless integration between new AI applications and old enterprise business.

In terms of large language model reasoning task splitting, the industry has formed a clear optimization idea. The prefill stage with excellent parallel computing performance can be undertaken by existing GPU resources in data centers, while SambaNova SN50 RDU is specially responsible for efficient processing in the decode stage. This reasonable task allocation can fully release the performance advantages of different hardware.

Different from the mainstream inference architecture in the current market, SambaNova chooses to concentrate all decode links on self-developed RDU chips. This design avoids frequent data interaction and transmission loss between different types of accelerators. Meanwhile, the air-cooled RDU equipment has low rack power consumption below 30 kilowatts, which brings higher flexibility for data center layout and deployment.

In order to promote the popularization of heterogeneous disaggregated architecture, the industry is constantly promoting the unification of software interface standards. Open-source tools such as vLLM, SGLang and vendor-neutral NIXL transmission library are gradually improving, which greatly reduces the docking difficulty between prefill and decode modules of different brands of hardware.

In terms of hardware adaptation, Intel Xeon 6 CPU will also be used as the main host control chip of SambaNova SN50 RDU products to replace the original matching scheme. The solution can independently migrate all core computing loads to RDU after completing basic system scheduling, effectively eliminating the performance bottleneck caused by PCIe bus transmission.

At present, the two companies have launched in-depth joint software debugging and scene verification work, and completed the deployment of SambaNova hardware equipment in Intel's test environment. After finishing all rounds of performance optimization and compatibility tests, the officially launched joint disaggregated AI inference system will be officially delivered to the market in the second half of 2026.