Chip manufacturer Intel announced its AI strategy at the 2025 OCP Summit, emphasizing that it will redouble its investment in the AI field with a comprehensive "open approach" to cope with the major transformation the industry is current...
Chip manufacturer Intel announced its AI strategy at the 2025 OCP Summit, emphasizing that it will redouble its investment in the AI field with a comprehensive "open approach" to cope with the major transformation the industry is currently undergoing. Intel points out that AI is a huge disruption that only occurs once in decades. The goal is to work with partners across industry and ecosystem to build an open, modular, and scalable AI platform that delivers the scale needed to monetize AI and transform daily life and business operations.
Focus on inference and agent wisdom to realize investment monetization
Intel made it clear that its strict focus strategy focuses on specific workloads, namely "inference" and "agent intelligence" (Agentic AI). Intel firmly believes that hundreds of billions of dollars in AI investment will be monetized through foundry intelligence that will transform business operations and people's daily lives.
Intel pointed out that the current industry demand for tokens is growing at an alarming rate. A recent report from a major cloud service provider shows that the amount of tokens processed per month has reached 1.4 quadrillions, an increase of more than 100 times in more than a year. This explosive exponential growth represents an urgent need for the industry to solve the problem of "token economics" and think about how to provide intelligence and services effectively and on a large scale.
Criticize vertically integrated architecture and call for heterogeneous open systemsHowever, current AI applications (such as chatbots) are mostly deployed in homogeneous, vertically integrated systems and rely on proprietary networks and software. Intel believes that this current architecture cannot scale effectively. Furthermore, agent intelligence involves multiple models, tool calls, data processing, and environment/sandbox requirements (such as virtual machines or API calls), resulting in potentially two to three orders of magnitude more tokens generated.
OEM smart workloads are extremely diverse. For example, even a single large language model inference call contains two stages: "pre-population" and "decoding". Pre-population requires a compute-optimized accelerator, while decoding requires a memory-bandwidth-optimized GPU. In addition, the agent agent also requires a CPU to handle environment operations (such as encoding agent test code), tool calls, and security protection. And since each component has different requirements for computing, memory bandwidth, and networking, Intel's homogeneous vertically integrated architecture that advocates "one size fits all" is no longer applicable. What the industry really needs is a more flexible, open and heterogeneous infrastructure.
Unified stacking and coordination of software are key to heterogeneityIntel pointed out that the key challenge in achieving this flexibility and heterogeneity lies in software. Software must be able to hide the complexity of underlying heterogeneity and provide a frictionless experience for applications and developers. Therefore, Intel is building a unified software stack, compiler and orchestration infrastructure, with the goal of allowing developers to make no changes to their code and applications will "just run" whether they develop using PyTorch, Hugging Face or LangChain. The system is responsible for profiling OEM workloads, placing different components on the correct hardware type, and coordinating them to achieve end-to-end service level agreements (SLAs). Intel expects to roll out such infrastructure in the fourth quarter.
Intel and partners have conducted benchmark tests to demonstrate the benefits of open system architecture. They connected an Nvidia GPU system (the pre-population part for compute optimization) with an Intel accelerator system (the memory bandwidth-optimized decoding part) over an Ethernet network. This simple combination of heterogeneous systems provides at least a 1.7x performance-per-dollar advantage over homogeneous vertically integrated systems under the same workload.
Next-generation data center GPU-Crescent Island samples to be delivered in 2026In addition to software efforts, Intel will also continue to innovate in underlying hardware to provide more options. Intel announced the launch of its next-generation data center GPU, codenamed "Crescent Island." The focus of this GPU is on inference and agent intelligence optimization, designed to deliver the best token economics and performance per dollar.
Intel emphasized that Crescent Island, the next-generation data center GPU, is built using universal programmable GPU IP XC3 IB. It will have low power consumption, be equipped with LPDDR memory, and achieve a good balance in memory capacity and bandwidth. In addition, GPU-optimized “pre-populated” workloads with just the right combination of compute optimization and memory capacity/bandwidth are expected to be available for sampling in the second half of 2026.
In addition, CPUs continue to play a key role in foundry smart workloads. Intel emphasizes the importance of x86 as a mainstream CPU platform. Intel is actively working openly with AMD and many partners through the "x86 Ecosystem Advisory Group" to promote standardization and compatibility in the x86 ecosystem, such as standardization of interrupt handling and vector instructions (AV extend). Therefore, Intel reiterates its support for open Ethernet standardization technology, which is critical to stitching together all the different components into an optimized chassis. Intel actively participates in OCP and "Open AI System".