Cerebras Systems announced it’s now running Kimi K2.6 in customer trials, marking the first time the chip company has operated a one trillion parameter open-weight model while reaching speeds close to 1,000 tokens per second.
The Sunnyvale, California-based firm went public last Thursday in the biggest U.S. technology offering so far this year. Shares were priced at $185 each, well above the $115 to $125 range initially planned.
When trading started on May 14, the stock jumped to $350 before settling at $311.07 by day’s end, a 68 percent increase from the offering price.
Orders came in at more than 20 times the number of shares available as reported by Cryptopolitan before, giving Cerebras a market value of roughly $64 billion.
After the opening-day surge, shares dropped 10.1 percent on Friday to close at $279.72. The stock recovered Monday with a 6.1 percent gain to $296.65, then added another 2.4 percent Tuesday to finish at $303.63.
The Tuesday gain came after S&P Dow Jones Indices approved Cerebras for early inclusion in its indexes because of the company’s large market capitalization.
ARK Investment Management, led by Cathie Wood, purchased 149,176 shares worth approximately $46.4 million through its ARK Innovation and ARK Next Generation Internet funds, according to trading records released Friday.
The firm had made a smaller purchase the day before.
Testing firm Artificial Analysis recorded Cerebras running K2.6 at 981 output tokens per second, 6.7 times faster than the closest GPU-based cloud competitor and 23 times faster than the typical provider.
For a request involving 10,000 input tokens that includes prompt processing, reasoning, and producing 500 output tokens, Cerebras completed the full response in 5.6 seconds compared to 163.7 seconds on the official Kimi system, a 29-fold improvement.
K2.6 ranks at the top of SWE-Bench Pro with a score of 58.6, performing better than Claude Opus 4.6 and matching GPT-5.4. The model also leads on tasks involving agents, including Humanity’s Last Exam and DeepSearchQA benchmarks.
Cerebras builds its technology around the Wafer-Scale Engine, which differs from typical chip designs. The third-generation system uses an entire silicon wafer instead of cutting it into smaller individual chips, making it about 57 times larger than the biggest graphics processing unit available today.
The wafer holds 4 trillion transistors, 900,000 cores, and 44 gigabytes of built-in memory.
Cerebras says that this setup can make tasks faster over 15 times than how the systems with usual GPUs work.
Revenue climbed 76 percent in 2025 to $510 million, up from $290 million in 2024, according to documents filed for the public offering.
Cerebras also turned profitable, reporting net income of $238 million in 2025 after losing nearly $482 million the previous year.
Major customers have signed on recently. In January, Cerebras reached a multi-year agreement with OpenAI worth more than $20 billion. Under the deal, OpenAI will use 750 megawatts of Cerebras computing capacity through 2028. OpenAI also extended a $1 billion working capital loan to help fund the infrastructure buildout.
In March, Amazon signed a binding agreement making Amazon Web Services the first major cloud provider to put Cerebras systems in its data centers, with the service available through Bedrock.
“OpenAI and Cerebras have agreed to co-design future models for future Cerebras hardware,” CEO Andrew Feldman wrote in a letter included with the offering documents.
If you're reading this, you’re already ahead. Stay there with our newsletter.