TLDR
- Nebius (NBIS) has entered into an agreement to purchase Eigen AI, a model optimization and inference specialist, for roughly $643 million through a combination of cash and Class A shares.
- The acquisition will bring Eigen AI’s optimization capabilities into Nebius Token Factory, the company’s enterprise-focused managed inference solution.
- MIT HAN Lab founders will lead the establishment of Nebius’s inaugural Bay Area engineering center and research facility.
- Collaborative optimization work between both organizations has already achieved top rankings on Artificial Analysis performance benchmarks.
- NBIS shares climbed 8.51% to reach $150.00 following the announcement, recovering from a 6.07% weekly decline.
On May 1, 2026, Nebius (NBIS) revealed its intention to purchase Eigen AI in a transaction valued at approximately $643 million. The consideration includes both cash and Nebius Class A shares, calculated using the company’s 30-day volume-weighted average share price at the time of agreement. Following the disclosure, NBIS shares surged 8.51% to $150.00.
The acquisition is anticipated to finalize in the coming weeks, subject to regulatory antitrust approval and customary completion requirements.
Eigen AI specializes in inference optimization and model performance enhancement. The company’s platform enables AI development teams to deploy open-source models with superior speed and reduced costs in production environments, eliminating the need for internal optimization infrastructure.
Nebius intends to integrate Eigen AI’s technology seamlessly into Token Factory, its managed inference service. Token Factory delivers auto-scaling API endpoints and fine-tuning capabilities for leading open-source models such as Llama, DeepSeek, Qwen, Gemma, and additional frameworks.
The partnership between these organizations predates the acquisition announcement. Prior collaboration resulted in jointly optimized model versions that achieved top-tier speed rankings on Artificial Analysis, a prominent independent AI performance evaluation platform.
What Eigen AI Brings to the Table
Eigen AI emerged from MIT’s HAN Lab research group. The company’s co-founders, Ryan Hanrui Wang and Wei-Chen Wang, developed two breakthrough methodologies now fundamental to modern AI deployment practices.
Ryan’s Sparse Attention innovation (SpAtten) holds the distinction of being the most-referenced HPCA publication since 2020. Wei-Chen’s Activation-aware Weight Quantization (AWQ) earned the MLSys 2024 Best Paper Award and has become the industry standard for 4-bit model deployment.
Third co-founder Di Jin earned his PhD from MIT CSAIL and played a key role in developing Meta’s Llama 3 and Llama 4 post-training processes. He also co-developed the CGPO reinforcement learning from human feedback methodology.
Upon transaction completion, the Eigen AI team will establish operations in the San Francisco Bay Area, marking Nebius’s first engineering and research footprint in the United States.
The Inference Market Context
Inference workloads currently represent the most rapidly expanding segment of the AI computing landscape. Industry projections indicate inference will account for approximately two-thirds of aggregate AI compute requirements throughout 2026.
Efficient inference execution presents significant technical challenges. The process encompasses model representation optimization, GPU kernel performance tuning, and dynamic workload orchestration — specialized capabilities beyond the reach of most internal engineering teams.
Open-source models compound these difficulties by typically shipping without performance optimization. Emerging architectures including Mixture-of-Experts and Compressed Sparse Attention present additional obstacles related to memory bandwidth and computational efficiency that demand specialized technical knowledge.
Eigen AI’s comprehensive optimization methodology addresses post-training enhancement, fine-tuning workflows, and production inference across the full spectrum of major open-source frameworks. The company’s kernel-level and model-specific techniques deliver enhanced performance from existing hardware infrastructure without requiring additional engineering resources.


