Nebius has announced plans to acquire Eigen AI, a company focused on inference and model optimization, in a transaction valued at approximately $643 million. The move reflects a broader shift in ...
Hyperscalers and AI companies have been turning toward specialized processors to run inference workloads in the cloud. Arm Holdings' chip design architectures have gained immense popularity among ...
Google said this week that its research on a new compression method could reduce the amount of memory required to run large language models by six times. SK Hynix, Samsung and Micron shares fell as ...
Stanford adjunct professor and successfully exited founder Zain Asgar just raised an $80 million Series A for a startup that solve the AI inference bottleneck problem in an astute way. The round was ...
NVIDIA, A MANUFACTURER of computer chips, is the most valuable company in the world. It owes its success to the versatility of the graphics processing unit (GPU), a chip it pioneered in the late 1990s ...
A significant shift is under way in artificial intelligence, and it has huge implications for technology companies big and small. For the past half-decade, most of the focus in AI has been on training ...
Nvidia CEO Jensen Huang debuted a new AI inference system during his GTC conference keynote. The product incorporates technology from Groq, with which Nvidia made a $20 billion deal. The chip can ...
Every GPU cluster has dead time. Training jobs finish, workloads shift and hardware sits dark while power and cooling costs keep running. For neocloud operators, those empty cycles are lost margin.
Google has officially released TensorFlow 2.21. The most significant update in this release is the graduation of LiteRT from its preview stage to a fully production-ready stack. Moving forward, LiteRT ...
As training costs soar, Microsoft is betting its latest chip on running models efficiently, not teaching them. JASON REDMOND/AFP via Getty Images Maia 200 is a custom application-specific integrated ...
The creators of the open source project vLLM have announced that they transitioned the popular tool into a VC-backed startup, Inferact, raising $150 million in seed funding at an $800 million ...