The design can run a large neural network more efficiently than banks with GPUs connected. But the production and operation of the chip is a challenge, requiring new methods of etching silicon functions, a design that includes redundancy to take account of manufacturing defects and a new water system to keep the giant chip cool.
To build a cluster of WSE-2 chips that can run record-breaking AI models, Cerebras had to solve another engineering challenge: how to get data in and out of the chip efficiently. Ordinary chips have their own memory on board, but Cerebras developed a memory box without a chip called MemoryX. The company also created software that allows a neural network to be partially stored in the off-chip memory, with only the calculations sent to the silicon chip. And it built a hardware and software system called SwarmX that connects everything.
“They can improve the scalability of training to enormous dimensions, beyond what some people do today,” said Mike Demler, senior analyst at Linley Group and senior editor at The microprocessor report.
Demler says it is not yet clear how much of a market it will be for the cluster, especially since some potential customers are already designing their own, more specialized chips internally. He adds that the real performance of the chip, in terms of speed, efficiency and cost, is still unclear. Cerebras has not published any reference results so far.
“There is a lot of impressive engineering in the new MemoryX and SwarmX technologies,” says Demler. “But just like the processor, these are highly specialized things; it only makes sense to train the very largest models. ”
Cerebra’s chips have so far been adopted by laboratories that need supercomputer power. Early clients include Argonne National Labs, Lawrence Livermore National Lab, pharmaceutical companies including GlaxoSmithKline and AstraZeneca, and what Feldman describes as “military intelligence” organizations.
This shows that the Cerebras chip can be used for more than just operating neural networks; The calculations these laboratories run involve similarly massive parallel mathematical operations. “And they are always thirsty for more computing power,” says Demler, who adds that the chip could be important for the future of supercomputers.
David Kanter, an analyst at Real World Technologies and CEO of MLCommons, an organization that measures the performance of various AI algorithms and hardware, says he sees a future market for much larger AI models. “I generally tend to believe in data – centric ML [machine learning], so we want larger datasets that make it possible to build larger models with more parameters, says Kanter.