Back in 2019, Tesla hinted at their Dojo Supercomputer project development. Elon Musk got all riled up and talked about it in a few of his interviews. Just recently, the D1 Dojo chip was unveiled at Tesla AI day.
We believe that Tesla’s Dojo Supercomputer might potentially become the world’s most powerful supercomputer one day! You might be wondering why? Though Tesla is an auto-mobile manufacturer, its interest in supercomputers has always been there. Today’s computer processing speed reaches 500 PetaFlops but, Tesla’s Dojo is expected to reach up to 1 ExaFlop which would be one quintillion (1018) floating-point operations per second. This is how it could become the world’s fastest AI-incorporated computer.
For your reference:
- 1 Flop = 1 Floating Point
- 1 PetaFlop = 1,000 TeraFlops
- 1 ExaFlop = 1,000,000 TeraFlops
Currently, Tesla is dealing with a plethora of video data which is extrapolated from their 1 million fleets of cars. As per Tesla, they weren’t happy with their current hardware of training computer vision neural nets. Thus began the creation of the D1 Dojo chip.
“Necessity is the mother of Inventions.” Tesla always takes this saying seriously. The D1 dojo chip uses 7-nanometer technology that delivers groundbreaking bandwidth and performance to the computer. This isn’t the first chip that they have designed, Tesla also made the FSD chip which can be found in the computer hardware 3 of their cars.
The Chief engineer Venkataramanan who is behind the Dojo Supercomputer said:
“This was entirely designed by the Tesla team internally. All the way from the architecture to the package. This chip is like GPU-level compute with a CPU level flexibility and twice the network chip level IO bandwidth.”Venkataramanan
Can Tesla’s Dojo Supercomputer be a breakthrough? Let’s Discuss!
The possible answer is YES! However, it’s really important to note that Tesla’s dojo supercomputer is mainly centered around managing their in-house operation. As mentioned above, it will be used for neural net video training only.
Tesla’s dojo supercomputer has the capability of running at 442,010 TeraFlops per second and by theory, it can push up to 537,212 TeraFlops per second. These numbers are close to exaflops which no supercomputer has achieved to date.
Currently, the most powerful supercomputer sits in Japan, called “Fujaku” and it has set a world record of 442 PetaFlops. Jumpstart magazines listed Tesla’s Dojo supercomputer as the world’s top 6th most powerful supercomputer in world based on the growth curve.
Does this mean that Tesla impulsively made that statement? Can it really achieve 1 ExaFlop speed? Stay tuned to know!
What’s the Deal with the Tiles?
Let’s talk about some hardware that will form the exascale system of Dojo.
Tesla created training tiles to support the computing system of D1 chips. Each tile is made up of 25 D1 chips in an integrated multi-chip module providing nine petaflops of computing and 36tbps of off-tile bandwidth. There are 500,000 nodes connected around in this tile which is why 9 petaflops per second are possible.
Tesla plans on building a tile housing cabinet that can cater too much larger systems. The tiles will be contained in a tray kind thing which will consist of 6 Dojo tiles. This setting will give an output of 100+ Flops/s.
Tesla plans to develop an ExaPOD containing a total of 10 Dojo cabinets in the near future. This very well could give the 1.1 exaflops mark as mentioned above.
Are the Claims Realistic?
The main concerns revolving around Tesla’s Dojo Supercomputer are that it hasn’t been through the rigorous testing phase as ‘Fujaku’ did. One of the fundamental tests for supercomputers is peeling apples. Tesla flipped the tables by peeling oranges. Tesla’s Dojo supercomputer might be the 6th fastest supercomputer in peeling apples but it’s the world’s fastest supercomputer in peeling oranges.
Regardless of that, the Dojo is mainly designed to boost machine learning and deep learning activities. Tesla uses unsupervised learning for their autopilot feature in their cars. No wonder they designed this D1 Dojo chip to aid ML.
On top of all this, the Dojo supercomputer has no RAM. Confused why? The D1 Dojo chip takes full advantage of L3 CACHE by possessing a 424.8 MB cache of memory. Beating IBM who previously held the record of 120 MB cache memory. An interesting factor worth mentioning is that when the CPU calls on RAM, it takes about 60 nanoseconds to respond. Whereas, L3 CACHE has a fewer response time i.e. 10 nanoseconds.
It gets stranger here, while the training tile doesn’t have a RAM it also doesn’t have a ‘shared’ L3 Cache on its SoC. During the event, Tesla demonstrated the node’s architecture which hints at an L1 Cache being used in the SoC. So just to let you know, L1 has a response time of 0.5 secs, while the L2 responds in 3-4 nanoseconds.
Just keep in mind that this system is fine-tuned for a specific purpose. While most systems have a wide array of components to keep the flexibility intact it seems as though the Dojo has been designed for restricted use only.
Calling Tesla’s Dojo supercomputer unimpressive is an understatement. It’s absolutely a groundbreaking invention and a massive leap forward for Tesla as they have a track record for living up to their wildest claims. Only time will tell how they enhance the process of training their computer neural nets.
Can’t wait for it to happen!