The Nvidia H200, in line with the corporate’s press launch, is the primary GPU to supply HBM3e — quicker, bigger reminiscence to gas the acceleration of generative AI and huge language fashions, whereas advancing scientific computing for HPC workloads.With HBM3e, the Nvidia H200 delivers 141GB of reminiscence at 4.8 terabytes per second, almost double the capability and a couple of.4x extra bandwidth in contrast with its predecessor, the Nvidia A100, the corporate famous.
When will the chip be made obtainable?
The Nvidia H200 will probably be obtainable from world system producers and cloud service suppliers beginning within the second quarter of 2024. Amazon Internet Companies, Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure will probably be among the many first cloud service suppliers to deploy H200-based situations beginning subsequent yr.
Powered by Nvidia NVLink and NVSwitch high-speed interconnects, HGX H200 supplies the best efficiency on numerous utility workloads, together with LLM coaching and inference for the most important fashions past 175 billion parameters. An eight-way HGX H200 supplies over 32 petaflops of FP8 deep studying compute and 1.1TB of mixture high-bandwidth reminiscence for the best efficiency in generative AI and HPC functions, mentioned Nvidia.
Nvidia’s graphics processing items (GPUs) are taking part in an more and more vital function within the improvement and deployment of generative AI fashions. GPUs are designed to deal with the large parallel computations required for coaching and working these fashions, making them well-suited for duties corresponding to picture technology, pure language processing amongst others. The corporate’s GPUs are capable of speed up the coaching and working of generative AI fashions by a number of orders of magnitude. This is because of their parallel processing structure, which permits them to carry out many calculations concurrently.