Given some of the pre-show announcements last week you might be forgiven for thinking that SC23 is going to be all about AI!

Here in the UK we had two huge announcements, the first being the award of Isambard-AI to a consortium led by Bristol University (headed by Prof. Simon MacIntosh-Smith, Director of the Isambard National Research Facility at the University of Bristol) and due in 2024. Powered by NVIDIA GH200s and integrated by HPE the £225m system will likely end up being the UKs largest HPC/AI system in late 2024 when it is fully operational (a smaller Isambard-3 system will be installed in March ’24).

The next is Dawn (with Phase 1 being installed as I write) at the Cambridge Open Zettascale Lab (led by Dr Paul Calleja, Director of Research Computing Services at the University of Cambridge and in conjunction with the UKRI and UKAEA) powered by Intel Max CPUs and GPUs and integrated by Dell (running on an OpenStack infrastructure).

Both are part of the UK Governments AI Research Resource (AIRR) initiative and are funded by the UKRI. We’re a little light on precise details at the moment but expect a few more to emerge at SC23.
HPC feels like it has been a little overshadowed by the Generative AI feeding frenzy of the last 6-9 months and the mood music is that everyone wants a system that is capable of at least running if not developing LLMs. This has had a number of interesting knock-on effects in the procurement space, with some elements of a more ‘traditional’ HPC system being increasingly hard to find without eye-wateringly long lead times.

Some see this as a chance for AMD to solidify its position in the HPC market while NVIDIA is firmly going after the AI market. Of course NVIDIA isn’t going to have it all its own way with plenty of pretenders to the crown jockeying to see if they can steal a slice of the pie. I’m told that NVIDIA won’t have their traditional mega presen at SC this year, which is a shame, but you can understand why.
AMD and Microsoft are likely to be making continued noise with AMD based SKUs in the Azure Cloud and behind closed doors I imagine we will see what performance AMD has imbued EPYC 5 (Turin) with. I expect to see MI300 in the flesh and it will be interesting to see how it compares to the currently all-conquering NVIDIA G100.

As always, the CPUs and GPUs tend to steal the focus, but arguably of equal importance are the systems level packaging that the vendors bring to the table. I haven’t been to SC in a few years so I’m really looking forward to getting to see the cooling systems and enclosure advances in that time. HPC systems also don’t run in isolation, so I’ll also be very interested to kick the tyres on performance storage systems and take the pulse of the HPC software ecosystem.

Intel has been ‘though some stuff’ in recent years and on the hardware front that seems set to continue for a little longer, but it’s doing some interesting work with the OneAPI ecosystem. AMD also has hugely upped its game on that front, with ROCm finally getting to the point where it is a credible competitor to (though certainly not as complete as) the NVIDIA CUDA ecosystem. Add in the latest work in OpenMP and the MPI spaces and we’re seeing some serious advances in the productivity and viability of cross-platform accelerated software ecosystem. Lots of interesting BoFs and workshops on this over the week with a couple on performance portability being of particular interest.

I’m really interested to speak with people who are integrating AI/ML into their HPC workflows too. My gut tells me that there is some amazing progress being made in some areas (weather being a case in point) but I fully expect many other domain areas are going to go down this route and hopefully we’ll see details of this that at the conference.

Feel free to say hi (I’ll be wearing my Guru badge 😉). Hope to see many of you (old and new friends) on the show floor or at the various gatherings!!

Dairsie Latimer
Technology FellowRed Oak Consulting