Measuring the performance of AI and its impact on society

Hosted by Jeff Shepard, EE World has organized this “virtual conversation” with Gary Bronner (GB), Senior Vice President with Rambus Labs. Mr. Bronner has generously agreed to share with us his experience and insights into AI applications and emerging computing architectures.

JS: When benchmarking AI performance, how would you rank the importance of throughput, latency, and accuracy?

GB: There really isn’t a one size fits all answer to this. Those things are valued very differently, depending on the specific application. It also depends upon whether you are looking at training or inference.

Gary Bronner (GB), Senior Vice President, Rambus Labs

For training, throughput is key. For inference, normally, one wants excellent latency, which makes it difficult to optimize for throughput. This is normally a trade-off that must be made. Training is all about collecting a bunch of things and then running them through together, which is more efficient from a throughput perspective but gives a longer latency. Inference is all about looking at a sample of one and quickly making a decision, which implies low latency.

The third important measurement to keep in mind is accuracy. You can improve throughput and latency but actually dramatically hurt accuracy. For example, I have seen situations where someone has built a really bad model. It doesn’t have very much computation at all. It’s not going to require very much data, and it’s not going to be very accurate, but I’m going to be able to get through it very quickly. And it is going to be very low latency to do the calculations. But the results are not acceptable. One needs to find an acceptable accuracy level for one’s application. It is a tough tradeoff since accuracy also affects energy. There is more power available in a data center than on a phone so that accuracy can be much higher.

JS: How is the perception of AI’s environmental impact changing, so-called “red AI”, and how might that perception change future developments in AI?

The environmental impact of the world’s collective “computing” is hugely important. According to results published in 2018, the world’s data centers consume ~ 1% of the world’s energy. To prevent this number from growing bigger, special attention will need to be paid. In the cloud, AI applications are immensely power-hungry, which makes efficiency even more important. And it’s expected that more and more AI will be done at the edge where many devices are battery-powered. As a result, I would expect to see a move away from “red AI” except for the most critical applications where power consumption trumps efficiency – for example, in the race to find a vaccine for COVID-19.

HBM2E is a high-performance memory that delivers higher overall throughput at a higher bandwidth-per-watt efficiency ideal for AI/ML and high-performance computing (HPC) applications. (Image: Rambus Labs)

JS: What are the societal implications, if any, of bias that may exist in AI inference engines?

GB: This isn’t a hardware issue, and so as a hardware company, this falls far outside Rambus’ purview. This is a challenge that needs to be addressed by the people coding and deploying the AI algorithms.

JS: Where do you expect AI and cognitive computing to have the largest impact in the near term? In the longer term?

GB: AI has made the most progress in deep learning applications, which are based on pattern matching. These are the applications where a lot of data can be learned and recognized by a computer. Such applications are already starting to have a large impact with significant positive results. For example, with Microsoft’s translation app, you can talk to your phone, and it will speak in a different language.

In the longer term, if we could get to something that is more towards the direction of what’s called artificial general intelligence (AGI), meaning intelligence, more like a human, I think that would have a very large impact. But it is very unclear when we are actually going to get there. Discussion in the community puts it in a range anywhere from five to a few hundred years.

JS: Thank you to Mr. Bronner, for sharing his insights and experience, another great conversation! You might also be interested in reading “AI applications and emerging computing architectures” – Virtual Conversation (part 1 of 2).