Untether AI Announces speedAI Accelerator Cards As World’s Highest Performing, Most Energy Efficient AI Accelerators According to MLPerf Benchmarks

Verified by MLPerf Inference 4.1 benchmarks, speedAI accelerator cards exhibit industry leading performance, up to 6X the energy efficiency and 3X lower latency than other submitters

Untether AI Announces speedAI Accelerator Cards As World’s Highest Performing, Most Energy Efficient AI Accelerators According to MLPerf Benchmarks

Media Contact for Untether AI:
Michelle Clancy Fuller, Cayenne Global, LLC
Michelle.clancy@cayennecom.com
1-503-702-4732

Company Contact:
Robert Beachler, Untether AI
beach@untether.ai
+1.650.793.8219

Untether AI^®, the leader in energy-centric AI inference acceleration, today announced its world-leading verified submission to the MLPerf^® Inference benchmarks, version 4.1. The speedAI^®240 Preview accelerator card submission demonstrated the highest throughput of any single PCIe card in Datacenter and Edge categories for the image classification benchmark, ResNet-50[1]. The speedAI240 Slim submission exhibited greater than 3X the energy efficiency of other accelerators in the Datacenter Closed Power category [2], 6X the energy efficiency in the Edge Closed category [3], and 3X lower latency in the Edge Closed category [4].

The MLPerf benchmarks are the only objective, peer-reviewed set of AI performance and power benchmarks in existence. MLPerf is supported by MLCommons^®, a consortium of industry leading AI chip developers such as Nvidia, Google, and Intel. Benchmark submissions are audited, and peer reviewed by all the submitters for performance, accuracy, latency, and power consumption, ensuring fair and accurate reporting. It is such a high barrier to entry that few startups have attempted to submit results. Untether AI not only submitted, but demonstrated that it has the highest performance, most energy efficient, and lowest latency PCIe AI accelerator cards.

“AI’s potential is being hamstrung by technologies that force customers to choose between performance and power efficiency. Demonstrating leadership across both vectors in a stringent test environment like MLPerf speaks volumes about the unique value of Untether AI's At-Memory compute architecture as well as the maturity of our hardware and software," said Chris Walker, CEO of Untether AI.

“MLPerf submission requires operating hardware, shipments to customers, computational accuracy, and a mature software stack that can be peer reviewed. It also requires companies to declare how many accelerators are used in their submission. These factors are what makes these benchmarks so objective, but also creates a high bar that many companies can’t meet in performance, accuracy or transparency of their solution,” said Bob Beachler, VP of Product at Untether AI.

Untether AI submitted two different cards and multiple system configurations in the Datacenter Closed, Datacenter Power, Edge Closed, and Edge Power categories for the MLPerf v4.1 ResNet-50 benchmark. In the Datacenter Closed offline performance submission of ResNet-50, it had a verified result of 70,348 samples/s[1], the highest throughput of any PCIe card submitted. In the Datacenter Closed Power, its 309,752 Server Queries/S at 986 Watts was 3X more energy efficient than the closest competitor [5].

In the Edge submission, it had a verified result of 0.12ms for single stream latency, 0.17ms for multi stream latency, and 70,348 samples/s for offline throughput [4]. These latency values are the fastest ever recorded for an MLPerf ResNet-50 submission. In the Edge Closed Power category, Untether AI was the only company that submitted so no direct comparison is available. However, normalizing to single cards and their published TDPs, the speedAI 240 Slim card demonstrated a 6X improvement in energy efficiency [3].

Untether AI enlisted the aid of Krai, a company that provides AI computer systems and has been trusted to submit MLPerf benchmarks by companies such as Dell, Lenovo, Qualcomm, Google, and HPE. Anton Lokhmotov, CEO of Krai, said, “We were impressed not only by the record-breaking performance and energy efficiency of the speedAI 240 accelerators, but also the readiness and robustness of the imAIgine SDK which facilitated creating the benchmark submissions.”

Untether AI is excited to have its At-Memory technology available in shipping hardware and verified by MLPerf. The speedAI240 accelerator cards’ world-leading performance and energy efficiency will transform AI inference, making it faster and more energy efficient for markets including datacenters, video surveillance, vision guided robotics, agricultural technology, machine inspection, and autonomous vehicles. To find out more about Untether AI acceleration solutions and its recent MLPerf benchmark scores, please visit https://www.untether.ai/mlperf-results/.

About Untether AI

Untether AI^® provides energy-centric AI inference acceleration from the edge to the cloud, supporting any type of neural network model. With its At-Memory compute architecture, Untether AI has solved the data movement bottleneck that costs energy and performance in traditional CPUs and GPUs, resulting in high-performance, low-latency neural network inference acceleration without sacrificing accuracy. Untether AI embodies its technology speedAI^® devices, acceleration cards, and its imAIgine^® Software Development Kit. More information can be found at www.untether.ai.

All references to Untether AI trademarks are the property of Untether AI. The MLPerf name and logo are registered and unregistered trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use is strictly prohibited. See www.mlcommons.org for more information. All other trademarks mentioned herein are the property of their respective owners.

Notes and Disclaimers

[1] Verified MLPerf™ score of v4.1 Inference Datacenter Closed ResNet-50. Results taken from https://mlcommons.org/benchmarks/inference-datacenter/ on August 28th, 2024. Result 4.1-0076. Normalized per card performance is not an official MLPerf benchmark.

[2] Verified MLPerf™ score of Inference v4.1 Datacenter Closed Power ResNet-50. Results taken from https://mlcommons.org/benchmarks/inference-datacenter/ on August 28th, 2024. Result 4.1-0066 and 4.1-0049.

[3] Verified MLPerf™ score of Inference v4.1 Inference Edge Closed ResNet-50. Results taken from https://mlcommons.org/benchmarks/inference-edge/ on August 28th, 2024. Result 4.1-0067 and 4.1-0041. Normalized per card performance and TDP power comparison is not an official MLPerf benchmark.

[4] Verified MLPerf™ score of Inference v4.1 Inference Edge Closed ResNet-50. Results taken from https://mlcommons.org/benchmarks/inference-edge/ on August 28th, 2024. Result 4.1-0069 and 4.1-0041.

[5] Verified MLPerf™ score of Inference v4.1 Datacenter Closed Power ResNet-50. Results taken from https://mlcommons.org/benchmarks/inference-datacenter/ on August 28th, 2024. Result 4.1-0066 and 4.1-0049.