Trying out a Coral TPU

A few years ago, Google released a neat little product called Coral, a “tensor processing unit” (TPU), aka, an AI accelerator. Targeted at IoT/embedded devices, such as a Raspberry Pi, Coral can run models using TensorFlow Lite and has enough performance to allow these devices to do some AI in a reasonable amount of time.

Of course, this isn’t without its limitations. TensorFlow Lite only supports models quantized to/using int8 (signed 8-bit integers), so no floating point numbers. Additionally, not all types of neural network operations are allowed. For example, Recurrent Neural Networks (RNNs) are limited to using LSTM, and even still has limitations & known issues on converting.

Finally, the biggest limitation is that the hardware isn’t that powerful. However, that’s somewhat by design; Coral is designed to run efficiently with very little power, and TensorFlow Lite also targets mobile devices. So, if you have a GPU, even something simple will be able to run circles around the Coral. But if you have an embedded/IoT use case, Coral will perform significantly better than the CPU will (also, back when it was released, ARM chips didn’t have any sort of neural processing and while some do now in 2023, it’s still not common).

Recently, I got my hands on the USB version of the Coral, so let’s try out the getting started guide as a way to learned about the Coral.

Following the guide, I spun up a VM running Debian 10, since that’s one of the supported operating systems. After that I did a USB pass-through to access the Coral from the VM. This was the first hiccup I encountered – for some weird reason the Coral will switch between two hardware IDs:

1a6e:089a Global Unichip Corp.
18d1:9302 Google Inc.

Thankfully the solution seems to be as simple as just passing both IDs through to the VM.

The second issue I encountered was that the kernel driver wants privileged access, and I was using an unprivileged user. A quick search yielded this GitHub issue, and rather than always running commands using sudo, I opted for the simpler (and more secure) option of adding my user to the plugdev group.

sudo usermod -aG plugdev $USER

After dealing with those two quirks, the rest of the process was seamless and I was able to run the MobileNet image classification example:

python3 examples/ --model test_data/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite --labels test_data/inat_bird_labels.txt --input test_data/parrot.jpg
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
Ara macao (Scarlet Macaw): 0.75781

Oddly, this is a lot slower than the getting started guide’s results. The accuracy was about the same, but their times were ~12ms for the first run and ~3ms for the subsequent runs. Google does provide an overclocked driver for better performance, but warns that it causes the hardware to get very hot.

Interestingly, identical results are given for the M.2/PCIe variant of the Coral, which is able to perform better to higher bandwidth (USB has more overhead and Coral has the USB variant categorized as a “prototyping” device) and align with the benchmarks page, which is noted as using C++ instead of Python due to C++ having lower overhead. So, part of me wonders if they copy-pasted and the getting started guide’s performance is not accurate.

Either way, this was a fun little experiment and gives me a new toy to play with. I was also pleasantly surprised that the getting started guide worked with minimal issues, given that it’s been a few years since the product was released and the operating system support is a bit dated.

Catch you next time!