ai

LLMs accelerated with eGPU on a Raspberry Pi 5

After a long journey getting AMD graphics cards working on the Raspberry Pi 5, we finally have a stable patch for the amdgpu Linux kernel driver, and it works on AMD RX 400, 500, 6000, and (current-generation) 7000-series GPUs.

With that, we also have stable Vulkan graphics and compute API support.

When I wrote about getting a Radeon Pro W7700 running on the Pi, I also mentioned AMD is not planning on supporting Arm with their ROCm GPU acceleration framework. At least not anytime soon.

Luckily, the Vulkan SDK can be used in its place, and in some cases even outperforms ROCm—especially on consumer cards where ROCm isn't even supported on x86!

They stole my voice with AI

UPDATE 9/23: The CEO of Elecrow responded. I've posted a follow-up blog post with my reaction to the response and some other thoughts on AI voice cloning.

Listen to this clip:

Your browser does not support the video tag.

I don't know about you, but that sounds pretty familiar. I mean I would like you to subscribe to my YouTube channel. But that's the Jeff Geerling channel, not Elecrow, where the clip above is from. I never said the words that are in that video.

If AI chatbots are the future, I hate it

AT&T Fiber Internet - speedtest graph

About a week ago, my home Internet (AT&T Fiber) went from the ~1 Gbps I pay for down to about 100 Mbps (see how I monitor my home Internet with a Pi). It wasn't too inconvenient, and I considered waiting it out to see if the speed recovered at some point, because latency was fine.

But as you can see around 7/7 on that graph, the 100 Mbps went down to about eight, and that's the point where my wife starts noticing how slow the Internet is. Action level.

So I fired up AT&T's support chat. I'm a programmer, I can usually find ways around the wily ways of chatbots.

Except AT&T's AI-powered chatbot seems to have a fiendish tendency to equate 'WiFi' with 'Internet', no doubt due to so many people thinking they are one and the same.

55 TOPS Raspberry Pi AI PC - 4 TPUs, 2 NPUs

I'm in full-on procrastination mode with Open Sauce coming up in 10 days and a project I haven't started on for it, so I decided to try building the stable AI PC with all the AI accelerator chips I own:

  • Hailo-8 (26 TOPS)
  • Hailo-8L (13 TOPS)
  • 2x Coral Dual Edge TPU (8+8 = 16 TOPS)
  • 2x Coral Edge TPU (4+4 = 8 TOPS)

After my first faltering attempt in my testing of Raspberry Pi's new AI Kit, I decided to try building it again, but with a more 'proper' PCIe setup, with external 12V power to the PCIe devices, courtesy of an uPCIty Lite PCIe HAT for the Pi 5.

Raspberry Pi 55 TOPS AI Board

I'm... not sure it's that much less janky, but at least I had one board with a bunch of M.2 cards instead of many precariously stacked on top of each other!

Testing Raspberry Pi's AI Kit - 13 TOPS for $70

Raspberry Pi today launched the AI Kit, a $70 addon which straps a Hailo-8L on top of a Raspberry Pi 5, using the recently-launched M.2 HAT (the Hailo-8L is of the M.2 M-key variety, and comes preinstalled).

Raspberry Pi AI Kit

The Hailo-8L's claim to fame is 3-4 TOPS/W efficiency, which, along with the Pi's 3-4W idle power consumption, puts it alongside Nvidia's edge devices like the Jetson Orin in terms of TOPS/$ and TOPS/W for price and efficiency.

Google's Coral TPU has been a popular choice for a machine learning/AI accelerator for the Pi for years now, but Google seems to have left the project on life support, after the Coral hardware was scalped for a couple years about as badly as the Raspberry Pi itself!

Testing object detection (yolo, mobilenet, etc.) with picamera2 on Pi 5

Besides the Pi 5 being approximately 2.5x faster for general compute, the addition of other blocks of the Arm architecture in the Pi 5's upgrade to A76 cores promises to speed up other tasks, too.

Jeff Geerling person object detection on Pi 5

On the Pi 4, popular image processing models for object detection, pose detection, etc. would top out at 2-5 fps using the built-in CPU. Accessories like the Google Coral TPU speed things up considerably (and are eminently useful in builds like my Frigate NVR), but a Coral adds on $60 to the cost of your Pi project.

With the Pi 5, if I can double or triple inference speed—even at the expense of maxing out CPU usage—it could be worth it for some things.

A PCIe Coral TPU FINALLY works on Raspberry Pi 5

Coral.ai TPUs are AI accelerators used for tasks like machine vision and audio processing. Raspberry Pis are often integrated into small robotics and IoT products—or used to analyze live video feeds with Frigate.

Until today, nobody I know of has been able to get a PCI Express Coral TPU working on the Raspberry Pi. The Compute Module 4, unfortunately, had some quirks in its PCIe implementation, preventing the use of the Coral over PCIe.

Google Coral TPU running over PCIe on Raspberry Pi 5

The Raspberry Pi 5 has a much improved PCIe bus—capable of reaching Gen 3 speeds even!—and I've already tested the first PCIe NVMe HATs for Pi 5.

So can the Pi 5 handle the Coral TPU natively over PCIe?

Yes. Though currently, you need to tweak a few things to get it working.

Testing the Coral TPU Accelerator (M.2 or PCIe) in Docker

Google Coral TPU in PCIe carrier

I recently tried setting up an M.2 Coral TPU on a machine running Debian 12 'Bookworm', which ships with Python 3.11, making the installation of the pyCoral library very difficult (maybe impossible for now?).

Some of the devs responded 'just install an older Ubuntu or Debian release' in the GitHub issues, as that would give me a compatible Python version (3.9 or earlier)... but in this case I didn't want to do that.

Transcribing recorded audio and video to text using Whisper AI on a Mac

2024 Update: I have a short video outlining my end-to-end process for subtitling all my videos on YouTube using Whisper/MacWhisper:

.embed-container { position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden; max-width: 100%; } .embed-container iframe, .embed-container object, .embed-container embed { position: absolute; top: 0; left: 0; width: 100%; height: 100%; }

Late last year, OpenAI announced Whisper, a new speech-to-text language model that is extremely accurate in translating many spoken languages into text. The whisper repository contains instructions for installation and use.

tl;dr: