Testing object detection (yolo, mobilenet, etc.) with picamera2 on Pi 5

Besides the Pi 5 being approximately 2.5x faster for general compute, the addition of other blocks of the Arm architecture in the Pi 5's upgrade to A76 cores promises to speed up other tasks, too.

Jeff Geerling person object detection on Pi 5

On the Pi 4, popular image processing models for object detection, pose detection, etc. would top out at 2-5 fps using the built-in CPU. Accessories like the Google Coral TPU speed things up considerably (and are eminently useful in builds like my Frigate NVR), but a Coral adds on $60 to the cost of your Pi project.

With the Pi 5, if I can double or triple inference speed—even at the expense of maxing out CPU usage—it could be worth it for some things.

To benchmark it, I wanted something I could easily replicate across my Pi 4 and Pi 5, and luckily, the picamera2 library has examples that I can deploy to any of my Pis easily.

Using TensorFlow Lite, I can feed in the example YOLOv5 or MobileNetV2 models, and see how performance compares between various Pi models.

Installing dependencies

You need to have picamera2 and a few other dependencies installed for the examples to run. Some of them are pre-installed, but check the documentation at the top of the example file for a full listing.

# Install OpenCV and Pip.
sudo apt install build-essential libatlas-base-dev python3-opencv python3-pip

# Install the TensorFlow Lite runtime.
pip3 install tflite-runtime

# Clone the picamera2 project locally.
git clone https://github.com/raspberrypi/picamera2

Running the models

Raspberry Pi Camera module 3 on Tripod

To use the picamera2 examples, you should have a Pi camera plugged into one of the CSI/DSI ports on your Pi 5 (or the camera connector on the Pi 4 or older). I'm using the Pi Camera V3 for my testing.

Then go into the tensorflow examples directory in the picamera2 project you cloned earlier:

cd picamera2/examples/tensorflow

Run the real-time YOLOv5 model with labels:

python3 yolo_v5_real_time_with_labels.py --model yolov5s-fp16.tflite --label coco_labels_yolov5.txt

Run the real-time MobileNetV2 model with labels:

python3 real_time_with_labels.py --model mobilenet_v2.tflite --label coco_labels.txt

I would like to find an easy way to calculate FPS for these models, so I can compare raw numbers directly, instead of just looking at a recording and estimating how many FPS I'm getting for the raw TensorFlow processing.

Watching htop, the CPU certainly gets a workout!

Using rpicam-apps instead

It seems like rpicam-apps has some more advanced image processing options, using the --post-process-file option.

For example, object_classify_tf.json contains and example with MobileNetV1, which relies on the mobilenet_v1_1.0_224_quant.tflite and labels.txt file being present in a models directory in your home folder.

Or for an extremely basic example, the negate.json stage just inverts the pixel feed, giving a negative view of the camera feed:

Jeff Geerling in Negative on Pi 5 camera feed

I haven't experimented as much with this, but it might be easier to introduce image processing this way than using Python, especially for demo/comparison purposes!

Comments

lol, just noticed in my picture of the screen, one of the detections said it found 'horse', with a 58% score :D

Love your videos and posts. You're doing the hacks I love and could never do myself.
I've been playing with opencv on a Pi5 for awhile and I just can't see performance improvements with pcie cards either on usb or m2 hat installs. Even converting python to c++ hasn't seen noticeable speed of processing changes. Only thing that was a definite improvement was the changes to top cpu and gpu speed.
If it's a cpu/gpu bottleneck could you see any benefits to these tpu npu products for opencv operations? No LLMs.