Continued Adventures in CheapML for IoT
· 約2分
I have probably bit off more than I can chew with this project. However, the progress made on automated labeling and model conversion has been a game-changer for my local-first AI experiments.
What is hard (or time-consuming) about this?
- Generating the testing/training data set. I ended up using a VLM for the initial labeling, with manual overrides for edge cases. I started with Moondream, but eventually transitioned to Qwen2-VL, which proved far more robust for object identification. In retrospect, we also should have tried out Molmo instead of LLaVA for its superior zero-shot performance on high-resolution snapshots.
- As I'm trying to run this with limited resources on a Raspberry Pi 4 (4GB RAM), that means starting the ML models from scratch and aggressive pruning/quantization.
- Trying to leave something that is generalizable enough for other cameras in the house—or just give up and pay the $20/mo for Frigate/Scrypted Cloud. The goal of "CheapML" is to see how far we can get with pure open-source and local hardware.
The Conversion Pipeline
The current pipeline takes a PyTorch-trained model, converts it to ONNX for optimization, and then finally to TFLite to squeeze it onto the RPi 4's CPU/TPU. Each step requires careful calibration to avoid significant accuracy loss during quantization.
