CamAI
Comprehending the Utilization of Recurrent Neural Networks (LSTMs) in Understanding Temporal Visual Data for Action Recognition.
CONTRIBUTORS
Owolabi Timilehin
Miracle Jesse-Paul
SKILLS
Python Programming, AI & Deep Learning, Electronics, Project Management
PUBLISHED
24, July 2023
IMPORTANT LINKS
Introduction
CamAI marries cutting-edge artificial intelligence (AI) with conventional camera technology, bringing about a remarkable transformation in how we perceive and interact with the world. Our AI Camera with Real-Time Action Recognition is powered by AI algorithms that elevate its capabilities to a new level.
Hardware
For a test, the hardware is a Pimoroni pan-tilt hat camera with an onboard microcontroller, which lets you independently drive the two servos (pan and tilt). The module pans and tilts through 180 degrees on each axis and is compatible with all 40-pin header Raspberry Pi models.
System Design
System Design
Feature Extraction
- Get the dataset (Videos of actions you want to predict)
- Select 10 frames, equally spaced across the span of the video
- Pass the frames through the media pipe pose model to extract the pose landmarks
- Save the landmark
Training
- Using the extracted landmarks, train an LSTM model along with the label of the action in the video (Data format should be [10 * h * w * c], where h = height, w = width, c = channel of the frame)
Testing
- Get the video feed from any camera
- Stack 10 frames per time in a queue (which will always be updated with an incoming frame)
- Pass the data in the queue through media pipe to extract important landmarks
- Pass the landmarks through the trained model
- Get the classified result and print it on the video feed