1. Background
2. Objectives
3. Design Approach
3.1 Hardware Design
3.2 Software Design
3.2.1 Microcontroller
3.2.2 Centralized Storage
3.2.3 Video Playback
3.2.3.1 Backend
3.2.3.2 Front-end
4. Future Prospects
4.1 Event Detection and Quick Navigation
4.2 Integration with HomeAssistant

Published August 27, 2025

BW21-CBV-Kit——Home Video Monitoring and Playback System

The BW21-CBV-Kit launched by Ai-Thinker is an open-source hardware development platform for intelligent sensing applications.

BeginnerProtip43

BW21-CBV-Kit——Home Video Monitoring and Playback System

Things used in this project

Hardware components

BW21-CBV-Kit

Story

1. Background

BW21-CBV-Kit's core features include:

AI Vision Capabilities:

Supports on-device AI image recognition algorithms, enabling scenarios such as face recognition, gesture detection, and object identification, with HD image capture via a 2MP camera.

Processing Performance:

Equipped with dual-band Wi-Fi (2.4GHz + 5GHz) and a high-performance RTL8735B processor (500MHz), meeting real-time image processing and wireless transmission requirements.

Open-source Ecosystem:

Built on the Arduino development environment, providing rich interfaces such as PWM control and sensor drivers, enabling users to develop custom network camera systems.

Compared with traditional closed-source smart cameras, the BW21-CBV-Kit provides unique advantages:

Freedom from System Restrictions:

Developers can bypass vendor constraints to create local storage or private cloud solutions, avoiding cloud data leakage risks.

Extended Scenario Integration:

Supports peripherals such as DHT temperature/humidity sensors and ultrasonic ranging modules, allowing smart home security, environmental monitoring, and other multi-functional applications.

Development Efficiency:

Provides HomeAssistant integration and pre-built AI model calls, significantly lowering the development barrier for home-grade smart cameras.

This device has strong potential applications in smart homes, industrial visual inspection, and new retail behavior analysis.

Its open-source nature is particularly suitable for maker communities and SMEs seeking customized visual solutions.

Many manufacturers currently offer smart cameras, but they are closed-source and tied to specific software and cloud services. The camera can be used to build a network camera with a custom software system for home use.

2. Objectives

Using the BW21-CBV-Kit, Xiao An Pai will create a home security camera capable of 24/7 video recording and real-time storage, supporting web-based playback of surveillance footage. The web interface should be as user-friendly as possible, with the ability to select video files by date.

The core goal of this project is to develop an intelligent home security monitoring system based on the Xiao An Pai hardware platform, focusing on three key functional modules:

24/7 audio and video acquisition system

Enables 24/7 uninterrupted video recording, using H.264 encoding technology to ensure 1080P HD quality;

Integrates facial recognition algorithms for intelligent monitoring of suspicious targets;

Tiered storage architecture

Builds a centralized storage mechanism: NAS subscribes to RTSP audio and video streams and writes them to disk.

Cloud-based interactive system

Builds a streaming media service platform based on a B/S architecture, supporting m3u8 video playback.

Develops a video retrieval module for daily surveillance video retrieval.

The overall system design adheres to a collaborative "end-edge-cloud" architecture, ultimately creating a complete home security solution while ensuring data security.

3. Design Approach

3.1 Hardware Design

Development Board: BW21-CBV-Kit connected to a GC2053 camera.

Camera Enclosure: DIY casing made using the packaging box and manual assembly.

3.2 Software Design

3.2.1 Microcontroller

On the microcontroller side, RTSP video streams are output while marking faces in the video.

1. #include "WiFi.h"
2. #include "StreamIO.h"
3. #include "VideoStream.h"
4. #include "RTSP.h"
5. #include "NNFaceDetection.h"
6. #include "VideoStreamOverlay.h"
7.
8. #define CHANNEL   0
9. #define CHANNELNN 3
10.
11. // Lower resolution for NN processing
12. #define NNWIDTH  576
13. #define NNHEIGHT 320
14.
15. VideoSetting config(VIDEO_FHD, 30, VIDEO_H264, 0);
16. VideoSetting configNN(NNWIDTH, NNHEIGHT, 10, VIDEO_RGB, 0);
17. NNFaceDetection facedet;
18. RTSP rtsp;
19. StreamIO videoStreamer(1, 1);
20. StreamIO videoStreamerNN(1, 1);
21.
22. char ssid[] = "Network_SSID";    // your network SSID (name)
23. char pass[] = "Password";        // your network password
24. int status = WL_IDLE_STATUS;
25.
26. IPAddress ip;
27. int rtsp_portnum;
28.
29. void setup()
30. {
31.     Serial.begin(115200);
32.
33.     // attempt to connect to Wifi network:
34.     while (status != WL_CONNECTED) {
35.         Serial.print("Attempting to connect to WPA SSID: ");
36.         Serial.println(ssid);
37.         status = WiFi.begin(ssid, pass);
38.
39.         // wait 2 seconds for connection:
40.         delay(2000);
41.     }
42.     ip = WiFi.localIP();
43.
44.     // Configure camera video channels with video format information
45.     // Adjust the bitrate based on your WiFi network quality
46.     config.setBitrate(2 * 1024 * 1024);    // Recommend to use 2Mbps for RTSP streaming to prevent network congestion
47.     Camera.configVideoChannel(CHANNEL, config);
48.     Camera.configVideoChannel(CHANNELNN, configNN);
49.     Camera.videoInit();
50.
51.     // Configure RTSP with corresponding video format information
52.     rtsp.configVideo(config);
53.     rtsp.begin();
54.     rtsp_portnum = rtsp.getPort();
55.
56.     // Configure face detection with corresponding video format information
57.     // Select Neural Network(NN) task and models
58.     facedet.configVideo(configNN);
59.     facedet.setResultCallback(FDPostProcess);
60.     facedet.modelSelect(FACE_DETECTION, NA_MODEL, DEFAULT_SCRFD, NA_MODEL);
61.     facedet.begin();
62.
63.     // Configure StreamIO object to stream data from video channel to RTSP
64.     videoStreamer.registerInput(Camera.getStream(CHANNEL));
65.     videoStreamer.registerOutput(rtsp);
66.     if (videoStreamer.begin() != 0) {
67.         Serial.println("StreamIO link start failed");
68.     }
69.
70.     // Start data stream from video channel
71.     Camera.channelBegin(CHANNEL);
72.
73.     // Configure StreamIO object to stream data from RGB video channel to face detection
74.     videoStreamerNN.registerInput(Camera.getStream(CHANNELNN));
75.     videoStreamerNN.setStackSize();
76.     videoStreamerNN.setTaskPriority();
77.     videoStreamerNN.registerOutput(facedet);
78.     if (videoStreamerNN.begin() != 0) {
79.         Serial.println("StreamIO link start failed");
80.     }
81.
82.     // Start video channel for NN
83.     Camera.channelBegin(CHANNELNN);
84.
85.     // Start OSD drawing on RTSP video channel
86.     OSD.configVideo(CHANNEL, config);
87.     OSD.begin();
88. }
89.
90. void loop()
91. {
92.     // Do nothing
93. }
94.
95. // User callback function for post processing of face detection results
96. void FDPostProcess(std::vector<FaceDetectionResult> results)
97. {
98.     int count = 0;
99.
100.     uint16_t im_h = config.height();
101.     uint16_t im_w = config.width();
102.
103.     Serial.print("Network URL for RTSP Streaming: ");
104.     Serial.print("rtsp://");
105.     Serial.print(ip);
106.     Serial.print(":");
107.     Serial.println(rtsp_portnum);
108.     Serial.println(" ");
109.
110.     printf("Total number of faces detected = %d\r\n", facedet.getResultCount());
111.     OSD.createBitmap(CHANNEL);
112.
113.     if (facedet.getResultCount() > 0) {
114.         for (int i = 0; i < facedet.getResultCount(); i++) {
115.             FaceDetectionResult item = results[i];
116.             // Result coordinates are floats ranging from 0.00 to 1.00
117.             // Multiply with RTSP resolution to get coordinates in pixels
118.             int xmin = (int)(item.xMin() * im_w);
119.             int xmax = (int)(item.xMax() * im_w);
120.             int ymin = (int)(item.yMin() * im_h);
121.             int ymax = (int)(item.yMax() * im_h);
122.
123.             // Draw boundary box
124.             printf("Face %ld confidence %d:\t%d %d %d %d\n\r", i, item.score(), xmin, xmax, ymin, ymax);
125.             OSD.drawRect(CHANNEL, xmin, ymin, xmax, ymax, 3, OSD_COLOR_WHITE);
126.
127.             // Print identification text above boundary box
128.             char text_str[40];
129.             snprintf(text_str, sizeof(text_str), "%s %d", item.name(), item.score());
130.             OSD.drawText(CHANNEL, xmin, ymin - OSD.getTextHeight(CHANNEL), text_str, OSD_COLOR_CYAN);
131.
132.             // Draw facial feature points
133.             for (int j = 0; j < 5; j++) {
134.                 int x = (int)(item.xFeature(j) * im_w);
135.                 int y = (int)(item.yFeature(j) * im_h);
136.                 OSD.drawPoint(CHANNEL, x, y, 8, OSD_COLOR_RED);
137.                 count++;
138.                 if (count == MAX_FACE_DET) {
139.                     goto OSDUpdate;
140.                 }
141.             }
142.         }
143.     }
144.
145. OSDUpdate:
146.     OSD.update(CHANNEL);
147. }

3.2.2 Centralized Storage

Use ffmpeg to subscribe to RTSP streams from the XiaoAnPai board and save them locally in segments:

1. ffmpeg -rtsp_transport tcp -i rtsp://192.168.123.6:554/mystream \
2. -c copy \
3. -f hls \
4. -strftime 1 \
5. -hls_time 60 \
6. -hls_list_size 0 \
7. -hls_flags delete_segments+append_list \
8. -hls_segment_filename "./data/%Y%m%d/%Y%m%d_%H%M%S.ts" \
9. "./data/20250323/playlist.m3u8" \
10. -protocol_whitelist file,rtsp,tcp \
11. -rw_timeout 5000000

Key functionalities:

1. RTSP to HLS Conversion:

Converts RTSP live streams (e.g., IP cameras) to HLS format for web playback.

2. Segmented Storage:

Generates.ts segments (e.g., every 3 seconds) stored in./data/YYYYMMDD/.

3. Playlist Management:

playlist.m3u8 lists all segments for on-demand playback.

4. Automatic Cleanup:

Optionally delete old segments to prevent disk overflow.

5. Time-based Directory Structure:

Dynamically generate directories and filenames based on the timestamp.

./data/20250318/20250323_120000.ts
./data/20250318/20250323_120003.ts

3.2.3 Video Playback

Currently, we've only completed video recording and storage. We still need a corresponding front-end page to display the video, and data transmission requires some back-end software support.

3.2.3.1 Backend

The video streaming service backend, built on the Flask framework, provides structured HLS streaming services. Through automated directory scanning and dynamic routing, it manages and distributes video resources categorized by date, primarily serving scenarios requiring time-based retrieval of video content.

1. Data Discovery Layer

Directory Scanning: Dynamically explores the../ffmpeg/data directory structure using the get_available_dates() function.
Validation: Verifies both directory naming compliance (YYYYMMDD format) and playlist integrity (presence of playlist.m3u8).
Timeline Sorting: Sorts directories with valid dates in ascending chronological order to ensure accurate timing in the front-end timeline display.

2. Interface Service Layer

Visualization Portal: The root route provides an index.html template to support front-end interface integration.
Structured Data Interface: /api/dates The endpoint returns a standardized date list (YYYY-MM-DD format), achieving front-end and back-end data separation.
Intelligent format conversion: Automatically converts stored date identifiers into a user-friendly display format.

3. Streaming Service Layer

Protocol Adaptation Routing: /video/ implements HLS protocol streaming.
M3U8 Index Distribution: Directly returns a playlist file for the specified date.
TS Segment Processing: Use the serve_ts_files method to accurately locate and securely verify segment resources.
Four-fold security verification: Includes file name length verification, date format verification, directory existence check, and file physical presence verification.

1. from flask import Flask, render_template, jsonify, send_from_directory
2. import os
3. from datetime import datetime
4.
5. app = Flask(__name__)
6. DATA_DIR = '../ffmpeg/data'
7.
8. def get_available_dates():
9.     dates = []
10.     for dirname in os.listdir(DATA_DIR):
11.         dir_path = os.path.join(DATA_DIR, dirname)
12.         m3u8_path = os.path.join(dir_path, 'playlist.m3u8')
13.         if os.path.isdir(dir_path) and os.path.exists(m3u8_path):
14.             try:
15.                 datetime.strptime(dirname, '%Y%m%d')
16.                 dates.append(dirname)
17.             except ValueError:
18.                 continue
19.     dates.sort(key=lambda x: datetime.strptime(x, '%Y%m%d'))
20.     return dates
21.
22. @app.route('/')
23. def index():
24.     return render_template('index.html')
25.
26. @app.route('/api/dates')
27. def api_dates():
28.     dates = get_available_dates()
29.     formatted_dates = [f"{d[:4]}-{d[4:6]}-{d[6:8]}" for d in dates]
30.     return jsonify({'dates': formatted_dates})
31.
32. @app.route('/video/<filename>')
33. def serve_m3u8(filename):
34.     if ".ts" in filename:
35.         return serve_ts_files(filename)
36.     else:
37.         dir_path = os.path.join(DATA_DIR, filename)
38.         return send_from_directory(dir_path, 'playlist.m3u8')
39.
40.
41. def serve_ts_files(filename):
42.     # Extract the date portion (first 8 digits) from a file name
43.     if len(filename) < 8:
44.         return "Invalid filename", 400
45.
46.     date_str = filename[:8]
47.     try:
48.         # Validate date format
49.         datetime.strptime(date_str, '%Y%m%d')
50.     except ValueError:
51.         return "Invalid date format in filename", 400
52.
53.     # Constructing file paths
54.     dir_path = os.path.join(DATA_DIR, date_str)
55.
56.     print("dir_path is ", dir_path)
57.
58.     if not os.path.isdir(dir_path):
59.         return "Date directory not found", 404
60.
61.     file_path = os.path.join(dir_path, filename)
62.
63.     print("file_path is ", file_path)
64.     if not os.path.exists(file_path):
65.         return "File not found", 404
66.
67.     return send_from_directory(dir_path, filename)
68.
69. if __name__ == '__main__':
70.     app.run(debug=True)

3.2.3.2 Front-end

1. The front-end primarily handles human-computer interaction, using the HLS streaming protocol to enable cross-platform video playback. A visual user interface is built for date selection, playback control, and status feedback, forming a complete streaming solution in conjunction with the back-end services.

Presentation Layer Design

Responsive Layout:

Desktop adaptation achieved through container maximum width constraints and automatic margins

Visual Hierarchy:

The control bar uses Flex layout to maintain element spacing, and the video player features a full-width black background to enhance visual focus

Status Visualization:

Dynamically displays playback progress and total duration for enhanced feedback

2. Streaming Adaptation Layer

Dual-Mode Compatibility:

The Hls.js library is preferred for advanced feature support

A native HTML5 player is used as an alternative to ensure usability in environments like Safari

Intelligent Instance Management:

Dynamically destroys and recreates HLS objects to avoid multi-stream memory leaks

Event-Driven Loading:

Loading order is ensured through the MEDIA_ATTACHED and MANIFEST_PARSED event chains

3. Control Logic Layer

Timed Operation Flow:

Select a date → Generate a standardized request → Initialize the playback engine → Automatically play

Playback Status Linkage:

User actions (play/pause) are synchronized with the video element status in real time

Intelligent Date Conversion:

Automatically converts display format dates to storage format parameters

1. <!-- templates/index.html -->
2. <!DOCTYPE html>
3. <html>
4. <head>
5.     <title>Video surveillance system</title>
6.     <script src="https://cdn.jsdelivr.net/npm/hls.js@1.1.5"></script>
7.     <style>
8.         .container {
9.             max-width: 800px;
10.             margin: 20px auto;
11.             padding: 20px;
12.         }
13.         .controls {
14.             margin-bottom: 20px;
15.             display: flex;
16.             gap: 10px;
17.             align-items: center;
18.         }
19.         #videoPlayer {
20.             width: 100%;
21.             background: #000;
22.         }
23.         button {
24.             padding: 5px 15px;
25.             cursor: pointer;
26.         }
27.     </style>
28. </head>
29. <body>
30.     <div class="container">
31.         <div class="controls">
32.             <select id="dateSelect">
33.  <option value="">Select Date</option>
34.  <button id="playBtn">Play</button>
35.  <button id="pauseBtn">Pause</button>
36.  <button id="pauseBtn">Pause</button>
37.  <span id="status">Ready</span>
38.         </div>
39.         <video id="videoPlayer" controls></video>
40.     </div>
41.
42.     <script>
43.         const video = document.getElementById('videoPlayer');
44.         let hls = null;
45.         let currentDate = '';
46.
47.         // Initialize HLS support detection
48.         if (Hls.isSupported()) {
49.             hls = new Hls();
50.             hls.attachMedia(video);
51.         }
52.
53.         // Loading available dates
54.         fetch('/api/dates')
55.             .then(res => res.json())
56.             .then(data => {
57.                 const select = document.getElementById('dateSelect');
58.                 data.dates.forEach(date => {
59.                     const option = document.createElement('option');
60.                     option.value = date;
61.                     option.textContent = date;
62.                     select.appendChild(option);
63.                 });
64.             });
65.
66.         // Date Selection Event
67.         document.getElementById('dateSelect').addEventListener('change', function() {
68.             const selectedDate = this.value;
69.             if (!selectedDate) return;
70.
71.             currentDate = selectedDate.replace(/-/g, '');
72.             const m3u8Url = `/video/${currentDate}`;
73.
74.             if (Hls.isSupported()) {
75.                 if (hls) {
76.                     hls.destroy();
77.                 }
78.                 hls = new Hls();
79.                 hls.attachMedia(video);
80.                 hls.on(Hls.Events.MEDIA_ATTACHED, () => {
81.                     hls.loadSource(m3u8Url);
82.                     hls.on(Hls.Events.MANIFEST_PARSED, () => {
83.                         video.play();
84.                     });
85.                 });
86.             } else if (video.canPlayType('application/vnd.apple.mpegurl')) {
87.                 video.src = m3u8Url;
88.                 video.addEventListener('loadedmetadata', () => {
89.                     video.play();
90.                 });
91.             }
92.         });
93.
94.         // Playback Controls
95.         document.getElementById('playBtn').addEventListener('click', () => {
96.             video.play();
97.         });
98.
99.         document.getElementById('pauseBtn').addEventListener('click', () => {
100.             video.pause();
101.         });
102.
103.         // Update status display
104.         video.addEventListener('timeupdate', () => {
105.             const status = document.getElementById('status');
106.             status.textContent = `Playing - ${formatTime(video.currentTime)}/${formatTime(video.duration)}`;
107.         });
108.
109.         function formatTime(seconds) {
110.             const date = new Date(0);
111.             date.setSeconds(seconds);
112.             return date.toISOString().substr(11, 8);
113.         }
114.     </script>
115. </body>
116. </html>

Demo video: https://www.bilibili.com/video/BV1zSXYYAEhJ/

4. Future Prospects

Currently, the frontend can only manually jump through video.

Future improvements: Use face recognition to identify family members and detect strangers, adding event tracks to allow fast navigation in the playback timeline.

4.2 Integration with HomeAssistant

Current implementation is independent with the custom frontend/backend.

Embedding the system into HomeAssistant (HASS) would broaden its application and accessibility.

Ai-Thinker

71 projects • 20 followers

Ai-Thinker is a leading supplier of IoT wireless products and solutions, including antennas, modules, and RF lab services.

BW21-CBV-Kit——Home Video Monitoring and Playback System

Things used in this project

Hardware components

Story

1. Background

2. Objectives

3. Design Approach

3.1 Hardware Design

3.2 Software Design

3.2.1 Microcontroller

3.2.2 Centralized Storage

3.2.3 Video Playback

3.2.3.1 Backend

3.2.3.2 Front-end

4. Future Prospects

4.1 Event Detection and Quick Navigation

4.2 Integration with HomeAssistant

Credits

Ai-Thinker

Comments

Embed the widget on your own site

BW21-CBV-Kit——Home Video Monitoring and Playback System

BW21-CBV-Kit——Home Video Monitoring and Playback System

Things used in this project

Hardware components

Story

1. Background

2. Objectives

3. Design Approach

3.1 Hardware Design

3.2 Software Design

3.2.1 Microcontroller

3.2.2 Centralized Storage

3.2.3 Video Playback

3.2.3.1 Backend

3.2.3.2 Front-end

4. Future Prospects

4.1 Event Detection and Quick Navigation

4.2 Integration with HomeAssistant

Credits

Ai-Thinker

Comments

Related channels and tags