- Published on
Reverse Engineering eufy Security Camera Videos - From .zxvideo to MP4
- Authors

- Name
- Christian Guevara
- @cgTheDev
Content
- The Problem
- A Note on AI Collaboration
- The Journey
- The Final Script
- Results
- Key Takeaways
- What About That AES Key?
- Final Thoughts
The Problem
I have a eufy Indoor Cam S350 with a 128GB microSD card full of recordings. After months of footage, I wanted to bulk download everything to my Mac. Sounds simple, right?
Plot twist: eufy doesn't make this easy.
The recordings are stored in a proprietary .zxvideo format that no media player recognizes. The only "official" way to get your videos is downloading them one-by-one through the eufy app. With 5,656 videos on my SD card, that wasn't happening.
So I did what any reasonable developer would do - I reverse engineered the format.
A Note on AI Collaboration
Full transparency: I used AI extensively for this project—specifically Claude Opus 4.5 with extended thinking.
The approach and ideas were mine: extracting headers from app-downloaded videos, finding the audio sync offset by comparison, the validation logic for AAC frames. But I have zero experience with FFMPEG, hex manipulation, or video container formats. I couldn't have written this script from scratch.
What I did was ask Claude to explain concepts—NAL units, ADTS headers, H.265 structure, why HEVC needs VPS/SPS/PPS—so I could actually understand what I was doing (and explain it here). The script itself was largely AI-generated based on my requirements and iterative testing.
Curiosity takes you to interesting places. I went from "why won't this file play" to understanding video codecs at the byte level. Not because I needed to become a video format expert, but because I refused to manually download 5,656 videos through an app, I'm a lazy mf...
The Journey
First Obstacle: ext4 File System
When I plugged the SD card into my Mac... nothing. macOS doesn't natively support ext4 (the Linux filesystem eufy uses).
Solution: Paragon extFS for Mac ($39.95, free trial). After installation, the card mounted and revealed the folder structure:
/Volumes/8416P0023340562/
├── Camera00/
│ ├── continue/ # Continuous recordings
│ │ └── 202403/20240301/20240301001359.zxvideo
│ └── event/ # Motion events
│ └── 202403/20240301/
│ ├── 20240301140220.zxvideo
│ ├── 20240301140220.txt
│ ├── 20240301140220_snapshot.jpg
│ └── 20240301140220_crop_zx_*.jpg
109GB of data with multiple file types:
| Type | Count | What It Is |
|---|---|---|
.zxvideo | 5,656 | Video recordings |
.txt | 5,657 | JSON metadata |
.jpg | 2,634 | Snapshots & detection crops |
.stats/.evt/.crop/.lst | ~50 | Internal indexes (skip) |
Plot twist #2: The .jpg files are NOT standard images. They're encrypted:
eufysecurity:T8416P0023340562:0184391229:<encrypted data>
Unlike the video data (which is unencrypted), eufy encrypts all thumbnails with a proprietary scheme (I think). Not worth trying to crack—just generate new thumbnails from the videos.
The two image types:
*_snapshot.jpg- Full frame at event start*_crop_zx_*.jpg- The detected person/pet that triggered the event (the bounding box crop)
What we do:
- Copy encrypted images - preserved in case someone figures it out later
- Generate thumbnails from videos - extract frames at 1s and 5s
ffmpeg -ss 00:00:01 -i video.mp4 -vframes 1 -q:v 2 thumb_1s.jpg
ffmpeg -ss 00:00:05 -i video.mp4 -vframes 1 -q:v 2 thumb_5s.jpg
Now to figure out what's inside those .zxvideo files.
Analyzing the Format
Time to break out the hex editor (well, xxd):
xxd video.zxvideo | head -20
00000000: 585a 5948 1405 29c6 0000 0300 0001 0068 XZYH..)........h
00000010: 13c6 0000 0102 72e5 0f00 000f 7008 cc5b ......r.....p..[
...
000000bc: 0000 0001 2601 ac20 c01a 0d97 d663 9b5f ....&.. .....c._
Key findings:
XZYH- Magic bytes (eufy's signature)00 00 00 01 26at offset 0xBC - That's an H.265/HEVC NAL start code!- The video is standard H.265, just wrapped in a custom container
The metadata .txt files were JSON goldmines:
{
"res_best_width": 3840,
"res_best_height": 2160,
"frame_num": 1801,
"start_time": "2024-04-25 18:26:50",
"mic_status": 1,
"aes": "OkppPUttOlB6MU93KFppMQ=="
}
4K video, 1801 frames, and... wait, is that an AES key? 🤔
The Missing Headers Problem
I extracted the video NAL units and tried to play them with ffmpeg:
ffprobe extracted.h265
[hevc] PPS id out of range
[hevc] Skipping invalid undecodable NALU
Could not find codec parameters for stream 0 (Video: hevc)
The issue: H.265/HEVC requires three initialization headers (VPS, SPS, PPS) that define the video's resolution and encoding parameters. The .zxvideo files don't include them - the camera injects them during playback.
The solution: Download ANY video through the eufy app (which exports as MP4) and extract the headers from there:
# Extract headers from app-downloaded video
with open('app_video.h265', 'rb') as f:
data = f.read()
# Find first IDR frame (NAL type 19) - headers are everything before it
for i in range(len(data) - 4):
if data[i:i+4] == b'\x00\x00\x00\x01':
nal_type = (data[i+4] >> 1) & 0x3F
if nal_type == 19: # IDR frame
headers = data[:i] # 238 bytes of VPS/SPS/PPS
break
These 238 bytes work for ALL videos from the same camera since they use the same encoding settings.
Finding the Audio
After getting video working, I noticed... no audio. The videos should have sound!
Searching through the hex dump, I found AAC ADTS frames scattered throughout the file:
# AAC ADTS sync word: 0xFFF1 or 0xFFF9
if data[i] == 0xFF and (data[i+1] & 0xF0) == 0xF0:
# Found an AAC frame!
The audio is interleaved with video throughout the file, not in a separate section. Here's the extraction with validation:
def extract_audio(data):
audio_data = bytearray()
i = 0
while i < len(data) - 7:
if data[i] == 0xFF and (data[i+1] & 0xF0) == 0xF0:
# Parse ADTS header
profile = ((data[i+2] >> 6) & 0x03) + 1
sample_rate_idx = (data[i+2] >> 2) & 0x0F
frame_len = ((data[i+3] & 0x03) << 11) | (data[i+4] << 3) | ((data[i+5] >> 5) & 0x07)
# Validate: AAC-LC, 16kHz, reasonable frame size
if profile == 2 and 7 <= frame_len <= 1024:
audio_data.extend(data[i:i+frame_len])
i += frame_len
continue
i += 1
return bytes(audio_data)
Audio specs: AAC-LC, 16000 Hz, mono - standard for security cameras.
The Audio Sync Problem
Combining video + audio produced a video where... the audio was noticeably delayed. Lips moved, then sound came 0.5 seconds later. Not great.
After comparing with an app-downloaded video, I discovered eufy applies a -0.127 second audio offset. Adding this to ffmpeg fixed sync perfectly:
ffmpeg -y \
-f hevc -r 15 -i video.h265 \
-itsoffset -0.127 -i audio.aac \ # The magic offset!
-c:v libx265 -crf 32 -tag:v hvc1 \
-c:a aac -b:a 64k \
-movflags +faststart \
output.mp4
Quality Tuning
I compared different CRF (quality) settings against the eufy app's output:
| CRF | File Size | SSIM vs App | Notes |
|---|---|---|---|
| 18 | 144 MB | 99.54% | Maximum quality |
| 23 | 80 MB | 99.36% | Balanced |
| 28 | 51 MB | 99.19% | Good compression |
| 32 | 33 MB | 98.64% | Matches app |
CRF 32 produces files nearly identical to the app (98.6% SSIM) at the same file size. The quality difference is imperceptible to human eyes.
The Final Script
After all that reverse engineering, I built a full extraction tool that handles everything:
What it extracts:
.zxvideo→.mp4(converted with synced audio).jpg→ copied (snapshots + detection crops).txt→ copied (JSON metadata with timestamps)
What it skips (internal camera indexes):
.stats,.evt,.crop,.lst
Here's the core conversion logic:
#!/usr/bin/env python3
"""eufy .zxvideo to MP4 Converter"""
import subprocess, tempfile
AUDIO_OFFSET = -0.127 # Sync correction
VIDEO_CRF = 32 # Matches app quality
def convert(zxvideo_path, output_path, headers):
data = zxvideo_path.read_bytes()
# Extract video (prepend headers)
video_start = data.find(b'\x00\x00\x00\x01', 0x10)
video_data = headers + data[video_start:]
# Extract audio (find AAC frames)
audio_data = extract_audio(data)
# Write to temp files
with tempfile.NamedTemporaryFile(suffix='.h265', delete=False) as vf:
vf.write(video_data)
video_temp = vf.name
with tempfile.NamedTemporaryFile(suffix='.aac', delete=False) as af:
af.write(audio_data)
audio_temp = af.name
# FFmpeg: combine with sync offset
subprocess.run([
'ffmpeg', '-y',
'-f', 'hevc', '-r', '15', '-i', video_temp,
'-itsoffset', str(AUDIO_OFFSET), '-i', audio_temp,
'-c:v', 'libx265', '-crf', str(VIDEO_CRF), '-tag:v', 'hvc1',
'-c:a', 'aac', '-b:a', '64k',
'-movflags', '+faststart', '-shortest',
str(output_path)
])
The full script runs in 3 phases:
- Phase 1: Copy all
.jpgfiles (fast, ~2,600 files) - Phase 2: Copy all
.txtmetadata (fast, ~5,600 files) - Phase 3: Convert all
.zxvideo→.mp4(slow, ~12 hours for 5,600 videos)
Results
| Metric | Value |
|---|---|
| Videos converted | 5,656 |
| Thumbnails generated | 11,312 (2 per video at 1s & 5s) |
| Encrypted images | 2,634 (copied, not viewable) |
| Metadata files | 5,657 |
| Total source size | 109 GB |
| Output size | ~200 GB (CRF 32) |
| Time per video | ~2 minutes |
| Audio sync | Perfect ✓ |
| QuickTime compatible | Yes ✓ |
Output folder structure mirrors the original:
~/Downloads/eufy_converted/
├── event/202404/20240425/
│ ├── 20240425182650.mp4 ← converted video
│ ├── 20240425182650.txt ← metadata
│ ├── 20240425182650_thumb_1s.jpg ← generated from video
│ ├── 20240425182650_thumb_5s.jpg ← generated from video
│ ├── 20240425182650_snapshot.jpg ← encrypted (preserved)
│ └── 20240425182650_crop_zx_*.jpg ← encrypted (preserved)
├── continue/...
└── extraction_summary.json ← stats & any failures
Key Takeaways
- Video data isn't encrypted - despite the
aesfield in metadata, the actual video/audio data is unencrypted - Images ARE encrypted - proprietary format, not worth cracking, just generate new thumbnails
- HEVC headers are reusable - same camera = same encoding = same headers
- Audio sync is consistent - -0.127s offset works for all videos
- ext4 is the main barrier - once you can read the filesystem, video extraction is straightforward
- Preserve what you can't decrypt - encrypted images are copied for potential future decryption
What About That AES Key?
The metadata contains Base64-encoded AES keys:
aes: "OkppPUttOlB6MU93KFppMQ==" → ":Ji=Km:Pz1Ow(Zi1"
These encrypt a small header region (bytes 0x10-0xA6), but not the actual video/audio data. The camera probably uses this for DRM or authentication, but it's not needed for playback.
Final Thoughts
What started as "I just want my videos" turned into a fun reverse engineering project. The .zxvideo format is actually quite simple once you understand it:
- Standard H.265 video (missing headers)
- Standard AAC audio (interleaved)
- Proprietary container (easily bypassed)
If you have a eufy camera and want to bulk export your local recordings, the tools are now available. No cloud, no app, just your videos.
Download the tools:
- convert_eufy_videos.py - The full extraction script
- hevc_headers.bin - HEVC headers for eufy Indoor Cam S350
Usage:
# Batch mode - extract everything from SD card
python3 convert_eufy_videos.py --source "/Volumes/YOUR_SD_CARD/Camera00" --dest ~/Downloads/eufy_converted
# Single file mode
python3 convert_eufy_videos.py --source "/path/to/video.zxvideo" --dest ~/output
Hit me up on Twitter @cgTheDev if you try this!
🖖