Cutting videos in the terminal with chafa and ffmpeg

I've been working on a video editor for the terminal: demo.mp4. This might be my favorite project yet. I'm excited to share a progress update.

SUCCESSES a

It looks pretty! Terminal graphics libraries always look pretty. AA-lib, for example, makes ASCII art in the terminal. And catimg makes pixel art. But I'm using a library called chafa. Chafa looks better, especially at small sizes, because it uses a variety of symbols. I recommend browsing chafa's blog posts and gallery to see all the pretty examples.

It's performant! At first, I was getting only a few frames per second with full CPU usage. But after tweaking some parameters with chafa and ffmpeg, I can play videos at 2x speed with 20-30 frames per second. That's enough for smooth video playback.

I'm happy! I wanted to make a terminal video editor two years ago, but that was too ambitious for me at the time. Now it's a reality. I love making my own tools. That's the thrill of programming -- you wish something existed, and then you make it happen.

SOME FFMPEG RECIPES a

The whole program centers around two ffmpeg commands. The first command decodes a video into frames of pixels:

ffmpeg \
  -ss 0.000 \
  -i video.mp4 \
  -vf scale=iw/2:ih/2 \
  -f rawvideo \
  -pix_fmt rgb24 \
  pipe:

Input any video: mp4, mkv, mov, etc, or even a URL to a video
Downsize the video with -vf scale=iw/2:ih/2 to make the program faster
Start anywhere; for example, if jumping to the middle of a 60 minute video, start decoding from -ss 1800.000
Output pixels with -f rawvideo and -pix_fmt rgb24
Stream pixels through an output pipe: into the main program

In short: ffmpeg does all the heavy lifting. Ffmpeg converts the video to pixels, chafa converts the pixels into symbols, and the symbols are printed to the terminal.

The second ffmpeg command is for cutting videos. Say you recorded a screencast, and you wanted to trim away the ends of the video. If you knew the starting and ending timestamps, you could run something like:

ffmpeg \
  -ss 5.000 \
  -i screencast.mp4 \
  -to 10.000 \
  -c copy \
  trimmed.mp4

But you'd have to watch the video in a media player, find the points to cut, write down those timestamps, and type this command. Or you'd have to upload your video to a cloud service and use their web frontend. That's too much friction. That's why I made this program: you can watch the video from the command-line, marking points to cut, and it will call this ffmpeg command for you.

I also wanted to share an interesting bug. I noticed that the program became unresponsive during longer videos. The strange part is that it always took 270 seconds, give or take.

I tried using different videos, slowing it down, speeding it up, and skipping frames. But the program still froze, endlessly waiting for the next frame after 270 seconds. I was stumped. There were no errors. Chafa worked fine. Ffmpeg worked fine. How could the program fail so consistently, regardless of the amount of bytes read?

After some digging, I found the answer: pipe capacity.

I had the ffmpeg process connected to two pipes: one for stdout, which produced pixels, and another for stderr, which produced error messages. I was reading bytes from the stdout pipe. But I was ignoring the stderr pipe.

                 ____________________
                /                    \        
      stdout -->  always reading pixels --> main program
     /          \____________________/                
    /  
ffmpeg              
    \            ____________________
     \          /  lots of messages  \       
      stderr -->    kept piling up     
                \____________________/

If you've used ffmpeg, you know how verbose its stderr messages are. The messages look like this:

frame=  495 fps=329 q=-0.0 size= 1503562kB time=00:00:08.25 bitrate=1492992.0kbits/s speed=5.49x
frame=  624 fps=311 q=-0.0 size= 1895400kB time=00:00:10.40 bitrate=1492992.0kbits/s speed=5.19x
frame=  752 fps=300 q=-0.0 size= 2284200kB time=00:00:12.53 bitrate=1492992.0kbits/s speed=   5x
frame=  874 fps=291 q=-0.0 size= 2654775kB time=00:00:14.56 bitrate=1492992.0kbits/s speed=4.84x
frame= 1005 fps=286 q=-0.0 size= 3052688kB time=00:00:16.75 bitrate=1492992.0kbits/s speed=4.76x

Those messages were silently accumulating in the stderr pipe. Since ffmpeg prints 2-3 messages per second, and each message is ~100 bytes, that means there were ~67500 bytes in the pipe after 270 seconds.

67500 bytes... that's right around the pipe capacity on my system, 65536 bytes! 270 seconds was no coincidence. That was how long it took for those ffmpeg messages to pile up and block the program.

You can reproduce this. Here's a script that runs ffmpeg continuously, redirecting stderr to a pipe, and printing the amount of bytes in the pipe after 270 seconds:

#!/bin/bash

ffmpeg \
  -f lavfi -i color=black \
  -loop 1 \
  -f null /dev/null \
  2>&1 \
  | tee >(sleep 270) \
  | wc -c

# Returns ~60500 for me.
# If you account for extra stderr messages in production,
# like video metadata and encoding information,
# that's right around 65536.

If you increase sleep 270 to something longer like sleep 500, you'll see that the pipe stays capped at 65536 bytes (or whatever the pipe capacity is on your system).

That bug was a real head-scratcher for me. I'm glad I figured it out. The simple fix is to pipe stderr to /dev/null.

NEXT STEPS a

You can download the executable (1.1 MB) I've been using. You'll need an x86_64 Linux machine with chafa ^1.14 and ffmpeg ^3.4. Best case, you'll only need to run apt install chafa ffmpeg or a similar command on your system. Worst case, you'll fiddle with dependencies for a day and it still might not work :(

I think that's ugly. I want to slim down the installation to a one-liner command like cargo install or curl. And I want it to work for anyone, whether they're on Linux, Windows, or Mac.

I also need to make videos play at a normal speed. Right now, the program shows a frame after sleeping every few milliseconds. This is usable but naïve. It does not account for the processing time in-between frames. And playback speed slows down because sleep() is not precise.

Once I make videos play at the proper speed, I can sync the renderer to audio playback. That would make it doubly useful. Sometimes I cut videos based on audio cues, not just visual cues. When audio is supported, it might earn the name "video editor".

For now, though, it's more of a "video cutter". I've tentatively named it vic. I've also considered vedi, vici, and vicu. Naming is hard! I always think about the unwritten rules of naming command-line programs, especially regarding finger travel and searchability.

There's tons of small improvements to work on, too: fixing flickering labels, centering videos, capping video height, enabling segment removal, enabling control of playback speed, cleaning up error handling, yada yada yada... I'm not sure how much progress I'll make now that LMT2 is over. I really needed the weekly accountability with other developers. Maybe I'll join the next LMT2 cohort, or maybe I'll commit to weekly updates on this page.

EDIT - I published a repo a month later: github.com/wong-justin/vic. I fixed the video playback speed and metadata parsing, which should solve a lot of problems.