For the past few years, my workflow for editing videos for my YouTube channel was the following:
- Write and record narration / 'A-roll' using a teleprompter
- Import recording into timeline, and chop out silent portions manually using the blade and/or range tools
- Work on the rest of the edit (adding 'B-roll' and inserts).
Step 3 is where the vast majority of editing time is spent, especially when I need to add in charts, motion graphics, etc.
But Step 2 is mind-numbingly boring, especially since it means for a typical 10 minute video, I'm going to sit there for 30 minutes or so tweaking all the cuts in the silent portion, to try to make the audio flow from one section of the recorded text to the next.
And it's not just for teleprompter recordings. If you're editing screencasts, streaming VODs, vlogs, or interviews, there's a good chance there are a lot of silent portions that need to be cut out before the full editing process begins.
There are some great apps out there that automate some or all of this for you, like:
- Recut ($99, no subscription required)
- Timebolt ($17/month and up, depending on subscription)
- Descript ($12/month and up, depending on subscription)
But I figured, Final Cut Pro X is a professional video editing application used by tons of content creators around the world... surely there's a way I can do this without buying separate software that spits out an edit decision list I have to import into Final Cut, right?
Well... sort-of. After a good deal of research and testing, my new method for cutting out gaps of silence is this osascript from jashmenn. It needed a little tweaking to work for my workflow, but combines ffmpeg's
silencedetect filter with a little OSA/AppleScript automation to make all the cuts for me.
Step 2 goes like this, now:
ffmpeg -i [video.mp4] -af silencedetect=n=-35dB:d=800ms -f s16le -y /dev/null 2>&1 | tee silence.txt
- Make sure Final Cut Pro is open to a timeline (or compound clip) with the same video portion visible.
The script runs through the video and makes cuts at all the silent portion boundaries, then goes back and deletes all those portions.
It's not perfect, and it would be nice to have a few of the more robust features like a real noise gate (attack, decay, etc. so I don't have tiny bits where there is a pop or I set something down), but this makes it so I can just run through and delete the bad takes, adjust the timings for some of the gap cuts, and be on my way!
I should note I changed the
moveToTimecode portion of the code using rlau1115's changes for 23.98p footage.
I also set a noise threshold of
-35dB and a delay of
800ms since that seems to offer the best results for my type of speech.
Finally, I also adjusted the margins to give the right amount of padding for the flow of my speech:
const startMargin = 0.175; const endMargin = 0.200;
Your mileage may vary. I've actually forked the Gist into a separate GitHub repository, final-cut-it-out, since I would like to work on improving it and making it more flexible for different framerates and margins.