Removing Ums and Ahs from Podcasts: When, How, and When to Leave Them
TL;DR: Don't remove every um and ah—speech stripped of all filler sounds robotic and exhausting to hear. Remove filler words that cluster together, disrupt flow, or become distracting patterns. Leave isolated fillers that give listeners processing time. Automated tools work for heavy cleanup; manual editing gives the most natural results.
Table of Contents
- The Filler Word Dilemma
- When to Remove Filler Words
- When to Keep Filler Words
- Manual Filler Word Removal
- Automated Filler Removal Tools
- Creating Natural-Sounding Results
- FAQ
The Filler Word Dilemma
Filler words feel like the enemy of polished podcasts. The temptation is to eliminate them all.
Here's the thing: completely filler-free speech sounds unnatural, even unsettling. Listeners subconsciously expect occasional verbal pauses. When they're all gone, something feels wrong—even if listeners can't identify what.
Why We Use Filler Words
Filler words serve communication purposes:
Cognitive processing: "Um" signals the speaker is thinking. Listeners' brains use this pause to process what was just said.
Turn-holding: "Uh" indicates the speaker isn't finished. Without it, listeners might interrupt.
Hedging: Fillers soften statements, making speakers sound less aggressive or certain.
Natural rhythm: Speech has a cadence. Removing all pauses flattens that rhythm.
The Over-Edited Problem
Aggressive filler removal creates new problems:
Machine-gun speech: Ideas hit listeners without breaks, overwhelming comprehension.
Lost context: Fillers often signal topic transitions or emphasis shifts.
Uncanny valley: Speech pattern feels "off" without identifiable reason.
Listener fatigue: No breathing room makes extended listening exhausting.
Finding the Balance
The goal: remove distracting fillers while maintaining natural speech patterns.
Remove: Clusters, patterns, extended ums that disrupt flow Keep: Isolated fillers, transition pauses, thinking moments that aid comprehension
When to Remove Filler Words
Target filler words that actively harm the listening experience.
Clusters of Fillers
Multiple filler words in sequence disrupt comprehension.
Problem example: "So, um, you know, like, basically, um..."
Why it's distracting: The listener is waiting for content, getting only verbal noise.
Action: Remove the cluster, leaving one natural pause or a single filler if it serves as transition.
Recurring Patterns
Speakers often have signature filler patterns that become noticeable over time.
Common patterns:
- "You know" every third sentence
- "Um" at the start of every response
- "Like" before every adjective
- "So" beginning every new thought
Why patterns are distracting: Once listeners notice the pattern, they can't unhear it. It becomes a distracting verbal tic.
Action: Remove enough instances that the pattern breaks. You don't need to eliminate every occurrence—just reduce frequency until it's no longer noticeable.
Extended Filler Sounds
Brief "um" feels natural. Three-second "uuuuuum" draws attention.
Why long fillers are distracting: They break the rhythm and suggest the speaker is lost or unprepared.
Action: Either remove entirely or shorten to typical filler length (under one second).
Fillers Before Important Points
Filler words just before key information weaken impact.
Problem example: "The most important thing is, um, uh, you know, commitment."
Why it's problematic: Builds anticipation, then deflates with filler before the payoff.
Action: Tighten so the key point lands with impact.
Failed Sentence Starts
Speakers sometimes start sentences, realize they're going wrong, and restart.
Problem example: "I think the, um, the thing is that, well actually, what I mean is..."
Why it's distracting: Listeners have to untangle what the speaker is actually saying.
Action: Remove false starts, keep the successful version.
When to Keep Filler Words
Some fillers serve the communication and should stay.
Isolated Fillers Between Ideas
A single "um" between thoughts gives listeners processing time.
Example: "We launched the feature in March. Um, the response was immediate."
Why to keep it: The pause separates two distinct thoughts. Removing it would create jarring immediacy.
Fillers Indicating Thought
When speakers pause to think, the filler signals that process.
Example: "What's the most important lesson? Um... probably persistence."
Why to keep it: The "um" and pause show genuine reflection, making the answer feel more authentic than an instant response.
Fillers in Emotional Moments
Hesitation during emotional content signals authenticity.
Example: "When she told me she was sick, I, um, I didn't know what to say."
Why to keep it: The filler conveys emotional weight. Removing it makes the delivery feel rehearsed.
Conversational Turn Signals
Fillers that hold the floor during conversation belong to natural dialogue.
Example during crosstalk: "Well, uh, that's not exactly—" [other speaker interjects]
Why to keep it: The filler is part of the conversational dance. Removing it creates unnatural precision.
Guest Comfort Indicators
Especially with nervous guests, some fillers signal relaxation progression.
Pattern: More fillers early in interview, fewer as guest warms up.
Why to keep some: A completely filler-free guest from the start sounds like they're reading a script. Natural improvement through the conversation is authentic.
Manual Filler Word Removal
Manual editing gives the most control and natural results.
Basic Technique
- Identify the filler in the waveform (short pause, low-energy sound)
- Select the filler including surrounding silence
- Delete and close the gap (ripple delete)
- Listen to the result in context
- Undo if it sounds unnatural
Selecting Cleanly
Poor selections create audible problems.
Include in your selection:
- The filler sound itself
- Breath before the filler (if present)
- Silence after the filler
- Don't cut into adjacent words
Finding selection points:
- Look for zero crossings (where waveform crosses the center line)
- Find natural silence gaps
- Zoom in far enough to see individual waveform cycles
Preserving Natural Gaps
Don't eliminate all space where fillers were removed.
The mistake: Selecting filler plus all surrounding silence, leaving no gap.
The result: Words run together unnaturally.
Better approach: Leave 0.3-0.5 seconds of silence where the filler was. Speech needs breathing room.
Workflow for Efficient Manual Editing
Don't hunt for every filler. Instead:
- Play through at normal speed
- Note (or mark) only the distracting fillers
- Return and edit only those
- Leave unobtrusive fillers alone
This is faster than stopping for every "um" and produces more natural results.
Using Markers
Most DAWs support markers—use them to tag fillers for batch editing.
Marker workflow:
- Play through episode once, adding markers at problem fillers
- Return to markers and edit each
- Clear markers as you complete edits
Separates identification (listening) from editing (technical) for efficiency. This systematic approach fits well within a broader editing workflow.
Automated Filler Removal Tools
Software can identify and remove filler words automatically.
How Automated Tools Work
Tools analyze audio for:
- Specific phonetic patterns ("um," "uh," "like")
- Low-energy sections matching filler characteristics
- Silence patterns typical of filler word placement
They then remove or silence the identified sections.
Popular Tools and Options
Descript: Transcription-based editing with filler word identification and one-click removal.
Adobe Podcast: Browser-based tool with "Enhance Speech" including filler removal.
Auphonic: Automatic post-production including filler reduction.
CapCut: Video editor with audio filler word removal.
DAW plugins: Various plugins detect and remove or reduce fillers in traditional DAWs.
Setting Automated Tool Sensitivity
Most tools have sensitivity or aggressiveness settings.
High sensitivity: Catches more fillers but may remove non-filler content Low sensitivity: Misses some fillers but fewer false positives
Recommendation: Start with medium-low sensitivity. You can always remove remaining fillers manually—it's harder to restore content that was incorrectly removed.
Pros and Cons of Automation
Advantages:
- Much faster than manual editing
- Consistent application across episodes
- Good for heavy filler use
- Catches fillers you might miss
Disadvantages:
- May remove legitimate words
- Can create unnatural gaps or transitions
- Less control over which fillers stay
- May process entire words that sound like fillers
Best Use of Automation
Use automated tools for initial cleanup, then listen through and:
- Remove remaining obvious fillers manually
- Restore any incorrectly removed content
- Adjust gaps for natural flow
Automation as first pass + manual refinement = efficient natural results.
Creating Natural-Sounding Results
Technical removal is only half the challenge. Results need to sound human.
Listening in Context
Always evaluate edits in context, not isolation.
Test method:
- Start playback 5-10 seconds before the edit
- Play through the edited section
- Continue 5-10 seconds after
Edits that sound fine in isolation sometimes create flow problems in context.
Maintaining Rhythm
Speech has natural cadence. Edits should preserve it.
Signs rhythm is broken:
- Words feel rushed together
- Pauses feel artificially uniform
- Energy doesn't match surrounding speech
Fix: Adjust gap length after edits. Different thought transitions need different pause lengths.
Matching Energy
Filler words often correlate with energy changes.
Problem: Removing filler that bridged energy levels creates an abrupt shift.
Solution: If removing a filler creates a jarring energy jump, either:
- Keep the filler
- Add a short crossfade or silence to smooth the transition
The Less-Is-More Principle
When in doubt, remove fewer fillers.
Conservative approach:
- Remove clusters and obvious problems
- Leave isolated fillers
- Keep anything that sounds worse when removed
Listeners accept occasional fillers much more readily than they accept obviously over-edited audio.
FAQ
How many fillers should I remove per minute of audio?
There's no specific number—judge by how distracting they are, not quantity. A speaker using two "ums" that form a pattern might need both reduced to break the pattern. Another speaker using ten isolated fillers might not need any removed. Focus on removing distractions, not hitting a removal quota.
Should I remove "you know" and "like" the same way as "um"?
Yes, apply the same principles. Remove clusters and patterns; keep isolated instances that aid natural flow. "You know" and "like" often serve conversational purposes—requesting agreement, softening statements—that pure filler sounds like "um" don't serve. Be even more conservative with meaningful-adjacent phrases.
Will listeners notice if I don't remove filler words?
Most listeners don't consciously notice moderate filler use unless you've trained them to expect polished speech. Podcast audiences are generally more tolerant of natural speech patterns than, say, audiobook listeners. Only chronic heavy filler use becomes noticeably distracting to typical listeners.
How do I remove fillers without making edits audible?
Cut at zero crossings, maintain appropriate gap lengths, use short crossfades when needed, and always check edits in context. Most audible edits come from cutting into words, removing too much silence, or creating jarring energy transitions. If an edit sounds obvious, undo and try a different approach.
Should I mention filler words to guests before recording?
Generally, no. Making guests self-conscious about fillers usually increases filler use as they monitor their own speech. If a guest uses fillers very heavily, you might gently mention that you'll clean up the audio, which can help them relax knowing the recording will be edited.
Ready to Polish Your Podcast Speech?
Filler word editing walks a line between polished and natural. Remove what distracts; keep what serves comprehension. The goal is speech that sounds human and unhurried, not robotic perfection.
Your thoughtfully edited episodes deserve permanent preservation. Transcription transforms natural-sounding audio into searchable archives where every carefully preserved word becomes findable.
Try PodRewind free and make every well-edited episode part of a searchable archive.