Skip to content
  • There are no suggestions because the search field is empty.

Basic Lip Sync Tips

Learn how to assess and correct lip sync using DeepEditor's refine tool.

Basic Lip Sync Guidelines
Assessing Lip Sync
Terminology
How to Fix the Most Common Issues in a Vub

Basic Lip Sync Guidelines

Use a “light touch”

You should never need to add a large number of keyframes to the timeline - the majority of the final vub result should be derived automatically by DeepEditor.

Heavy movements can look unnatural. Start with 10 or -10. Blend shape adjustments should rarely go above 30 or -30.

Don’t overwork the shot. Too many refinements can make the output media look worse, not better.

Light Moves Faster Than Sound

Starting a movement a frame or two early may look better to the viewer.

Mirror Tip

Consider grabbing a handheld mirror or open your phone camera. Use your own lips and jaw to see how it moves for the specific word you're working on.

Jaw Movement

Another tip is to place your hand under your jaw. As you say the line, you'll be better able to identify when the jaw moves.

Interpolate When Possible 

  • Use the Source media as much as possible by using interpolation.
  • When a character’s mouth movement does not need to be changed from the original performance, bring the interpolation down to 0. 
  • When changing interpolation or blendshape values from 100 to 0 or vice versa, ensure the change appears as seamless as possible. We recommend the transition spans at least 4 frames.

Assessing Lip Sync

When assessing lip sync, focus on the following questions:

  • Do you believe the character is genuinely delivering the dialogue?
  • Does the output resemble natural speech?
  • Is the output synchronized?
  • Are any mouth movements absent?
  • Are any mouth movements extraneous?
  • Do the lips close and open appropriately with the sounds?
  • Check for things that can’t be happening - remove any physical impossibilities.

Terminology

Key Term

Definition

Examples

phoneme

the smallest unit of sound perceived as meaningful within a given language

In English, /s/ (as in sip) and /z/ (as in zip) are both phonemes, because English speakers derive meaning from them. The presence of /s/ or /z/ can change the meaning of an English word–e.g. If you swap the /s/ in sip with a /z/, it becomes a new word: zip! Though other languages may have both /s/ and /z/ sounds present, if those sounds are interchangeable and do not alter the meaning of a word, then /s/ and/or /z/ would be present as phones (actual sounds) rather than phonemes. For instance, in Swedish, there is no /z/ phoneme; so native Swedish speakers often swap /z/ for /s/ when speaking English, because in their native tongue, there is no meaningful difference between /s/ and /z/.

viseme

what a given phoneme or set of phonemes looks like

The phonemes /f/ and /v/ differ in that /f/ is voiceless and /v/ is voiced. Because they are produced so similarly (everything is the same apart from being voiced or voiceless) /f/ and /v/ appear the same visually. Due to their indistinguishable appearance, /f/ and /v/ are often grouped into one viseme category. 

articulator

any of the organs or structures involved in speech production—e.g. lips, teeth, tongue, throat, etc.

The articulators involved in /f/ and /v/ phonemes are the: upper lip and lower teeth–which is why these two phonemes are classified as “labiodentals” by linguists!

fricative

a phoneme that requires turbulence or restricted airflow; restriction can be caused by various articulators working together to create a narrow passage to force through

/s/, /z/, /f/, and /v/ are just some examples of fricatives. /s/ and /z/ create restricted airflow by keeping the teeth close together. /f/ and /v/ on the other hand, utilize the upper teeth and lower lip to create restricted airflow.