Skip to content
  • There are no suggestions because the search field is empty.

Lip Sync Theory (Advanced)

Master the science of lip sync to evaluate your vub and make precise, detailed adjustments.

Visemes & Phonemes Definitions
Consonants Overview

Consonants to Prioritize

Visemes & Phonemes Definitions

  • Phoneme: The smallest unit of sound in a given language that affects meaning within a word. A unit of sound in a language that cannot be analyzed into smaller linear units and that can distinguish one word from another.

  • Viseme: A group of phonemes that are visually indistinct from one another. a “visual phoneme,” or, a group of phonemes that look the same (e.g. same lip position, jaw position, etc.) when produced. In the Consonants and Vowels section, you will find various phonemes listed together as a single viseme group. These groupings are made based on these phonemes’ visual similarities.

Consonants Overview

A consonant is a speech sound produced by some form of airflow restriction. This restriction can be created by the lips, teeth, and/or tongue. M, b, and p, for example, require the lips to meet (a form of airflow restriction) in order to be produced. These sounds are referred to as bilabials, more specifically, full closure bilabials. F and v, on the other hand, require lip-to-tooth contact (another form of airflow restriction) in order to produce enough friction to create the target sound. We will cover the remaining forms of restriction as we go through each group.

Below is a diagram showing which articulators (vocal organs like the teeth, tongue, lips, throat, etc.) each consonant viseme is primarily driven by. 

lips teeth toungue

 

Consonants to Prioritize

  • priority visemes:  viseme groups made of phonemes that have reliable properties to assess.

  • stop: a speech sound created by completely blocking the flow of air and then releasing it.

  • fricative:  refers to speech sounds created by air escaping from the mouth through a narrow passageway.

  • affricate:  a sound that starts as a stop and then releases as a fricative.

  • bilabial: a speech sound that requires both lips; in linguistics, this would refer to a w/r/oo sound or m/b/p; however, at Flawless, when we refer to bilabials we are exclusively referring to m/b/p’s. 

Priority visemes refer to viseme groups made of phonemes that have more robust properties. These viseme groups are less subject to context-based variation, making them more reliable to assess.

Priority Group: Level 1

m/b/p

  • This viseme group consists of phonemes that require the upper and lower lips to be fully closed.

  • We typically refer to these phonemes as bilabials. Bilabials are types of sounds that require both lips. The lips must interact together to create a full closure or near closure. m/b/p requires a full closure, whereas w is an example of a bilabial that requires partial closure.

w/r/oo

  • This viseme group consists of phonemes that require the upper and lower lips to be rounded and nearly closed.

  • Though here at Flawless, when we refer to a “bilabial” we are almost exclusively referring to an m/b/p, w’s are technically bilabials as well; they are simply bilabials with partial closure rather than full closure.

  • Though /r/ and oo are included in this group as well neither /r/ nor oo (IPA symbol /uː/) is considered a bilabial despite having lip configurations indistinguishable to /w/. Because oo is a vowel, it is instead considered a tense, rounded vowel.

  • It is important to note that in this viseme group, we are only interested in the /r/ sound when it occurs at the beginning of a word or syllable. When /r/ occurs at the beginning of a word or syllable, it takes on a strong rounded shape. For example, in the word “red,” the /r/ position is strongly rounded; however, when /r/ occurs at the end of a word or syllable, as it does in the word “father,” it occurs in a much looser form and is significantly more subject to context-based variation. 

  • As noted, this group contains a vowel, oo (formal IPA symbol /uː/). I have included the oo with the w’s and r’s due to its indistinguishable outside appearance. 

f/v

  • This viseme group consists of phonemes that require the lower lip to interact with the teeth to create restricted airflow. 

  • /f/ and /v/ are both labiodental, fricative sounds. Labiodental refers to sounds created via lip-to-tooth interaction, and fricative refers to speech sounds created by air escaping from the mouth through a narrow passageway.

  • Though it may be possible to produce a similar sound in the reverse, i.e. with the upper lip and lower teeth, you can expect to never see such a variation.

NOTE 1: The “fully closed” nature of m/b/p is not as absolute as it may seem. There are many cases in natural speech when we can produce an intelligible m/b/p without fully closing the lips. These almost-closed configurations can occur in a variety of circumstances but are most notable when:

  • someone is smiling while speaking (because the lips are more separated, making it take more energy to fully close the lips)

  • a /p/ is followed immediately by an /f/ - e.g. in words like “helpful,” “hopeful,” “stepfather,” etc. (because the consonants blend together to create a hybrid between /p/ and /f/ to maximize efficiency)

  • someone is slurring their speech or not enunciating (Happens more than you might think!)

Priority Group: Level 2

s/z

  • Like f/v, the s/z viseme group is also made up of fricatives. s/z and a subset of fricatives known as sibilants. Sibilants are created by forcing air through a narrow channel while also curling the tongue to direct air over the edge of the teeth.

  • Though s/z does not have a distinct lip shape or position (besides - lips must not be closed), due to its fricative nature, s/z is semi-distinguished in that it requires the upper and lower teeth to remain in close proximity. 

ch/sh/dge/zh, or designated in IPA as: tʃ⁠/ʃ⁠⁠/dʒ⁠⁠/ʒ

  • This viseme group consists of a mixture of phonemes that are considered affricates and fricatives. sh & zh are considered fricatives; whereas ch & dge are considered affricates. An affricate is a sound that starts as a stop and then releases as a fricative; this is why ch and dge’s IPA designation has two symbols: tʃ⁠⁠ for ch and dʒ for dge. In the former, the t represents the t-like stop. In the latter, the d represents the d-like stop. For example, when we say the word “reach,” during the “ch” part, we make a t-sound and then transition into the sh-style fricative. In the word “judge,” during the “dge” part, we start with a d-like stop and end in a zh-style fricative.

  • fricatives:

    • sh / ʃ⁠⁠

    • zh / ʒ

  • affricates: 

    • ch / tʃ⁠⁠

    • dge / dʒ

th

  • This viseme group consists of two different th sounds: voiced th (ð) and unvoiced th (θ). Both th’s are referred to as interdental fricatives; interdental fricatives are consonants that are created by placing the tongue between the teeth.

  • Due to the nature of fricatives requiring narrow air passageways, in order to produce the th group, the teeth must remain relatively close together and to the tongue. The more space there is between the teeth, the more the tongue needs to compensate for the increased openings. 

n/l

  • This viseme group consists of sounds that require the tongue to be positioned upward, pressing against the area behind the front teeth. The tongue positions for n vs. l are similar but not visually distinguishable due to occlusion from the teeth. Their main differences in sound come from their direction of airflow; for n, the tongue blocks air so it exits through the nose, and for l, the air goes around the side of the tongue.

  • This viseme group consists of two stop consonants that share the same lip, tongue, and jaw positions but differ in that /t/ is voiceless and /d/ is voiced. 

  • A stop consonant is a speech sound created by completely blocking the flow of air and then releasing it.

t/d

  • Compared to visemes in Priority Groups 1 and 2, t/d does not possess any highly distinguishing properties. This group does not offer visual contrast from other speech sounds and only requires the lips and jaw to be slightly or greater than slightly parted.

  • In most cases, t/d is undetectable (not able to be discerned from visual inspection). Only in cases when the lips and jaw are open enough to clearly see the tongue can t and d be distinguished from other phonemes. When these conditions are present, the tongue must lift up to tap the area behind the top teeth.

k/g

  • Even more undetectable than t/d, this viseme group consists of phonemes that experienced lip readers call invisible. Though k/g and t/d both only require the lips and jaw to be slightly or more open, k/g does not have a visible tongue position. Instead, its defining feature is a stop in the back of the throat, which is not observable under general conditions.

h

  • Almost as imperceptible as k/g is the h viseme. h does not require any particular lip position, but it does require moderate jaw opening. To produce an h sound, air must be passed in a constricted manner between the tongue and the roof of the mouth - OR the back in the throat. h is unvoiced.