Playing with Praat, part 1: Cardinal vowels and vowel space

Phonetics is great fun. In this article series, I will share some self-experimentation in Chinese phonetics that I simply think are too nerdy to share on Hacking Chinese (perhaps I will find some way of publishing something about this there later, but this is the unedited director’s cut). This is the first of several articles where I discuss Chinese phonetics and some related experiments I’ve done with my own pronunciation. Before we get to the actual Chinese, we need some basic knowledge of phonetics, so I will talk mostly about vowels in general first and will start talking about Chinese vowels next time.

About this article

A word of warning, don’t expect this to be helpful or useful, but expect it to be interesting. I do think that a deep understanding of phonetics can really help learning to pronounce a foreign language, but it’s certainly not the most efficient method of learning. If you’re interested in what I spend my spare time doing at the moment, read on. If you want quick fixes to your own pronunciation, go mimic a native speaker instead or read articles about pronunciation on Hacking Chinese.

If you find anything wrong or dubious in this article, just leave a comment. I don’t pretend that I actually know these things well, so there might be errors here and there. Since part of the goal is to learn more about phonetics, pointing out a mistake I’ve made is equivalent to doing me a big favour!

This article is going to contain some jargon and it will require you to already understand some basic theory. Rather than spending hours explaining these thing, I will simply link to Wikipedia articles whenever necessary. I will try to make the narrative understandable even if you haven’t taken several courses in phonetics, though, but I might be a bit blind to what uninitiated people find hard.

If you want to, you can do everything I’ve done here yourself, you just need a microphone and Praat, which is a program developed for speech analysis and is free of charge. I’m not going to go into any details about how to use Praat now, but it’s fairly easy to find tutorials online.

Vowels and vowel space

Basically, vowels can be defined in a two dimensional space determined by how the tongue separates the oral cavity into two compartments, which will result in a signal with different formant frequencies. This means that if you look at the spectrogram of a vowel, you can actually see these formant frequencies and thereby roughly determine the place of articulation of this vowel. The picture below is from my pronunciation of [i] and the lower line with the red dots represent the first formant frequency, F1, and the second line with red dots represents the second formant frequency, F2.

[i]

The value of F1 is related to the openness of the vowel, i.e. how much you open your mouth and lower your tongue when pronouncing it.  Try pronouncing “bin” and “ban” in English and you should feel a big difference openness.A low F1 means that the vowel is closed, so the [i] above is a closed vowel because F1 is very low.  The opposite would be [a], which is an open vowel and has a relatively high F1. See the below spectrogram for [a].

[a]

The value of F2 is related to the back-front aspect of the vowel, i.e. how far forward or backward your tongue is positioned. Try pronouncing “beat” and then “boot” in English and you will feel difference between a front vowel in “beat” and a back vowel in “boot”. F2 decreases as the tongue retracts, so a [i] in “beat” has a very high F2, whereas [u] in “boot” has a lower one (although not as low as the cardinal [u] described below). Compare the formant frequencies of [a] above and [u] below. Note that F1 and F2 overlaps in this diagram, the formant at around 2100 Hz is F3, not F2.

[u]

Cardinal vowels and my personal vowel space

What the above means is that there is a range of possible vowels and that vowel quality can be defined in terms of the location in this space. In phonetics, there are eight cardinal vowels that occupy the corners and edges of this space and they can be represented in what’s called a vowel chart. You can check the IPA vowel chart on Wikipedia, which also has audio recordings or York University’s site which also contains a neat chart with audio. There are eight cardinal vowels, four front and four back, each set comes with different degrees of openness.

(Actually, there is a third dimension I have mostly ignored and will continue to ignore, and that is lip rounding. As you can see in the Wikipedia article above, there is a second set of cardinal vowels that matches the first eight, but are opposite in terms of lip-rounding. This is too complicated for this article and I will ignore anything else beyond the basics for now.)

One problem with these charts is that they are schematic rather than accurate representations of the oral cavity, the produced sound or the perceived sound. For instance, since the shape and size of the oral cavity and other resonance cavities vary between individuals, you can’t just compare someone’s formant frequencies for one vowel with those of someone else and conclude that A’s vowels are farther back than B’s.

One way of approaching the issue is to draw your own vowel space and see what the cardinal vowels look like when you pronounce them. This is very simple to do in theory:

  1. Record the eight cardinal vowels
  2. Measure F1 and F2 for these vowels
  3. Plot them on a formant diagram (F1 against F2)

Each step isn’t as easy as it looks, though, but more about that in a moment, I’ll show you my results first. This diagram shows F1 plotted against F2. Note that actual frequency is not the same as perceived frequency, so therefore the scales aren’t linear.

cardinal vowel chart olleThese are the eight cardinal vowels and their F1 and F2 frequencies. Here are the relevant numbers:

Vowel F1/F2
[a] 253/2309
[e] 335/2094
[ɛ] 461/1702
[a] 636/1404
[ɑ] 580/1007
[ɔ] 424/672
[o]  361/609
[u] 245/446

You can also plot the frequency of F1 and F2 for each vowel, which in my case gives something like this, which is fairly close to what it’s supposed to look like. Remember, the order of the cardinal vowels is from closed front via open front and open back to closed back. Thus, we expect F1 values to first increase and then decrease. We also expect F2 values to fall through out the cardinal vowel sequence. This is also what we find.

cardinal vowel formant graph

I don’t think much can be said about this, even though my ow rendering of the cardinal vowels isn’t perfect. It would be interesting to see what the model talker on York University’s site would look like plotted in a similar way to what I have done above. Still, I think the blue polygon in the first graph shows pretty well the limits of my articulation. I have tried to produce even more extreme vowels in each direction without succeeding. Brief checks show that my vowels in actual languages (Swedish, English, Chinese) fall within this range, but more about this later (especially Chinese, of course).

I want to be as cool as you, what should I do?

As promised, I won’t go into details in how to use Praat, but I will describe the general process briefly based on the three steps above. The first thing you need to do is record the cardinal vowels. This can be quite hard if you have no experience with trying to pronounce sounds other than those in your native language. Note that even though the same letters might be used in your alphabet, if you are a native speaker, the cardinal vowels typically don’t match the vowels in English. For instance, “i” in English can represent two sounds: /i/ and /ɪ/, but none of them are as open and fronted as the cardinal vowel [i]. Therefore, some practice is required. Start by mimicking the audio charts I linked to above.

Second, you need to measure the frequencies of F1 and F2 in Praat. You’ll have to figure out how to install and use the program on your own, but I’ll give some suggestions for measuring F1 and F2 for the vowels. The main problem is where to measure and there are several ways of doing this. The key is to be consistent. You can either choose the time where the intensity is the highest or when the vowel looks the most stable (i.e. F1 and F2 aren’t fluctuating). I don’t think it matter much which method you choose in this case, but I usually go with the highest intensity since that’s much more objective than the idea of stability.

Third, plot F1 against F2 in a graph. The easiest way is probably to do what I did and simply take a picture of a chart and then manually plot your vowels in any decent image editing program. Creating a graph like my cardinal vowel graph is pretty easy with any spreadsheet software.

Conclusion

The main point with writing this article is that I enjoy it. There are also secondary reasons, like sharing what I have done with others and the fact that I learn a lot about this simply by being forced to write about it rather than just doing it. This is just the first article in this series, next time I’ll look at monophthongs in Mandarin Chinese and how these relate to the vowel space I drew in this little experiment. I will then move on to diphthongs, triphthongs before leaving vowels altogether and start looking at tones, consonants and so on. Stay tuned!

Tags: , ,