Mandarin has four tones — 1st: high & level (mā), 2nd: rising (má), 3rd: dipping/low (mǎ), 4th: falling (mà) — plus a neutral tone (ma). Tone is part of the word, not decoration: change the tone and you change the meaning entirely.
The single sound “ma” can mean mother, hemp, horse, or scold — depending only on its pitch. That’s why tones aren’t optional polish in Chinese; they’re the difference between words. The good news: there are only five pitch patterns to master, the rules behind them are short, and you can train your ear far faster than you think. This guide walks through each tone, the changes that happen in real speech, the mistakes that trip up beginners, and a daily routine to fix your ear.
The classic example: one syllable, four meanings
Why tones exist (and why English speakers miss them)
English does use pitch — but for emotion and emphasis, not to change which word you mean. “Really?” rising sounds skeptical; “Really.” falling sounds flat. The word is the same either way. In Mandarin, pitch is lexical: it’s baked into the word the way a vowel is. So when a beginner says “ma” with whatever pitch feels natural, they’re unknowingly switching between mother, horse and scold.
This is why tones feel hard at first: your brain has spent your whole life ignoring pitch as meaningless. The skill isn’t physical — most people can produce all the pitches easily — it’s perceptual and habitual. You have to learn to notice tone and to store it as part of every word.
The 5-level pitch system
Linguists describe Mandarin pitch on a scale from 1 (lowest) to 5 (highest). Every tone is just a path through that scale:
| Tone | Contour (5-level) | Shape |
|---|---|---|
| 1st | 5 → 5 | High and flat, held steady |
| 2nd | 3 → 5 | Rises from mid to high |
| 3rd | 2 → 1 → 4 | Dips low, then rises (in isolation) |
| 4th | 5 → 1 | Drops sharply from high to low |
| neutral | light | Short, unstressed, pitch set by the syllable before it |
Tone by tone — how each one moves
| Tone | Pitch shape | Feels like | Example |
|---|---|---|---|
| 1st — high level | Flat & high (5→5) | Holding a steady note, like a doctor’s “ahh” | 高 gāo — tall |
| 2nd — rising | Mid up to high (3→5) | A surprised “huh?” | 来 lái — to come |
| 3rd — dipping | Low dip (2→1→4) | A doubtful “wellll…” | 好 hǎo — good |
| 4th — falling | High down to low (5→1) | A sharp command “Stop!” | 去 qù — to go |
| neutral | Light & quick | Unstressed, drops onto the previous tone | 吗 ma — question particle |
Production tips for each tone
How to actually make the sound
- 1st tone: sing it. Pick a comfortably high note and hold it flat — no drifting up or down. Think of singing “laaa” on one pitch.
- 2nd tone: say it like a genuine question — “What?” The pitch climbs from the middle to the top. Don’t start too low or you’ll run out of room.
- 3rd tone: in real speech it’s mostly a low tone — let your voice drop to the bottom of your range and stay there. The full dip-then-rise only happens when the word stands completely alone.
- 4th tone: be decisive — a short, sharp drop, like snapping “No!” or stamping a foot. It’s the most forceful tone.
- Neutral tone: say it quick and quiet, with no pitch target of its own. 妈妈 māma is a 1st tone followed by a light, low neutral.
The skill nobody trains: hearing tones
Most beginners practice producing tones but never perceiving them — then wonder why native speech sounds like a blur. Pitch perception is a trainable skill. The fastest method is minimal pairs: listen to two words that differ only by tone and force a choice.
| Pair | Word A | Word B |
|---|---|---|
| 3 vs 4 | 买 mǎi — buy | 卖 mài — sell |
| 1 vs 4 | 汤 tāng — soup | 烫 tàng — scalding hot |
| 1 vs 4 | 书 shū — book | 树 shù — tree |
| 2 vs 4 | 床 chuáng — bed | 创 chuàng — to create |
| 4 vs 4 | 问 wèn — ask | 吻 wěn — kiss |
Ear-training that works
- Drill minimal pairs daily until you can pick the right word with your eyes closed.
- Always learn a word with its tone — never store “ma” and the tone separately.
- Shadow native audio: play, pause, imitate the pitch contour out loud, then compare.
- Use color: associating each tone with a fixed color builds instant recognition over time.
Tone pairs — the real unit of speech
Natives don’t produce tones one isolated syllable at a time; they produce them in pairs and chunks. Practicing two-syllable words trains the transitions, which is where most beginner errors actually happen. A few worth drilling:
| Pattern | Example | Meaning |
|---|---|---|
| 1 + 1 | 飞机 fēijī | airplane |
| 2 + 1 | 咖啡 kāfēi | coffee |
| 3 + 1 | 老师 lǎoshī | teacher |
| 4 + 4 | 再见 zàijiàn | goodbye |
| 2 + 4 | 学校 xuéxiào | school |
| 3 + neutral | 喜欢 xǐhuan | to like |
Tone change rules (sandhi) you must know
Tones shift in connected speech. Three rules cover almost everything:
| Rule | What happens | Example |
|---|---|---|
| 3 + 3 | First 3rd tone becomes a 2nd (rising) | 你好 nǐ hǎo → ní hǎo |
| 不 (bù) + 4th | bù becomes bú | 不是 bù shì → bú shì |
| 一 (yī) shift | yì before 1/2/3, yí before 4th | 一个 yī gè → yí gè |
A note on writing: pinyin is usually written with the original tone marks even when sandhi changes the pronunciation, so 你好 stays “nǐ hǎo” on the page but is said “ní hǎo”. Don’t memorize these as abstract tables — learn them inside real words, where they’ll feel natural.
The 3rd tone deep dive (the one everyone gets wrong)
The 3rd tone causes more trouble than the other three combined, because textbooks teach its dramatic dip-then-rise shape — but that full shape only appears when the syllable is said completely alone. In actual sentences:
- Before another tone, it’s a low half-tone — just drop your voice and stop. No rise.
- Before another 3rd tone, it becomes a 2nd (rising) tone (the sandhi rule above).
- Only at the very end of a phrase, or in isolation, does the full low-then-rising contour show up.
The classic beginner error is over-dipping every 3rd tone into a big U-shape, which sounds unnatural and slow. Fix: think “low”, not “dip”.
Common mistakes to avoid
- Learning the word first, the tone later. The tone is the word. Store them together from the start.
- Adding English intonation. Rising your pitch at the end of a question overrides the real tones — keep each word’s tone intact and add the question with 吗 (ma) instead.
- Practicing only single syllables. Drill words and short phrases so you train the transitions.
- Ignoring perception. If you can’t hear the difference, you can’t reliably produce it. Train your ear first.
Train tones the way your ear learns
Hanzijo color-codes all five tones everywhere, pairs every word with native audio, and includes a dedicated tone trainer with minimal-pair drills. Tone sandhi is built into the audio, so you hear the real shift — not the textbook one. Hear it, repeat it, lock it in with SRS.
Get the Tone Trainer — FreeA 10-minute daily tone routine
| Minutes | Activity |
|---|---|
| 0–3 | Minimal-pair listening: choose A or B, check, repeat |
| 3–6 | Shadow 5 two-syllable words — copy the contour out loud |
| 6–8 | Record yourself saying today’s words; compare to native audio |
| 8–10 | SRS review of any tones you missed this week |
Frequently asked questions
How many tones does Mandarin have?
Four main tones plus a neutral tone. Cantonese has more, but standard Mandarin (the one HSK tests) uses five pitch categories total.
Do I really need perfect tones to be understood?
Context carries you far, but wrong tones cause real misunderstandings and make you harder to follow. Aim for accurate tone pairs in words rather than perfect isolated syllables — that’s how natives actually produce them.
What’s the hardest tone?
The 3rd tone, because its full dipping shape only appears in isolation; in connected speech it’s usually a low half-tone. Learning it inside words avoids the classic over-dipping mistake.
Do tones change in questions?
No. Unlike English, you don’t raise your pitch at the end of a question. The tones stay the same, and the question is marked grammatically — for example by adding 吗 (ma) or using a question word.