What makes a voice sound natural?🤔| Intonation Analysis of Google Assistant | American Accent

86,025 views

2018-05-14 ・ Accent's Way English with Hadar


New videos

What makes a voice sound natural?🤔| Intonation Analysis of Google Assistant | American Accent

86,025 views ・ 2018-05-14

Accent's Way English with Hadar


Please double-click on the English subtitles below to play the video.

00:00
Hey guys it's Hadar and this is the Accent's Way. A few days ago Google CEO, Sundar Pichai, presented
0
20
8240
00:08
an extraordinary demo showing the new capabilities of the Google assistant.
1
8260
5280
00:20
In the demo, he played a real conversation between the Google assistant
2
20540
4240
00:24
which is a robot, AI technology and a real human being.
3
24780
4800
00:29
The stunning thing was that in the conversation the person could not detect
4
29580
4800
00:34
that she was not speaking to a real person but to a machine.
5
34380
4040
00:38
And the reason for that is, of course, the algorithm and the ability to use the right sentences according to the
6
38420
6320
00:44
nuances in the conversation, but also the way it was executed, the sound of the voice, the intonation.
7
44740
9080
00:53
What makes a voice sound natural?
8
53820
2880
00:56
What did they do, over there at Google, that made the voice sound so natural that the person could not
9
56840
6880
01:03
imagine they were speaking to a robot?
10
63720
2380
01:06
So what we're going to do, is we're gonna analyze the
11
66280
2580
01:08
conversation together and I'm going to pinpoint the places where the Google voice sounds so natural
12
68860
6660
01:15
and explain why it makes it sound like a real human being.
13
75520
5180
01:24
Okay, so she starts with, that's Google assistant right, it's a machine it's not a real person, I know it's crazy.
14
84760
7080
01:32
'Hi'
15
92240
1540
01:33
'Hi', right.
16
93780
1960
01:35
The way of saying 'Hi' this way, is a very welcoming, nice, warm, friendly way of saying it.
17
95740
5480
01:41
There's always a glide from high to low.
18
101220
2380
01:43
'Hi'
19
103680
1260
01:44
And listen to the ending.
20
104940
1520
01:46
'Hi'
21
106460
1720
01:48
I'm going down. It's not 'Hi'.
22
108200
3080
01:51
I'm going up
23
111340
1100
01:52
'Hi'
24
112940
1940
01:54
And then there is a little tail going up at the end.
25
114880
2720
01:57
'Hi'
26
117600
1220
01:58
That means that something else is coming up, I'm not done. And then she says something like this.
27
118820
4940
02:03
'Hi'
28
123760
780
02:04
'I'm calling to book a woman's haircut for a client'
29
124540
3280
02:08
Now in English, when you start a new idea, when you start a conversation, when you have a question, you
30
128020
5860
02:13
kind of start high in pitch.
31
133880
2180
02:16
'I'm calling...'
32
136240
1420
02:17
It's not
33
137660
760
02:18
'I'm calling to book a woman's haircut for a client.'
34
138420
3140
02:21
'I'm calling to book a....'
35
141560
1440
02:23
It's like asking for permission or telling you something new.
36
143000
4680
02:27
'Hi'
37
147680
760
02:28
And notice it, now like start listening to how people start asking questions
38
148440
4060
02:32
or starting sentences or new ideas.
39
152500
2780
02:35
There's always this wavy thing at the beginning, like a really high-pitched tone that they begin with
40
155280
5300
02:40
regardless to what words they're choosing to stress.
41
160580
3280
02:43
Now in the sentence
42
163960
1400
02:45
'Hi, I'm calling to book a woman's haircut for a client.'
43
165520
3820
02:50
'I'm calling to book a woman's haircut for a client.'
44
170100
4400
02:54
So there is this rise in pitch at the beginning.
45
174500
3540
02:58
'I'm calling...'
46
178040
1640
02:59
And 'calling' is a stress word.
47
179680
2020
03:01
'...to book...'
48
181700
720
03:02
That's a little less stress. So it goes down
49
182420
2300
03:04
'...a woman's haircut...'
50
184720
1460
03:06
Right, that's the subject, that's what I'm calling to book, that goes higher in pitch
51
186180
4300
03:10
'...for a client...'
52
190480
1320
03:11
And then there is this rising-rising intonation, the up-speak, where I go up.
53
191800
4840
03:16
That means that there is something else coming up, and then she continues
54
196640
4740
03:24
'I'm looking for something on May 3rd'
55
204340
2000
03:26
So she stresses the word looking, she starts again high in pitch at the beginning of the sentence
56
206800
4780
03:31
'I'm looking for something on May 3rd.'
57
211580
2100
03:33
And then she goes up in pitch at the end.
58
213680
2120
03:35
Now, look, it's totally okay, and sometimes even better to end it like a statement.
59
215800
5340
03:41
'I'm looking for something on May 3rd'
60
221140
2236
03:43
Right, and then it's a rising intonation and then you drop it down.
61
223380
3820
03:47
However, this rising-rising intonation at the end of a sentence
62
227780
4160
03:51
even if it's not a question, it's a very common speech pattern in America nowadays.
63
231940
5480
03:57
Which made it sound even more natural than just a regular ending statement.
64
237660
5720
04:03
'I'm looking for something on May 3rd'
65
243380
2260
04:05
And that open ending leaves more room for an answer.
66
245640
4180
04:09
It means that I'm waiting for an answer from you, but it's sort of like a question.
67
249900
4920
04:14
And then there is thi s part
68
254820
1460
04:22
'Mm-hmm'
69
262880
1140
04:24
Which is fantastic. What sounds more natural than
70
264020
3720
04:27
'Mm-hmm'
71
267740
1000
04:28
That's what we say, notice even here there is this glide in intonation.
72
268740
4520
04:33
'Mm-hmm'
73
273460
780
04:34
Again going up in pitch, making it sound more natural.
74
274240
3440
04:37
Like someone would actually say it like that.
75
277680
3240
04:45
'At 12 pm'
76
285860
1480
04:47
Now we can learn a lot just from this one statement.
77
287660
3100
04:51
Notice that every syllable hits a different note. It's not all on the same note.
78
291000
5900
04:57
'At 12 pm'
79
297220
1160
04:58
'At 12 pm'
80
298480
1320
05:00
'At 12 pm'
81
300160
2860
05:03
Right and even the 'm' is kind of like gliding down.
82
303320
4740
05:08
Okay, so it goes up in pitch and then it goes down.
83
308220
2900
05:11
'At 12 pm'
84
311240
1540
05:24
'Do you have anything between...'
85
324420
2020
05:26
'Do you have anything between...'
86
326540
1580
05:28
A question
87
328120
940
05:29
'Do you have...'
88
329060
700
05:29
Reduction at the beginning
89
329760
1260
05:31
'Do you have anything between...'
90
331020
1420
05:32
Again starting with a higher pitch.
91
332440
2480
05:34
'Do you have anything between 10 am...'
92
334920
2540
05:37
Pause
93
337780
1320
05:39
Because people pause, they want to think about what they want to say
94
339300
3000
05:42
'...and 12 pm'
95
342300
1700
05:44
Okay, so it's not 'between 10 am and 12 pm'
96
344000
2780
05:46
The system knows what hours it's going to suggest, but it takes that little pause to make it sound more natural.
97
346780
5980
05:52
So phrasing is crucial when we speak English.
98
352760
3540
05:56
Phrasing, filler words, intonation patterns, stressed words, so the rising-rising intonation.
99
356420
7860
06:04
But then also the falling intonation at the end, to indicate that I'm done.
100
364280
3860
06:16
'Just a woman's haircut for now'
101
376460
2000
06:18
So again, this glide at the beginning, this high pitch at the beginning, just a woman's, and then she goes down
102
378460
6760
06:25
'...haircut for now.'
103
385220
1420
06:26
The assistant could have answered 'a woman's haircut'
104
386640
3060
06:29
but they added the 'just' and for 'now'.
105
389700
3620
06:33
So 'just a woman's haircut'
106
393320
2100
06:35
the 'just' is not an essential word here
107
395420
1960
06:37
but it's a filler word that a lot of people use, which made it sound more natural.
108
397380
4480
06:41
'Just a woman's haircut for now'
109
401980
1700
06:43
And 'for now' is just another filler word that says well, let's begin with that and see where we go.
110
403680
4880
06:48
It's a polite way of saying 'that's it'. I don't need anything else.
111
408560
4100
06:52
'just a woman's haircut for now'
112
412660
2720
06:55
So those extra words
113
415380
1640
06:57
extra phrases, extra sounds, make it sound more natural and not like a robot.
114
417080
6320
07:03
And the thing is that these extra sounds and extra words are not usually used by non-native
115
423400
5940
07:09
speakers because we use efficient English. The way English is being taught is by very concise sentences
116
429340
7120
07:16
'this is how you say it'
117
436460
1280
07:17
and then you learn that people use all these extra phrases and sounds
118
437740
3880
07:21
'hmm'
119
441620
640
07:22
'aah'
120
442260
920
07:23
'well'
121
443180
1340
07:24
'for now'
122
444520
1000
07:25
'just'
123
445520
1000
07:26
Okay, all these extra phrases that make it sound more conversational and that's a way to communicate
124
446620
6220
07:32
and make it sound more friendly and polite.
125
452840
2960
07:40
'10 am is fine'
126
460780
1880
07:42
Again, that rising, rising intonation. She could have said
127
462660
3040
07:45
'10 am is fine.'
128
465700
1940
07:47
'10 am is fine.'
129
467640
1320
07:48
but
130
468960
500
07:49
'10 am is fine.'
131
469460
1440
07:50
makes it sound a little more friendly, a little less aggressive, a little less determined
132
470900
6680
07:57
'10 am is fine.'
133
477580
1220
07:58
I'm still waiting for an answer. I need you to approve it still.
134
478860
3420
08:02
'10 am is fine.'
135
482280
1220
08:03
And again notice that high pitch at the beginning
136
483500
2740
08:06
'10 am is fine.'
137
486240
1240
08:12
Again, up-speak at the end.
138
492080
2240
08:14
'The first name is Lisa'
139
494320
2060
08:16
It's not a question. So why does she go up in pitch?
140
496380
3860
08:20
Because that's a common speech pattern which makes it sound so natural.
141
500500
4180
08:24
You as a non-native speaker don't have to use it.
142
504740
2820
08:27
You can definitely go high in pitch and drop down at the end.
143
507560
3640
08:31
'The first name is Lisa.'
144
511200
1900
08:33
I'm a fond of this kind of conversation, where you go up and close it at the end.
145
513100
4720
08:37
But notice that these are the patterns that they chose to use, knowing that it would make it sound more natural.
146
517820
5840
08:52
'Okay, great!'
147
532680
1040
08:53
'Okay, great!'
148
533800
1220
08:55
She could have said just
149
535020
1180
08:56
'Thank you!'
150
536200
940
08:57
'Okay, great!'
151
537140
1260
08:58
That's how people comment on something that they're happy about.
152
538400
2900
09:01
'Okay, great!'
153
541300
1160
09:02
'Thanks!'
154
542460
500
09:02
And there is a build up here in terms of the intonation.
155
542960
2960
09:05
That shows that, one thing is a little more important than the other.
156
545920
4520
09:10
'Okay, great!'
157
550440
1220
09:11
'Thanks!'
158
551660
600
09:12
Rising, falling and then rising intonation at the end.
159
552260
3360
09:15
So to conclude, in order to answer our question
160
555820
2860
09:18
What makes a voice sound more natural?
161
558700
2520
09:21
We look at what the people at Google did, to make their Google assistant sound like a real human being.
162
561380
5000
09:26
So when it comes to intonation, it wasn't monotonous.
163
566960
2842
09:29
'Hi, I'd like to book a woman's haircut'
164
569802
2123
09:31
But it had that nice glide
165
571925
1775
09:33
'Hi, I'd like to book a woman's haircut'
166
573700
2660
09:36
So every syllable had a different note.
167
576360
3020
09:39
Also, at the beginning of an idea or a sentence, it started high in pitch.
168
579920
5100
09:45
Every important word stuck out.
169
585200
1970
09:47
So it was a little higher in pitch and longer.
170
587170
2650
09:49
And at the end, every sentence ending, ended up with rising - rising intonation.
171
589820
5060
09:55
Almost like a question even though it wasn't always a question.
172
595240
3740
09:58
Why? Because up-speak is a common speech pattern in U.S. today, whether you like it or not.
173
598980
6940
10:06
Another thing they added is those extra words
174
606020
3020
10:09
'just'
175
609040
860
10:09
'for now'
176
609900
920
10:10
'hmm'
177
610900
680
10:11
Extra sounds.
178
611580
1080
10:12
'Mm-hmm', that made it sound more natural and even here intonation played a major role.
179
612660
6120
10:19
Because it wasn't flat. '
180
619000
1500
10:20
'Mm-hmm'
181
620500
620
10:21
'Mm-hmm'
182
621380
1040
10:22
Right, it was really like music.
183
622420
2540
10:25
'hmm'
184
625280
840
10:26
And the last thing was phrasing, taking small pauses to indicate that the person is thinking
185
626480
5880
10:32
I mean the machine is thinking, I mean the assistant is thinking.
186
632360
4600
10:37
I don't even know how to call it anymore. This is how actually people speak. They take small pauses between
187
637340
6220
10:43
chunks, parts of the sentence, not between words and not only at the end of the sentence.
188
643760
5680
10:49
As I said, we want to recognize these patterns as we just did today and recognize what makes it sound more
189
649440
6280
10:55
natural, more conversational and then take these elements and add them to our speech in English.
190
655720
7220
11:02
And it's also great for you as a speaker, because sometimes you need to come up with the right words
191
662940
4600
11:07
so it doesn't have to be 100% concise.
192
667620
3500
11:11
Because it's not concise for American speakers as well and it can give you time, those extra filler words like
193
671120
7180
11:18
'hmm' and 'well'
194
678300
2120
11:20
And the phrases and the pauses and the extra words like
195
680560
3920
11:24
'just' and 'okay'
196
684480
1360
11:26
That can give you some, that can buy you some time to come up with a right word, in order to
197
686580
5680
11:32
convey what you want to say.
198
692260
1620
11:33
And as a side note, to all you non-native speakers out there
199
693880
4200
11:38
when we look at the presentation, we see that Sundar, Google CEO, is not a native English speaker.
200
698080
6280
11:44
And he is a phenomenal presenter.
201
704360
3080
11:47
This is to say, that you don't have to lose your accent to be a great speaker in English.
202
707440
6080
11:53
In fact, the accent is an advantage, it reveals some layers that you have as a speaker
203
713800
6040
11:59
It shows that you carry your history behind you, that you have an interesting story.
204
719840
5280
12:05
You don't want to lose your accent. You don't want to hide your accent.
205
725120
3860
12:08
You do want to use the elements of speech to sound great.
206
728980
3960
12:13
To convey your message, to be a strong speaker, to speak slowly, to be clear, to be understood.
207
733100
6560
12:19
But it doesn't mean that you need to lose your accent.
208
739900
2520
12:22
So when you work on your accent, and intonation, and rhythm, and stress, your goal should not necessarily be
209
742500
6160
12:29
lose your accent, speak like a native speaker.
210
749080
3640
12:32
But be the best speaker that you can.
211
752720
3700
12:36
With or without a foreign accent, because that doesn't really matter.
212
756420
4740
12:41
What matters is how you feel about yourself and how you convey your message
213
761340
5000
12:46
and if you're clear and communicative.
214
766340
3240
12:49
Now I have a question for you.
215
769580
2060
12:51
What other elements of speech, whether it's specific words or phrases or intonation patterns, do people use
216
771860
6960
12:58
that make them sound more natural? What have you noticed? What are you using?
217
778820
5420
13:04
So let me know in the comments below
218
784240
2540
13:06
'So' is one of them, I use 'so' all the time, you've probably noticed.
219
786780
4200
13:10
That's it! Thank you so much for watching.
220
790980
2220
13:13
Please share this video with your friends if you liked it and don't forget to subscribe to my YouTube channel
221
793200
5480
13:18
and click on the belt to get notifications
222
798680
2260
13:20
there are a lot more videos coming up about American intonation, so you don't want to miss it out.
223
800940
5080
13:26
Have a wonderful week and I'll see you next week, in the next video.
224
806320
5260
13:31
Bye.
225
811580
1820
About this website

This site will introduce you to YouTube videos that are useful for learning English. You will see English lessons taught by top-notch teachers from around the world. Double-click on the English subtitles displayed on each video page to play the video from there. The subtitles scroll in sync with the video playback. If you have any comments or requests, please contact us using this contact form.

https://forms.gle/WvT1wiN1qDtmnspy7