Peter Donnelly: How stats fool juries

246,556 views ・ 2007-01-12

TED

아래 영문자막을 더블클릭하시면 영상이 재생됩니다.

번역: Dae-Ki Kang 검토: John Han

00:25

As other speakers have said, it's a rather daunting experience --

25000

2000

다른 연설자께서 말씀하신 것처럼, 이건 상당히 기죽을만한 경험이군요.

00:27

a particularly daunting experience -- to be speaking in front of this audience.

27000

3000

특히 여러분과 같은 청중들 앞에서 발표를 하는 건, 특히 위축될만한 경험입니다.

00:30

But unlike the other speakers, I'm not going to tell you about

30000

3000

그럼에도, 다른 연설자들과 달리, 저는 여러분에게

00:33

the mysteries of the universe, or the wonders of evolution,

33000

2000

우주의 신비나, 또는 진화의 경이로움,

00:35

or the really clever, innovative ways people are attacking

35000

4000

또는 사람들이 우리 세계의 심각한 불평등들을 공략하고자 하기 위한

00:39

the major inequalities in our world.

39000

2000

정말로 지혜롭고 혁신적인 방안들에 대해서 얘기하진 않을 것입니다.

00:41

Or even the challenges of nation-states in the modern global economy.

41000

5000

또는 현대 글로벌 경제에서 민족국가들이 직면한 문제들에 대해서도 얘기하지 않을 것입니다.

00:46

My brief, as you've just heard, is to tell you about statistics --

46000

4000

여러분이 방금 들은 것처럼, 전 간단히 여러분께 통계학에 대해 말씀드리고

00:50

and, to be more precise, to tell you some exciting things about statistics.

50000

3000

그리고, 정확히 말하자면, 여러분께 통계학에 관한 재미있는 것들을 알려드리겠습니다.

00:53

And that's --

53000

1000

그리고 그건

00:54

(Laughter)

54000

1000

(웃음)

00:55

-- that's rather more challenging

55000

2000

그건 약간 더 난감한 것입니다.

00:57

than all the speakers before me and all the ones coming after me.

57000

2000

저보다 먼저 연설했던 모든 사람들과 앞으로 연설할 모든 사람들보다도 말입니다.

00:59

(Laughter)

59000

1000

(웃음)

01:01

One of my senior colleagues told me, when I was a youngster in this profession,

61000

5000

제가 이 분야에서 초보자였을때, 선배 중 한명이 저에게 이렇게 말했습니다.

01:06

rather proudly, that statisticians were people who liked figures

66000

4000

상당히 자랑스럽게 말하기를, 통계학자들은 수치를 좋아하는 사람들인데

01:10

but didn't have the personality skills to become accountants.

70000

3000

그들은 회계사가 될만한 사교성은 가지고 있지 않다고 말이죠.

01:13

(Laughter)

73000

2000

(웃음)

01:15

And there's another in-joke among statisticians, and that's,

75000

3000

그리고 통계학자들에 대한 그들만의 다른 농담도 있는 데요.

01:18

"How do you tell the introverted statistician from the extroverted statistician?"

78000

3000

"내성적인 통계학자와 외향적인 통계학자를 어떻게 구별하는지 아십니까?"

01:21

To which the answer is,

81000

2000

답은 말이죠.

01:23

"The extroverted statistician's the one who looks at the other person's shoes."

83000

5000

외향적인 통계학자는 다른 사람의 신발까지는 쳐다볼 수 있다는 겁니다.

01:28

(Laughter)

88000

3000

(웃음)

01:31

But I want to tell you something useful -- and here it is, so concentrate now.

91000

5000

그렇지만, 전 여러분에게 뭔가 유용한 걸 얘기하고 싶고 -- 그걸 가지고 왔습니다. 그러니, 이젠 집중해 주시기 바랍니다.

01:36

This evening, there's a reception in the University's Museum of Natural History.

96000

3000

오늘 저녁, 대학의 자연사 박물관에서 리셉션이 있습니다.

01:39

And it's a wonderful setting, as I hope you'll find,

99000

2000

그리고, 그건 매우 훌륭하게 준비되어 있다는 걸, 여러분들도 알게 되길 원합니다.

01:41

and a great icon to the best of the Victorian tradition.

101000

5000

그리고 그건 빅토리아 시대의 전통 중 최고 수준의 표상입니다.

01:46

It's very unlikely -- in this special setting, and this collection of people --

106000

5000

이렇게 특별한 설정과 사람들의 모임에서 -- 잘 발생할 거 같지 않은 일이지만

01:51

but you might just find yourself talking to someone you'd rather wish that you weren't.

111000

3000

당신은, 당신이 별로 얘기하고 싶어하지 않는 사람과 얘기하게 된다고 합시다.

01:54

So here's what you do.

114000

2000

그런 경우, 당신은 이렇게 할 수 있습니다.

01:56

When they say to you, "What do you do?" -- you say, "I'm a statistician."

116000

4000

그들이 당신에게 "직업이 뭡니까"라고 물었을 때, "통계학자입니다"라고 대답하는 거죠.

02:00

(Laughter)

120000

1000

(웃음)

02:01

Well, except they've been pre-warned now, and they'll know you're making it up.

121000

4000

뭐... 다만.. 여기서 예외는 그들이 이러한 상황에 대해 미리 주의를 했고, 당신이 거짓말을 했다는 사실을 눈치채는 경우겠죠.

02:05

And then one of two things will happen.

125000

2000

아무튼 그런 대답을 듣고 나면, 다음 두가지 중 하나가 발생할 겁니다.

02:07

They'll either discover their long-lost cousin in the other corner of the room

127000

2000

그들은 갑자기 방의 한 구석에서 오랜동안 헤어졌던 사촌을 발견하고

02:09

and run over and talk to them.

129000

2000

그 사촌에게 달려가서 얘기를 한다든지...

02:11

Or they'll suddenly become parched and/or hungry -- and often both --

131000

3000

아니면, 갑자기 목이 마르거나 배가 고파지거나 -- 때로는 둘 다가 되서 --

02:14

and sprint off for a drink and some food.

134000

2000

물을 마시고 음식을 먹기 위해 달려가겠죠.

02:16

And you'll be left in peace to talk to the person you really want to talk to.

136000

4000

그리고 당신은 다시 평화롭게 남겨져서, 당신이 정말 대화하고 싶은 사람과 대화를 할 수 있게 됩니다.

02:20

It's one of the challenges in our profession to try and explain what we do.

140000

3000

이게 바로 우리같은 직업을 가진 사람들이 우리가 뭘 하는지 설명하려면 받게 되는 도전입니다.

02:23

We're not top on people's lists for dinner party guests and conversations and so on.

143000

5000

우리는 디너 파티 초대 손님이나 대화 상대 등등의 리스트에서, 상위에 있는 인기있는 사람들은 아닙니다.

02:28

And it's something I've never really found a good way of doing.

148000

2000

그리고 이러한 문제는 저로서는 절대 좋은 해결 방법을 찾아내지 못하는 것이기도 합니다.

02:30

But my wife -- who was then my girlfriend --

150000

3000

그런데, 제 아내는 -- 당시에는 여자 친구였는데요. --

02:33

managed it much better than I've ever been able to.

153000

3000

결국 제가 할 수 있는 것보다 더 나은 방법을 찾아냈습니다.

02:36

Many years ago, when we first started going out, she was working for the BBC in Britain,

156000

3000

오래 전에, 우리가 처음 데이트를 하기 시작했을 때, 그녀는 영국의 BBC에서 일하고 있었습니다.

02:39

and I was, at that stage, working in America.

159000

2000

그리고 그 당시 저는 미국에서 일하고 있었죠.

02:41

I was coming back to visit her.

161000

2000

제가 여자친구를 만나기 위해 돌아왔는데요.

02:43

She told this to one of her colleagues, who said, "Well, what does your boyfriend do?"

163000

6000

그녀가 이 사실을 동료에게 말하자, 동료는 "음, 네 남자친구는 뭐하는 사람인데?"라고 물었습니다.

02:49

Sarah thought quite hard about the things I'd explained --

169000

2000

사라는 제가 설명했던 것들을 매우 열심히 생각했죠.

02:51

and she concentrated, in those days, on listening.

171000

4000

그녀는, 적어도 그 당시에는, 제 말을 집중해서 들었었습니다.

02:55

(Laughter)

175000

2000

(웃음)

02:58

Don't tell her I said that.

178000

2000

제가 이렇게 얘기했다고 사라에게 말하지 마세요.

03:00

And she was thinking about the work I did developing mathematical models

180000

4000

그리고, 그녀는 제가 개발하고 있는 수학적 모델들에 관한 작업에 대해 생각했죠.

03:04

for understanding evolution and modern genetics.

184000

3000

그 수학적 모델들은 진화와 현대 유전학을 이해하기 위한 것들이었습니다.

03:07

So when her colleague said, "What does he do?"

187000

3000

따라서, 그녀의 동료가 "네 남자친구는 뭐하는 사람이야?"라고 묻자,

03:10

She paused and said, "He models things."

190000

4000

그녀는 잠시 생각하고는 이렇게 말했습니다. "그 사람은 뭔가 모델링하는 사람이야."

03:14

(Laughter)

194000

1000

(웃음)

03:15

Well, her colleague suddenly got much more interested than I had any right to expect

195000

4000

음, 그 동료는 갑자기 제가 예상할 수 있는 것보다 훨씬 더 흥미를 느꼈습니다.

03:19

and went on and said, "What does he model?"

199000

3000

그래서 계속해서 말했죠. "그 사람은 뭘 모델링하는 데?"

03:22

Well, Sarah thought a little bit more about my work and said, "Genes."

202000

3000

뭐, 사라는 제가 하는 작업에 대해 더 생각해 보고는 이렇게 말했습니다. "유전자들".

03:25

(Laughter)

205000

4000

(웃음)

03:29

"He models genes."

209000

2000

"그는 유전자들을 모델링해."

03:31

That is my first love, and that's what I'll tell you a little bit about.

211000

4000

이게 바로 제 첫사랑 얘기이고, 제가 여러분에게 약간 얘기하게 될 것입니다.

03:35

What I want to do more generally is to get you thinking about

215000

4000

제가 좀더 일반적으로 하고 싶은 것은 여러분이 생각하게 하고 싶습니다.

03:39

the place of uncertainty and randomness and chance in our world,

219000

3000

우리 세계 안의 불확실성(uncertainty)과 무작위성(randomness) 그리고 가능성(chance)의 장소에 대해서 말입니다.

03:42

and how we react to that, and how well we do or don't think about it.

222000

5000

그리고 우리가 그것에 대해 어떻게 반응하는지, 그리고 우리가 그것을 얼마나 잘 또는 잘못 생각하는지에 대해서 말입니다.

03:47

So you've had a pretty easy time up till now --

227000

2000

자, 지금까지는 여러분에게 매우 쉬웠습니다.

03:49

a few laughs, and all that kind of thing -- in the talks to date.

229000

2000

본 발표에서 지금까지는, 약간 웃고, 뭐 그런 것들이었습니다.

03:51

You've got to think, and I'm going to ask you some questions.

231000

3000

여러분은 이제 생각해야 합니다. 그리고 저는 여러분에게 몇가지 질문을 하겠습니다.

03:54

So here's the scene for the first question I'm going to ask you.

234000

2000

여기에 제가 여러분에게 드리는 첫번째 질문이 있습니다.

03:56

Can you imagine tossing a coin successively?

236000

3000

동전을 계속해서 던지는 경우를 상상할 수 있겠습니까?

03:59

And for some reason -- which shall remain rather vague --

239000

3000

그리고 뭔가 모호하게 남아있을 이유 때문에

04:02

we're interested in a particular pattern.

242000

2000

우리는 특정 패턴에 관심을 가지고 있다고 합시다.

04:04

Here's one -- a head, followed by a tail, followed by a tail.

244000

3000

하나는 앞면(head), 뒷면(tail), 뒷면(tail)이 나오는 경우입니다.

04:07

So suppose we toss a coin repeatedly.

247000

3000

즉 우리가 동전을 계속해서 던진다고 합시다.

04:10

Then the pattern, head-tail-tail, that we've suddenly become fixated with happens here.

250000

5000

그러면, 그 패턴, 앞뒤뒤, 즉 HTT가 나오는 여기에서 우리가 갑자기 고정됩니다.

04:15

And you can count: one, two, three, four, five, six, seven, eight, nine, 10 --

255000

4000

그러면 여러분은 이렇게 셀 수 있지요: 하나, 둘, 셋, 넷, 다섯, 여섯, 일곱, 여덟, 아홉, 열 --

04:19

it happens after the 10th toss.

259000

2000

즉 이 패턴은 10 번째 던졌을 때 나왔습니다.

04:21

So you might think there are more interesting things to do, but humor me for the moment.

261000

3000

따라서, 여러분들은 아마 이런 거보다는 뭔가 더 재미있는 일이 있겠지 하고 생각하시겠지만, 일단 제 비위를 좀 더 맞춰 주시기 바랍니다.

04:24

Imagine this half of the audience each get out coins, and they toss them

264000

4000

여기 청중분들 중 절반이 각각 동전을 꺼내들고 던진다고 합시다.

04:28

until they first see the pattern head-tail-tail.

268000

3000

HTT 패턴이 나올 때까지 반복해서 던져 봅니다.

04:31

The first time they do it, maybe it happens after the 10th toss, as here.

271000

2000

처음에는 여기서처럼 아마 10번째에 그런 패턴이 나오겠죠.

04:33

The second time, maybe it's after the fourth toss.

273000

2000

두번째에는 네번째에서 그런 패턴이 나옵다고 합시다.

04:35

The next time, after the 15th toss.

275000

2000

다음에는 15번째...

04:37

So you do that lots and lots of times, and you average those numbers.

277000

3000

즉, 여러분은 이런 실험을 여러번 해보고는 몇번째에 나오는지에 대한 숫자들의 평균을 내봅니다.

04:40

That's what I want this side to think about.

280000

3000

즉, 여러분들 중 이 쪽 절반은 이러한 실험에 대해 생각을 해보길 바랍니다.

04:43

The other half of the audience doesn't like head-tail-tail --

283000

2000

이제 여러분들 중 다른 절반은 HTT 패턴을 싫어한다고 합시다.

04:45

they think, for deep cultural reasons, that's boring --

285000

3000

다른 절반인 여러분들은 심오한 문화적인 차이 때문에 HTT 패턴은 매우 따분하다고 생각합니다.

04:48

and they're much more interested in a different pattern -- head-tail-head.

288000

3000

그리고 다른 패턴에 매우 관심이 많습니다. -- HTH 즉 앞뒤앞 입니다.

04:51

So, on this side, you get out your coins, and you toss and toss and toss.

291000

3000

따라서 다른 한 쪽에서 여러분은 동전을 꺼내서, 던지고, 던지고, 또 던집니다.

04:54

And you count the number of times until the pattern head-tail-head appears

294000

3000

그리고 HTH 패턴이 처음 나오는 때의 횟수를 세고요.

04:57

and you average them. OK?

297000

3000

그것들의 평균을 내는 겁니다. 아시겠죠?

05:00

So on this side, you've got a number --

300000

2000

따라서 이 쪽에서 여러분들은 평균값 하나을 얻었고 --

05:02

you've done it lots of times, so you get it accurately --

302000

2000

여러분들은 이 실험을 충분히 매우 많이 했다고 합시다. 따라서 그 값은 정확하겠죠.

05:04

which is the average number of tosses until head-tail-tail.

304000

3000

그 값은 HTT (앞뒤뒤) 패턴이 처음 발생하는 동전 던지기 횟수의 평균입니다.

05:07

On this side, you've got a number -- the average number of tosses until head-tail-head.

307000

4000

다른 쪽에서는, HTH (앞뒤앞) 패턴이 처음 나오는 동전 던지기 회수의 평균값을 얻었습니다.

05:11

So here's a deep mathematical fact --

311000

2000

자 여기에 수학적으로 심오한 사실 하나가 있습니다.

05:13

if you've got two numbers, one of three things must be true.

313000

3000

두 개의 숫자가 있으므로, 다음 세 가지 중 하나가 사실이어야 합니다.

05:16

Either they're the same, or this one's bigger than this one,

316000

3000

그 두 숫자가 같거나, 아니면 이것이 요것보다 크거나,

05:19

or this one's bigger than that one.

319000

1000

아니면 요것이 이것보다 커야 합니다.

05:20

So what's going on here?

100

320000

3000

자, 어떨까요?

05:23

So you've all got to think about this, and you've all got to vote --

101

323000

2000

여러분 모두 이것에 대해 생각해 보시고, 투표를 합시다.

05:25

and we're not moving on.

102

325000

1000

그리고, 여기서 더이상 진도를 안나갑니다.

05:26

And I don't want to end up in the two-minute silence

103

326000

2000

그리고 여러분 모두가 의사를 표명하도록 여러분에게 시간을 더 주고

05:28

to give you more time to think about it, until everyone's expressed a view. OK.

104

328000

4000

대신 저는 2분이 넘게 침묵하고 있고 싶지는 않군요. 괜찮죠?

05:32

So what you want to do is compare the average number of tosses until we first see

105

332000

4000

따라서 여러분이 할 일은 HTH를 처음 보게 되는 동전 던지기 회수의 평균과

05:36

head-tail-head with the average number of tosses until we first see head-tail-tail.

106

336000

4000

HTT를 처음 보게 되는 동전 던지기 회수의 평균을 비교하는 겁니다.

05:41

Who thinks that A is true --

107

341000

2000

A가 맞다고 생각하시는 분들 있습니까?

05:43

that, on average, it'll take longer to see head-tail-head than head-tail-tail?

108

343000

4000

즉, 평균적으로 HTH를 발견하는 게 HTT를 발견하는 것보다 시간이 더 걸린다는 거죠?

05:47

Who thinks that B is true -- that on average, they're the same?

109

347000

3000

B가 맞다고 생각하시는 분들은요? 즉, 평균적으로 같은 시간이 걸린다는 거죠.

05:51

Who thinks that C is true -- that, on average, it'll take less time

110

351000

2000

C가 맞다고 생각하시는 분들은요? 즉 평균적으로 HTH를 발견하는 게

05:53

to see head-tail-head than head-tail-tail?

111

353000

3000

HTT보다 시간이 덜 걸린다는 겁니다.

05:57

OK, who hasn't voted yet? Because that's really naughty -- I said you had to.

112

357000

3000

좋습니다. 아직 투표 안한 사람 있나요? 왜냐면 안했다면 꽤 무례한 것이거든요. 제가 여러분에게 투표하라고 했잖아요.

06:00

(Laughter)

113

360000

1000

(웃음)

06:02

OK. So most people think B is true.

114

362000

3000

알겠습니다, 대부분 B가 맞다고 생각하는군요.

06:05

And you might be relieved to know even rather distinguished mathematicians think that.

115

365000

3000

그리고 여러분이 알면 안심하실만한 사실은, 상당히 유명한 수학자들도 그렇게 생각한다는 겁니다.

06:08

It's not. A is true here.

116

368000

4000

B가 답이 아니고, A가 답입니다.

06:12

It takes longer, on average.

117

372000

2000

평균적으로 시간이 더 걸립니다.

06:14

In fact, the average number of tosses till head-tail-head is 10

118

374000

2000

사실 HTH가 나오는 평균 던지기는 10 이고요.

06:16

and the average number of tosses until head-tail-tail is eight.

119

376000

5000

HTT가 나오는 평균 던지기는 8 입니다.

06:21

How could that be?

120

381000

2000

어째서일까요?

06:24

Anything different about the two patterns?

121

384000

3000

이 두 패턴 간의 차이점이라도 있나요?

06:30

There is. Head-tail-head overlaps itself.

122

390000

5000

있습니다. HTH는 그 자신 스스로가 겹쳐집니다.

06:35

If you went head-tail-head-tail-head, you can cunningly get two occurrences

123

395000

4000

만일 HTHTH 가 나왔다고 하면, 겨우 다섯번 던졌는 데도

06:39

of the pattern in only five tosses.

124

399000

3000

패턴이 두 번 발생합니다.

06:42

You can't do that with head-tail-tail.

125

402000

2000

HTT의 경우엔 그럴 수 없습니다.

06:44

That turns out to be important.

126

404000

2000

이게 매우 중요한 사실이라는 게 드러났습니다.

06:46

There are two ways of thinking about this.

127

406000

2000

이 점에 대해 생각할 두 가지 방법이 있습니다.

06:48

I'll give you one of them.

128

408000

2000

그 중 하나를 알려드리겠습니다.

06:50

So imagine -- let's suppose we're doing it.

129

410000

2000

우리가 그것을 한다고 상상해 봅시다.

06:52

On this side -- remember, you're excited about head-tail-tail;

130

412000

2000

한 쪽에서는 -- 기억하듯이, HTT 패턴을 좋아하고 있고요.

06:54

you're excited about head-tail-head.

131

414000

2000

여러분들은 HTH 패턴을 좋아한다고 합시다.

06:56

We start tossing a coin, and we get a head --

132

416000

3000

동전을 던지고 H, 즉 앞면이 나왔습니다.

06:59

and you start sitting on the edge of your seat

133

419000

1000

그러면 여러분은 의자 가장자리에 앉습니다.

07:00

because something great and wonderful, or awesome, might be about to happen.

134

420000

5000

왜냐면 뭔가 아름다운, 대단한 일이 발생할 거 같거든요.

07:05

The next toss is a tail -- you get really excited.

135

425000

2000

다시 던집니다. T, 뒷면이 나왔습니다. 여러분들은 흥분합니다.

07:07

The champagne's on ice just next to you; you've got the glasses chilled to celebrate.

136

427000

4000

얼음에 잠긴 샴페인 병이 옆에 있습니다. 샴페인 잔을 차게 해서 축하할 준비를 합니다.

07:11

You're waiting with bated breath for the final toss.

137

431000

2000

마지막 던지기를 숨을 죽이며 기다립니다.

07:13

And if it comes down a head, that's great.

138

433000

2000

그리고 H 앞면이 나왔습니다. 멋집니다.

07:15

You're done, and you celebrate.

139

435000

2000

해냈습니다. 그래서 축하합니다.

07:17

If it's a tail -- well, rather disappointedly, you put the glasses away

140

437000

2000

만일 T 즉 뒷면이 나왔다면 -- 네, 상당히 실망해서, 여러분은 샴페인 잔을 다시 갔다 놓습니다.

07:19

and put the champagne back.

141

439000

2000

그리고 샴페인도 도로 갔다 놓습니다.

07:21

And you keep tossing, to wait for the next head, to get excited.

142

441000

3000

그리고 다시 H, 즉 앞면이 나올 때까지 계속 던집니다.

07:25

On this side, there's a different experience.

143

445000

2000

다른 한 쪽에서는 다른 경험을 합니다.

07:27

It's the same for the first two parts of the sequence.

144

447000

3000

처음 두 번의 던지기에서는 동일한 경험입니다.

07:30

You're a little bit excited with the first head --

145

450000

2000

처음 H가 나오면 약간 흥분하고 --

07:32

you get rather more excited with the next tail.

146

452000

2000

다음에 T가 나오면 상당히 흥분하고

07:34

Then you toss the coin.

147

454000

2000

그리고 나서 동전을 던집니다.

07:36

If it's a tail, you crack open the champagne.

148

456000

3000

만일 T가 나오면 샴페인을 터뜨립니다.

07:39

If it's a head you're disappointed,

149

459000

2000

만일 H가 나오면 실망하겠죠.

07:41

but you're still a third of the way to your pattern again.

150

461000

3000

그러나 그 H 자체로 이미 여러분이 찾고자 하는 패턴의 3분의 1은 왔습니다.

07:44

And that's an informal way of presenting it -- that's why there's a difference.

151

464000

4000

이게 바로 그 생각할 방법을 일상적인 방법으로 제시한 겁니다. 그래서 차이가 있는 겁니다.

07:48

Another way of thinking about it --

152

468000

2000

이것에 대해 생각하는 다른 방법으로는 --

07:50

if we tossed a coin eight million times,

153

470000

2000

만일 동전을 팔백만번 던졌다고 합시다.

07:52

then we'd expect a million head-tail-heads

154

472000

2000

그러면 HTH 패턴을 백만번 정도 기대할 수 있습니다. (2의 3제곱이 8이므로)

07:54

and a million head-tail-tails -- but the head-tail-heads could occur in clumps.

155

474000

7000

그리고 HTT도 백만번 정도입니다 -- 그러나 HTH는 무리 지어 발생할 수도 있습니다.

08:01

So if you want to put a million things down amongst eight million positions

156

481000

2000

따라서 팔백만 개의 위치 중에서 백만개를 내려놓고 싶다면

08:03

and you can have some of them overlapping, the clumps will be further apart.

157

483000

5000

어떤 것들은 서로 겹쳐지게 되므로, 그 무리들은 갈라지게 됩니다.

08:08

It's another way of getting the intuition.

158

488000

2000

이게 바로 직관적으로 이해하는 다른 방법입니다.

08:10

What's the point I want to make?

159

490000

2000

제가 말하고자 하는 핵심이 뭘까요.

08:12

It's a very, very simple example, an easily stated question in probability,

160

492000

4000

이건 바로, 확률에서 쉽게 기술할 수 있는 문제이며, 여러분들같이

08:16

which every -- you're in good company -- everybody gets wrong.

161

496000

3000

거의 모든 사람들이 틀리는, 매우 매우 단순한 예입니다.

08:19

This is my little diversion into my real passion, which is genetics.

162

499000

4000

이건 저의 진정한 열정인 유전학에서 살짝 벗어난 것입니다.

08:23

There's a connection between head-tail-heads and head-tail-tails in genetics,

163

503000

3000

유전학에서 HTH 와 HTT 같의 연결된 바가 있습니다.

08:26

and it's the following.

164

506000

3000

그건 다음과 같습니다.

08:29

When you toss a coin, you get a sequence of heads and tails.

165

509000

3000

동전을 던지면, 앞면과 뒷면의 순열을 얻게 됩니다.

08:32

When you look at DNA, there's a sequence of not two things -- heads and tails --

166

512000

3000

DNA를 보시면, 앞면과 뒷면 두 가지의 순열이 아닙니다.

08:35

but four letters -- As, Gs, Cs and Ts.

167

515000

3000

A, G, C, T 라는 네 문자들의 순열입니다.

08:38

And there are little chemical scissors, called restriction enzymes

168

518000

3000

그리고 제한 효소라 불리는 작은 화학적인 가위들이 있습니다.

08:41

which cut DNA whenever they see particular patterns.

169

521000

2000

이 가위들은 DNA에서 특정 패턴을 만나면 자릅니다.

08:43

And they're an enormously useful tool in modern molecular biology.

170

523000

4000

이것들은 현대 분자 생물학에서 매우 유용한 도구입니다.

08:48

And instead of asking the question, "How long until I see a head-tail-head?" --

171

528000

3000

그리고 "앞면, 뒷면, 앞면을 보려면 얼마나 기다려야 하나?"라고 묻는 대신에

08:51

you can ask, "How big will the chunks be when I use a restriction enzyme

172

531000

3000

"G-A-A-G 라는 패턴을 보면 자르는 제한 효소를 사용한다면, "

08:54

which cuts whenever it sees G-A-A-G, for example?

173

534000

4000

"그 잘라진 덩어리의 길이는 어느정도일까?"라고 물어볼 수 있습니다.

08:58

How long will those chunks be?"

174

538000

2000

그 잘라진 것의 길이는 어느 정도일까요?

09:00

That's a rather trivial connection between probability and genetics.

175

540000

5000

이건 확률과 유전학 간의 다소 단순한 연결 고리입니다.

09:05

There's a much deeper connection, which I don't have time to go into

176

545000

3000

훨씬 더 깊은 연결이 있습니다만, 제가 여기서 다루기에는 다소 시간이 걸립니다.

09:08

and that is that modern genetics is a really exciting area of science.

177

548000

3000

그리고 현대 유전학은 매우 재미있는 과학 분야입니다.

09:11

And we'll hear some talks later in the conference specifically about that.

178

551000

4000

본 컨퍼런스의 나중에 나올 발표 몇 개에서 특히 이 분야에 대해 애기할 것입니다.

09:15

But it turns out that unlocking the secrets in the information generated by modern

179

555000

4000

그러나, 이러한 것은 현대의 실험 기술이 생성하는 정보의 비밀을 푸는 것이라는

09:19

experimental technologies, a key part of that has to do with fairly sophisticated --

180

559000

5000

사실이 드러났고요. 핵심적 부분은 상당히 복잡한 데 --

09:24

you'll be relieved to know that I do something useful in my day job,

181

564000

3000

근데, 제가 제 직업에서 뭔가 유용한 것을 한다는 걸 아셔서 안심하셨을 겁니다만,

09:27

rather more sophisticated than the head-tail-head story --

182

567000

2000

단순한 앞면 뒷면 앞면 얘기보다는 더 복잡하고 --

09:29

but quite sophisticated computer modelings and mathematical modelings

183

569000

4000

상당히 복잡한 컴퓨터 모델링과 수학 모델링

09:33

and modern statistical techniques.

184

573000

2000

그리고 현대 통계 기법들입니다.

09:35

And I will give you two little snippets -- two examples --

185

575000

3000

이제 여러분에게 두 개의 작은 예로

09:38

of projects we're involved in in my group in Oxford,

186

578000

3000

옥스포드의 제 그룹에서 하고 있는 프로젝트들에 대해 소개하겠습니다.

09:41

both of which I think are rather exciting.

187

581000

2000

둘 다 제 생각엔 상당히 재미있습니다.

09:43

You know about the Human Genome Project.

188

583000

2000

휴먼 게놈 프로젝트를 아실 겁니다.

09:45

That was a project which aimed to read one copy of the human genome.

189

585000

4000

그건 사람의 게놈의 한 복사본을 읽으려는 목표의 프로젝트입니다.

09:51

The natural thing to do after you've done that --

190

591000

2000

이것을 해내고 나면 자연스럽게 하고 싶어지는 일은 --

09:53

and that's what this project, the International HapMap Project,

191

593000

2000

그게 바로 국제 햅맵 (HapMap) 프로젝트인데,

09:55

which is a collaboration between labs in five or six different countries.

192

595000

5000

대여섯 개의 서로 다른 나라의 연구실들 간의 협동 과제입니다.

10:00

Think of the Human Genome Project as learning what we've got in common,

193

600000

4000

휴먼 게놈 프로젝트는 우리가 공통적으로 가지고 있는 게 뭔지를 알고자 하는 거라 생각하시고요.

10:04

and the HapMap Project is trying to understand

194

604000

2000

햅맵 프로젝트는 서로 다른 사람들 간의

10:06

where there are differences between different people.

195

606000

2000

차이점이 어디에 있는지를 이해하기 위한 프로젝트입니다.

10:08

Why do we care about that?

196

608000

2000

왜 우리가 그런 것들에 신경써야 할까요?

10:10

Well, there are lots of reasons.

197

610000

2000

글쎄요, 여러 가지 이유가 있습니다.

10:12

The most pressing one is that we want to understand how some differences

198

612000

4000

가장 절박한 이유는 우리는 어떠한 차이가 어떤 사람에게

10:16

make some people susceptible to one disease -- type-2 diabetes, for example --

199

616000

4000

특정 질병에 더 잘 걸리게 하는지를 이해하고 싶어 합니다. 예를 들어 당뇨병 제2형이 그렇습니다.

10:20

and other differences make people more susceptible to heart disease,

200

620000

5000

또한 어떠한 차이가 사람들로 하여금 심장병이나 발작, 자폐증 등에

10:25

or stroke, or autism and so on.

201

625000

2000

더 잘 걸리게 하는지 이해하고 싶습니다.

10:27

That's one big project.

202

627000

2000

그건 하나의 큰 프로젝트입니다.

10:29

There's a second big project,

203

629000

2000

두 번째로 큰 프로젝트가 있는 데,

10:31

recently funded by the Wellcome Trust in this country,

204

631000

2000

최근 미국의 Wellcome Trust 에서 자금을 댄 과제인데요,

10:33

involving very large studies --

205

633000

2000

매우 큰 연구들이 관련되어 있습니다.

10:35

thousands of individuals, with each of eight different diseases,

206

635000

3000

8 개의 서로 다른 질병을 각각 가지고 있는 수천 명도 관련되어 있고요.

10:38

common diseases like type-1 and type-2 diabetes, and coronary heart disease,

207

638000

4000

이 질병들은 당뇨병 제1형, 제2형과 같은 흔한 병들과 관상동맥성 심장질환,

10:42

bipolar disease and so on -- to try and understand the genetics.

208

642000

4000

조울증 등등으로, 유전학을 이해하고자 하는 시도입니다.

10:46

To try and understand what it is about genetic differences that causes the diseases.

209

646000

3000

질병들을 초래하는 유전적인 차이들이 무엇인지를 이해하려는 시도입니다.

10:49

Why do we want to do that?

210

649000

2000

우린 왜 이런 걸 할까요?

10:51

Because we understand very little about most human diseases.

211

651000

3000

왜냐면, 사람의 질병 대부분에 대해 우린 거의 이해하지 못하고 있기 때문입니다.

10:54

We don't know what causes them.

212

654000

2000

우린 뭐가 질병들을 초래하는지 모릅니다.

10:56

And if we can get in at the bottom and understand the genetics,

213

656000

2000

만일 우리가 바닥까지 가서 유전학을 이해한다면,

10:58

we'll have a window on the way the disease works,

214

658000

3000

질병이 작동하는 길로 향하는 창문을 얻을 수 있을 겁니다.

11:01

and a whole new way about thinking about disease therapies

215

661000

2000

또한 질병 치료법, 질병 예방법 등등에 대해 완전히 새롭게

11:03

and preventative treatment and so on.

216

663000

3000

사고하는 방법을 알 수 있을 겁니다.

11:06

So that's, as I said, the little diversion on my main love.

217

666000

3000

따라서, 제가 얘기했듯이, 그건 저의 주된 관심사에서 약간 벗어난 것입니다.

11:09

Back to some of the more mundane issues of thinking about uncertainty.

218

669000

5000

불확실성에 대해 생각하는 좀더 재미없는 사안들 중 일부로 돌아가 보겠습니다.

11:14

Here's another quiz for you --

219

674000

2000

여기 여러분에게 다른 퀴즈를 내겠습니다.

11:16

now suppose we've got a test for a disease

220

676000

2000

우리가 특정 질병에 대한 검사를 한다고 합시다.

11:18

which isn't infallible, but it's pretty good.

221

678000

2000

이 검사는 절대 안틀리는 건 아니지만, 상당히 좋은 검사입니다.

11:20

It gets it right 99 percent of the time.

222

680000

3000

99 퍼센트의 경우로 맞는 답을 제시합니다.

11:23

And I take one of you, or I take someone off the street,

223

683000

3000

그리고 제가 여러분 중 하나를 골라서 또는 길에서 한 사람을 골라서

11:26

and I test them for the disease in question.

224

686000

2000

그 질병을 검사합니다.

11:28

Let's suppose there's a test for HIV -- the virus that causes AIDS --

225

688000

4000

AIDS를 일으키는 바이러스인 HIV 테스트라고 가정하고

11:32

and the test says the person has the disease.

226

692000

3000

테스트 결과 그 사람이 그 병이 있다고 합시다.

11:35

What's the chance that they do?

227

695000

3000

그 사람이 그 병이 있을 가능성은 얼마일까요?

11:38

The test gets it right 99 percent of the time.

228

698000

2000

검사는 99 퍼센트의 경우로 맞는 답을 제시한다고 했습니다.

11:40

So a natural answer is 99 percent.

229

700000

4000

따라서 자연스런 대답은 99 퍼센트입니다.

11:44

Who likes that answer?

230

704000

2000

이 대답이 맘에 드는 분이 있습니까?

11:46

Come on -- everyone's got to get involved.

231

706000

1000

자자 -- 모든 분들이 참여해야 합니다.

11:47

Don't think you don't trust me anymore.

232

707000

2000

저를 더이상 믿지 않겠다고 생각하지 마시기 바랍니다.

11:49

(Laughter)

233

709000

1000

(웃음)

11:50

Well, you're right to be a bit skeptical, because that's not the answer.

234

710000

3000

사실, 회의적인 게 맞습니다. 왜냐면 맞는 답이 아니거든요.

11:53

That's what you might think.

235

713000

2000

그게 바로 여러분이 생각하시고 있는 것일 겁니다.

11:55

It's not the answer, and it's not because it's only part of the story.

236

715000

3000

이게 정답이 아닌 이유는 단지 이야기의 일부일 뿐이라서가 아닙니다.

11:58

It actually depends on how common or how rare the disease is.

237

718000

3000

그건 실제로 병이 얼마나 흔한지 아니면 희귀한지에 따라 달라지기 때문입니다.

12:01

So let me try and illustrate that.

238

721000

2000

따라서, 설명해 보겠습니다.

12:03

Here's a little caricature of a million individuals.

239

723000

4000

여기 백만 명에 대한 작은 그림이 있습니다.

12:07

So let's think about a disease that affects --

240

727000

3000

이런 질병에 대해 생각해 봅시다.

12:10

it's pretty rare, it affects one person in 10,000.

241

730000

2000

이건 매우 희귀한 거라 만 명 중 한 명에게만 영향을 줍니다.

12:12

Amongst these million individuals, most of them are healthy

242

732000

3000

이 백만 명 중, 대부분은 건강합니다.

12:15

and some of them will have the disease.

243

735000

2000

그리고 일부는 그 질병을 가지고 있습니다.

12:17

And in fact, if this is the prevalence of the disease,

244

737000

3000

그리고, 만일 그 질병이 유행한다면, 사실은

12:20

about 100 will have the disease and the rest won't.

245

740000

3000

100명이 질병에 걸리고, 나머지는 그렇지 않다는 겁니다.

12:23

So now suppose we test them all.

246

743000

2000

그러므로, 우리가 그 백만명을 전부 테스트한다고 합시다.

12:25

What happens?

247

745000

2000

어떻게 될까요?

12:27

Well, amongst the 100 who do have the disease,

248

747000

2000

자, 질병을 가진 100 명 중에서

12:29

the test will get it right 99 percent of the time, and 99 will test positive.

249

749000

5000

테스트는 99 퍼센트 맞으므로, 99명은 양성으로 나오고

12:34

Amongst all these other people who don't have the disease,

250

754000

2000

질병이 없는 다른 사람들 중에서는

12:36

the test will get it right 99 percent of the time.

251

756000

3000

테스트는 역시 99 퍼센트 맞으므로

12:39

It'll only get it wrong one percent of the time.

252

759000

2000

단지 1퍼센트만 잘못 결과를 낼 것입니다.

12:41

But there are so many of them that there'll be an enormous number of false positives.

253

761000

4000

그러나 질병이 없는 사람들이 훨씬 많으므로, 가짜 양성(false positive)이 엄청나게 나올 것입니다.

12:45

Put that another way --

254

765000

2000

다시 말해서 --

12:47

of all of them who test positive -- so here they are, the individuals involved --

255

767000

5000

양성으로 판정된 사람들 중에서, -- 여기에 그들이 있죠 -- 관련된 사람들 중에

12:52

less than one in 100 actually have the disease.

256

772000

5000

100 분의 1 이하가 실제로 질병을 가지고 있다는 겁니다.

12:57

So even though we think the test is accurate, the important part of the story is

257

777000

4000

따라서, 우리가 그 테스트가 정확하다고 생각하더라도, 이 이야기의 중요한 부분은

13:01

there's another bit of information we need.

258

781000

3000

우리는 다른 정보가 필요하다는 겁니다.

13:04

Here's the key intuition.

259

784000

2000

여기에 중요한 직감이 있습니다.

13:07

What we have to do, once we know the test is positive,

260

787000

3000

일단 결과가 양성임을 안다면, 반드시 해야 할 일은

13:10

is to weigh up the plausibility, or the likelihood, of two competing explanations.

261

790000

6000

두가지 가능한 설명에 대한 타당성 또는 가능성를 재봐야 한다는 겁니다.

13:16

Each of those explanations has a likely bit and an unlikely bit.

262

796000

3000

각각의 설명에는 가능성이 있는 부분과 그렇지 않은 부분이 있습니다.

13:19

One explanation is that the person doesn't have the disease --

263

799000

3000

첫번째 설명은 그 사람이 질병을 가지고 있지 않다는 건데 --

13:22

that's overwhelmingly likely, if you pick someone at random --

264

802000

3000

만일 그 사람을 무작위로 뽑은 거라면 그럴 가능성이 매우 있습니다.

13:25

but the test gets it wrong, which is unlikely.

265

805000

3000

그러나, 그건 테스트가 틀렸다는 건 데, 그럴 가능성이 없어 보입니다.

13:29

The other explanation is that the person does have the disease -- that's unlikely --

266

809000

3000

다른 설명은 그 사람이 질병을 가진 건데, -- 그럴 가능성은 없어 보이지만 --

13:32

but the test gets it right, which is likely.

267

812000

3000

테스트가 제대로 맞추었다는 것으로 그럴 가능성은 있어 보입니다.

13:35

And the number we end up with --

268

815000

2000

그리고 결국 우리가 계산을 마친 숫자를 보면 --

13:37

that number which is a little bit less than one in 100 --

269

817000

3000

100 분의 1보다 약간 더 작은 숫자인데 --

13:40

is to do with how likely one of those explanations is relative to the other.

270

820000

6000

이 두 설명들 중 하나가 다른 하나와 비교하여 얼마나 가능성을 가졌는지와 관계가 있습니다.

13:46

Each of them taken together is unlikely.

271

826000

2000

이 두 설명을 같이 고려하는 것은 가능성이 없어 보입니다.

13:49

Here's a more topical example of exactly the same thing.

272

829000

3000

이와 정확히 같은 경우로 좀더 시사적인 예를 보도록 합시다.

13:52

Those of you in Britain will know about what's become rather a celebrated case

273

832000

4000

이 중 영국에 계신 분은 이제 어느 정도 유명해진 사건으로 자신의 두 아이가 갑자기 사망한

13:56

of a woman called Sally Clark, who had two babies who died suddenly.

274

836000

5000

샐리 클라크의 경우입니다. (변호사이며, MSbP에 의한 유아 살해 혐의로 3년간 복역하다 무죄로 풀려났으나, 그로 인한 알콜 중독으로 사망)

14:01

And initially, it was thought that they died of what's known informally as "cot death,"

275

841000

4000

그리고 초기에는, 이 아이들이 유아 돌연사로 죽었다고 여겨졌습니다.

14:05

and more formally as "Sudden Infant Death Syndrome."

276

845000

3000

더 정확히 말하면 유아 돌연사 증후군이죠.

14:08

For various reasons, she was later charged with murder.

277

848000

2000

여러 가지 이유로, 그녀는 나중에 살인 혐의를 받게 됩니다.

14:10

And at the trial, her trial, a very distinguished pediatrician gave evidence

278

850000

4000

그리고 재판에서, 그녀의 재판에서, 매우 유명한 소아과 의사가 (Roy Meadow: MSbP 증상을 최초로 주장한 사람)

14:14

that the chance of two cot deaths, innocent deaths, in a family like hers --

279

854000

5000

두 건의 유아 돌연사, 즉 누구도 죄가 없는 사망 사건이 그녀의 가족같이 전문직에 종사하고 흡연을 안하는 가족에서

14:19

which was professional and non-smoking -- was one in 73 million.

280

859000

6000

일어날 우연성은 7천3백만 분의 1이라는 증거를 제시합니다.

14:26

To cut a long story short, she was convicted at the time.

281

866000

3000

긴 얘기를 짧게 말하자면, 그녀는 그 당시에는 유죄를 선고받았습니다.

14:29

Later, and fairly recently, acquitted on appeal -- in fact, on the second appeal.

282

869000

5000

나중에, 아주 최근 들어, 항소심에서 무죄를 인정받았습니다. 실제로는 두 번째 항소심이었죠.

14:34

And just to set it in context, you can imagine how awful it is for someone

283

874000

4000

그리고 이 맥락을 고려하자면, 이것이 얼마나 끔찍한 것인지 이해할 수 있을텐데요.

14:38

to have lost one child, and then two, if they're innocent,

284

878000

3000

아무 죄도 없는 한 사람이 자신의 첫 아이를 잃고, 두 번째 아이를 잃고,

14:41

to be convicted of murdering them.

285

881000

2000

그들을 살해했다고 유죄를 선고 받은 겁니다.

14:43

To be put through the stress of the trial, convicted of murdering them --

286

883000

2000

재판 과정과 아기들을 죽였다고 유죄 선고를 받은 정신적 고통과

14:45

and to spend time in a women's prison, where all the other prisoners

287

885000

3000

여성 감옥에서 당신을 자식들을 죽인 사람으로 간주할 다른 죄수들과

14:48

think you killed your children -- is a really awful thing to happen to someone.

288

888000

5000

지내야 하는 스트레스는 -- 사람에게 일어날 수 있는 진정으로 끔찍한 일입니다.

14:53

And it happened in large part here because the expert got the statistics

289

893000

5000

그리고, 이 일은 많은 곳에서 일어나는 데, 그 이유는 전문가들이 통계를

14:58

horribly wrong, in two different ways.

290

898000

3000

두 가지 다른 방식으로 지독하게 잘못 받아들이기 때문입니다.

15:01

So where did he get the one in 73 million number?

291

901000

4000

자, 그 사람은 어디서 7천3백만이라는 숫자를 얻었을까요?

15:05

He looked at some research, which said the chance of one cot death in a family

292

905000

3000

그 사람은 어떤 연구를 참고했는 데, 그 연구에서는 샐리 클라크와 비슷한 가족에서

15:08

like Sally Clark's is about one in 8,500.

293

908000

5000

아이가 유아 돌연사할 가능성이 8,500 분의 1이라는 걸 본 겁니다.

15:13

So he said, "I'll assume that if you have one cot death in a family,

294

913000

4000

따라서, 그는 "만일 가족에서 유아 돌연사가 한 번 일어난 다면, "

15:17

the chance of a second child dying from cot death aren't changed."

295

917000

4000

"또 한 번 일어날 확률은 변하지 않는다고 가정할 수 있다."라고 말한 겁니다.

15:21

So that's what statisticians would call an assumption of independence.

296

921000

3000

그게 바로 통계학자들이 말하는 이른바 독립성의 가정입니다.

15:24

It's like saying, "If you toss a coin and get a head the first time,

297

924000

2000

이건 마치 "만일 당신이 동전을 던지고, 처음에 앞면이 나왔다면 이 사실은"

15:26

that won't affect the chance of getting a head the second time."

298

926000

3000

"두 번째에 동전의 앞면이 나올 가능성에 영향을 주지 않는다"는 말과 같습니다.

15:29

So if you toss a coin twice, the chance of getting a head twice are a half --

299

929000

5000

따라서, 만일 동전을 두 번 던진다면, 앞면이 두번 나올 확률은 2분의 1인 첫번째 던졌을 때의 가능성에

15:34

that's the chance the first time -- times a half -- the chance a second time.

300

934000

3000

역시 2분의 1인 두번째 던졌을 때의 가능성을 곱한 것이 됩니다.

15:37

So he said, "Here,

301

937000

2000

따라서 그 소아과 의사는 말하길 "자, 가정해 봅시다 --

15:39

I'll assume that these events are independent.

302

939000

4000

이러한 사건들이 서로 독립적이라고 가정하겠습니다.

15:43

When you multiply 8,500 together twice,

303

943000

2000

그러면, 8,500 을 두 번 곱하게 되는 데,

15:45

you get about 73 million."

304

945000

2000

7천3백만을 얻게 되는 겁니다."

15:47

And none of this was stated to the court as an assumption

305

947000

2000

그리고 이 주장이 가정이라는 점은, 법정에서나 배심원들에게

15:49

or presented to the jury that way.

306

949000

2000

제시되지 않았습니다.

15:52

Unfortunately here -- and, really, regrettably --

307

952000

3000

여기서 불행히도 -- 그리고 너무도 유감스럽게도 --

15:55

first of all, in a situation like this you'd have to verify it empirically.

308

955000

4000

무엇보다 먼저, 이러한 상황이라면, 제시된 주장을 실험적으로 확인해야 합니다.

15:59

And secondly, it's palpably false.

309

959000

2000

두번째로, 그 주장은 명백히 거짓입니다.

16:02

There are lots and lots of things that we don't know about sudden infant deaths.

310

962000

5000

우리가 유아 돌연사에 대해 모르는 건 너무도 너무도 많습니다.

16:07

It might well be that there are environmental factors that we're not aware of,

311

967000

3000

우리가 모르는 환경적인 요소가 존재할 수도 있습니다.

16:10

and it's pretty likely to be the case that there are

312

970000

2000

또한 우리가 모르는 유전적인 요소가 존재할 가능성도

16:12

genetic factors we're not aware of.

313

972000

2000

매우 높습니다.

16:14

So if a family suffers from one cot death, you'd put them in a high-risk group.

314

974000

3000

따라서, 어떤 가족이 유아 돌연사로 고통받는다면, 그 가족은 고위험군으로 분류해야 합니다.

16:17

They've probably got these environmental risk factors

315

977000

2000

그 가족은 우리가 모르는 환경적 위험 요소를 가지고 있고

16:19

and/or genetic risk factors we don't know about.

316

979000

3000

또는 유전적인 위험 요소를 가지고 있을 수도 있습니다.

16:22

And to argue, then, that the chance of a second death is as if you didn't know

317

982000

3000

그리고 그런 정보를 모르면서, 두번째 죽음의 가능성에 대해 논하는 것은

16:25

that information is really silly.

318

985000

3000

정말 어리석은 짓입니다.

16:28

It's worse than silly -- it's really bad science.

319

988000

4000

그건 어리석은 거보다 더 나쁩니다. 그건 정말 잘못된 과학입니다.

16:32

Nonetheless, that's how it was presented, and at trial nobody even argued it.

320

992000

5000

그럼에도, 상황은 그런 식으로 흘러갔고, 재판에서 아무도 논쟁하지 않았습니다.

16:37

That's the first problem.

321

997000

2000

이것이 첫번째 문제입니다.

16:39

The second problem is, what does the number of one in 73 million mean?

322

999000

4000

두번째 문제는 도대체 7천3백만분의 일이라는 숫자가 의미하는 게 뭐냐는 겁니다.

16:43

So after Sally Clark was convicted --

323

1003000

2000

샐리 클라크가 유죄선고를 받고 나서 --

16:45

you can imagine, it made rather a splash in the press --

324

1005000

4000

예상하실 수 있겠지만, 언론은 이걸 특정으로 만들었고 --

16:49

one of the journalists from one of Britain's more reputable newspapers wrote that

325

1009000

7000

영국의 유명한 신문의 한 기자는 이렇게 썼습니다.

16:56

what the expert had said was,

326

1016000

2000

전문가가 말한 바는

16:58

"The chance that she was innocent was one in 73 million."

327

1018000

5000

"그녀가 무죄일 확률이 7천3백만 분의 1이라는 것이다."

17:03

Now, that's a logical error.

328

1023000

2000

자, 이건 논리적인 오류입니다.

17:05

It's exactly the same logical error as the logical error of thinking that

329

1025000

3000

이건 99 퍼센트 정확한 질병 테스트를 하고 나서

17:08

after the disease test, which is 99 percent accurate,

330

1028000

2000

질병에 걸렸을 가능성이 99 퍼센트라고 생각하는

17:10

the chance of having the disease is 99 percent.

331

1030000

4000

것과 정확히 똑같은 논리적인 오류입니다.

17:14

In the disease example, we had to bear in mind two things,

332

1034000

4000

질병에 대한 예에서, 우린 두가지 경우를 명심해야 했습니다.

17:18

one of which was the possibility that the test got it right or not.

333

1038000

4000

하나는 테스트가 맞는지 틀리는지의 가능성에 대한 것이고요.

17:22

And the other one was the chance, a priori, that the person had the disease or not.

334

1042000

4000

다른 하나는, 테스트 이전에 그 사람이 질병을 가지고 있는지 아닌지에 대한 가능성입니다.

17:26

It's exactly the same in this context.

335

1046000

3000

이 맥락에 따르면 정확히 같은 것입니다.

17:29

There are two things involved -- two parts to the explanation.

336

1049000

4000

두 가지 것이 전체 설명의 두 부분에 관련되어 있습니다.

17:33

We want to know how likely, or relatively how likely, two different explanations are.

337

1053000

4000

두가지 가능한 설명에 대해, 우린 얼마나 가능성이 있는지, 상대적으로 얼마나 가능성이 있는지 알고 싶어합니다.

17:37

One of them is that Sally Clark was innocent --

338

1057000

3000

그 중 하나는 샐리 클라크가 무죄다라는 거고요.

17:40

which is, a priori, overwhelmingly likely --

339

1060000

2000

그건 원래부터 매우 가능성이 있는 겁니다.

17:42

most mothers don't kill their children.

340

1062000

3000

대부분의 어머니들은 자기 자식을 살해하지 않습니다.

17:45

And the second part of the explanation

341

1065000

2000

그리고 그 설명의 두번째 부분은

17:47

is that she suffered an incredibly unlikely event.

342

1067000

3000

그녀가 믿을 수 없을 정도로 있을 수 없는 사건들로 괴로워하고 있었다는 겁니다.

17:50

Not as unlikely as one in 73 million, but nonetheless rather unlikely.

343

1070000

4000

7천3백만분의 1만큼 있을 수 없는 게 아니라, 그럼에도 더더욱 있을 거 같지 않은 사건으로 말입니다.

17:54

The other explanation is that she was guilty.

344

1074000

2000

다른 설명을 보자면, 그녀는 유죄입니다.

17:56

Now, we probably think a priori that's unlikely.

345

1076000

2000

이건 원래부터 가능성이 희박합니다.

17:58

And we certainly should think in the context of a criminal trial

346

1078000

3000

그리고, 우린 범죄자의 재판이라는 상황에 기대어

18:01

that that's unlikely, because of the presumption of innocence.

347

1081000

3000

그 가능성이 거의 없다고 생각해야 합니다. 무죄추정의 원칙 때문이죠.

18:04

And then if she were trying to kill the children, she succeeded.

348

1084000

4000

그리고, 만일 그녀가 자식들을 죽이려 했었다면, 그녀는 성공한 겁니다.

18:08

So the chance that she's innocent isn't one in 73 million.

349

1088000

4000

그리고, 그녀가 무죄일 가능성은 7천3백만 분의 1이 아닙니다.

18:12

We don't know what it is.

350

1092000

2000

우린 그 가능성이 얼마인지 모릅니다.

18:14

It has to do with weighing up the strength of the other evidence against her

351

1094000

4000

그 가능성은 그녀가 유죄라는 다른 증거들의 중요성과 통계적인 증거를

18:18

and the statistical evidence.

352

1098000

2000

같이 저울질한 결과와 관계가 있는 것입니다.

18:20

We know the children died.

353

1100000

2000

아이들이 죽었다는 것을 우린 압니다.

18:22

What matters is how likely or unlikely, relative to each other,

354

1102000

4000

중요한 것은 두 가지 가능한 설명이 서로 상대적으로 얼마나 가능성이

18:26

the two explanations are.

355

1106000

2000

있는지 그렇지 않은지입니다.

18:28

And they're both implausible.

356

1108000

2000

그리고 둘 다 믿기지 않습니다.

18:31

There's a situation where errors in statistics had really profound

357

1111000

4000

통계적인 오류가 진정으로 심오하고, 진정으로 불행한 결과를

18:35

and really unfortunate consequences.

358

1115000

3000

낳은 상황이 되었습니다.

18:38

In fact, there are two other women who were convicted on the basis of the

359

1118000

2000

사실, 이 소아과 의사가 제시한 증거를 기반으로 다른 두 여성이 유죄 선고를 받았고

18:40

evidence of this pediatrician, who have subsequently been released on appeal.

360

1120000

4000

나중에, 항소심을 통해 풀려났습니다.

18:44

Many cases were reviewed.

361

1124000

2000

많은 사건들이 재조사되었습니다.

18:46

And it's particularly topical because he's currently facing a disrepute charge

362

1126000

4000

이건 특히 시사적인데, 그 의사는 현재 영국 일반의사협회에 불명예를

18:50

at Britain's General Medical Council.

363

1130000

3000

안긴 혐의로 기소되었기 때문입니다.

18:53

So just to conclude -- what are the take-home messages from this?

364

1133000

4000

이제 결론을 내자면 -- 이 발표에서 집에 가져갈 메시지가 뭘까요?

18:57

Well, we know that randomness and uncertainty and chance

365

1137000

4000

자, 우린 무작위성, 불확실성, 그리고 가능성이 우리 매일매일의 생활의

19:01

are very much a part of our everyday life.

366

1141000

3000

많은 부분임을 압니다.

19:04

It's also true -- and, although, you, as a collective, are very special in many ways,

367

1144000

5000

또한 -- 비록 여러분들은 여러 가지 방향으로 매우 특별한 분들이지만,

19:09

you're completely typical in not getting the examples I gave right.

368

1149000

4000

제가 제시한 예들에 제대로 대답하지 못한 전형적인 사람들입니다.

19:13

It's very well documented that people get things wrong.

369

1153000

3000

사람들이 이러한 질문들에 제대로 대답하지 못한다는 건, 관련 논문들에도 잘 나와 있습니다.

19:16

They make errors of logic in reasoning with uncertainty.

370

1156000

3000

사람들은 불확실성 하에서 논리적인 추론을 할 때 오류를 저지릅니다.

19:20

We can cope with the subtleties of language brilliantly --

371

1160000

2000

우린 언어의 미묘함에 훌륭하게 대처해야 하고 --

19:22

and there are interesting evolutionary questions about how we got here.

372

1162000

3000

우리가 어떻게 이렇게 되었는지에 대해 흥미로운 진화적인 질문들이 있습니다.

19:25

We are not good at reasoning with uncertainty.

373

1165000

3000

우린 불확실성 하에서의 추론을 잘 못합니다.

19:28

That's an issue in our everyday lives.

374

1168000

2000

그건 우리가 매일 생활하는 바에 있어 문제가 됩니다.

19:30

As you've heard from many of the talks, statistics underpins an enormous amount

375

1170000

3000

이러한 많은 발표들에서 들으셨듯이, 통계는 과학 연구 -- 특히 사회과학이나 의학에서

19:33

of research in science -- in social science, in medicine

376

1173000

3000

많은 것들에 대해 뒷받침하는 근거를 제공합니다.

19:36

and indeed, quite a lot of industry.

377

1176000

2000

그리고 산업계의 많은 부분에서도 실제로 그러합니다.

19:38

All of quality control, which has had a major impact on industrial processing,

378

1178000

4000

산업 처리 과정에 주요한 영향을 주는 품질 관리의 모든 것이

19:42

is underpinned by statistics.

379

1182000

2000

통계에 의해 근거를 얻습니다.

19:44

It's something we're bad at doing.

380

1184000

2000

통계는 우리가 제대로 못해내는 것입니다.

19:46

At the very least, we should recognize that, and we tend not to.

381

1186000

3000

최소한 적어도, 우린 그 사실을 인식해야 하는 데, 그러지 못합니다.

19:49

To go back to the legal context, at the Sally Clark trial

382

1189000

4000

샐리 클라크의 재판에 대한 법률적인 상황으로 돌아가 보면,

19:53

all of the lawyers just accepted what the expert said.

383

1193000

4000

모든 변호사들이 전문가가 말한 것을 그냥 받아들였습니다.

19:57

So if a pediatrician had come out and said to a jury,

384

1197000

2000

만일 소아과 의사가 와서 배심원에게 말하길,

19:59

"I know how to build bridges. I've built one down the road.

385

1199000

3000

"난 다리를 어떻게 건설하는지 압니다. 저 길 아래 다리 하나를 지었습니다."

20:02

Please drive your car home over it,"

386

1202000

2000

"그 위로 차를 타고 지나서 집으로 가시지요."라고 한다면,

20:04

they would have said, "Well, pediatricians don't know how to build bridges.

387

1204000

2000

배심원들은 "허, 소아과 의사는 다리를 건설할 줄 몰라."

20:06

That's what engineers do."

388

1206000

2000

"그건 엔지니어가 할 일이지"라고 하겠죠.

20:08

On the other hand, he came out and effectively said, or implied,

389

1208000

3000

반면, 그는 나와서 효과적으로 주장했거나, 최소한 암시하기를,

20:11

"I know how to reason with uncertainty. I know how to do statistics."

390

1211000

3000

"나는 불확실성 하에서도 추론을 할 줄 압니다. 나는 통계를 할 줄 알거든요."라고 했고,

20:14

And everyone said, "Well, that's fine. He's an expert."

391

1214000

3000

모든 사람들이 "음, 그거 괜찮네. 그는 전문가니까."라고 말했습니다.

20:17

So we need to understand where our competence is and isn't.

392

1217000

3000

따라서, 우린 무엇이 우리가 잘하는 건지, 아닌지를 이해할 필요가 있습니다.

20:20

Exactly the same kinds of issues arose in the early days of DNA profiling,

393

1220000

4000

정확히 똑같은 문제들이 DNA 프로파일링의 초기에 발생했는 데,

20:24

when scientists, and lawyers and in some cases judges,

394

1224000

4000

과학자들, 법률가들 그리고 어떤 경우에는 판사들이,

20:28

routinely misrepresented evidence.

395

1228000

3000

상투적으로 증거를 잘못 제시했습니다.

20:32

Usually -- one hopes -- innocently, but misrepresented evidence.

396

1232000

3000

일반적으로 -- 사람들이 믿고 싶기를 -- 순전히 실수로 증거를 잘못 제시한 것입니다.

20:35

Forensic scientists said, "The chance that this guy's innocent is one in three million."

397

1235000

5000

법의학 과학자들이 "그 사람이 무고일 가능성은 3백만분의 1이다."라고 말했습니다.

20:40

Even if you believe the number, just like the 73 million to one,

398

1240000

2000

7천3백만분의 1처럼, 그 숫자를 여러분이 믿는다고 해도,

20:42

that's not what it meant.

399

1242000

2000

그건 그런 뜻이 아닙니다.

20:44

And there have been celebrated appeal cases

400

1244000

2000

그리고 그러한 것 때문에 영국과 다른 나라들에서는

20:46

in Britain and elsewhere because of that.

401

1246000

2000

몇 건의 유명한 항소심들이 있었습니다.

20:48

And just to finish in the context of the legal system.

402

1248000

3000

법률 시스템의 맥락에서 이 발표를 마치자면...

20:51

It's all very well to say, "Let's do our best to present the evidence."

403

1251000

4000

"증거를 제시하기 위해 노력하자"고 말하는 건 매우 좋습니다.

20:55

But more and more, in cases of DNA profiling -- this is another one --

404

1255000

3000

그러나 더더욱, DNA 프로파일링의 경우 -- 이건 다른 경우입니다.

20:58

we expect juries, who are ordinary people --

405

1258000

3000

우린 배심원들이 평범한 사람들로 --

21:01

and it's documented they're very bad at this --

406

1261000

2000

그들이 불확실성 하에서의 추론에 약하다는 건 논문에 잘 나와있으므로 --

21:03

we expect juries to be able to cope with the sorts of reasoning that goes on.

407

1263000

4000

우린 배심원들이 이러한 종류의 추론에 대응할 수 있기를 기대해야 합니다.

21:07

In other spheres of life, if people argued -- well, except possibly for politics --

408

1267000

5000

인생의 다른 부분, 만일 사람들이 주장한다면 -- 아, 아마도 정치는 빼고요.

21:12

but in other spheres of life, if people argued illogically,

409

1272000

2000

그러나, 인생의 다른 부분에서, 만일 사람들이 비논리적으로 주장한다면,

21:14

we'd say that's not a good thing.

410

1274000

2000

우린 그건 좋은 게 아니다라고 말해야 합니다.

21:16

We sort of expect it of politicians and don't hope for much more.

411

1276000

4000

우린 그런 건 정치인들에게나 기대하고, 그 이상은 기대도 안합니다.

21:20

In the case of uncertainty, we get it wrong all the time --

412

1280000

3000

불확실성의 경우, 우린 언제나 제대로 못해냅니다.

21:23

and at the very least, we should be aware of that,

413

1283000

2000

그리고 적어도 최소한, 우린 그걸 알고 있어야 합니다.

21:25

and ideally, we might try and do something about it.

414

1285000

2000

그리고 이상적으로, 우린 이것에 대해 뭔가 하려고 노력해야 할 것입니다.

21:27

Thanks very much.

415

1287000

1000

매우 감사합니다.

New videos

06:16

How important is politeness? ⏲️ 6 Minute English

07:44

North Korea’s secrets revealed by phone: Study:...

17:30

Advanced English Learning: Speaking Practice

03:48

What can you do? Easy English Conversations 💬 ...

12:13

Speak English Confidently: Daily Tricks & Tips 🧠

13:00

Practice English Conversation (Family life) Imp...

10:22

VOCABULARY English Speaking Practice

11:45

3 Simple Steps to Become Fluent in English

Original video on YouTube.com

Peter Donnelly: How stats fool juries - YouTube

이 웹사이트 정보

이 사이트는 영어 학습에 유용한 YouTube 동영상을 소개합니다. 전 세계 최고의 선생님들이 가르치는 영어 수업을 보게 될 것입니다. 각 동영상 페이지에 표시되는 영어 자막을 더블 클릭하면 그곳에서 동영상이 재생됩니다. 비디오 재생에 맞춰 자막이 스크롤됩니다. 의견이나 요청이 있는 경우 이 문의 양식을 사용하여 문의하십시오.

https://forms.gle/WvT1wiN1qDtmnspy7

Playback speed

Subtitle font size

Peter Donnelly: How stats fool juries

New videos

Peter Donnelly: How stats fool juries

New videos

Original video on YouTube.com