How statistics can be misleading - Mark Liddell

Kako statistika može da bude varljiva - Mark Lidel (Mark Liddell)

1,427,995 views

2016-01-14 ・ TED-Ed


New videos

How statistics can be misleading - Mark Liddell

Kako statistika može da bude varljiva - Mark Lidel (Mark Liddell)

1,427,995 views ・ 2016-01-14

TED-Ed


Please double-click on the English subtitles below to play the video.

Prevodilac: Milenka Okuka Lektor: Mile Živković
00:06
Statistics are persuasive.
0
6636
2441
Statistika je uverljiva,
00:09
So much so that people, organizations, and whole countries
1
9077
3464
toliko da ljudi, organizacije i čitave države zasnivaju
00:12
base some of their most important decisions on organized data.
2
12541
5206
neke od svojih najvažnijih odluka na organzovanim podacima.
00:17
But there's a problem with that.
3
17747
1737
Međutim, tu imamo problem.
00:19
Any set of statistics might have something lurking inside it,
4
19484
3817
Svaki statistički skup može da ima nešto skriveno u sebi,
00:23
something that can turn the results completely upside down.
5
23301
3950
nešto što može u potpunosti da preokrene rezultate.
00:27
For example, imagine you need to choose between two hospitals
6
27251
3669
Na primer, zamislite da morate da izaberete između dve bolnice
00:30
for an elderly relative's surgery.
7
30920
2817
zbog operacije starijeg rođaka.
00:33
Out of each hospital's last 1000 patient's,
8
33737
2697
Od poslednjih 1000 pacijenata iz svake bolnice,
00:36
900 survived at Hospital A,
9
36434
3178
u bolnici A je preživelo 900,
00:39
while only 800 survived at Hospital B.
10
39612
3409
dok je u bolnici B preživelo svega 800.
00:43
So it looks like Hospital A is the better choice.
11
43021
3149
Pa se čini da je bolnica A bolji izbor.
00:46
But before you make your decision,
12
46170
1673
No, pre nego što se odlučite,
00:47
remember that not all patients arrive at the hospital
13
47843
3568
zapamtite da svi pacijenti ne stižu u bolnicu
00:51
with the same level of health.
14
51411
2400
istog zdravstvenog stanja.
00:53
And if we divide each hospital's last 1000 patients
15
53811
2892
A ako podelimo poslednjih 1000 pacijenata iz svake bolnice
00:56
into those who arrived in good health and those who arrived in poor health,
16
56703
4429
na one koji su stigli dobrog zdravlja i one koji su stigli lošeg zdravlja,
01:01
the picture starts to look very different.
17
61132
2640
slika počinje da izgleda veoma drugačije.
01:03
Hospital A had only 100 patients who arrived in poor health,
18
63772
4077
Bolnica A je imala samo 100 pacijenata koji su stigli lošeg zdravlja,
01:07
of which 30 survived.
19
67849
2476
od kojih je 30 preživelo.
01:10
But Hospital B had 400, and they were able to save 210.
20
70325
4527
Međutim, bolnica B je imala 400 takvih i uspeli su da spase 210.
01:14
So Hospital B is the better choice
21
74852
2317
Pa je bolnica B bolji izbor
01:17
for patients who arrive at hospital in poor health,
22
77169
3572
za pacijente koji stižu u bolnicu lošeg zdravlja,
01:20
with a survival rate of 52.5%.
23
80741
3785
sa stopom preživelih od 52,5%.
01:24
And what if your relative's health is good when she arrives at the hospital?
24
84526
3919
A šta ako je zdravlje vašeg rođaka dobro kad stigne u bolnicu?
01:28
Strangely enough, Hospital B is still the better choice,
25
88445
3826
Zvuči čudno, ali bolnica B je i dalje bolji izbor,
01:32
with a survival rate of over 98%.
26
92271
3405
sa stopom preživelih preko 98%.
01:35
So how can Hospital A have a better overall survival rate
27
95676
3057
Pa, kako može bolnica A da ima bolju ukupnu stopu preživelih,
01:38
if Hospital B has better survival rates for patients in each of the two groups?
28
98733
6097
ako bolnica B ima bolje stope preživelih u obe grupe pacijenata?
01:44
What we've stumbled upon is a case of Simpson's paradox,
29
104830
3759
Ono na šta smo nabasali je slučaj Simpsonovog paradoksa,
01:48
where the same set of data can appear to show opposite trends
30
108589
3310
gde ista grupa podataka može da pokaže suprotne trendove,
01:51
depending on how it's grouped.
31
111899
2765
u zavisnosti od toga kako su grupisani.
01:54
This often occurs when aggregated data hides a conditional variable,
32
114664
4080
Ovo se često dešava kad skup podataka skriva uslovnu varijablu,
01:58
sometimes known as a lurking variable,
33
118744
2633
koju ponekad zovu skrivenom varijablom,
02:01
which is a hidden additional factor that significantly influences results.
34
121377
5207
a to je skriveni dodatni faktor koji značajno utiče na rezultate.
02:06
Here, the hidden factor is the relative proportion of patients
35
126584
3439
Ovde je skriveni faktor, relativna srazmera pacijenata
02:10
who arrive in good or poor health.
36
130023
3241
koji stižu dobrog ili lošeg zdravlja.
02:13
Simpson's paradox isn't just a hypothetical scenario.
37
133264
3280
Simpsonov paradoks nije prosto hipotetičan scenario.
02:16
It pops up from time to time in the real world,
38
136544
2380
S vremena na vreme se pojavljuje u stvarnom svetu,
02:18
sometimes in important contexts.
39
138924
3208
ponekad u bitnim kontekstima.
02:22
One study in the UK appeared to show
40
142132
1998
Jedno istraživanje u Britaniji je pokazalo
02:24
that smokers had a higher survival rate than nonsmokers
41
144130
3470
da pušači imaju veću stopu preživelih od nepušača
02:27
over a twenty-year time period.
42
147600
2246
tokom perioda od 20 godina.
02:29
That is, until dividing the participants by age group
43
149846
3461
Sve dok učesnici u istraživanju nisu podeljeni po starosnim grupama,
02:33
showed that the nonsmokers were significantly older on average,
44
153307
4516
tada se pokazalo da su nepušači u proseku značajno stariji
02:37
and thus, more likely to die during the trial period,
45
157823
3107
i stoga je bila veća verovatnoća da će da umru tokom istraživanja,
02:40
precisely because they were living longer in general.
46
160930
3508
baš zbog toga što su inače živeli duže.
02:44
Here, the age groups are the lurking variable,
47
164438
2848
Ovde su starosne grupe skrivena varijabla
02:47
and are vital to correctly interpret the data.
48
167286
2890
i od suštinskog su značaja za pravilno tumačenje podataka.
02:50
In another example,
49
170176
1383
U drugom primeru,
02:51
an analysis of Florida's death penalty cases
50
171559
2722
analiza slučajeva smrtne kazne u Floridi
02:54
seemed to reveal no racial disparity in sentencing
51
174281
3984
nije se činilo da otkriva rasnu nejednakost kod presuda
02:58
between black and white defendants convicted of murder.
52
178265
3316
između crnih i belih prestupnika osuđenih na smrt.
03:01
But dividing the cases by the race of the victim told a different story.
53
181581
4815
Međutim, podela slučajeva prema rasi žrtve, govorila je nešto drugo.
03:06
In either situation,
54
186396
1573
U oba slučaja,
03:07
black defendants were more likely to be sentenced to death.
55
187969
3122
crni prestupnici su češće osuđivani na smrt.
03:11
The slightly higher overall sentencing rate for white defendants
56
191091
3975
Sveukupno nešto veća stopa osuđenih belih prestupnika
03:15
was due to the fact that cases with white victims
57
195066
3626
je bila posledica činjenice da slučajevi sa belim žrtvama
03:18
were more likely to elicit a death sentence
58
198692
2667
češće uzrokuju smrtnu kaznu
03:21
than cases where the victim was black,
59
201359
2732
od slučajeva gde je žrtva crnac,
03:24
and most murders occurred between people of the same race.
60
204091
4392
a većina ubistava se dešavala među ljudima iste rase.
03:28
So how do we avoid falling for the paradox?
61
208483
2836
Pa, kako da izbegnemo podleganje ovom paradoksu?
03:31
Unfortunately, there's no one-size-fits-all answer.
62
211319
3367
Nažalost, ne postoji univerzalno rešenje.
03:34
Data can be grouped and divided in any number of ways,
63
214686
3818
Podaci se mogu grupisati i podeliti na bezbroj načina,
03:38
and overall numbers may sometimes give a more accurate picture
64
218504
3602
a sveukupne cifre mogu ponekad da daju tačniju sliku
03:42
than data divided into misleading or arbitrary categories.
65
222106
4532
od podataka podeljenih u varljive ili proizvoljne kategorije.
03:46
All we can do is carefully study the actual situations the statistics describe
66
226638
5451
Sve što možemo da učinimo je da izučavamo stvarne situacije koje statistika opisuje
03:52
and consider whether lurking variables may be present.
67
232089
3888
i da pazimo na prisustvo skrivenih varijabli.
03:55
Otherwise, we leave ourselves vulnerable to those who would use data
68
235977
3401
U suprotnom, podložni smo uticaju onih koji će da iskoriste podatke
03:59
to manipulate others and promote their own agendas.
69
239378
3271
kako bi manipulisali drugima i promovisali sopstvene ciljeve.
About this website

This site will introduce you to YouTube videos that are useful for learning English. You will see English lessons taught by top-notch teachers from around the world. Double-click on the English subtitles displayed on each video page to play the video from there. The subtitles scroll in sync with the video playback. If you have any comments or requests, please contact us using this contact form.

https://forms.gle/WvT1wiN1qDtmnspy7