class: center, middle, inverse, title-slide # PRIMJENJENA STATISTIKA ## Predavanje 3: Deskriptivna statistika ### Luka Sikic, PhD ### Fakultet hrvatskih studija |
Github PS
--- class: inverse, middle # PREGLED PREDAVANJA --- layout: true # PREGLED PREDAVANJA --- ## Ciljevi <br> <br> <br> <br> - Upoznavanje sa podatcima - Mjere centralne tendencije - Mjere varijabilnosti - Mjere asimetrije i zaobljenosti - Pregled varijabli i podatkovnih okvira - Standardizirane vrijednosti - Korelacija --- layout: true # PODATCI --- .hi[Učitaj podatke] ```r # Učitaj paket library(lsr) # Definiraj put do podataka setwd() # Provjera getwd() load("../Podatci/aflsmall.Rdata") # Učitaj podatke u radni prostor #who() # Pregledaj učitane podatke str(afl.finalists) # Struktura podataka ``` ``` #> Factor w/ 17 levels "Adelaide","Brisbane",..: 9 10 3 10 9 3 10 3 9 10 ... ``` ```r str(afl.margins) # Struktura podataka ``` ``` #> num [1:176] 56 31 56 8 32 14 36 56 19 1 ... ``` --- layout: true # PODATCI --- .hi[Pregledaj podatke] <br> <br> <br> ```r # Pregledaj podatke print(afl.margins[1:11]) ``` ``` #> [1] 56 31 56 8 32 14 36 56 19 1 3 ``` -- ```r # Pregledaj podatke print(afl.finalists[1:5]) ``` ``` #> [1] Hawthorn Melbourne Carlton Melbourne Hawthorn #> 17 Levels: Adelaide Brisbane Carlton Collingwood Essendon Fitzroy ... Western Bulldogs ``` --- .hi[Vizualizacija] <img src="03_DESKRIPTIVNA_STATISTIKA_xar_files/figure-html/histogram1-1.svg" style="display: block; margin: auto;" /> .footnote[[*]Histogram pobjedničkih bodova(`afl.margins`) iz AFL 2010 lige američkog nogometa. Grafikon prikazuje da se broj pobjeda uz veću razliku rijeđe pojavljuje.] --- layout: false class: middle, inverse # MJERE CENTRALNE TENDENCIJE --- layout: true # MJERE CENTRALNE TENDENCIJE --- <br> <br> <br> <br> - Aritmetička sredina - Medijan - Mod --- layout: true #ARITMETIČKA SREDINA --- .hi[Definicija] $$ \bar{X} = \frac{X_1+X_2 +...+ X_{N-1}+X_N} {N} $$ -- .hi[Sumiranje] $$ \sum_{i=1}^5 X_i $$ -- .hi[Skraćeni zapis] $$ \bar{X} = \frac{1}{N} \sum_{i=1}^N X_i $$ -- .hi[Izračun rukom] $$ \frac{56 + 31 + 56 + 8 + 32}{5} = \frac{183}{5} = 36.60 $$ --- .hi[Kalkulator] ```r (56 + 31 + 56 + 8 + 32) / 5 ``` ``` #> [1] 36.6 ``` -- .hi[Funkcija] ```r sum( afl.margins[1:5]) / 5 ``` ``` #> [1] 36.6 ``` --- layout: true # MEDIJAN --- .hi[Za neparni niz] $$ 8, 31, \mathbf{32}, 56, 56 $$ -- .hi[Za parni niz] $$ 8, 14, \mathbf{31}, \mathbf{32}, 56, 56 $$ -- .hi[Funkcija] ```r # Izračunaj median putem funkcije median( x = afl.margins ) # Cijeli podatkovni skup ``` ``` #> [1] 30.5 ``` --- layout: true # EKSTREMNE VRIJEDNOSTI --- ```r # Definiraj vektor od 10 brojeva vektor_10 <- c( -15,2,3,4,5,6,7,8,9,12 ) ``` -- ```r mean( x = vektor_10 ) # Izračunaj AS ``` ``` #> [1] 4.1 ``` -- ```r median( x = vektor_10 ) # Izračunaj medijan ``` ``` #> [1] 5.5 ``` -- .hi[Korekcija] ```r # Ukloni 10% ekstremnih vrijednosti mean( x = vektor_10, trim = .1) ``` ``` #> [1] 5.5 ``` -- ```r # Ukloni 5% ekstremnih vrijednosti mean( x = afl.margins, trim = .05) ``` ``` #> [1] 33.75 ``` --- layout: true # MOD --- ```r # Pogledaj frekvenciju podataka table(afl.finalists) ``` ``` #> afl.finalists #> Adelaide Brisbane Carlton Collingwood #> 26 25 26 28 #> Essendon Fitzroy Fremantle Geelong #> 32 0 6 39 #> Hawthorn Melbourne North Melbourne Port Adelaide #> 27 28 28 17 #> Richmond St Kilda Sydney West Coast #> 6 24 26 38 #> Western Bulldogs #> 24 ``` --- ```r # Izračunaj modalnu vrijednost modeOf( x = afl.finalists ) ``` ``` #> [1] "Geelong" ``` -- ```r # Izračunaj modalnu frekvenciju maxFreq(x = afl.finalists) ``` ``` #> [1] 39 ``` -- ```r # Izaračun za afl.margins podatke modeOf(afl.margins) # Mod ``` ``` #> [1] 3 ``` -- ```r maxFreq(afl.margins) # Modalna frekvencija ``` ``` #> [1] 8 ``` --- layout: false class: middle, inverse # MJERE VARIJABILNOSTI --- layout: true # MJERE VARIJABILNOSTI --- <br> <br> <br> <br> - Raspon/Min-Max - Kvartili - Prosječno apsolutno odstupanje - Varijanca - Standardna devijacija - Srednje apsolutno odstupanje --- layout: true # RASPON/MIN-MAX --- ```r # Maksimalna vrijednost max(afl.margins) ``` ``` #> [1] 116 ``` -- ```r # Minimalna vrijednost min(afl.margins) ``` ``` #> [1] 0 ``` -- ```r # Raspon podataka range(afl.margins) ``` ``` #> [1] 0 116 ``` --- layout: true # KVARTILI --- ```r # Izračunaj pedeseti (50i) kvartil/percentil quantile(x = afl.margins, probs = .5) ``` ``` #> 50% #> 30.5 ``` -- ```r # Izračunaj 25i i 75i kvartil/percentil quantile(afl.margins, probs = c(.25,.75)) ``` ``` #> 25% 75% #> 12.75 50.50 ``` -- ```r # Izračunaj interkvartilni raspon IQR(x = afl.margins) ``` ``` #> [1] 37.75 ``` --- layout:true # PROSJEČNO APSOLUTNO ODSTUPANJE --- .hi[Formula] $$ \mbox{}(X) = \frac{1}{N} \sum_{i = 1}^N |X_i - \bar{X}| $$ -- .hi[Tablica za ručni izračun prosječnog apsolutnog odstupanja] <table> <caption></caption> <thead> <tr> <th style="text-align:right;"> `\(i\)` </th> <th style="text-align:right;"> `\(X_i\)` </th> <th style="text-align:right;"> `\(X_i - \bar{X}\)` </th> <th style="text-align:right;"> `\((X_i - \bar{X})\)` </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 56 </td> <td style="text-align:right;"> 19.4 </td> <td style="text-align:right;"> 19.4 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> -5.6 </td> <td style="text-align:right;"> 5.6 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 56 </td> <td style="text-align:right;"> 19.4 </td> <td style="text-align:right;"> 19.4 </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:right;"> -28.6 </td> <td style="text-align:right;"> 28.6 </td> </tr> <tr> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 32 </td> <td style="text-align:right;"> -4.6 </td> <td style="text-align:right;"> 4.6 </td> </tr> </tbody> </table> --- .hi[Izračun rukom] $$ \frac{19.4 + 5.6 + 19.4 + 28.6 + 4.6}{5} = 15.52 $$ -- .hi[Izračun pomoću funkcija] ```r X <- c(56, 31,56,8,32) # Napravi vektor X.bar <- mean( X ) # Korak 1. Izračunaj AS AD <- abs( X - X.bar ) # Korak 2. Uzmi aps vrijednost AAD <- mean( AD ) # Korak 3. Izračunaj AS devijacija ``` -- ```r print( AAD ) # Pogledaj rezultate ``` ``` #> [1] 15.52 ``` --- layout:true # VARIJANCA --- <br> <br> .hi[Formula 1] $$ \mbox{Var}(X) = \frac{1}{N} \sum_{i=1}^N \left( X_i - \bar{X} \right)^2 $$ <br> <br> .hi[Formula 2] `$$\mbox{Var}(X) = \frac{\sum_{i=1}^N \left( X_i - \bar{X} \right)^2}{N}$$` --- .hi[Ručni izračun varijance] <table> <caption></caption> <thead> <tr> <th style="text-align:right;"> `\(i\)` </th> <th style="text-align:right;"> `\(X_i\)` </th> <th style="text-align:right;"> `\(X_i - \bar{X}\)` </th> <th style="text-align:right;"> `\((X_i - \bar{X})^2\)` </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 56 </td> <td style="text-align:right;"> 19.4 </td> <td style="text-align:right;"> 376.36 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 31 </td> <td style="text-align:right;"> -5.6 </td> <td style="text-align:right;"> 31.36 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 56 </td> <td style="text-align:right;"> 19.4 </td> <td style="text-align:right;"> 376.36 </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:right;"> -28.6 </td> <td style="text-align:right;"> 817.96 </td> </tr> <tr> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 32 </td> <td style="text-align:right;"> -4.6 </td> <td style="text-align:right;"> 21.16 </td> </tr> </tbody> </table> -- .hi[Kalkulatorski izračun] ```r # Kalkulator (376.36 + 31.36 + 376.36 + 817.96 + 21.16 ) / 5 ``` ``` #> [1] 324.64 ``` --- .hi[Izračun putem funkcija] ```r # Izračunaj varijancu pomoću funkcija mean( (X - mean(X) )^2) ``` ``` #> [1] 324.64 ``` -- ```r var(X) # Skrati postupak ``` ``` #> [1] 405.8 ``` -- ```r ## Isti primjer sa svim podatcima # Izračunaj varijancu pomoću funkcija mean( (afl.margins - mean(afl.margins) )^2) ``` ``` #> [1] 675.9718 ``` -- ```r var( afl.margins ) # Skrati postupak ``` ``` #> [1] 679.8345 ``` --- layout:true # STANDARDNA DEVIJACIJA --- <br> <br> .hi[Formula 1] $$ s = \sqrt{ \frac{1}{N} \sum_{i=1}^N \left( X_i - \bar{X} \right)^2 } $$ <br> <br> .hi[Formula 2] $$ \hat\sigma = \sqrt{ \frac{1}{N-1} \sum_{i=1}^N \left( X_i - \bar{X} \right)^2 } $$ -- ```r # Izračunaj pomoću funkcije sd( afl.margins ) ``` ``` #> [1] 26.07364 ``` --- layout:true # APSOLUTNO ODSTUPANJE OD MEDIJANA --- ```r # Prosječno apsolutno odstupanje od prosjeka mean( abs(afl.margins - mean(afl.margins)) ) ``` ``` #> [1] 21.10124 ``` -- ```r # *Medijansko* apsolutno odstupanje od *medijana*: median( abs(afl.margins - median(afl.margins)) ) ``` ``` #> [1] 19.5 ``` -- ```r # Izračun putem funkcije mad( x = afl.margins, constant = 1 ) ``` ``` #> [1] 19.5 ``` --- layout:true # KOEFICIJENT ASIMETRIJE --- <br> <br> <img src="03_DESKRIPTIVNA_STATISTIKA_xar_files/figure-html/skewness-1.svg" style="display: block; margin: auto;" /> --- .hi[Formula] $$ \mbox{skewness}(X) = \frac{1}{N \hat{\sigma}^3} \sum_{i=1}^N (X_i - \bar{X})^3 $$ -- .hi[Funkcijski izračun] ```r # Izračunaj na stvarnim podatcima skew( x = afl.margins ) ``` ``` #> [1] 0.7671555 ``` --- layout:true # KOEFICIJENT ZAOBLJENOSTI --- <br> <br> <img src="03_DESKRIPTIVNA_STATISTIKA_xar_files/figure-html/kurtosis-1.svg" style="display: block; margin: auto;" /> --- .hi[Formula] <br> <br> $$ \mbox{kurtosis}(X) = \frac{1}{N \hat\sigma^4} \sum_{i=1}^N \left( X_i - \bar{X} \right)^4 - 3 $$ -- <br> <br> .hi[Funkcijski izračun] ```r # Izračunaj na stvarnim podatcima kurtosi( x = afl.margins ) ``` ``` #> [1] 0.02962633 ``` --- layout:true # DESKRIPTIVNA STATISTIKA NA VARIJABLI --- .hi[Numerička varijabla] ```r # Pregled numeričke varijable summary( object = afl.margins ) # Deskriptivna stat ``` ``` #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 0.00 12.75 30.50 35.30 50.50 116.00 ``` -- .hi[Logička varijabla] ```r # Pregled logičke varijable ekstremi <- afl.margins > 50 # Stvori log varijablu ``` -- ```r head(ekstremi,5) # Pogledaj podatke ``` ``` #> [1] TRUE FALSE TRUE FALSE FALSE ``` ```r summary(ekstremi) # Deskriptivna stat ``` ``` #> Mode FALSE TRUE #> logical 132 44 ``` --- .hi[Faktorska varijabla] ```r # Pregled faktorske varijable summary(object = afl.finalists) # Deskriptivna stat ``` ``` #> Adelaide Brisbane Carlton Collingwood #> 26 25 26 28 #> Essendon Fitzroy Fremantle Geelong #> 32 0 6 39 #> Hawthorn Melbourne North Melbourne Port Adelaide #> 27 28 28 17 #> Richmond St Kilda Sydney West Coast #> 6 24 26 38 #> Western Bulldogs #> 24 ``` -- ```r # Pregled tekstualne varijable txt <- as.character( afl.finalists ) # Stvori txt var summary( object = txt ) # Deskriptivna stat ``` ``` #> Length Class Mode #> 400 character character ``` --- layout:true # NOVI PODATKOVNI SKUP --- .hi[Pdatci] ```r rm(list = ls()) # Očisti radni prostor load("../Podatci/clinicaltrial.Rdata") # Učitaj podatke who(TRUE) # Pregled podataka ``` ``` #> -- Name -- -- Class -- -- Size -- #> clin.trial data.frame 18 x 3 #> $drug factor 18 #> $therapy factor 18 #> $mood.gain numeric 18 ``` --- layout:true # DESKRIPTIVNA STATISTIKA DF --- .hi[Obični pregled] ```r # Deksriptivna statistika na podatkovnom okviru summary(clin.trial) # Desktiptivna stat ``` ``` #> drug therapy mood.gain #> placebo :6 no.therapy:9 Min. :0.1000 #> anxifree:6 CBT :9 1st Qu.:0.4250 #> joyzepam:6 Median :0.8500 #> Mean :0.8833 #> 3rd Qu.:1.3000 #> Max. :1.8000 ``` --- .hi[Alternativna funkcija] ```r # Deksriptivna statistika na podatkovnom okviru describe(clin.trial) # Desktiptivna stat/ druga funkcija ``` ``` #> vars n mean sd median trimmed mad min max range skew kurtosis #> drug* 1 18 2.00 0.84 2.00 2.00 1.48 1.0 3.0 2.0 0.00 -1.66 #> therapy* 2 18 1.50 0.51 1.50 1.50 0.74 1.0 2.0 1.0 0.00 -2.11 #> mood.gain 3 18 0.88 0.53 0.85 0.88 0.67 0.1 1.8 1.7 0.13 -1.44 #> se #> drug* 0.20 #> therapy* 0.12 #> mood.gain 0.13 ``` --- .hi[Grupirani pregled] <br> ```r # Pregledaj grupirano prema terapiji by(data = clin.trial, # Izvor podataka INDICES = clin.trial$therapy, # Odredi grupiranje FUN = summary) # Odredi funkciju ``` ``` #> clin.trial$therapy: no.therapy #> drug therapy mood.gain #> placebo :3 no.therapy:9 Min. :0.1000 #> anxifree:3 CBT :0 1st Qu.:0.3000 #> joyzepam:3 Median :0.5000 #> Mean :0.7222 #> 3rd Qu.:1.3000 #> Max. :1.7000 #> ------------------------------------------------------------ #> clin.trial$therapy: CBT #> drug therapy mood.gain #> placebo :3 no.therapy:0 Min. :0.300 #> anxifree:3 CBT :9 1st Qu.:0.800 #> joyzepam:3 Median :1.100 #> Mean :1.044 #> 3rd Qu.:1.300 #> Max. :1.800 ``` --- .hi[Grupirani pregled] <br> <br> ```r # Pregledaj grupirano prema razlici u raspoloženju aggregate(formula = mood.gain ~ drug + therapy, # Prikaz data = clin.trial, # Podatci FUN = mean) # AS ``` ``` #> drug therapy mood.gain #> 1 placebo no.therapy 0.300000 #> 2 anxifree no.therapy 0.400000 #> 3 joyzepam no.therapy 1.466667 #> 4 placebo CBT 0.600000 #> 5 anxifree CBT 1.033333 #> 6 joyzepam CBT 1.500000 ``` --- .hi[Grupirani pregled] ```r # Pregledaj grupirano prema razlici u raspoloženju aggregate(mood.gain ~ drug + therapy, # Prikaz clin.trial, # Podatci sd) # Standardna devijacija ``` ``` #> drug therapy mood.gain #> 1 placebo no.therapy 0.2000000 #> 2 anxifree no.therapy 0.2000000 #> 3 joyzepam no.therapy 0.2081666 #> 4 placebo CBT 0.3000000 #> 5 anxifree CBT 0.2081666 #> 6 joyzepam CBT 0.2645751 ``` --- layout:true # STANDARDNE VRIJEDNOSTI --- .hi[Formula] $$ \mbox{standardna vrijednost} = \frac{\mbox{vrijednost opservacije} - \mbox{prosjek}}{\mbox{standardna devijacija}} $$ -- .hi[Z-score] $$ z_i = \frac{X_i - \bar{X}}{\hat\sigma} $$ -- .hi[Izračun rukom] $$ z = \frac{35 - 17}{5} = 3.6 $$ -- .[Distribucija] ```r # Vidi dio u distribuciji pnorm( 3.6 ) ``` ``` #> [1] 0.9998409 ``` --- layout:true # NOVI PODATKOVNI SKUP --- ```r rm(list = ls()) # Očisti radni prostor # Učitaj podatke load("../Podatci/parenthood.Rdata") who(TRUE) # Pregled podataka ``` ``` #> -- Name -- -- Class -- -- Size -- #> parenthood data.frame 100 x 4 #> $dan.sleep numeric 100 #> $baby.sleep numeric 100 #> $dan.grump numeric 100 #> $day integer 100 ``` -- ```r # Pregledaj podatke head(parenthood, 7) # Prvih 7 redova ``` ``` #> dan.sleep baby.sleep dan.grump day #> 1 7.59 10.18 56 1 #> 2 7.91 11.66 60 2 #> 3 5.14 7.92 82 3 #> 4 7.71 9.61 55 4 #> 5 6.68 9.75 67 5 #> 6 5.99 5.04 72 6 #> 7 8.19 10.45 53 7 ``` --- <br> <br> ```r # Pogledaj deskriptivnu statistiku describe(parenthood) ``` ``` #> vars n mean sd median trimmed mad min max range skew #> dan.sleep 1 100 6.97 1.02 7.03 7.00 1.09 4.84 9.00 4.16 -0.29 #> baby.sleep 2 100 8.05 2.07 7.95 8.05 2.33 3.25 12.07 8.82 -0.02 #> dan.grump 3 100 63.71 10.05 62.00 63.16 9.64 41.00 91.00 50.00 0.43 #> day 4 100 50.50 29.01 50.50 50.50 37.06 1.00 100.00 99.00 0.00 #> kurtosis se #> dan.sleep -0.72 0.10 #> baby.sleep -0.69 0.21 #> dan.grump -0.16 1.00 #> day -1.24 2.90 ``` --- .hi[Vizualizacija] <img src="03_DESKRIPTIVNA_STATISTIKA_xar_files/figure-html/parenthood-1.svg" style="display: block; margin: auto;" /> .footnote[[*]Grafički prikaz varijabli u `parenthood` podatkovnom skupu.] --- layout:true # KORELACIJA --- .hi[Grafički prikaz korelacije] <img src="03_DESKRIPTIVNA_STATISTIKA_xar_files/figure-html/scatterparent1a-1.svg" style="display: block; margin: auto;" /> .footnote[[*]Dijagram rasipanja za varijable `Sati spavanja/roditelj` i `Raspoloženje`.] --- .hi[Grafički prikaz korelacije] <img src="03_DESKRIPTIVNA_STATISTIKA_xar_files/figure-html/scatterparent2-1.svg" style="display: block; margin: auto;" /> .footnote[[*]Dijagram rasipanja za varijable `Sati spavanja/dijete` i `Sati spavanja/roditelj`.] --- <br <br> .hi[Kovarijanca] $$ \mbox{Cov}(X,Y) = \frac{1}{N-1} \sum_{i=1}^N \left( X_i - \bar{X} \right) \left( Y_i - \bar{Y} \right) $$ -- <br <br> .hi[Personov korelacijski koeficijent;standardizacija kovarijance] $$ r_{XY} = \frac{\mbox{Cov}(X,Y)}{ \hat{\sigma}_X \ \hat{\sigma}_Y} $$ --- layout:true # SMJER I INTENZITET KORELACIJE --- <img src="03_DESKRIPTIVNA_STATISTIKA_xar_files/figure-html/corr-1.svg" style="display: block; margin: auto;" /> --- layout:true # IZRAČUN KORELACIJE U R --- .hi[Funkcijski izračun;pojedinačno] ```r # Izračunaj korelaciju između spavanja i raspoloženja cor(x = parenthood$dan.sleep, y = parenthood$dan.grump) ``` ``` #> [1] -0.903384 ``` -- .hi[Funkcijski izračun;cijeli df] ```r # Izračunaj korelacijsku tablicu cor(x = parenthood) ``` ``` #> dan.sleep baby.sleep dan.grump day #> dan.sleep 1.00000000 0.62794934 -0.90338404 -0.09840768 #> baby.sleep 0.62794934 1.00000000 -0.56596373 -0.01043394 #> dan.grump -0.90338404 -0.56596373 1.00000000 0.07647926 #> day -0.09840768 -0.01043394 0.07647926 1.00000000 ``` --- layout:true # INTERPRETACIJA KORELACIJE --- .hi[Okvirne smjernice za interpretaciju korelacije] <table> <caption></caption> <thead> <tr> <th style="text-align:left;"> Korelacija </th> <th style="text-align:left;"> Snaga </th> <th style="text-align:left;"> Smjer </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> -1.0 to -0.9 </td> <td style="text-align:left;"> Izrazito jaka </td> <td style="text-align:left;"> Negativna </td> </tr> <tr> <td style="text-align:left;"> -0.9 to -0.7 </td> <td style="text-align:left;"> Jaka </td> <td style="text-align:left;"> Negativna </td> </tr> <tr> <td style="text-align:left;"> -0.7 to -0.4 </td> <td style="text-align:left;"> Umjerena </td> <td style="text-align:left;"> Negativna </td> </tr> <tr> <td style="text-align:left;"> -0.4 to -0.2 </td> <td style="text-align:left;"> Slaba </td> <td style="text-align:left;"> Negativna </td> </tr> <tr> <td style="text-align:left;"> -0.2 to 0 </td> <td style="text-align:left;"> Zanemariva </td> <td style="text-align:left;"> Negativna </td> </tr> <tr> <td style="text-align:left;"> 0 to 0.2 </td> <td style="text-align:left;"> Zanemariva </td> <td style="text-align:left;"> Pozitivna </td> </tr> <tr> <td style="text-align:left;"> 0.2 to 0.4 </td> <td style="text-align:left;"> Slaba </td> <td style="text-align:left;"> Pozitivna </td> </tr> <tr> <td style="text-align:left;"> 0.4 to 0.7 </td> <td style="text-align:left;"> Umjerena </td> <td style="text-align:left;"> Pozitivna </td> </tr> <tr> <td style="text-align:left;"> 0.7 to 0.9 </td> <td style="text-align:left;"> Jaka </td> <td style="text-align:left;"> Pozitivna </td> </tr> <tr> <td style="text-align:left;"> 0.9 to 1.0 </td> <td style="text-align:left;"> Izrazito jaka </td> <td style="text-align:left;"> Pozitivna </td> </tr> </tbody> </table> --- layout:true # NOVI PODATKOVNI SKUP --- ```r rm(list=ls()) # Očisti radni prostor load("../Podatci/effort.Rdata") # Učitaj podatke who(TRUE) # Pregledaj podatke ``` ``` #> -- Name -- -- Class -- -- Size -- #> effort data.frame 10 x 2 #> $hours numeric 10 #> $grade numeric 10 ``` -- .hi[Pregled podataka] ```r head(effort, 3) #Pregledaj podatke ``` ``` #> hours grade #> 1 2 13 #> 2 76 91 #> 3 40 79 ``` -- ```r cor(effort$hours, effort$grade) # Izračunaj korelaciju ``` ``` #> [1] 0.909402 ``` --- .hi[Vizualizacija] <img src="03_DESKRIPTIVNA_STATISTIKA_xar_files/figure-html/rankcorrpic-1.svg" style="display: block; margin: auto;" /> .footnote[[*]Odnos između sati studiranja i ocjene (svaka točka predstavlja jednog studenta). Isprekidana linija prikazuje linearni odnos. Korelacija između ove dvije varijable je visoka `\(r = .91\)`. Valja primjetiti da više sati učenja uvijek dodnosi veću ocjenu što se odražava u visokom Spearman koeficijentu korelacije of `\(rho = 1\)`.] --- layout:true # SPEARMANOVA KORELACIJA --- ```r sati_studiranja <- rank( effort$hours ) # Rang sati ocjena <- rank( effort$grade ) # Rang ocjena ``` | | Rang sati rada | Rang visine ocjene | |-|---------------------|-----------------------| |student | 1 | 1 | 1 | |student | 2 | 10 |10 | |student | 3 | 6 | 6 | |student | 4 | 2 | 2 | |student | 5 | 3 | 3 | |student | 6 | 5 | 5 | |student | 7 | 4 | 4 | |student | 8 | 8 | 8 | |student | 9 | 7 | 7 | |student | 10 | 9| 9 --- .hi[Funkcijski izračun] ```r cor(sati_studiranja,ocjena) # Izračunaj korelaciju ``` ``` #> [1] 1 ``` -- ```r # Dodaj argument "spearman" cor(effort$hours, effort$grade, method = "spearman") ``` ``` #> [1] 1 ```