class: center, middle, inverse, title-slide .title[ # 04 Data-Analysis ] .subtitle[ ## Elements of R e R Markdown ] .author[ ### Claudio Zandonella ] --- class: size-small # Working Session .pull-left-50[ ### Environment - `ls()`, list all objects in the environment - `rm(<name_object>)`, remove object from the environment - `rm(list = ls())`, remove all objects from the environment > Use RStudio Projects! ### Libraries - `install.packages("<package-name>")`, install package - `library("<package-name")`, load package - `<package_name>::<function>()`, use function without loading package ] .pull-right-50[ ### Working Directory - `getwd()`, get absolute filepath - `setwd()`, set the working directory in the session ### Paths - Favour **Relative path** instead of **Absolute path**. - Use `"/"` (forward slash) to indicate paths - `"./"` current working directory - `"../"` parent directory ### Help!....I need somebody, Help! - `?<name_function>()`, get function documentation - `??<name>`, search for matches in documentation ] --- class: size-small # Managing Data <br> <table class="table table-striped table-hover" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Format </th> <th style="text-align:left;"> Read </th> <th style="text-align:left;"> Write </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> <bold><code class="remark-inline-code">.rda()</code></bold> </td> <td style="text-align:left;width: 12cm; "> <code class="remark-inline-code">load("<data>.rda")</code> </td> <td style="text-align:left;width: 13cm; "> <code class="remark-inline-code">save(<object>, file = "<data>.rda")</code> </td> </tr> <tr> <td style="text-align:left;"> <bold><code class="remark-inline-code">.rds()</code></bold> </td> <td style="text-align:left;width: 12cm; "> <code class="remark-inline-code">readRDS("<data>.rds")</code> </td> <td style="text-align:left;width: 13cm; "> <code class="remark-inline-code">saveRDS(<object>, file = "<data>.rds")</code> </td> </tr> <tr> <td style="text-align:left;"> <bold><code class="remark-inline-code">.csv()</code></bold> </td> <td style="text-align:left;width: 12cm; "> <code class="remark-inline-code">read.csv("<data>.csv")</code> for <code class="remark-inline-code">,</code> as separator and <code class="remark-inline-code">.</code> as decimal separator </td> <td style="text-align:left;width: 13cm; "> <code class="remark-inline-code">write.csv(<object>, file = "<data>.csv")</code> </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;width: 12cm; "> <code class="remark-inline-code">read.csv2()</code> for <code class="remark-inline-code">;</code> as separator and <code class="remark-inline-code">,</code> as decimal separator </td> <td style="text-align:left;width: 13cm; "> <code class="remark-inline-code">write.csv2(<object>, file = "<data>.csv")</code> </td> </tr> <tr> <td style="text-align:left;"> <bold><code class="remark-inline-code">.txt()</code></bold> </td> <td style="text-align:left;width: 12cm; "> <code class="remark-inline-code">read.table("<data>.txt", sep = )</code> </td> <td style="text-align:left;width: 13cm; "> <code class="remark-inline-code">write.table(<object>, file = "<data>.txt")</code> </td> </tr> <tr> <td style="text-align:left;"> <bold><code class="remark-inline-code">.xlsx()</code></bold> </td> <td style="text-align:left;width: 12cm; "> <code class="remark-inline-code">readxl::read_xlsx("<data>.xlsx")</code> </td> <td style="text-align:left;width: 13cm; "> <code class="remark-inline-code">really??</code> </td> </tr> </tbody> </table> --- class: size-small # Descriptive Analysis <br> .pull-left-50[ - `summary()`, summary information - `mean()`, arithmetic mean - `median()`, median value - `quantile()`, sample quantiles - `max()`, maximum value - `min()`, minimum value - `range()`, range of values - `sd()`, standard deviation - `var()`, variance - `cor()`, correlation - `cov()`, covariance - `table()`, frequency table ] .pull-right-50[ ### Utility Functions - `paste()`, concatenate string - `paste0()`, concatenate string - `head()`, return first rows - `cumsum()`, cumulative sum - `cumprod()`, cumulative product - `round()`, rounding numbers - `scale()`, scaling and centering values - `unique()`, get unique elements - `duplicated()`, determine duplicated elements - `complete.cases()`, find complete cases ] --- class: size-small # Plots .pull-left-50[ - `plot(x, y, ...)`, scatter plot - `plot(x, y, type = "l",...)`, line plot - `barplot(height, ...)`, barplot - `hist(x, ...)`, histogram - `boxplot(formula, ...)` boxplot - `plot(density(x, ...))`, kernel density - `pairs()`, matrices of scatter plots ### Plot Options - `col`, color of lines and points - `main`, main plot title - `xlab` / `ylab`, label for x/y axis - `xlim` / `ylim`, limits for x/y axis - `legend()`, add legend - `par(mfrow = c(<n-rows>, <n-cols>))`, <br>define plots grid ] .pull-right-50[ <img src="04-R-data-analysis_files/figure-html/example-plot-1.png" style="display: block; margin: auto;" /> ] --- class: size-small # Statistics .pull-left-50[ ### Tests - `cor.test()`, test correlation - `t.test()`, Student's t-test - `wilcox.test()`, Wilcoxon rank test - `fisher.test()`, Fisher's exact test - `chisq.test()`, Pearson's chi-squared test - `shapiro.test()`, normality Test ### Models - `lm()`, fit linear models - `glm()`, fit generalized linear models - `anova()`, compute analysis of variance ] .pull-right-50[ ### Formula ~ `<outcome> ~ <predictors>` - `1` add intercept term - `a + b` additive effect - `a:b` interaction effect - `a*b` interaction and lower terms - `(a + b + ...)^n` limit to interaction of grade `n` - `- a` remove term ### Contrasts - `contr.treatment()`, dummy coding - `contr.sum()`, sum-to-zero contrasts - `contr.poly()`, orthogonal polynomial contrasts - `contr.helmert()`, Helmert contrasts ] --- class: size-small # Toop Package .pull-left-50[ #### Models - `lme4`, fit linear and generalized linear mixed-effects models - `brms`, fit Bayesian generalized (non-)linear multivariate multilevel models using 'Stan' for full Bayesian inference. - `metafor`, a comprehensive collection of functions for conducting meta-analyses in R. - `lavaan`, fit a variety of latent variable models, including confirmatory factor analysis, structural equation modeling and latent growth curve models ] .pull-right-50[ #### Utilities - `car`, variety of tests and diagnostics for regresion analysis - `performance`, utilities for computing measures to assess model quality. - `emmeans`, obtain estimated marginal means (EMMs) and cntrasts for many linear, generalized linear, and mixed models. ] --- class: end, middle, center # Thanks!