class: title-slide # Case study - gapminder ## tidy models with `broom` .center[<img src="https://raw.githubusercontent.com/tidymodels/broom/master/man/figures/logo.png" width="100px"/>] ### A. Ginolhac | rworkshop | 2021-09-10
--- # Learning objectives .flex[ .w-70.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt3.ml1[ .large[.gbox[You will learn]] .float-img[] .Large[ - Use `dplyr` / `purrr` for efficient data manipulation - Tidying multiple linear models using `broom` - Managing related things together in **one** `tibble` - Summarise findings in one `ggplot` using relevant aesthetics ]] .w-30.bg-washed-yellow.b--yellow.ba.bw2.br3.shadow-5.ph3.mt3.ml1[ .large[.gbox[Guided practical]] .center.Large[Interactive session] ]] --- # Managing multiple models #### Tutorial based on the great conference by [Hadley Wickham][1] .center[] [1]:https://www.youtube.com/watch?v=rz3_FDVt9eg --- # List-column cheatsheet, reminder .flex[ .w-50.b--green.ba.bw1.br3.shadow-5.ph3.mt2.mr1[ .large[.ybox[`list` in a `data.frame/tibble`]] .center[] ] .w-50.b--green.ba.bw1.br3.shadow-5.ph3.mt2.mr1[ .large[.ybox[`list of tibbles` in a `data.frame/tibble`]] .center[] ]] .center.footnote[[Functional Programming by S. Altman, B. Behrman, H. Wickham](https://dcl-prog.stanford.edu/list-columns.html) (CC4 Licence)] --- # Keep all analyses together ### Steps: cols, per group: rows #### Workflow from last time .pull-left[ ```r palmerpenguins::penguins %>% nest_by(island, species) %>% mutate(model = list(lm(bill_depth_mm ~ bill_length_mm, data = data)), summary = list(summary(model)), r_squared = pluck(summary, "r.squared")) ``` #### Remember `nest_by()` is roughly the equivalent to: ```r penguins %>% group_by(island, species) %>% summarise(data = list(cur_data())) %>% * rowwise() ``` - `rowwise()` ensures that operations are performed on each row ] .pull-right[ ``` # A tibble: 5 × 6 # Rowwise: island, species island species data model summary r_squared <fct> <fct> <list<tibble[,6]>> <list> <list> <dbl> 1 Biscoe Adelie [44 × 6] <lm> <smmry.lm> 0.219 2 Biscoe Gentoo [124 × 6] <lm> <smmry.lm> 0.414 3 Dream Adelie [56 × 6] <lm> <smmry.lm> 0.258 4 Dream Chinstrap [68 × 6] <lm> <smmry.lm> 0.427 5 Torgersen Adelie [52 × 6] <lm> <smmry.lm> 0.0620 ``` ] --- # [Gapminder](http://www.gapminder.org/) is a fact tank .pull-left[ ## Dataset .float-img[] - From an independent Swedish foundation - [R package](https://github.com/jennybc/gapminder) by [Jenny Bryan!](https://github.com/jennybc) - Install from CRAN: `gapminder` ] .pull-right[ ## Hans Rosling .float-img[] - Fundamentally optimistic - Great [talk](https://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen) ] --- class: slide-practical # Guided practical, explore gapminder
06
:
00
.flex[ .w-60.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.gbox[Questions
]] .large[ - Install the `gapminder` package - Load `gapminder` and `tidyverse` packages - Use the pipe `%>%` to pass `gapminder` to `ggplot()` - **Plot** the `life expectency` (`lifeExp` in `y`) ~ `year` (`x`) - Use `geom_line()` ] ] .w-40.bg-washed-red.b--red.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.rbox[Warning!]] .float-img[] .center.huge[mind the grouping! ] ]] --- # Gapminder .pull-left[ ```r # install.packages("gapminder") library(gapminder) gapminder %>% ggplot(aes(x = year, y = lifeExp, * group = country)) + geom_line() ``` ] .pull-right[ <img src="lecture09_lm_gapminder_files/figure-html/unnamed-chunk-4-1.png" width="504" /> ] --- class: slide-practical # Keep related things together using list-column
04
:
00
.flex[ .w-60.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt3.ml6[ .large[.gbox[Questions
]] .large[ - Add a column using `mutate()` named `year1950` which is: `year` - 1950 - Nest with `nest_by()` the tibble by `country` and `continent` - How many rows will you get? .bold[Expectations help catching mistakes] - Save the object as `by_country` ] ] ] --- # Keep related things together .pull-left[ ### Nest _per_ country ```r by_country <- gapminder %>% mutate(year1950 = year - 1950) %>% nest_by(continent, country) by_country ``` ### Helpers - By default the list column is named `data` - `year1950` will help to get meaningful intercepts - Add `continent` to keep it along with `country` - Note that `nest_by()` contains `rowwise()` ] .pull-right[ ``` # A tibble: 142 × 3 # Rowwise: continent, country continent country data <fct> <fct> <list<tibble[,5]>> 1 Africa Algeria [12 × 5] 2 Africa Angola [12 × 5] 3 Africa Benin [12 × 5] 4 Africa Botswana [12 × 5] 5 Africa Burkina Faso [12 × 5] 6 Africa Burundi [12 × 5] 7 Africa Cameroon [12 × 5] 8 Africa Central African Republic [12 × 5] 9 Africa Chad [12 × 5] 10 Africa Comoros [12 × 5] # … with 132 more rows ``` ] --- class: slide-compact # One country example: Germany .pull-left[ ### From original tibble ```r gapminder %>% filter(country == "Germany") %>% select(-country, -continent) ``` ``` # A tibble: 12 × 4 year lifeExp pop gdpPercap <int> <dbl> <int> <dbl> 1 1952 67.5 69145952 7144. 2 1957 69.1 71019069 10188. 3 1962 70.3 73739117 12902. 4 1967 70.8 76368453 14746. 5 1972 71 78717088 18016. 6 1977 72.5 78160773 20513. 7 1982 73.8 78335266 22032. 8 1987 74.8 77718298 24639. 9 1992 76.1 80597764 26505. 10 1997 77.3 82011073 27789. 11 2002 78.7 82350671 30036. 12 2007 79.4 82400996 32170. ``` ] .pull-right[ ### Nested tibble ```r by_country %>% filter(country == "Germany") ``` ``` # A tibble: 1 × 3 # Rowwise: continent, country continent country data <fct> <fct> <list<tibble[,5]>> 1 Europe Germany [12 × 5] ``` ```r by_country %>% filter(country == "Germany") %>% unnest(data) # also summarise(data) ``` ``` # A tibble: 12 × 7 # Groups: continent, country [1] continent country year lifeExp pop gdpPercap year1950 <fct> <fct> <int> <dbl> <int> <dbl> <dbl> 1 Europe Germany 1952 67.5 69145952 7144. 2 2 Europe Germany 1957 69.1 71019069 10188. 7 3 Europe Germany 1962 70.3 73739117 12902. 12 4 Europe Germany 1967 70.8 76368453 14746. 17 5 Europe Germany 1972 71 78717088 18016. 22 6 Europe Germany 1977 72.5 78160773 20513. 27 7 Europe Germany 1982 73.8 78335266 22032. 32 8 Europe Germany 1987 74.8 77718298 24639. 37 9 Europe Germany 1992 76.1 80597764 26505. 42 10 Europe Germany 1997 77.3 82011073 27789. 47 11 Europe Germany 2002 78.7 82350671 30036. 52 12 Europe Germany 2007 79.4 82400996 32170. 57 ``` ] --- # What happens in the `data.frame`, STAYS in the `data.frame` .vembedr[
] --- class: slide-practical # Las Vegas principle, add linear models
06
:
00
.flex[ .w-60.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.gbox[Questions
]] .large[ - Using `by_country` - Add a new column `model` with linear regressions of `lifeExp` on `year1950` - Save as `by_country_lm` ] ] .w-40.bg-washed-red.b--red.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.rbox[Ask yourself]] .float-img[] .center.Large[If you see **add column**, do you use `mutate` or `summarise`?] ]] --- # Linear models ### Linear model _per_ country .pull-left[ ```r by_country_lm <- by_country %>% mutate(model = list(lm(lifeExp ~ year1950, data = data))) by_country_lm ``` ] .pull-right[ ``` # A tibble: 142 × 4 # Rowwise: continent, country continent country data model <fct> <fct> <list<tibble[,5]>> <list> 1 Africa Algeria [12 × 5] <lm> 2 Africa Angola [12 × 5] <lm> 3 Africa Benin [12 × 5] <lm> 4 Africa Botswana [12 × 5] <lm> 5 Africa Burkina Faso [12 × 5] <lm> 6 Africa Burundi [12 × 5] <lm> 7 Africa Cameroon [12 × 5] <lm> 8 Africa Central African Republic [12 × 5] <lm> 9 Africa Chad [12 × 5] <lm> 10 Africa Comoros [12 × 5] <lm> # … with 132 more rows ``` ] --- class: slide-practical # Explore a list column
04
:
00
.flex[ .w-50.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.gbox[Questions
]] .large[ - Count # rows per country using the `data` column - Does any country have less data than others? ] ] .w-50.bg-washed-red.b--red.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.rbox[Reminder]] .large[ - A **list column** is a list, you need to iterate through elements - .bold[But] remember the `tibble` is still `rowwise` grouped (good). - `distinct()` will help to find unique values - .bold[But] remember the `tibble` is still `rowwise` grouped (.bold[bad]). ]]] --- # Explore a list column .flex[ .w-50.mr2[ ```r by_country_lm %>% * mutate(n = nrow(data)) %>% select(continent, country, n) ``` ] .w-50.ml2[ ``` # A tibble: 142 × 3 # Rowwise: continent, country continent country n <fct> <fct> <int> 1 Africa Algeria 12 2 Africa Angola 12 3 Africa Benin 12 4 Africa Botswana 12 5 Africa Burkina Faso 12 6 Africa Burundi 12 7 Africa Cameroon 12 8 Africa Central African Republic 12 9 Africa Chad 12 10 Africa Comoros 12 # … with 132 more rows ``` ] ] -- .flex[ .w-50.mr2[ ```r by_country_lm %>% mutate(n = nrow(data)) %>% ungroup() %>% * distinct(n) ``` ] .w-50.ml2[ ``` # A tibble: 1 × 1 n <int> 1 12 ``` ] ] --- class: slide-practical # Explore a list column, plotting
03
:
00
.flex[ .w-60.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.gbox[Questions
]] .large[ - Plot `lifeExp` ~ `year1950` for **Bulgaria** by .bold[unnesting] `data` ] ] .w-40.bg-washed-red.b--red.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.rbox[reminder]] .large[ - `filter()` for the desired country - `unnest()` raw `data` - Pipe to `ggplot()`] ]] --- # Explore a list column, plotting .pull-left[ ```r by_country_lm %>% filter(country == "Bulgaria") %>% unnest(data) %>% ggplot(aes(x = year1950, y = lifeExp)) + geom_line() ``` ] .pull-right[ <img src="lecture09_lm_gapminder_files/figure-html/unnamed-chunk-16-1.png" width="504" /> ] --- class: slide-practical # Explore nested tibble
05
:
00
.flex[ .w-60.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.gbox[Questions
]] .large[ - Display the `summary` for the linear model of **Rwanda** - How do you interpret the `\(r^2\)` for this particular model? ] ] .w-40.bg-washed-red.b--red.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.rbox[reminder]] .large[ - `filter()` for the desired country - Use `list()` to run `summary()` on the linear model - To extract the named `"r.squared"`, use the `pluck(sumary, "r.squared")` `purrr` syntax] ]] --- # Linear model for Rwanda .pull-left[ ```r by_country_lm %>% filter(country == "Rwanda") %>% mutate(summary = list(summary(model)), r2 = pluck(summary, "r.squared")) %>% select(country, r2) ``` ``` Adding missing grouping variables: `continent` ``` ``` # A tibble: 1 × 3 # Rowwise: continent, country continent country r2 <fct> <fct> <dbl> 1 Africa Rwanda 0.0172 ``` ] .pull-right[ .large[ - `\(r^2\)` is close to 0, linearity sounds broken - `broom` will cleanup linear model elements into tibbles ] ] --- # Cleanup using `broom`
.center[] --- class: slide-practical # Tidying models
05
:
00
.flex[ .w-60.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.gbox[Questions
]] .large[ - Install `broom` from .bold[CRAN] - Using `by_country_lm`, add 4 new columns: + `glance`, using the broom function on the `model` column + `tidy`, using the broom function on the `model` column + `augment`, using the broom function on the `model` column + `rsq` from the `glance` column - Save as `models` - Why extracting the `\(r^2\)` in the main tibble is useful? ] ] .w-40.bg-washed-red.b--red.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.rbox[Reminder]] .large[ - Use `list()` when dealing with a list column `rowwise` grouped ]]] --- # Tidying models .pull-left[ ### Useful info - Coefficients estimates: + **slope** + **intercept** - `\(r^2\)` - Residuals ```r library(broom) models <- by_country_lm %>% mutate(glance = list(glance(model)), tidy = list(tidy(model)), augment = list(augment(model)), rsq = pluck(glance, "r.squared")) ``` ] -- .pull-right[ ### Extracting `\(r^2\)` in main tibble .large[Why? no need to `unnest` for sorting / filtering.] <div data-pagedtable="false"> <script data-pagedtable-source type="application/json"> {"columns":[{"label":["continent"],"name":[1],"type":["fct"],"align":["left"]},{"label":["country"],"name":[2],"type":["fct"],"align":["left"]},{"label":["data"],"name":[3],"type":["list<tibble[,5]>"],"align":["right"]},{"label":["model"],"name":[4],"type":["list"],"align":["right"]},{"label":["glance"],"name":[5],"type":["list"],"align":["right"]},{"label":["tidy"],"name":[6],"type":["list"],"align":["right"]},{"label":["augment"],"name":[7],"type":["list"],"align":["right"]},{"label":["rsq"],"name":[8],"type":["dbl"],"align":["right"]}],"data":[{"1":"Africa","2":"Algeria","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98511721"},{"1":"Africa","2":"Angola","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.88781463"},{"1":"Africa","2":"Benin","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.96660199"},{"1":"Africa","2":"Botswana","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.03402340"},{"1":"Africa","2":"Burkina Faso","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.91871050"},{"1":"Africa","2":"Burundi","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.76599597"},{"1":"Africa","2":"Cameroon","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.68017839"},{"1":"Africa","2":"Central African Republic","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.49324448"},{"1":"Africa","2":"Chad","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.87237550"},{"1":"Africa","2":"Comoros","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99685076"},{"1":"Africa","2":"Congo, Dem. Rep.","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.34820278"},{"1":"Africa","2":"Congo, Rep.","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.51966079"},{"1":"Africa","2":"Cote d'Ivoire","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.28337240"},{"1":"Africa","2":"Djibouti","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.97437134"},{"1":"Africa","2":"Egypt","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99030424"},{"1":"Africa","2":"Equatorial Guinea","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99686864"},{"1":"Africa","2":"Eritrea","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.95727334"},{"1":"Africa","2":"Ethiopia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.96850263"},{"1":"Africa","2":"Gabon","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.81276621"},{"1":"Africa","2":"Gambia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98923562"},{"1":"Africa","2":"Ghana","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98409873"},{"1":"Africa","2":"Guinea","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.97831518"},{"1":"Africa","2":"Guinea-Bissau","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98455829"},{"1":"Africa","2":"Kenya","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.44255729"},{"1":"Africa","2":"Lesotho","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.08485635"},{"1":"Africa","2":"Liberia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.51175640"},{"1":"Africa","2":"Libya","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98333149"},{"1":"Africa","2":"Madagascar","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99465364"},{"1":"Africa","2":"Malawi","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.83995446"},{"1":"Africa","2":"Mali","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99545140"},{"1":"Africa","2":"Mauritania","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99767430"},{"1":"Africa","2":"Mauritius","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.93478457"},{"1":"Africa","2":"Morocco","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99458312"},{"1":"Africa","2":"Mozambique","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.77427932"},{"1":"Africa","2":"Namibia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.43702163"},{"1":"Africa","2":"Niger","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.89768664"},{"1":"Africa","2":"Nigeria","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.87010508"},{"1":"Africa","2":"Reunion","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.96607180"},{"1":"Africa","2":"Rwanda","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.01715964"},{"1":"Africa","2":"Sao Tome and Principe","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.95525936"},{"1":"Africa","2":"Senegal","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99054417"},{"1":"Africa","2":"Sierra Leone","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.96015054"},{"1":"Africa","2":"Somalia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.84442863"},{"1":"Africa","2":"South Africa","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.31246865"},{"1":"Africa","2":"Sudan","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99214243"},{"1":"Africa","2":"Swaziland","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.06821087"},{"1":"Africa","2":"Tanzania","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.76421876"},{"1":"Africa","2":"Togo","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.90580373"},{"1":"Africa","2":"Tunisia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98070422"},{"1":"Africa","2":"Uganda","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.34215382"},{"1":"Africa","2":"Zambia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.05983644"},{"1":"Africa","2":"Zimbabwe","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.05623196"},{"1":"Americas","2":"Argentina","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99556810"},{"1":"Americas","2":"Bolivia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98454156"},{"1":"Americas","2":"Brazil","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99804741"},{"1":"Americas","2":"Canada","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99638552"},{"1":"Americas","2":"Chile","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98279710"},{"1":"Americas","2":"Colombia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.96787344"},{"1":"Americas","2":"Costa Rica","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.96174767"},{"1":"Americas","2":"Cuba","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.92406716"},{"1":"Americas","2":"Dominican Republic","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.97060781"},{"1":"Americas","2":"Ecuador","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99456626"},{"1":"Americas","2":"El Salvador","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.95567201"},{"1":"Americas","2":"Guatemala","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99666377"},{"1":"Americas","2":"Haiti","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98761338"},{"1":"Americas","2":"Honduras","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.97730026"},{"1":"Americas","2":"Jamaica","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.80565904"},{"1":"Americas","2":"Mexico","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98520444"},{"1":"Americas","2":"Nicaragua","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99677615"},{"1":"Americas","2":"Panama","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.95119516"},{"1":"Americas","2":"Paraguay","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98298650"},{"1":"Americas","2":"Peru","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98847401"},{"1":"Americas","2":"Puerto Rico","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.90781912"},{"1":"Americas","2":"Trinidad and Tobago","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.79800744"},{"1":"Americas","2":"United States","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98592016"},{"1":"Americas","2":"Uruguay","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.97683072"},{"1":"Americas","2":"Venezuela","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.94652607"},{"1":"Asia","2":"Afghanistan","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.94771226"},{"1":"Asia","2":"Bahrain","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.96673981"},{"1":"Asia","2":"Bangladesh","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98936087"},{"1":"Asia","2":"Cambodia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.63869222"},{"1":"Asia","2":"China","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.87127734"},{"1":"Asia","2":"Hong Kong, China","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.97230183"},{"1":"Asia","2":"India","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.96843652"},{"1":"Asia","2":"Indonesia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99711418"},{"1":"Asia","2":"Iran","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99501535"},{"1":"Asia","2":"Iraq","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.54578420"},{"1":"Asia","2":"Israel","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99478290"},{"1":"Asia","2":"Japan","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.95959563"},{"1":"Asia","2":"Jordan","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.96975008"},{"1":"Asia","2":"Korea, Dem. Rep.","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.70306306"},{"1":"Asia","2":"Korea, Rep.","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98765101"},{"1":"Asia","2":"Kuwait","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.95235423"},{"1":"Asia","2":"Lebanon","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.94172582"},{"1":"Asia","2":"Malaysia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.94650639"},{"1":"Asia","2":"Mongolia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98731309"},{"1":"Asia","2":"Myanmar","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.87937750"},{"1":"Asia","2":"Nepal","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99154171"},{"1":"Asia","2":"Oman","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.97479461"},{"1":"Asia","2":"Pakistan","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99724965"},{"1":"Asia","2":"Philippines","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99142260"},{"1":"Asia","2":"Saudi Arabia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.97208439"},{"1":"Asia","2":"Singapore","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98794751"},{"1":"Asia","2":"Sri Lanka","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.94771469"},{"1":"Asia","2":"Syria","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98416512"},{"1":"Asia","2":"Taiwan","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.95707113"},{"1":"Asia","2":"Thailand","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.96738440"},{"1":"Asia","2":"Vietnam","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98941189"},{"1":"Asia","2":"West Bank and Gaza","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.97048087"},{"1":"Asia","2":"Yemen, Rep.","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98117240"},{"1":"Europe","2":"Albania","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.91057777"},{"1":"Europe","2":"Austria","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99213401"},{"1":"Europe","2":"Belgium","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99454056"},{"1":"Europe","2":"Bosnia and Herzegovina","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.89569829"},{"1":"Europe","2":"Bulgaria","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.54654217"},{"1":"Europe","2":"Croatia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.93243047"},{"1":"Europe","2":"Czech Republic","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.91668191"},{"1":"Europe","2":"Denmark","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.97066797"},{"1":"Europe","2":"Finland","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99383835"},{"1":"Europe","2":"France","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99762458"},{"1":"Europe","2":"Germany","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98950568"},{"1":"Europe","2":"Greece","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.97196085"},{"1":"Europe","2":"Hungary","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.79501875"},{"1":"Europe","2":"Iceland","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.97032657"},{"1":"Europe","2":"Ireland","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98414574"},{"1":"Europe","2":"Italy","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99336612"},{"1":"Europe","2":"Montenegro","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.80186521"},{"1":"Europe","2":"Netherlands","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98221566"},{"1":"Europe","2":"Norway","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.96290057"},{"1":"Europe","2":"Poland","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.83966315"},{"1":"Europe","2":"Portugal","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.96903508"},{"1":"Europe","2":"Romania","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.80556666"},{"1":"Europe","2":"Serbia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.87880538"},{"1":"Europe","2":"Slovak Republic","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.79174822"},{"1":"Europe","2":"Slovenia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.96604327"},{"1":"Europe","2":"Spain","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.96489456"},{"1":"Europe","2":"Sweden","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99548216"},{"1":"Europe","2":"Switzerland","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.99739086"},{"1":"Europe","2":"Turkey","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98533264"},{"1":"Europe","2":"United Kingdom","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.98443596"},{"1":"Oceania","2":"Australia","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.97964774"},{"1":"Oceania","2":"New Zealand","3":"<list<tibble[,5]>>","4":"<S3: lm>","5":"<tibble[,12]>","6":"<tibble[,5]>","7":"<tibble[,8]>","8":"0.95358464"}],"options":{"columns":{"min":{},"max":[10]},"rows":{"min":[10],"max":[10]},"pages":{}}} </script> </div> ] --- class: center, middle, inverse # Exploratory plots --- class: slide-practical # Plotting `\(r^2\)` for countries
05
:
00
.flex[ .w-60.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.gbox[Questions
]] .large[ - Plot `country` ~ `rsq` - Color points per continent - **Reorder** country levels by `\(r^2\)` (`rsq`): _snake plot_ - Which continent shows most of the low `\(r^2\)` values? ] ] .w-40.bg-washed-red.b--red.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.rbox[Reminder]] .large[ to reorder the discrete values of `country`: - Use the `forcats` package - `fct_reorder(country, rsq)` to reorder based on the `rsq` continuous variable ]]] --- # Do linear models fit all countries? ### Snake plot .pull-left[ ```r library(forcats) models %>% ggplot(aes(x = rsq, y = fct_reorder(country, rsq))) + geom_point(aes(colour = continent), alpha = 0.7, size = 2) + theme_classic(18) + theme(axis.text.y = element_blank(), axis.ticks.y = element_blank(), legend.position = c(0.25, 0.75)) + guides(color = guide_legend( override.aes = list(alpha = 1))) + labs(x = "r square", y = "Country") ``` ] .pull-right[ <img src="lecture09_lm_gapminder_files/figure-html/unnamed-chunk-25-1.png" width="504" /> ] --- class: slide-practical # Display the real data for countries with a low `\(r^2\)`
04
:
00
.flex[ .w-60.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.gbox[Questions
]] .large[ - Focus on non-linear trends - Filter the 20 countries with the lowest `\(r^2\)` - `unnest` column `data` - Plot `lifeExp` ~ `year` with lines - Colour per continent - Facet per country - Same questions for the **top 20** `\(r^2\)` ] ] .w-40.bg-washed-red.b--red.ba.bw2.br3.shadow-5.ph3.mt3.ml2[ .large[.rbox[Reminder]] .large[ - You .bold[must] `ungroup()` as we work currently .bold[by row] - `slice_min(col, n = 5)` returns the 5 minimal values of `col` ]]] --- # Focus on non-linear trends .pull-left[ ```r models %>% ungroup() %>% slice_min(rsq, n = 20) %>% unnest(data) %>% ggplot(aes(x = year, y = lifeExp)) + geom_line(aes(colour = continent)) + facet_wrap(~ country) + theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position = "bottom") ``` ] .pull-right[ <img src="lecture09_lm_gapminder_files/figure-html/unnamed-chunk-27-1.png" width="504" /> ] --- # Focus on best linear trends .pull-left[ ```r models %>% ungroup() %>% slice_max(rsq, n = 20) %>% unnest(data) %>% ggplot(aes(x = year, y = lifeExp)) + geom_line(aes(colour = continent)) + facet_wrap(~ country) + theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position = "bottom") ``` ] .pull-right[ <img src="lecture09_lm_gapminder_files/figure-html/unnamed-chunk-28-1.png" width="504" /> ] --- # Interpreting the linear model .pull-left[ ### Regression - What represents the **intercept**? + Using `year1950`? + Using `year`? + Justify Hadley choice - What represents the **slope**? <img src="lecture09_lm_gapminder_files/figure-html/unnamed-chunk-29-1.png" width="432" /> ] -- .pull-right[ - Coefficients with predictor `year - 1950` ```r filter(models, country == "Germany") %>% unnest(tidy) %>% select(continent, country, estimate) ``` ``` # A tibble: 2 × 3 # Groups: continent, country [1] continent country estimate <fct> <fct> <dbl> 1 Europe Germany 67.1 2 Europe Germany 0.214 ``` - Compare with model and original years ```r gapminder %>% filter(country == "Germany") %>% lm(lifeExp ~ year, data = .) %>% tidy() %>% select(term, estimate) ``` ``` # A tibble: 2 × 2 term estimate <chr> <dbl> 1 (Intercept) -350. 2 year 0.214 ``` ] --- class: slide-practical # Summarise on one plot ### by Hadley Wickham
04
:
00
.flex[ .w-60.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt3.ml6[ .large[.gbox[Questions
]] - Unnest coefficients (`tidy` column) + Mind to keep the `continent`, `country` and `rsq` columns - Put **intercept** and **slope** in their own columns + In **wide** format, only one value can be used. + Discard unused columns. - Plot `slope ~ intercept` (watch out the `(Intercept)` name which needs to be called between backticks '.large[`]') - Colour per continent - Size per `\(r^2\)` (use for `scale_size_area()` for lisibility) - Add tendency with `geom_smooth(method = "loess")` ] ] --- --- class: nvs1 count: false ### Full pipeline .panel1-final_plot-auto[ ```r *gapminder ``` ] .panel2-final_plot-auto[ ``` # A tibble: 1,704 × 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Afghanistan Asia 1952 28.8 8425333 779. 2 Afghanistan Asia 1957 30.3 9240934 821. 3 Afghanistan Asia 1962 32.0 10267083 853. 4 Afghanistan Asia 1967 34.0 11537966 836. 5 Afghanistan Asia 1972 36.1 13079460 740. 6 Afghanistan Asia 1977 38.4 14880372 786. 7 Afghanistan Asia 1982 39.9 12881816 978. 8 Afghanistan Asia 1987 40.8 13867957 852. 9 Afghanistan Asia 1992 41.7 16317921 649. 10 Afghanistan Asia 1997 41.8 22227415 635. # … with 1,694 more rows ``` ] --- count: false ### Full pipeline .panel1-final_plot-auto[ ```r gapminder %>% * mutate(year1950 = year - 1950) ``` ] .panel2-final_plot-auto[ ``` # A tibble: 1,704 × 7 country continent year lifeExp pop gdpPercap year1950 <fct> <fct> <int> <dbl> <int> <dbl> <dbl> 1 Afghanistan Asia 1952 28.8 8425333 779. 2 2 Afghanistan Asia 1957 30.3 9240934 821. 7 3 Afghanistan Asia 1962 32.0 10267083 853. 12 4 Afghanistan Asia 1967 34.0 11537966 836. 17 5 Afghanistan Asia 1972 36.1 13079460 740. 22 6 Afghanistan Asia 1977 38.4 14880372 786. 27 7 Afghanistan Asia 1982 39.9 12881816 978. 32 8 Afghanistan Asia 1987 40.8 13867957 852. 37 9 Afghanistan Asia 1992 41.7 16317921 649. 42 10 Afghanistan Asia 1997 41.8 22227415 635. 47 # … with 1,694 more rows ``` ] --- count: false ### Full pipeline .panel1-final_plot-auto[ ```r gapminder %>% mutate(year1950 = year - 1950) %>% * nest_by(continent, country) ``` ] .panel2-final_plot-auto[ ``` # A tibble: 142 × 3 # Rowwise: continent, country continent country data <fct> <fct> <list<tibble[,5]>> 1 Africa Algeria [12 × 5] 2 Africa Angola [12 × 5] 3 Africa Benin [12 × 5] 4 Africa Botswana [12 × 5] 5 Africa Burkina Faso [12 × 5] 6 Africa Burundi [12 × 5] 7 Africa Cameroon [12 × 5] 8 Africa Central African Republic [12 × 5] 9 Africa Chad [12 × 5] 10 Africa Comoros [12 × 5] # … with 132 more rows ``` ] --- count: false ### Full pipeline .panel1-final_plot-auto[ ```r gapminder %>% mutate(year1950 = year - 1950) %>% nest_by(continent, country) %>% * mutate(model = list(lm(lifeExp ~ year1950, * data = data)), * glance = list(glance(model)), * tidy = list(tidy(model)), * rsq = pluck(glance, "r.squared")) ``` ] .panel2-final_plot-auto[ ``` # A tibble: 142 × 7 # Rowwise: continent, country continent country data model glance tidy rsq <fct> <fct> <list<tibb> <list> <list> <list> <dbl> 1 Africa Algeria [12 × 5] <lm> <tibble… <tibbl… 0.985 2 Africa Angola [12 × 5] <lm> <tibble… <tibbl… 0.888 3 Africa Benin [12 × 5] <lm> <tibble… <tibbl… 0.967 4 Africa Botswana [12 × 5] <lm> <tibble… <tibbl… 0.0340 5 Africa Burkina Faso [12 × 5] <lm> <tibble… <tibbl… 0.919 6 Africa Burundi [12 × 5] <lm> <tibble… <tibbl… 0.766 7 Africa Cameroon [12 × 5] <lm> <tibble… <tibbl… 0.680 8 Africa Central African Republic [12 × 5] <lm> <tibble… <tibbl… 0.493 9 Africa Chad [12 × 5] <lm> <tibble… <tibbl… 0.872 10 Africa Comoros [12 × 5] <lm> <tibble… <tibbl… 0.997 # … with 132 more rows ``` ] --- count: false ### Full pipeline .panel1-final_plot-auto[ ```r gapminder %>% mutate(year1950 = year - 1950) %>% nest_by(continent, country) %>% mutate(model = list(lm(lifeExp ~ year1950, data = data)), glance = list(glance(model)), tidy = list(tidy(model)), rsq = pluck(glance, "r.squared")) %>% * unnest(tidy) ``` ] .panel2-final_plot-auto[ ``` # A tibble: 284 × 11 # Groups: continent, country [142] continent country data model glance term estimate std.error statistic <fct> <fct> <list<t> <lis> <list> <chr> <dbl> <dbl> <dbl> 1 Africa Algeria [12 × 5] <lm> <tibb… (Int… 42.2 0.756 55.8 2 Africa Algeria [12 × 5] <lm> <tibb… year… 0.569 0.0221 25.7 3 Africa Angola [12 × 5] <lm> <tibb… (Int… 31.7 0.804 39.4 4 Africa Angola [12 × 5] <lm> <tibb… year… 0.209 0.0235 8.90 5 Africa Benin [12 × 5] <lm> <tibb… (Int… 38.9 0.671 58.0 6 Africa Benin [12 × 5] <lm> <tibb… year… 0.334 0.0196 17.0 7 Africa Botswana [12 × 5] <lm> <tibb… (Int… 52.8 3.49 15.1 8 Africa Botswana [12 × 5] <lm> <tibb… year… 0.0607 0.102 0.593 9 Africa Burkina Faso [12 × 5] <lm> <tibb… (Int… 34.0 1.17 29.0 10 Africa Burkina Faso [12 × 5] <lm> <tibb… year… 0.364 0.0342 10.6 # … with 274 more rows, and 2 more variables: p.value <dbl>, rsq <dbl> ``` ] --- count: false ### Full pipeline .panel1-final_plot-auto[ ```r gapminder %>% mutate(year1950 = year - 1950) %>% nest_by(continent, country) %>% mutate(model = list(lm(lifeExp ~ year1950, data = data)), glance = list(glance(model)), tidy = list(tidy(model)), rsq = pluck(glance, "r.squared")) %>% unnest(tidy) %>% * select(continent, country, rsq, term, estimate) ``` ] .panel2-final_plot-auto[ ``` # A tibble: 284 × 5 # Groups: continent, country [142] continent country rsq term estimate <fct> <fct> <dbl> <chr> <dbl> 1 Africa Algeria 0.985 (Intercept) 42.2 2 Africa Algeria 0.985 year1950 0.569 3 Africa Angola 0.888 (Intercept) 31.7 4 Africa Angola 0.888 year1950 0.209 5 Africa Benin 0.967 (Intercept) 38.9 6 Africa Benin 0.967 year1950 0.334 7 Africa Botswana 0.0340 (Intercept) 52.8 8 Africa Botswana 0.0340 year1950 0.0607 9 Africa Burkina Faso 0.919 (Intercept) 34.0 10 Africa Burkina Faso 0.919 year1950 0.364 # … with 274 more rows ``` ] --- count: false ### Full pipeline .panel1-final_plot-auto[ ```r gapminder %>% mutate(year1950 = year - 1950) %>% nest_by(continent, country) %>% mutate(model = list(lm(lifeExp ~ year1950, data = data)), glance = list(glance(model)), tidy = list(tidy(model)), rsq = pluck(glance, "r.squared")) %>% unnest(tidy) %>% select(continent, country, rsq, term, estimate) %>% * pivot_wider(names_from = term, * values_from = estimate) ``` ] .panel2-final_plot-auto[ ``` # A tibble: 142 × 5 # Groups: continent, country [142] continent country rsq `(Intercept)` year1950 <fct> <fct> <dbl> <dbl> <dbl> 1 Africa Algeria 0.985 42.2 0.569 2 Africa Angola 0.888 31.7 0.209 3 Africa Benin 0.967 38.9 0.334 4 Africa Botswana 0.0340 52.8 0.0607 5 Africa Burkina Faso 0.919 34.0 0.364 6 Africa Burundi 0.766 40.3 0.154 7 Africa Cameroon 0.680 40.7 0.250 8 Africa Central African Republic 0.493 38.4 0.184 9 Africa Chad 0.872 39.3 0.253 10 Africa Comoros 0.997 39.1 0.450 # … with 132 more rows ``` ] --- count: false ### Full pipeline .panel1-final_plot-auto[ ```r gapminder %>% mutate(year1950 = year - 1950) %>% nest_by(continent, country) %>% mutate(model = list(lm(lifeExp ~ year1950, data = data)), glance = list(glance(model)), tidy = list(tidy(model)), rsq = pluck(glance, "r.squared")) %>% unnest(tidy) %>% select(continent, country, rsq, term, estimate) %>% pivot_wider(names_from = term, values_from = estimate) %>% * ggplot(aes(x = `(Intercept)`, y = year1950)) ``` ] .panel2-final_plot-auto[ <img src="lecture09_lm_gapminder_files/figure-html/final_plot_auto_08_output-1.png" width="720" /> ] --- count: false ### Full pipeline .panel1-final_plot-auto[ ```r gapminder %>% mutate(year1950 = year - 1950) %>% nest_by(continent, country) %>% mutate(model = list(lm(lifeExp ~ year1950, data = data)), glance = list(glance(model)), tidy = list(tidy(model)), rsq = pluck(glance, "r.squared")) %>% unnest(tidy) %>% select(continent, country, rsq, term, estimate) %>% pivot_wider(names_from = term, values_from = estimate) %>% ggplot(aes(x = `(Intercept)`, y = year1950)) + * geom_point(aes(colour = continent, * size = rsq)) ``` ] .panel2-final_plot-auto[ <img src="lecture09_lm_gapminder_files/figure-html/final_plot_auto_09_output-1.png" width="720" /> ] --- count: false ### Full pipeline .panel1-final_plot-auto[ ```r gapminder %>% mutate(year1950 = year - 1950) %>% nest_by(continent, country) %>% mutate(model = list(lm(lifeExp ~ year1950, data = data)), glance = list(glance(model)), tidy = list(tidy(model)), rsq = pluck(glance, "r.squared")) %>% unnest(tidy) %>% select(continent, country, rsq, term, estimate) %>% pivot_wider(names_from = term, values_from = estimate) %>% ggplot(aes(x = `(Intercept)`, y = year1950)) + geom_point(aes(colour = continent, size = rsq)) + * geom_smooth(se = FALSE, * method = "loess", * formula = "y ~ x") ``` ] .panel2-final_plot-auto[ <img src="lecture09_lm_gapminder_files/figure-html/final_plot_auto_10_output-1.png" width="720" /> ] --- count: false ### Full pipeline .panel1-final_plot-auto[ ```r gapminder %>% mutate(year1950 = year - 1950) %>% nest_by(continent, country) %>% mutate(model = list(lm(lifeExp ~ year1950, data = data)), glance = list(glance(model)), tidy = list(tidy(model)), rsq = pluck(glance, "r.squared")) %>% unnest(tidy) %>% select(continent, country, rsq, term, estimate) %>% pivot_wider(names_from = term, values_from = estimate) %>% ggplot(aes(x = `(Intercept)`, y = year1950)) + geom_point(aes(colour = continent, size = rsq)) + geom_smooth(se = FALSE, method = "loess", formula = "y ~ x") + * scale_size_area() ``` ] .panel2-final_plot-auto[ <img src="lecture09_lm_gapminder_files/figure-html/final_plot_auto_11_output-1.png" width="720" /> ] --- count: false ### Full pipeline .panel1-final_plot-auto[ ```r gapminder %>% mutate(year1950 = year - 1950) %>% nest_by(continent, country) %>% mutate(model = list(lm(lifeExp ~ year1950, data = data)), glance = list(glance(model)), tidy = list(tidy(model)), rsq = pluck(glance, "r.squared")) %>% unnest(tidy) %>% select(continent, country, rsq, term, estimate) %>% pivot_wider(names_from = term, values_from = estimate) %>% ggplot(aes(x = `(Intercept)`, y = year1950)) + geom_point(aes(colour = continent, size = rsq)) + geom_smooth(se = FALSE, method = "loess", formula = "y ~ x") + scale_size_area() + * labs(x = "Life expectancy (1950)", * y = "Yearly improvement") ``` ] .panel2-final_plot-auto[ <img src="lecture09_lm_gapminder_files/figure-html/final_plot_auto_12_output-1.png" width="720" /> ] --- count: false ### Full pipeline .panel1-final_plot-auto[ ```r gapminder %>% mutate(year1950 = year - 1950) %>% nest_by(continent, country) %>% mutate(model = list(lm(lifeExp ~ year1950, data = data)), glance = list(glance(model)), tidy = list(tidy(model)), rsq = pluck(glance, "r.squared")) %>% unnest(tidy) %>% select(continent, country, rsq, term, estimate) %>% pivot_wider(names_from = term, values_from = estimate) %>% ggplot(aes(x = `(Intercept)`, y = year1950)) + geom_point(aes(colour = continent, size = rsq)) + geom_smooth(se = FALSE, method = "loess", formula = "y ~ x") + scale_size_area() + labs(x = "Life expectancy (1950)", y = "Yearly improvement") + * theme_minimal(18) ``` ] .panel2-final_plot-auto[ <img src="lecture09_lm_gapminder_files/figure-html/final_plot_auto_13_output-1.png" width="720" /> ] <style> .panel1-final_plot-auto { color: black; width: 44.1%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-final_plot-auto { color: black; width: 53.9%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-final_plot-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- # Animation with `gganimate` #### Takes ~ 5 minutes due to easing (.green[linear] since time in years) .pull-left[ ```r library(gganimate) gapminder %>% ggplot(aes(x = gdpPercap, y = lifeExp, size = pop, color = continent)) + transition_time(year) + ease_aes("linear") + scale_size(range = c(2, 12)) + geom_point() + theme_bw(16) + labs(title = "Year: {frame_time}", x = "GDP per capita", y = "life expectancy") + scale_x_log10() -> p animate(p) anim_save("gapminder2.gif") ``` ] .pull-right[  ] --- # Before we stop .flex[ .w-60.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt2.ml1[ .large[.gbox[You learned to:] .float-img[] - Keep related things together: + Input data + Meaningful grouping ids + Perform modelling + Extract relevant model components + Explore visually your findings ] ] .w-40.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt2.ml2[ .large[.bbox[Acknowledgments 🙏 👏] * Hadley Wickham * Jennifer Bryan * David Robinson * Thomas Pedersen * Eric Koncina * Roland Krause ]] ] .w-60.pv2.ph3.mt1.ml6[ .huge[.bbox[Thank you for your attention!]] ]