class: title-slide
# Evaluation in programming ## `tidyeval` .center[<img src="https://rlang.r-lib.org/reference/figures/rlang.png" width="100px"/>] ### A. Ginolhac | rworkshop | 2021-09-10 ## 🪄 .bold[Advanced topic]
--- # Learning objectives .flex[ .w-60.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt3.mr1.ml6[ .large[.gbox[You will learn to:] .right[<img src="img/00/kt.png" class="float-img"/>] - Understand how quoted functions work - Why we need **tidyeval** - How to apply it for your programming - Embrace the curly-curly operator and forget the above points ]]] --- # Environments and promises .pull-left[ ### Functions enclose their variables ```r x <- 1 plus_one <- function(x) { x <- x + 1 x } plus_one(x) ``` ``` [1] 2 ``` ```r plus_one(x) ``` ``` [1] 2 ``` .Large[The `x` object in .bold[Global.Env] wasn't modified] ] -- .pull-right[ ### Separate assignment and evaluation ```r msg <- "old" delayedAssign("promise", msg) msg <- "new!" promise # new! ``` ``` [1] "new!" ``` #### The <u>promise</u> was created when `msg` was .green[`"old"`]
.large[is fine with letting user creating variables (`promises`) that will be evaluated only later.] ] .footnote[Source: help page of `delayedAssign()`] --- # Standard vs Non-standard evaluation .pull-left[ ####
.large[base], must refers to known objects - But quoting (as no evaluation) is used for axis labels ```r plot(swiss$Education, swiss$Examination) ``` <img src="lecture11_tidyeval_files/figure-html/unnamed-chunk-3-1.png" width="360" /> ] -- .pull-right[ #### `ggplot2` or widely in the tidyverse - Evaluating columns in **data** context. ```r ggplot(swiss, aes(x = Education, y = Examination)) + geom_point() ``` <img src="lecture11_tidyeval_files/figure-html/unnamed-chunk-4-1.png" width="360" /> ] --- # Quoting .pull-left[ .rbox[
Does not mean adding quotes!] #### But capture an expression ```r quote(Education) ``` ``` Education ``` ```r quote(swiss$Education) ``` ``` swiss$Education ``` #### Without quoting, we cannot evaluate ```r eval(Education, envir = swiss) ``` ``` Error in eval(Education, envir = swiss): object 'Education' not found ``` `envir` tells where to find the vector `Education` ] -- .pull-right[ .bbox[Quoting allows to evaluate when needed] ```r eval(quote(Education), envir = swiss) ``` ``` [1] 12 9 5 7 15 7 7 8 7 13 6 12 7 12 5 2 8 28 20 9 10 3 12 6 1 [26] 8 3 10 19 8 2 6 2 6 3 9 3 13 12 11 13 32 7 7 53 29 29 ``` ] --- class: center, middle ### Non-standard evaluation (NSE)
### Unquoted variable names .footnote[[technically WRONG, but useful approximation. - .bold[Jenny Bryan], rstudio::conf2019](https://www.youtube.com/watch?v=2BXPLnLMTYo)] --- # Quotation also works for expression .pull-left[ ### Exist in
.bold[base] ```r subset(swiss, (Education + Examination) > 60)[, 2:4] ``` ``` Agriculture Examination Education Neuchatel 17.6 35 32 V. De Geneve 1.2 37 53 ``` ] -- .pull-right[ ### `dplyr` ```r filter(swiss, (Education + Examination) > 60) %>% select(2:4) ``` ``` Agriculture Examination Education Neuchatel 17.6 35 32 V. De Geneve 1.2 37 53 ``` ] --- # Evaluating expression .flex[ .w-70.bg-washed-yellow.b--gold.ba.bw2.br3.shadow-5.ph3.mt3.mr1[ .large.gbox[Standard evaluation of an expression] .large[To evaluate an .red[expression], you search .red[environments] for name bindings-values and perform the evaluation immediately. ]] .w-70.bg-washed-yellow.b--gold.ba.bw2.br3.shadow-5.ph3.mt3.mr1[ .large[ .bbox[Non-standard evaluation] means you might * Modify the .red[expression] or * Modify the chain of searched .red[environments] before evaluation. ] ] ] -- .flex[ .w-70.bg-washed-yellow.b--gold.ba.bw2.br3.shadow-5.ph3.mt3.ml6[ .center.Large[Why `swiss` is found while being absent in `Global_Env`?] ]] .footnote[[Jenny Bryan, rstudio::conf2019](https://www.youtube.com/watch?v=2BXPLnLMTYo)] --- class: nvs1 # Values in environments are searched in a precise order .pull-left[  ```r # callr allows to run a clean R session callr::r(function() rlang::search_envs()) ``` ``` [[1]] $ <env: global> [[2]] $ <env: package:stats> [[3]] $ <env: package:graphics> [[4]] $ <env: package:grDevices> [[5]] $ <env: package:utils> [[6]] $ <env: package:datasets> [[7]] $ <env: package:methods> [[8]] $ <env: Autoloads> [[9]] $ <env: tools:callr> [[10]] $ <env: package:base> ``` - `swiss` lies in `datasets`, so found after 5 fails ] -- .pull-right[  ```r callr::r(function() { * library(forcats) rlang::search_envs() }) ``` ``` [[1]] $ <env: global> [[2]] $ <env: package:forcats> [[3]] $ <env: package:stats> [[4]] $ <env: package:graphics> [[5]] $ <env: package:grDevices> [[6]] $ <env: package:utils> [[7]] $ <env: package:datasets> [[8]] $ <env: package:methods> [[9]] $ <env: Autoloads> [[10]] $ <env: tools:callr> [[11]] $ <env: package:base> ``` ] .footnote[[Advanced R](https://adv-r.hadley.nz/environments.html) by Hadley Wickham] --- # Quoting prevents evaluation, nice, but how does it work? .pull-left[ .large[ `Education` is unknown in the .bold[Global Environment] ```r Education ``` ``` Error in eval(expr, envir, enclos): object 'Education' not found ``` Quoting .bold[prevents] evaluation ] ```r quote(Education) ``` ``` Education ``` .large[But how to .bold[force] evaluation then?] ] -- .pull-right[ ###
uses `eval()` in _data_ context with `envir` ```r Education <- "tic" # the Education in the GlobalEnv won't clash eval(expr = quote(Education), envir = swiss) ``` ``` [1] 12 9 5 7 15 7 7 8 7 13 6 12 7 12 5 2 8 28 20 9 10 3 12 6 1 [26] 8 3 10 19 8 2 6 2 6 3 9 3 13 12 11 13 32 7 7 53 29 29 ``` ] --- # Evaluation in the tidyverse .pull-left[ .large[ Works even if an object name is colliding with the .bold[Global Environment] - `Education` is evaluated in the `swiss` context .bold[only] ] ```r Education <- 20 filter(swiss, Education > 40) ``` ``` Fertility Agriculture Examination Education Catholic V. De Geneve 35 1.2 37 53 42.34 Infant.Mortality V. De Geneve 18 ``` ] -- .pull-right[ .large[ - `.data` and `.env` pronouns exist if you need to precise .bold[who] is .bold[where] ```r filter(swiss, .data$Education > .env$Education) ``` ``` Fertility Agriculture Examination Education Catholic Lausanne 55.7 19.4 26 28 12.11 Neuchatel 64.4 17.6 35 32 16.92 V. De Geneve 35.0 1.2 37 53 42.34 Rive Droite 44.7 46.6 16 29 50.43 Rive Gauche 42.8 27.7 22 29 58.33 Infant.Mortality Lausanne 20.2 Neuchatel 23.0 V. De Geneve 18.0 Rive Droite 18.2 Rive Gauche 19.3 ``` ] #### But you better avoid names collision ] --- # Quotations of expressions .pull-left[ ### Doing it by hand ```r expr <- quote((Education + Examination) > 60) expr ``` ``` (Education + Examination) > 60 ``` ```r swiss[eval(expr, envir = swiss), 1:2] ``` ``` Fertility Agriculture Neuchatel 64.4 17.6 V. De Geneve 35.0 1.2 ``` ] -- .pull-right[ ### In a function it .red[fail!]
```r quoted_filter <- function(data, expr) { q_expr <- quote(expr) data[eval(q_expr, envir = data), 1:2] } quoted_filter(swiss, (Education + Examination) > 60) ``` ``` Error in eval(q_expr, envir = data): object 'Examination' not found ``` ] --- # Expressions in functions #### `substitute()` is needed .pull-left[ ```r quoted_filter_sub <- function(data, expr) { q_expr <- substitute(expr) # returns both the results and substituted expression list(data[eval(q_expr, envir = data), 1:2], q_expr ) } quoted_filter_sub(swiss, (Education + Examination) > 60) ``` ``` [[1]] Fertility Agriculture Neuchatel 64.4 17.6 V. De Geneve 35.0 1.2 [[2]] (Education + Examination) > 60 ``` ] .pull-right[ ### _Substituting and Quoting Expressions_ help page .large[ > `substitute` returns the parse tree for the (unevaluated) expression expr, substituting any variables bound in env. `quote` simply returns its argument. The argument is not evaluated and can be any R expression. - Complex but works, so .bold[why] do we need `tidyeval`? ] ] --- # Why do we need `tidyeval`, _i. e_ quasiquotation? .pull-left[ ### Add one variable in GlobalEnv ```r threshold <- 60 quoted_filter_sub(swiss, (Education + Examination) > threshold) ``` ``` [[1]] Fertility Agriculture Neuchatel 64.4 17.6 V. De Geneve 35.0 1.2 [[2]] (Education + Examination) > threshold ``` ] -- .pull-right[ ### .red[Error], names clash
.Large[ `Fertility` is .bold[also] a data column ] ```r Fertility <- 60 quoted_filter_sub(swiss, (Education + Examination) > Fertility) ``` ``` [[1]] Fertility Agriculture Neuchatel 64.4 17.6 V. De Geneve 35.0 1.2 Rive Droite 44.7 46.6 Rive Gauche 42.8 27.7 [[2]] (Education + Examination) > Fertility ``` ] --- # Quasiquotation, bang bang ‼️ operator .large[ - When we need to unquote **part** of the expression: .green.large[!!] - `rlang::qq_show()` helper to check ] .flex[ .w-60.b--blue.ba.bw2.br3.shadow-5.ph4.mt5.ml6[ .bbox[Demonstration] ```r quo((Education + Examination) > Fertility) ``` ``` <quosure> expr: ^(Education + Examination) > Fertility env: 0x5641f6598958 ``` ```r rlang::qq_show(quo((Education + Examination) > !! Fertility)) ``` ``` quo((Education + Examination) > 60) ``` ] ] --- # The bang bang operator `!!` .pull-left[ ### In `dplyr`, unquote only `Fertility` - Keep the rest of the expression as before ```r filter(swiss, (Education + Examination) > !! Fertility) ``` ``` Fertility Agriculture Examination Education Catholic Neuchatel 64.4 17.6 35 32 16.92 V. De Geneve 35.0 1.2 37 53 42.34 Infant.Mortality Neuchatel 23 V. De Geneve 18 ``` ] -- .pull-right[ ### .green[New], the `curly-curly` operator ```r filter(swiss, (Education + Examination) > {{Fertility}}) ``` ``` Fertility Agriculture Examination Education Catholic Neuchatel 64.4 17.6 35 32 16.92 V. De Geneve 35.0 1.2 37 53 42.34 Infant.Mortality Neuchatel 23 V. De Geneve 18 ``` ] --- class: nvs2, hide_logo .flex[ .w-50.mr1[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">New in-depth blogpost on NSE with <a href="https://twitter.com/hashtag/rlang?src=hash&ref_src=twsrc%5Etfw">#rlang</a>'s quosures:<a href="https://t.co/yJQ81IdhYp">https://t.co/yJQ81IdhYp</a>Jump down the rabbit hole with me to see how they work, and of course, how to implement them in base <a href="https://twitter.com/hashtag/rstats?src=hash&ref_src=twsrc%5Etfw">#rstats</a>. <a href="https://t.co/wo4GebqONt">pic.twitter.com/wo4GebqONt</a></p>— BrodieG (@BrodieGaslam) <a href="https://twitter.com/BrodieGaslam/status/1293166631741521922?ref_src=twsrc%5Etfw">August 11, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] .w-50.bg-washed-blue.b--blue.ba.bw2.br3.shadow-5.ph3.ml1[ .large[.ybox[`rlang` is providing the toolkit] - Really advanced - Most the time, you don't need it - Already exposed in main tidyverse packages - Animation done in
with [`rayshader`](https://github.com/tylermorganwall/rayshader) by .bold[Brodie Gaslam] ]] ] --- # `tidyeval`, what to remember .pull-left[ ### Use unquoted names in your own functions - `Fertility` as a .bold[promise] (unknown in **Global Env**) - Evaluation is delayed by `enquo()` - User decide the evaluation with `!!` - Both can be abstracted with `{{}}` ```r select_head <- function(.data, column, n = 5) { .data %>% select({{column}}) %>% slice_head(n = n) } select_head(swiss, column = Fertility, n = 2) ``` ``` Fertility Courtelary 80.2 Delemont 83.1 ``` ] -- .pull-right[ ```r select_head(swiss, column = Fertility, n = 2) ``` ``` Fertility Courtelary 80.2 Delemont 83.1 ``` ```r select_head(swiss, column = c(Fertility, Catholic), n = 2) ``` ``` Fertility Catholic Courtelary 80.2 9.96 Delemont 83.1 84.84 ``` ```r select_head(swiss, column = Fertility:Examination, n = 3) ``` ``` Fertility Agriculture Examination Courtelary 80.2 17.0 15 Delemont 83.1 45.1 6 Franches-Mnt 92.5 39.7 5 ``` ] -- .center.Large[All you need to remember, .bold[unquoted] arguments in functions: use `{{arg}}`] --- # New column name, the operator `:=` .large[ - To turn a **quosure** into a **name** that could be pasted - Must uses a specific assignment `:=` (.bold[walrus], from `data.table`) ] .pull-left[ ```r my_summarise <- function(.data, out_name, expr) { summarise(.data, {{out_name}} := mean({{expr}}) ) } ``` ] -- .pull-right[ ```r my_summarise(swiss, m_fer, Fertility) ``` ``` m_fer 1 70.14255 ``` ```r my_summarise(swiss, mean_exam, Examination) ``` ``` mean_exam 1 16.48936 ``` ] --- class: hide_logo # Before we stop .flex[ .w-60.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt1.ml1[ .large[.gbox[You learned to:]] - grasp non-standard evaluation - use it with `tidyeval` - Pass arguments as promises, not strings > I can categorically say if you're pasting strings to program with dplyr, there is always better way. .tr[[— _Hadley Wickham_](https://github.com/tidyverse/dplyr/issues/2663#issuecomment-294206288)] ] .w-40.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt1.ml2[ .large[.bbox[Acknowledgments 🙏 👏] * Lionel Henry * Jenny Bryan * Clause O. Wilke * Hadley Wickham ]]] .flex[ .w-60.bg-washed-green.b--green.ba.bw2.br3.shadow-5.ph3.mt2.ml1[ .large[.ybox[Further reading
]] - [5min talk](https://www.youtube.com/watch?v=nERXS3ssntw) by Hadley Wickham - [RStudio::conf 2019, Lazy evaluation](https://www.youtube.com/watch?v=2BXPLnLMTYo) by Jenny Bryan - [1h Rstudio webinar](https://www.rstudio.com/resources/webinars/?wvideo=zt9kl921rh) by Lionel Henry - [RStudio::conf 2017 slides](https://www.r-project.org/dsc/2017/slides/tidyeval-hygienic-fexprs.pdf) by Lionel Henry and Hadley Wickham - [quasiquotation](https://adv-r.hadley.nz/quasiquotation.html) adv in R by Hadley Wickham - [reduce, map to generate code, lm example](https://adv-r.hadley.nz/quasiquotation.html#map-reduce-to-generate-code) adv in R by Hadley Wickham - [tidyeval ressources clustered](http://maraaverick.rbind.io/2017/08/tidyeval-resource-roundup/) by Mara Averick - [do it by quasiquotation](http://blog.jalsalam.com/posts/2017/quasi-quotation-applications/) by Jamel Alsalam ] .w-40.pv2.ph3.mt1.ml1[ .huge[.bbox[Thank you for your attention!]] ] ]