After struggling how to program with dplyr, I decided it would be best to write a blog post about it. These notes are referenced from this cheatsheet
Fundamental terminology:
- Symbol: a name that represents a value or object stored in R
- For example, take
a = 3, num_missing = sum(is.na)
. a
, andnum_missing
are symbols.
- For example, take
- Expression: an object that stores quoted code without evaulating it
- Think about in terms of mathematical equations. Many times when we do proofs, we do it by manipulating expressions only. For example, take the derivation to estimate the parameters for least squares.
In the example below, we capture the expression a + b
. It doesn’t matter if we defined symbols a
and b
before.
c <- rlang::expr(a + b)
c
> a + b
- Constant: a bare value
- Bare objects have no class attribute
- We give examples of constants: numeric, chracter, and logical vectors as well as lists.
- We demonstrate that a data frame is not bare because it has class attributes
# no attributes
attributes(1)
> NULL
attributes("1")
> NULL
attributes(TRUE)
> NULL
attributes(list(1, 2, 3))
> NULL
# data frame is not bare
# it has attributes
attributes(data.frame(x = c(1,3)))
> $names
> [1] "x"
>
> $class
> [1] "data.frame"
>
> $row.names
> [1] 1 2
- Environment: A context to store bindings between symbols to objects.
Below, we have introduced two new bindings to the global environment. We bind the symbol a
to the constant 1 and the symbol b
to the constant “hello”. And, objects a
and b
are now objects in memory within the global environment
a <- 1
b <- "hello"
Call object: a vector of symbols, constants, and calls that begin with a function name, possibly with arguments.
sum(1, 2)
is the call objectmin(c(1, 3, 4))
is the call object
Code: a sequence of symbols, constants, and calls that will return a result if evaluated
sum(1 + 2)
is code. Notice how it is made up of symbols, constants, and calls3
is the result.
sum(1 + 2)
> [1] 3
Standard Evaluation: Code evaluation is immediate
Non-Standard Evaluation: Code evalulation is delayed
- The code is quoted with intent to evaluate later
Quotation: The act of storing an expression without evaluating it
rlang::expr(lm(y ~ x))
rlang::expr
stores the expression but doesn’t evaluate it
Unquotation: The act of evaluating an expression
- In rlang, you can use
!!
,!!!
, or:=
to unquote. !!
unquotes the symbol or call that follows!!!
is used with vectors and lists. It unpacks its results as arguments to a call.:=
allows for unquoting of symbols (names) on the left hand side of an=
assignment
- In rlang, you can use
Quasiquotation: The act of quoting some parts of an expression and evaluating other parts of an expression, by inserting in the evaluated result of the expression