Environments and Scoping

(Section 3.5)

Environments and the Search Path

Finding Out What Names “Mean”

You use an identifier in a program, e.g.:

a + 23

Or maybe:

as.numeric(c("3.2", "5"))
[1] 3.2 5.0

How does R figure out:

  • what value goes with a?
  • what as.numeric() does?
  • what c() does?

Scoping

Scoping

The process by which the computer looks up the object associated with a name in an expression.

Scoping is managed by R’s many environments.

Environment

An object stored in the computer’s memory that keeps track of name-value pairs.

Environment

An environment is like a bag of names.

Actually, it’s a collection of names along with the objects to which they are bound.

In addition, each environment also comes with a link to another environment that is its parent. (But we’ll get to that later.)

The Active Environment

Active Environment

The environment that R will consult first in order to find the value of any name in an expression.

At any given moment in an R session, some environment plays the role of the active environment.

The Global Environment

Global Environment

The environment that is active when one is using R from the console.

This is the environment that you are most often “in”, at this early stage in your use of R.

More About the Global Environment

You can get a list of all the names in the Global Environment:

ls()

You can also remove all of the names from your Global Environment:

rm(list = ls())

Parent Environment

Parent Environment

The second environment (after the active environment) that R will search when it needs to look up a name.

When R needs to look up a name, it will first search the active environment.

If it doesn’t find the name there, then it follows the link to the parent environment.

But that environment has a parent, too, where R can search.

The Search Path

Search Path

The sequence of environments that the computer will consult in order to find an object associated with a name in an expression.

The sequence begins with the active environment, followed by its parent environment, followed by the parent of the parent environment, and so on.

Showing Your Search Path

search()
 [1] ".GlobalEnv"        "tools:quarto"      "tools:quarto"     
 [4] "package:stats"     "package:graphics"  "package:grDevices"
 [7] "package:utils"     "package:datasets"  "package:methods"  
[10] "Autoloads"         "package:base"     

An Experiment

Define a variable:

quadlingColor <- "red"

Then use it in some code:

cat(quadlingColor, ", white and blue\n", sep ="")
red, white and blue

Looking Up Stuff

In order to execute the previous command:

cat(quadlingColor, ", white and blue\n", sep ="")

R had to look up two names:

  • quadlingColor
  • cat

Where Did R Find Them?

The find() function can tell us!

find("quadlingColor")
[1] ".GlobalEnv"
find("cat")
[1] "package:base"
  • It found quadlingColor in the first place it looked.
  • To find cat R had to go pretty far up the Search Path!

Let’s Experiment

Run this code:

cat <- "Pippin"

Now R knows about two things named cat:

  • the string “Pippin” in the Global Environment
  • the cat() function in package base

What do you think will happen if you run the following command?

cat(cat, "is a cat!\n")

Further Experiments

Execute this code:

cat <- function(...) {
  "Meow!"
}

Now what do you think will happen when you run the following?

cat(cat, "is a cat!\n")

Further Experiments

We can still get to the old cat() function, but we have to give R a hint where to look. Run this code:

base::cat(cat, "is a cat\n")

Best not to hide base::cat(), so let’s remove our cat():

rm(cat)

Function Environments

Review

  • An environment is a collection of names associated with objects.
  • The Global Environment is the environment that is active when we are working from the console.
  • When R needs to look up a name, it consults a search path.
  • When we are in the Global Environment the search path starts there, and continues to:
    • the last package loaded (the parent environment),
    • the package before that (the “grandparent environment”),
    • and so on …
    • … up to package base.
  • the first object of the right type having the given name that is found along the search path is the object to which R will associate the name.

Another Function

Try this:

a <- 10
b <- 4
f <- function(x, y) {
  a <- 5
  print(ls())
  cat("a is ", a, "\n",
      "b is ", b, "\n",
      "x is ", x, "\n",
      "y is ", y, "\n", sep = "")
}

Now a and b and f() are in the Global Environment:

  • a is bound to 10, b is bound to 4
  • f is bound to the function we defined

Calling f()

Now call f():

f(x = 2, y = 3)

What did you get for a, b, x and y? Why?

Run-time Environments

Run-time Environment (also called the “Evaluation Environment”)

A special environment that is created when a function is called and ceases to exist when the function finishes executing.

  • It contains the values that are local to the function and the arguments of the function as well.
  • Its parent is the environment in which the function was defined. (The Global Environment, in our example.)

Important Fact

The parent environment of the run-time environment is the environment in which the function was defined.

When you have defined a function in your Global Environment, then the Global Environment is the parent environment when the function is called!

So …

a <- 10
b <- 4
f <- function(x, y) {
  a <- 5
  print(ls())
  cat("a is ", a, "\n",
      "b is ", b, "\n",
      "x is ", x, "\n",
    
}
f(x = 2, y = 3)
  • That’s why f() could find b and learn that it was 4.
  • On the other hand, it bound a to 5 in the run-time environment, so when it looked for the name a it found it in the run-time environment.

Practice

Now that you have run f(), what should a be: 10 or 5? Why?

Now check:

a