More on Lists

(Section 9.3)

Application: Splitting

Sub-setting by seat:

Here’s a slow, annoying way:

front <- subset(m111survey, seat == "1_front")
middle <- subset(m111survey, seat == "2_middle")
back <- subset(m111survey, seat == "3_back")

Sub-setting with split()

Here’s a cool way:

bySeat <- split(m111survey, f = m111survey$seat)
str(bySeat)
List of 3
 $ 1_front :'data.frame':   27 obs. of  12 variables:
  ..$ height         : num [1:27] 76 62 70.8 70 65 ...
  ..$ ideal_ht       : num [1:27] 78 65 NA 72 69 62 62 70 68 70 ...
  ..$ sleep          : num [1:27] 9.5 7 10 4 6 7 5 8 7 7 ...
  ..$ fastest        : int [1:27] 119 100 100 85 100 60 80 120 75 90 ...
  ..$ weight_feel    : Factor w/ 3 levels "1_underweight",..: 1 1 3 2 2 3 2 3 3 3 ...
  ..$ love_first     : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 2 2 1 1 ...
  ..$ extra_life     : Factor w/ 2 levels "no","yes": 2 1 1 2 1 1 2 2 1 1 ...
  ..$ seat           : Factor w/ 3 levels "1_front","2_middle",..: 1 1 1 1 1 1 1 1 1 1 ...
  ..$ GPA            : num [1:27] 3.56 3.5 3.1 3.68 2.1 2.5 3.89 3.75 3.4 2.77 ...
  ..$ enough_Sleep   : Factor w/ 2 levels "no","yes": 1 1 2 1 2 2 1 2 2 1 ...
  ..$ sex            : Factor w/ 2 levels "female","male": 2 1 2 2 1 1 1 1 1 2 ...
  ..$ diff.ideal.act.: num [1:27] 2 3 NA 2 4 0 3 5 2 2.25 ...
 $ 2_middle:'data.frame':   32 obs. of  12 variables:
  ..$ height         : num [1:32] 74 64 67 78 69 68 73 68 70 75 ...
  ..$ ideal_ht       : num [1:32] 76 NA 67 75 72 68 75 68 75 78 ...
  ..$ sleep          : num [1:32] 7 9 7 7 7 4.5 8 4 7.5 7 ...
  ..$ fastest        : int [1:32] 110 85 90 80 125 100 120 90 90 143 ...
  ..$ weight_feel    : Factor w/ 3 levels "1_underweight",..: 2 2 3 3 1 1 2 3 2 3 ...
  ..$ love_first     : Factor w/ 2 levels "no","yes": 1 1 1 1 1 2 1 2 2 2 ...
  ..$ extra_life     : Factor w/ 2 levels "no","yes": 2 1 1 1 1 2 2 1 2 1 ...
  ..$ seat           : Factor w/ 3 levels "1_front","2_middle",..: 2 2 2 2 2 2 2 2 2 2 ...
  ..$ GPA            : num [1:32] 2.5 3.8 NA 3.2 3.2 2.2 3.55 3.2 2.8 3.1 ...
  ..$ enough_Sleep   : Factor w/ 2 levels "no","yes": 1 1 2 2 1 1 2 1 2 2 ...
  ..$ sex            : Factor w/ 2 levels "female","male": 2 1 1 1 2 2 2 1 2 2 ...
  ..$ diff.ideal.act.: num [1:32] 2 NA 0 -3 3 0 2 0 5 3 ...
 $ 3_back  :'data.frame':   12 obs. of  12 variables:
  ..$ height         : num [1:12] 72 79 59 73 65 69 72 70.5 69 74 ...
  ..$ ideal_ht       : num [1:12] 72 76 61 77 68 67 90 73 65 76 ...
  ..$ sleep          : num [1:12] 8 6 7 6 7 6 9 7 8 7 ...
  ..$ fastest        : int [1:12] 95 160 90 110 125 145 125 190 110 95 ...
  ..$ weight_feel    : Factor w/ 3 levels "1_underweight",..: 1 2 2 2 3 3 3 2 3 2 ...
  ..$ love_first     : Factor w/ 2 levels "no","yes": 1 1 1 2 2 1 1 1 2 1 ...
  ..$ extra_life     : Factor w/ 2 levels "no","yes": 2 2 2 2 1 1 2 2 2 2 ...
  ..$ seat           : Factor w/ 3 levels "1_front","2_middle",..: 3 3 3 3 3 3 3 3 3 3 ...
  ..$ GPA            : num [1:12] 3.2 2.7 2.8 3.5 3.5 ...
  ..$ enough_Sleep   : Factor w/ 2 levels "no","yes": 1 2 1 1 1 1 1 1 1 2 ...
  ..$ sex            : Factor w/ 2 levels "female","male": 2 2 1 2 1 1 2 2 1 2 ...
  ..$ diff.ideal.act.: num [1:12] 0 -3 2 4 3 -2 18 2.5 -4 2 ...

One Hitch

This won’t work to get at one of the frames:

bySeat$1_front
## Error: unexpected numeric constant in "bySeat$1"

1_front is not a legal name for a variable. (Cannot begin with a number!)

No Big Deal

You can do this:

bySeat[["1_front"]]

Or this:

bySeat[[1]]
   height ideal_ht sleep fastest   weight_feel love_first extra_life    seat
1   76.00       78   9.5     119 1_underweight         no        yes 1_front
4   62.00       65   7.0     100 1_underweight         no         no 1_front
6   70.80       NA  10.0     100  3_overweight         no         no 1_front
7   70.00       72   4.0      85 2_about_right         no        yes 1_front
11  65.00       69   6.0     100 2_about_right         no         no 1_front
12  62.00       62   7.0      60  3_overweight         no         no 1_front
13  59.00       62   5.0      80 2_about_right        yes        yes 1_front
19  65.00       70   8.0     120  3_overweight        yes        yes 1_front
21  66.00       68   7.0      75  3_overweight         no         no 1_front
22  67.75       70   7.0      90  3_overweight         no         no 1_front
23  63.00       67   8.5      90  3_overweight         no        yes 1_front
24  66.00       66   7.0     120  3_overweight        yes         no 1_front
26  54.00       54   4.0     130  3_overweight        yes        yes 1_front
27  74.00       75   5.0     119 2_about_right        yes        yes 1_front
28  68.00       66   4.5     112 2_about_right        yes         no 1_front
29  68.00       68   6.0      93 2_about_right        yes        yes 1_front
37  74.00       76   5.0     115  3_overweight        yes         no 1_front
38  63.00       67   7.5     105  3_overweight        yes         no 1_front
43  66.00       68   9.0      91  3_overweight        yes         no 1_front
45  51.00       54   7.0     130 2_about_right         no         no 1_front
55  65.00       75   6.0     130 2_about_right         no        yes 1_front
57  64.00       66   6.0      95  3_overweight        yes         no 1_front
61  63.00       68   7.5      75  3_overweight         no         no 1_front
62  64.00       68   7.5     102  3_overweight         no         no 1_front
67  69.00       67   2.0      85  3_overweight        yes         no 1_front
69  61.00       68   5.0     130 2_about_right         no         no 1_front
71  70.00       73   5.0     110 1_underweight         no         no 1_front
     GPA enough_Sleep    sex diff.ideal.act.
1  3.560           no   male            2.00
4  3.500           no female            3.00
6  3.100          yes   male              NA
7  3.680           no   male            2.00
11 2.100          yes female            4.00
12 2.500          yes female            0.00
13 3.890           no female            3.00
19 3.750          yes female            5.00
21 3.400          yes female            2.00
22 2.770           no   male            2.25
23 3.000          yes female            4.00
24 3.167          yes female            0.00
26 3.413           no female            0.00
27 3.700           no   male            1.00
28 3.500           no female           -2.00
29 3.750           no female            0.00
37 3.900           no   male            2.00
38 3.787          yes female            4.00
43 3.500           no female            2.00
45 2.550           no female            3.00
55 3.000           no   male           10.00
57 3.300           no female            2.00
61 4.000          yes female            5.00
62 3.400           no female            4.00
67 3.500           no female           -2.00
69 3.700           no female            7.00
71 2.700           no   male            3.00

Appliction: Storing Results in a List

Back to the Flowery Meadow

Recall the flowers in the meadow:

flower_colors <- c("blue", "red", "pink", "crimson", "orange")

Our goal: store results of all walks in a list.

Helper-Function:

walk_meadow_vec <- function(color, wanted) {
  picking <- TRUE
  ## the following will be extended to hold the flowers picked:
  flowers_picked <- character()
  desired_count <- 0
  while (picking) {
    picked <- sample(flower_colors, size = 1)
    flowers_picked <- c(flowers_picked, picked)
    if (picked == color) desired_count <- desired_count + 1
    if (desired_count == wanted) picking <- FALSE
  }
  ## return the vector of flowers picked:
  flowers_picked
}

Try it Out

walk_meadow_vec("blue", 5)
 [1] "blue"    "crimson" "blue"    "orange"  "red"     "pink"    "crimson"
 [8] "blue"    "blue"    "red"     "orange"  "blue"   

Function to Make the List

all_walk_list <- function(people, favs, numbers) {
  ## initialize a list of the required length:
  lst <- vector(mode = "list", length = length(people))
  for (i in 1:length(people)) {
    fav <- favs[i]
    number <- numbers[i]
    lst[[i]] <- walk_meadow_vec(
      color = fav,
      wanted = number
    )
  }
  ## give names:
  names(lst) <- people
  ## return the list
  lst
}

Try it out

set.seed(2020)
all_walk_list(
  people = c("Dorothy", "Toto"),
  favs = c("blue", "orange"),
  numbers = c(4, 2)
)
$Dorothy
 [1] "crimson" "crimson" "blue"    "blue"    "crimson" "red"     "blue"   
 [8] "orange"  "red"     "red"     "orange"  "red"     "pink"    "red"    
[15] "orange"  "crimson" "red"     "crimson" "crimson" "red"     "crimson"
[22] "orange"  "crimson" "crimson" "pink"    "red"     "red"     "pink"   
[29] "orange"  "crimson" "orange"  "orange"  "red"     "orange"  "blue"   

$Toto
[1] "pink"   "orange" "blue"   "orange"

Application: Returning Multiple Items

Some Cool Functions

nchar() returns the number of characters in a string:

nchar("Dorothy")
[1] 7

toupper() makes all the letters of a string upper-case:

toupper("Dorothy")
[1] "DOROTHY"

Simple Function # 1

This function takes two strings and returns the number of characters in each:

f1 <- function(str1, str2) {
  numbers <- c(nchar(str1), nchar(str2))
  names(numbers) <- c(str1, str2)
  numbers
}
f1("Dorothy", "Toto")
Dorothy    Toto 
      7       4 

Simple Function # 2

This function takes two strings and returns the upper-case versions of each:

f2 <- function(str1, str2) {
  loudly <- c(toupper(str1), toupper(str2))
  names(loudly) <- c(str1, str2)
  loudly
}
f2("Dorothy", "Toto")
  Dorothy      Toto 
"DOROTHY"    "TOTO" 

Simple Function #3

We would like a function that takes a string and returns:

  • the number of characters in it, and
  • the upper-case version of it.

Problem

f3 <- function(str) {
  number <- nchar(str)
  loudly <- toupper(str)
  c(number, loudly)
}
f3("Dorothy")
[1] "7"       "DOROTHY"

"7" is a string, not a number!

Problem

f3 <- function(str) {
  number <- nchar(str)
  loudly <- toupper(str)
  number
  loudly
}
f3("Dorothy")
[1] "DOROTHY"

We only get the string!

Problem

f3 <- function(str) {
  number <- nchar(str)
  loudly <- toupper(str)
  loudly
  number
}
f3("Dorothy")
[1] 7

We only get the number!

Solution

Return a list!

f3 <- function(str) {
  number <- nchar(str)
  loudly <- toupper(str)
  list(numberOfChars = number,
       uppercase = loudly)
}
f3("Dorothy")
$numberOfChars
[1] 7

$uppercase
[1] "DOROTHY"

Understanding Ellipses

What is ...?

The Ellipsis Argument

... stands for “any other arguments you want to add”:

ellipisDemo <- function(...) {
  cat("I got the following arguments:\n\n")
  print(list(...))
}
ellipisDemo(x = 3, y = "cat", z = FALSE)
I got the following arguments:

$x
[1] 3

$y
[1] "cat"

$z
[1] FALSE

Application

Remember the na.rm parameter of functions like mean():

vec <- c(2,5,4,6,NA)
mean(vec)
[1] NA
mean(vec, na.rm = TRUE)
[1] 4.25

A Function to Compute Several Means

manyMeans <- function(vecs = list()) {
  n <- length(vecs)
  if ( n == 0 ) {
    return(cat("Need some vectors to work with!"))
  }
  results <- numeric(n)
  for ( i in 1:n ) {
    results[i] <- mean(vecs[[i]])
  }
  results
}

Try It Out

manyMeans(list(1:5, 6:10))
[1] 3 8
manyMeans(list(1:5, c(6,7,8,9,NA)))
[1]  3 NA

Use na.rm = TRUE

manyMeans <- function(vecs = list()) {
  n <- length(vecs)
  if ( n == 0 ) {
    return(cat("Need some vectors to work with!"))
  }
  results <- numeric(n)
  for ( i in 1:n ) {
    results[i] <- mean(vecs[[i]], na.rm = TRUE)
  }
  results
}
manyMeans(list(1:5, c(6,7,8,9,NA)))
[1] 3.0 7.5

But this decides the option for the user! Can we let the user decide?

Solution: Use Ellipsis Argument

manyMeans <- function(vecs = list(), ...) {
  n <- length(vecs)
  if ( n == 0 ) {
    return(cat("Need some vectors to work with!"))
  }
  results <- numeric(n)
  for ( i in 1:n ) {
    results[i] <- mean(vecs[[i]], ...)
  }
  results
}

Try It Out

manyMeans(list(1:5, c(6,7,8,9,NA)))
[1]  3 NA
manyMeans(list(1:5, c(6,7,8,9,NA)), na.rm= TRUE)
[1] 3.0 7.5