Introduction to hand R (learning notes)

Posted by Double_M on Mon, 24 Jan 2022 13:41:57 +0100

0618 15:00

Story: eight green brothers of Slot Machine

https://rstudio-education.github.io/hopr/programs.html

People who have not experienced Vegas will have a little trouble understanding the story. Like myself.
Three different icon combinations will be generated when a machine plays once. If the icon combination meets the winning rules, you will win money. Compared with 21 o'clock and big turntable, baqingge's odds are relatively low, so it is more popular with the villa owner.

9 Programs

Write play ()

– 1 randomly generate three patterns
– 2 calculate bonus

  • function of randomly generating three patterns
get_symbols <- function() {
  wheel <- c("DD", "7", "BBB", "BB", "B", "C", "0")
  sample(wheel, size = 3, replace = TRUE, 
    prob = c(0.03, 0.03, 0.06, 0.1, 0.25, 0.01, 0.52))
}
  • Manitoba slot machines return mechanism
    – winning conditions
    – 1 three identical patterns (except 0)
    – 2 any combination with B
    – 3 at least one C

The details are as follows

9.1 strategy

Split the whole process as far as possible until it is very simple and can be solved with R's own function.

  • 9.1.1 continuous command
play <- function() {

  # step 1: generate symbols
  symbols <- get_symbols()

  # step 2: display the symbols
  print(symbols)

  # step 3: score the symbols
  score(symbols)
}

Get pattern 👉 Display pattern 👉 Rate pattern combinations

  • 9.1.2 parallel instructions
    For example, score () should consider three situations at the same time, namely
    1. Three same patterns, 2 Combined pattern with B, 3 Except for 1 and 2.
    Then Diamonds' situation is the most complex and can be ignored for the time being.

  • Complete process

if ( # Case 1: all the same <1>) {
  prize <- # look up the prize <3>
} else if ( # Case 2: all bars <2> ) {
  prize <- # assign $5 <4>
} else {
  # count cherries <5>
  prize <- # calculate a prize <7>
}

# count diamonds <6>
# double the prize if necessary <8>

Translation of adult words is
1.test are the three patterns the same
2.test do all patterns contain B
3. Consider the reward for each condition
4. Give 5 ¥ including B
5. Calculate the quantity of C
6. Calculate the number of Diamonds (DD)
7. Calculate the reward according to the number of C
8. Calculate Diamonds' bonus

Here comes a piece of honey code

length(unique(symbols)==1)

Break down the composite code
unique() return s the unique item that appears in the vector.
If the symbol contains the same thing, unique(symbols) will become a vector with a length of 1.

That's wrong. Shouldn't it become length(unique(symbols))== 1

> symbols<-c("7",'8','7')
> unique(symbols)
[1] "7" "8"

Or you can write your own functions

same <- symbols[1] == symbols[2] && symbols[2] == symbols[3]

if (same) {
  prize <- # look up the prize
} else if ( # Case 2: all bars ) {
  prize <- # assign $5
} else {
  # count cherries
  prize <- # calculate a prize
}
# count diamonds
# double the prize if necessary
  • all() function
    check every component in the vector. check whether each component of the vector is true or false.
> symbols<-c('B',"B","BBB")
> all(symbols %in% c("B", "BB", "BBB"))
[1] TRUE
  • Set award amount for each condition
if (same) {
  symbol <- symbols[1]
  if (symbol == "DD") {
    prize <- 800
  } else if (symbol == "7") {
    prize <- 80
  } else if (symbol == "BBB") {
    prize <- 40
  } else if (symbol == "BB") {
    prize <- 5
  } else if (symbol == "B") {
    prize <- 10
  } else if (symbol == "C") {
    prize <- 10
  } else if (symbol == "0") {
    prize <- 0
  }
}

Lookup tables

> payouts <- c("DD" = 100, "7" = 80, "BBB" = 40, "BB" = 25, 
+              "B" = 10, "C" = 10, "0" = 0)
> payouts
 DD   7 BBB  BB   B   C   0 
100  80  40  25  10  10   0 
> payouts["DD"]
 DD 
100 
> payouts["B"]
 B 
10 
unname(payouts["DD"])
 100 

Here, payout is a type of lookup table
0619 22:59

same <- symbols[1] == symbols[2] && symbols[2] == symbols[3]
bars <- symbols %in% c("B", "BB", "BBB")

if (same) {
  payouts <- c("DD" = 100, "7" = 80, "BBB" = 40, "BB" = 25, 
    "B" = 10, "C" = 10, "0" = 0)
  prize <- unname(payouts[symbols[1]])
} else if (all(bars)) {
  prize <- # assign $5
} else {
  # count cherries
  prize <- # calculate a prize
}

What does symbols specify?

  • How to calculate how many "C S" there are
> symbols<-c("C","DD","C")
> sum(symbols=="C")
[1] 2
> sum(symbols=="DD")
[1] 1

Just keep writing

same <- symbols[1] == symbols[2] && symbols[2] == symbols[3]
bars <- symbols %in% c("B", "BB", "BBB")

if (same) {
  payouts <- c("DD" = 100, "7" = 80, "BBB" = 40, "BB" = 25, 
    "B" = 10, "C" = 10, "0" = 0)
  prize <- unname(payouts[symbols[1]])
} else if (all(bars)) {
  prize <- 5
} else {
  cherries <- sum(symbols == "C")
  prize <- # calculate a prize
}
diamonds <- sum(symbols == "DD")
# double the prize if necessary
  • Calculate C's reward
    if loops can be used, but they are less efficient
    For example,
if (cherries == 2) {
  prize <- 5
} else if (cherries == 1) {
  prize <- 2
} else {}
  prize <- 0
}

You might as well turn your head

same <- symbols[1] == symbols[2] && symbols[2] == symbols[3]
bars <- symbols %in% c("B", "BB", "BBB")

if (same) {
  payouts <- c("DD" = 100, "7" = 80, "BBB" = 40, "BB" = 25, 
    "B" = 10, "C" = 10, "0" = 0)
  prize <- unname(payouts[symbols[1]])
} else if (all(bars)) {
  prize <- 5
} else {
  cherries <- sum(symbols == "C")
  prize <- c(0, 2, 5)[cherries + 1]
}

diamonds <- sum(symbols == "DD")
# double the prize if necessary
  • The last step is to double the reward when diamond appears, which is consistent with the number of diamond
    that is
prize * 2 ^ diamonds
same <- symbols[1] == symbols[2] && symbols[2] == symbols[3]
bars <- symbols %in% c("B", "BB", "BBB")

if (same) {
  payouts <- c("DD" = 100, "7" = 80, "BBB" = 40, "BB" = 25, 
    "B" = 10, "C" = 10, "0" = 0)
  prize <- unname(payouts[symbols[1]])
} else if (all(bars)) {
  prize <- 5
} else {
  cherries <- sum(symbols == "C")
  prize <- c(0, 2, 5)[cherries + 1]
}

diamonds <- sum(symbols == "DD")
prize * 2 ^ diamonds
  • Sort out the score function
score <- function (symbols) {
  # identify case
  same <- symbols[1] == symbols[2] && symbols[2] == symbols[3]
  bars <- symbols %in% c("B", "BB", "BBB")
  
  # get prize
  if (same) {
    payouts <- c("DD" = 100, "7" = 80, "BBB" = 40, "BB" = 25, 
      "B" = 10, "C" = 10, "0" = 0)
    prize <- unname(payouts[symbols[1]])
  } else if (all(bars)) {
    prize <- 5
  } else {
    cherries <- sum(symbols == "C")
    prize <- c(0, 2, 5)[cherries + 1]
  }
  
  # adjust for diamonds
  diamonds <- sum(symbols == "DD")
  prize * 2 ^ diamonds
}
  • After writing the score function, you can write the complete play function.
play <- function() {
  symbols <- get_symbols()
  print(symbols)
  score(symbols)
}
  • Then you can keep playing...

It should be the problem of Japanese translation. In fact, it's not difficult to lose the original English version. Continue S3 tomorrow

10. S3

0623 18:38

10.2 Attributes

For example, the data frame saves row names and column names as attributes.

You can view attributes through attributes().

row.names(deck)
##  [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13"
## [14] "14" "15" "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26"
## [27] "27" "28" "29" "30" "31" "32" "33" "34" "35" "36" "37" "38" "39"
## [40] "40" "41" "42" "43" "44" "45" "46" "47" "48" "49" "50" "51" "52"
or to change an attribute's value:

row.names(deck) <- 101:152

Or give new attributes

levels(deck) <- c("level 1", "level 2", "level 3")

attributes(deck)
## $names
## [1] "face"  "suit"  "value"
## 
## $class
## [1] "data.frame"
## 
## $row.names
##  [1] 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117
## [18] 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134
## [35] 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151
## [52] 152
## 
## $levels
## [1] "level 1" "level 2" "level 3"
  • Exercises
play <- function() {
  symbols <- get_symbols()
  print(symbols)
  score(symbols)
}

Modify play() to return to prize and include related symbols, and delete redundant print(symbols)

  • Solution
    Create a new play function and add a new attribute symbols to prize
play <- function() {
  symbols <- get_symbols()
  prize <- score(symbols)
  attr(prize, "symbols") <- symbols
  prize
}
  • You can also use structure() in one step
play <- function() {
  symbols <- get_symbols()
  structure(score(symbols), symbols = symbols)
}

three_play <- play()
three_play
##  0
##  attr(,"symbols")
##  "0"  "BB" "B"
  • What's the point
    You can output attributes directly
slot_display <- function(prize){

  # extract symbols
  symbols <- attr(prize, "symbols")

  # collapse symbols into single string
  symbols <- paste(symbols, collapse = " ")

  # combine symbol with prize as a character string
  # \n is special escape sequence for a new line (i.e. return or enter)
  string <- paste(symbols, prize, sep = "\n$")

  # display character string in console without quotes
  cat(string)
}

slot_display(one_play)
## B 0 B
## $0

10.3 exclusive (?) function

0624 11:45
For example, print()
print() is not an ordinary function, but an exclusive function. Because you can make print() do different things in different situations.
as follows

num <- 1000000000
print(num)
1000000000
## and a different thing when we gave num a class:

class(num) <- c("POSIXct", "POSIXt")
print(num)
"2001-09-08 19:46:40 CST"

print() will output in different formats according to different attributes of data.

10.4 Methods

> print
function (x, ...) 
UseMethod("print")
<bytecode: 0x103594e68>
<environment: namespace:base>

UseMethod will check the classification of data and determine the output format according to the classification of data. For another example, when you give the data of posixct attribute to print, UseMethod will convert all variables in print into print POSIXct. R will run print POSIXct

print.POSIXct
## function (x, ...) 
## {
##     max.print <- getOption("max.print", 9999L)
##     if (max.print < length(x)) {
##         print(format(x[seq_len(max.print)], usetz = TRUE), ...)
##         cat(" [ reached getOption(\"max.print\") -- omitted", 
##             length(x) - max.print, "entries ]\n")
##     }
##     else print(format(x, usetz = TRUE), ...)
##     invisible(x)
## }
## <bytecode: 0x7fa948f3d008>
## <environment: namespace:base>

Take the example of factor

print.factor
 function (x, quote = FALSE, max.levels = NULL, width = getOption("width"), 
     ...) 
 {
     ord <- is.ordered(x)
     if (length(x) == 0L) 
         cat(if (ord) 
             "ordered"
 ...
        drop <- n > maxl
         cat(if (drop) 
             paste(format(n), ""), T0, paste(if (drop) 
             c(lev[1L:max(1, maxl - 1)], "...", if (maxl > 1) lev[n])
         else lev, collapse = colsep), "\n", sep = "")
     }
     invisible(x)
 }
 <bytecode: 0x7fa94a64d470>
 <environment: namespace:base>

print.POSIXct and print Factor is called print methods. R allocates methods according to the class of the data. print() itself has almost 200 methods.

methods(print)
##   [1] print.acf*                                   
##   [2] print.anova                                  
##   [3] print.aov*                                   
##  ...                      
## [176] print.xgettext*                              
## [177] print.xngettext*                             
## [178] print.xtabs*
##
##   Nonvisible functions are asterisked

To sum up, the combination of generic functions, methods, and class based is called S3 system. Because the system originated from the third edition of s language. Many generic functions of R language are S3, such as summary and head.

Therefore, the output format can be adapted according to the characteristics of S3 system. Just specify a class for the data.

10.4.1 Method Dispatch

The name of each S3 method consists of two parts. The first part is the function to which the method belongs, such as print, summary and head. The second part is the attribute class. For example, print function ,summary.matrix.

  • Exercise 1
    You can assign any class to the data,
> class(one_play) <- "slots"
> print.slots <- function(x, ...) {
+   cat("I'm using the print.slots method")
+ }
> print(one_play)
I'm using the print.slots method
  • Exercise 2
    Modify play(), add slots attribute to it, and display it in the output.
play <- function() {
  symbols <- get_symbols()
  structure(score(symbols), symbols = symbols)
}

After modification

play <- function() {
  symbols <- get_symbols()
  structure(score(symbols), symbols = symbols, class = "slots")
}

10.5 Classes

There are three steps to create a class

  1. Select the name of the class
  2. Assign an attribute to each class
  3. Write the methods corresponding to the new class for generic methods

11. Loop

Story: each slot machine seems to have a return of 42 cents per dollar, but the manufacturer sets the return rate of 92 cents per knife. Let's calculate the return rate of each machine.

  • Calculate baqingge probability
wheel <- c("DD", "7", "BBB", "BB", "B", "C", "0")
combos <- expand.grid(wheel, wheel, wheel, stringsAsFactors = FALSE)
prob <- c("DD" = 0.03, "7" = 0.03, "BBB" = 0.06, "BB" = 0.1, "B" = 0.25, "C" = 0.01, "0" = 0.52)
combos$prob1 <- prob[combos$Var1]
combos$prob2 <- prob[combos$Var2]
combos$prob3 <- prob[combos$Var3]
combos$prob <- combos$prob1 * combos$prob2 * combos$prob3
head(combos,3)
sum(combos$prob)
> head(combos,3)
  Var1 Var2 Var3 prob1 prob2 prob3    prob
1   DD   DD   DD  0.03  0.03  0.03 2.7e-05
2    7   DD   DD  0.03  0.03  0.03 2.7e-05
3  BBB   DD   DD  0.06  0.03  0.03 5.4e-05
> sum(combos$prob)
[1] 1

0625 11:30

for Loop

for( i in c("my", "first","for","loop")){
  print(i)
}
> for( i in c("my", "first","for","loop")){
+   print(i)
+ }
[1] "my"
[1] "first"
[1] "for"
[1] "loop"

In fact, the calculation results in for loop must be saved. Otherwise, the calculation is meaningless. You can write and arrange a short vector or list in advance, and then put the calculation results in it.

chars<-vector(length=4)
words<-c("my","first","for","loop")
for(i in 1:4){
  chars[i]<-words[i]
}
  • Add and calculate prize
for (i in 1:nrow(combos)) {
  symbols <- c(combos[i, 1], combos[i, 2], combos[i, 3])
  combos$prize[i] <- score(symbols)
}
score <- function(symbols) {
  
  diamonds <- sum(symbols == "DD")
  cherries <- sum(symbols == "C")
  
  # identify case
  # since diamonds are wild, only nondiamonds 
  # matter for three of a kind and all bars
  slots <- symbols[symbols != "DD"]
  same <- length(unique(slots)) == 1
  bars <- slots %in% c("B", "BB", "BBB")

  # assign prize
  if (diamonds == 3) {
    prize <- 100
  } else if (same) {
    payouts <- c("7" = 80, "BBB" = 40, "BB" = 25,
      "B" = 10, "C" = 10, "0" = 0)
    prize <- unname(payouts[slots[1]])
  } else if (all(bars)) {
    prize <- 5
  } else if (cherries > 0) {
    # diamonds count as cherries
    # so long as there is one real cherry
    prize <- c(0, 2, 5)[cherries + diamonds + 1]
  } else {
    prize <- 0
  }
  
  # double for each diamond
  prize * 2^diamonds
}

while loop

In practice, while uses less than for.
while will not return the results, so you need to set it manually to save it.

  • example
    Count the number of times you play to bankruptcy
plays_till_broke <- function(start_with) {
  cash <- start_with
  n <- 0
  while (cash > 0) {
    cash <- cash - 1 + play()
    n <- n + 1
  }
  n
}

plays_till_broke(100)
 260

repeat loop

plays_till_broke <- function(start_with) {
  cash <- start_with
  n <- 0
  repeat {
    cash <- cash - 1 + play()
    n <- n + 1
    if (cash <= 0) {
      break
    }
  }
  n
}

plays_till_broke(100)
 237

12 Speed high speed

  • 12.1 vector coding
    Compare the following two pieces of code
abs_loop <- function(vec){
  for (i in 1:length(vec)) {
    if (vec[i] < 0) {
      vec[i] <- -vec[i]
    }
  }
  vec
}
abs_sets <- function(vec){
  negs <- vec < 0
  vec[negs] <- vec[negs] * -1
  vec
}

The second segment is vector coding

  • Create vector code in advance
winnings <- vector(length = 1000000)
for (i in 1:1000000) {
  winnings[i] <- play()
}

mean(winnings)
 0.9366984

The last data is the three skill trees of science

  1. Logistics issues: data storage, operational skills
  2. Tactical questions: the skills of mining information inside data
  3. Strategic issues: skills to summarize conclusions at a larger level

190625 15:09 sleepy

190629 16:26 2 weeks

Need to remember, like reciting the text

Chapter 9

Key syntax function(),sample(),unique(),if loop,unname(),all(),sum()

wheel<-c("DD","7","BBB","BB","B","C","0")
prob=c(0.03, 0.03, 0.06, 0.1, 0.25, 0.01, 0.52))

1. Randomly select three characters, and the occurrence probability of each character is as above
2. If the three characters are the same, give the following to prize

c("DD"=100, "7" = 80, "BBB" = 40, "BB" = 25,
"B" = 10, "C" = 10, "0" = 0)

3. If the three characters are different, but they are one of "B", "BB" and "BBB", give it to me
prize<-5
4. If the three characters are different, but "C" appears, allocate prize according to the number of "C"
1 C=2, 2 C=5
5. If "DD" occurs, it shall be distributed according to the number of "DD"
prize*2^"DD" number

get_symbols<-function(){
  wheel<-c("DD","7","BBB","BB","B","C","0")
  sample(wheel,size=3,replace=TRUE,
         prob=c(0.03, 0.03, 0.06, 0.1, 0.25, 0.01, 0.52))
}

score<-function(symbols){
  bars<- symbols %in% c("B","BB","BBB")
 # %in% logical judgment function
  
  if(length(unique(symbols))==1){
   payouts<-c("DD"=100, "7" = 80, "BBB" = 40, "BB" = 25, 
   "B" = 10, "C" = 10, "0" = 0)
   prize<-unname(payouts[symbols[1]])
# unnamed() does not display the name
  } else if(all(bars)){
# Whether all() is a logical judgment function of TRUE
    prize<-5
  } else {
    cherries<-sum(symbols=="C")
    prize<- c(0,2,5)[cherries+1]
  }
   diamonds<- sum(symbols=="DD")
   prize*2^diamonds
}
   
play<-function(){
  symbols<-get_symbols()
  print(symbols)
  score(symbols)
}

0714

  • Test rate of return
  1. score() function
    – define the number of occurrences of "DD" in diamonds
    – define the number of occurrences of "C" in cherries
    – define characters other than slots "DD"
    – define the same. All characters are the same
    – define whether the bars character is a combination of "B", "BB", "BBB"
    – diamonds: 100, same: payouts < - C ("7" = 80, "BBB" = 40, "BB" = 25, "B" = 10, "C" = 10, "0" = 0), bars: 5, cherries + diamonds: 1 = 2, 2 = 5, others: 0. When diamonds appears, prize* 2^diamonds
    – wheel <- c("DD", "7", "BBB", "BB", "B", "C", "0")
    – prob <- c("DD" = 0.03, "7" = 0.03, "BBB" = 0.06, "BB" = 0.1, "B" = 0.25, "C" = 0.01, "0" = 0.52)
    – calculate the probability of each combination
    – calculate the overall return based on the probability and return of each portfolio
score <- function(symbols){
  diamonds <- sum(symbols == "DD")
  cherries <- sum(symbols == "C")

  slots <- symbols[symbols !="DD"]
  same <- length(unique(slots))==1
  bars <- slots %in% c("B","BB","BBB")
  
  if (diamonds==3){
    prize <- 100
  } else if (same) {
    payouts<- c("7" = 80, "BBB" = 40, "BB" = 25,
                "B" = 10, "C" = 10, "0" = 0)
    prize <- unname(payouts[slots[1]])
  } else if (all(bars)){
    prize <- 5
  } else if (cherries>0) {
    prize<-c(2,5)[cherries+diamonds]
  } else {
    prize <-0
  }
  prize* 2^diamonds
}


wheel <- c("DD", "7", "BBB", "BB", "B", "C", "0")
combos <- expand.grid(wheel, wheel, wheel, stringsAsFactors = FALSE)
prob <- c("DD" = 0.03, "7" = 0.03, "BBB" = 0.06, "BB" = 0.1, "B" = 0.25, "C" = 0.01, "0" = 0.52)
combos$prob1 <- prob[combos$Var1]
combos$prob2 <- prob[combos$Var2]
combos$prob3 <- prob[combos$Var3]
combos$prob  <- combos$prob1 * combos$prob2 * combos$prob3

# for loop 
for( i in 1:nrow(combos)){
  symbols <- c(combos[i,1], combos[i,2], combos[i,3])
  combos$prize[i] <- score(symbols)
}
sum(combos$prize * combos$prob)

Topics: R Language