文章目录
- 一、Function basics
- 1、Creating your own function
- 2、Function structure
- 3、Using your created function
- 4、Multiple inputs
- 5、Default inputs
- 二、Return values and side effects
- 1、Returning more than one thing
- 2、Side effects
- Example of side effect: plot
- 三、Environments and design
- 1、Environment: what the function can see and do
- 2、Environment examples
一、Function basics
1、Creating your own function
Call function()
to create your own function. Document your function with comments
# get.wordtab.king: get a word table from King's "I Have A Dream" speech
# Input: none
# Output: word table, i.e., vector with counts as entries and associated
# words as names
get.wordtab.king = function() {
lines = readLines("https://raw.githubusercontent.com/king.txt")
text = paste(lines, collapse=" ")
words = strsplit(text, split="[[:space:]]|[[:punct:]]")[[1]]
words = words[words != ""]
wordtab = table(words)
return(wordtab)
}
- Input: none
- Output: word table, i.e., vector with counts as entries and associated words as names
Much better: create a word table function that takes a URL of web
# get.wordtab.from.url: get a word table from text on the web
# Input:
# - str.url: string, specifying URL of a web page
# Output: word table, i.e., vector with counts as entries and associated
# words as names
get.wordtab.from.url = function(str.url) {
lines = readLines(str.url)
text = paste(lines, collapse=" ")
words = strsplit(text, split="[[:space:]]|[[:punct:]]")[[1]]
words = words[words != ""]
wordtab = table(words)
return(wordtab)
}
-
Input:
- str.url: string, specifying URL of a web page
-
Output: word table, i.e., vector with counts as entries and associated words as names
2、Function structure
The structure of a function has three basic parts:
- Inputs (or arguments): within the parentheses of
function()
- Body (code that is executed): within the braces
{}
- Output (or return value): obtained with function
return()
- (optional) Comments: description of functions by comments
# get.wordtab.from.url: get a word table from text on the web
# Input:
# - str.url: string, specifying URL of a web page
# Output: word table, i.e., vector with counts as entries and associated
# words as names
get.wordtab.from.url = function(str.url) {
lines = readLines(str.url)
text = paste(lines, collapse=" ")
words = strsplit(text, split="[[:space:]]|[[:punct:]]")[[1]]
words = words[words != ""]
wordtab = table(words)
return(wordtab)
}
3、Using your created function
Our created functions can be used just like the built-in ones
# Using our function
king.wordtab.new = get.wordtab.from.url(
"https://raw.githubusercontent.com/mxcai/BIOS5801/main/data/king.txt")
all(king.wordtab.new == king.wordtab)
## [1] TRUE
# Revealing our function's definition
get.wordtab.from.url
## function(str.url) {
## lines = readLines(str.url)
## text = paste(lines, collapse=" ")
## words = strsplit(text, split="[[:space:]]|[[:punct:]]")[[1]]
## words = words[words != ""]
## wordtab = table(words)
## return(wordtab)
## }
4、Multiple inputs
Our function can take more than one input
# get.wordtab.from.url: get a word table from text on the web
# Inputs:
# - str.url: string, specifying URL of a web page
# - split: string, specifying what to split on
# Output: word table, i.e., vector with counts as entries and associated
# words as names
get.wordtab.from.url = function(str.url, split) {
lines = readLines(str.url)
text = paste(lines, collapse=" ")
words = strsplit(text, split=split)[[1]]
words = words[words != ""]
table(words)
}
- Inputs:
str.url
: string, specifying URL of a web pagesplit
: string, specifying what to split on- Output: word table, i.e., vector with counts as entries and associated words as names
5、Default inputs
Our function can also specify default values for the inputs (if the user doesn’t specify an input in the function call, then the default value is used)
# get.wordtab.from.url: get a word table from text on the web
# Inputs:
# - str.url: string, specifying URL of a web page
# - split: string, specifying what to split on. Default is the regex pattern
# "[[:space:]]|[[:punct:]]"
# - convert2lower: Boolean, TRUE if words should be converted to lower case before
# the word table is computed. Default is TRUE
# Output: word table, i.e., vector with counts as entries and associated
# words as names
get.wordtab.from.url = function(str.url, split="[[:space:]]|[[:punct:]]",
convert2lower=TRUE) {
lines = readLines(str.url)
text = paste(lines, collapse=" ")
words = strsplit(text, split=split)[[1]]
words = words[words != ""]
# Convert to lower case, if we're asked to
if (convert2lower) words = tolower(words)
table(words)
}
二、Return values and side effects
1、Returning more than one thing
R doesn’t let your function have multiple outputs, but you can return a list
When creating a function in R, though you cannot return more than one output, you can return a list. This (by definition) can contain an arbitrary number of arbitrary objects
- Inputs:
str.url
: string, specifying URL of a web pagesplit
: string, specifying what to split on. Default is the regex pattern “[[:space:]]|[[:punct:]]”convert2lower
: Boolean, TRUE if words should be converted to lower case before the word table is computed. Default is TRUEkeep.nums
: Boolean, TRUE if words containing numbers should be kept in the word table. Default is FALSE
- Output: list, containing word table, and then some basic numeric summaries
get.wordtab.from.url = function(str.url,
split="[[:space:]]|[[:punct:]]",
convert2lower=TRUE, keep.nums=FALSE) {
lines = readLines(str.url)
text = paste(lines, collapse=" ")
words = strsplit(text, split=split)[[1]]
words = words[words != ""]
# Convert to lower case, if we're asked to
if (convert2lower) {
words = tolower(words)
}
# Get rid of words with numbers, if we're asked to
if (!keep.nums) {
words = grep("[0-9]", words, invert=TRUE, value=TRUE)
}
# Compute the word table
wordtab = table(words)
return(list(wordtab=wordtab,
number.unique.words=length(wordtab),
number.total.words=sum(wordtab),
longest.word=words[which.max(nchar(words))]))
}
# King's "I Have A Dream" speech
king.wordtab = get.wordtab.from.url(
"https://raw.githubusercontent.com/king.txt")
lapply(king.wordtab, head)
## $wordtab
## words
## a able again ago ahead alabama
## 37 8 2 1 1 3
##
## $number.unique.words
## [1] 528
##
## $number.total.words
## [1] 1631
##
## $longest.word
## [1] "discrimination"
2、Side effects
A side effect of a function is something that happens as a result of the function’s body, but is not returned. Examples:
- Printing something out to the console
- Plotting something on the display
- Saving an R data file, or a PDF, etc.
Example of side effect: plot
get.wordtab.from.url
: get a word table from text on the web- Inputs:
str.url
: string, specifying URL of a web pagesplit
: string, specifying what to split on. Default is the regex pattern “[[:space:]]|[[:punct:]]”convert2lower
: Boolean, TRUE if words should be converted to lower case before the word table is computed. Default is TRUEkeep.nums
: Boolean, TRUE if words containing numbers should be kept in the word table. Default is FALSEplot.hist
: Boolean, TRUE if a histogram of word lengths should be plotted as a side effect. Default is FALSE
- Output: list, containing word table, and then some basic numeric summaries
get.wordtab.from.url = function(str.url, split="[[:space:]]|[[:punct:]]",
convert2lower=TRUE, keep.nums=FALSE, plot.hist=FALSE) {
lines = readLines(str.url)
text = paste(lines, collapse=" ")
words = strsplit(text, split=split)[[1]]
words = words[words != ""]
# Convert to lower case, if we're asked to
if (convert2lower) words = tolower(words)
# Get rid of words with numbers, if we're asked to
if (!keep.nums)
words = grep("[0-9]", words, invert=TRUE, value=TRUE)
# Plot the histogram of the word lengths, if we're asked to
if (plot.hist)
hist(nchar(words), col="lightblue", breaks=0:max(nchar(words)),
xlab="Word length")
# Compute the word table
wordtab = table(words)
return(list(wordtab=wordtab,
number.unique.words=length(wordtab),
number.total.words=sum(wordtab),
longest.word=words[which.max(nchar(words))]))
}
# King's speech
king.wordtab = get.wordtab.from.url(
str.url="https://raw.githubusercontent.com/mxcai/BIOS5801/main/data/king.txt",
plot.hist=TRUE)
三、Environments and design
1、Environment: what the function can see and do
- Each function generates its own environment
- Variable names in function environment override names in the global environment
- Internal environment starts with the named arguments
- Assignments inside the function only change the internal environment
- Variable names undefined in the function are looked for in the global environment
2、Environment examples
-
Variable names here override names in the global environment
y
is2
in the global environmenty
is10
in the function environment, and only exists when the function is under execution
-
Variable assignments inside the function environment would (generally) not change the variable in the global environment
x
remains to be1
in the global environment
x <- 1
y <- 2
addone = function(y) {
x = 1+y
x
}
addone(10)
## [1] 11
y
## [1] 2
x
## [1] 1
- Variable names undefined in the function are looked for in the global environment
circle.area = function(r) { pi*r^2 }
circle.area(1:3)
## [1] 3.141593 12.566371 28.274334
true.pi = pi # to back up the sanity
pi = 3
circle.area(1:3)
## [1] 3 12 27
pi = true.pi # Restore sanity
circle.area(1:3)
## [1] 3.141593 12.566371 28.274334