Chapter 2 Functions
In R functions are chunks of code summarized into a separated environment. There are predefined functions included in the packages of R and self-written functions. Once a package is loaded or self-written functions are passed to the console, they are available in the workspace. Self-written functions appear in the workspace overview, while functions from the packages are hidden.
2.1 Structure
Functions are written as
<name> <- function(<argument1>,<argument2>,...) {<code>}
For naming use verbs as function do something. This is halpful as in R anything is an object, particularly data as well as function. Thus, there should be some convention making it easier to determine which object is a function and which a data object (or something else). This can also be achieved by a prefix or suffix like “_fun” (see code style guidelines).
2.2 Help & search
Each function in R contained in packages has a help pack which is accessible via help(<function name>)
of ?<function name>
, e.g.
help(mean)
?mean
If you don’t know the name of a function you can search the help files by ??<search phrase>
, e.g.
??regression
2.3 Using
Functions can be used by passing arguments the corresponding arguments. The help pages list the data types required for the corresponding argument. E.g.
mean(x = 5) # there are 3 arguments of mean, see help page, x is the argument of which the mean is calculated
mean(y = 5) # an argument called y is not defined for mean
mean(x = "test") # the mean of strings cannot be calculated
mean(5) # if the argument's name is not given, the order of arguments matters
mean(y <- 5) # you can define an object also from within a function
# here is y
y
<- sqrt(x = y) # results are objects
res <- res^2
res2
res2
<- sqrt # functions are objects too
res3 res3(x = 5)
2.4 Writing
When writing functions, the code can use all objects defined as arguments in the function definition and all objects in the workspace. Good programming practice calls for using only objects defined as function arguments. Nonetheless, one can create as many auxiliary objects within the function without cramming the workspace. A function terminates if
- there appears a statement with a direct evaluation
- one of the following functions is called:
return(<object>)
: exists function and returns<object>
stop(<message>)
: exists function and prints<message>
<- function(x) {x^2}
foo foo(5)
foo(x = 5)
# define standard normal density function
<- function(x, mu, sigma) {
foo2 # direct evaluation
1/sqrt(2*pi*sigma^2) * exp(-(x-mu)^2/(2*sigma^2))}
foo2(0,0,1) # value of standard normal distribution at 0
dnorm(0,0,1) # check by comparing with implemented function
# a bit more complex function
<- function(x, mu, sigma) {
foo3 # you can define auxiliary objects to be used only within the function
<- 1/sqrt(2*pi*sigma^2) * exp(-(x-mu)^2/(2*sigma^2))
tmp return(tmp) # only one object can be returned
}
foo3(0,0,1) # does the same as foo2
<- function(x, mu, sigma) { # a more elaborate function
foo4 # the operator '<<-' creates an object in the workspace (outside the function environment)
<<- 1/sqrt(2*pi*sigma^2) * exp(-(x-mu)^2/(2*sigma^2))
tmp
}
foo4(0,0,1) # no apparent result, but check your workspace
When you have written many functions, your script might quickly become very large and messy. A work-around is to outsource functions in separate script files (e.g., “fun_1.R”) and load them into the workspace by source("fun_1.R")
.
2.5 Exercises
- Formulate the EOQ formula in R (\(\frac{1}{2} \cdot c_l \cdot q + \frac{d}{q} \cdot c_o\)).
- Derive a function for calculating weighted Euclidean distance between two points.
- Alter your EOQ function by checking whether all arguments are supplied and stop execution while displaying a error message.
- Formulate a function for the Geometric Poisson density distribution \(f(n|\lambda,\theta)=\sum_{k=1}^n e^{-\lambda} \cdot \frac{\lambda^k }{k!} \cdot (1-\theta)^{n-k} \cdot \theta^k \cdot \binom{n-1}{k-1}\).
Hint: You can create an integer vector by 1:4
.