Chapter 7 Optimization in R
There are a lot of optimization routines implemented in R. Most multi-purpose optimization routines in R are dedicated to continuous optimization as this is most often encountered in statistics. Additionally, also constrained optimization problems appear which are notoriously difficult to solve in their most general form. The following table shows an overview over some popular optimization routines in R. A comprehensive overview can be found here.
type of objective func. | variables | constraints | package | function |
---|---|---|---|---|
general | 1, cont. | none | built-in | optimize() |
general | \(n\), cont. | boxed | built-in | optim() |
general | \(n\), cont. | boxed | optimx |
optimx() |
general | \(n\), cont. | linear | built-in | constrOptim() |
linear | \(n\), cont. | linear | lpSolve |
lp() |
linear | \(n\), cont., int., bin. | linear | Rglpk |
Rglpk_solve_LP() |
linear | \(n\), cont., int., bin. | linear | Rsymphony |
Rsymphony_solve_LP() |
quad. | \(n\), cont. | linear | quadprog |
solve.QP() |
general | \(n\), cont. | general | nloptr |
nloptr() |
general | \(n\), cont. | general | Rsolnp |
solnp() |
7.1 Continuous optimization with optim
For unconstrained (or at most box-constraint) general prupose optimization, R offers the built-in function optim()
which is extended by the optimx()
function. The syntax of both functions is identical: optim(par = <initial parameter>, fn = <obj. function>, method = <opt. routine>)
. The first argument of the function to be optimized must be the vector (or scalar) to be optimized over and should return a scalar (i.e. the objective value). Some optimization routines also allow Inf
or NA
as returned values, but some require finite values always. Additionally, upper and lower values for the parameters can be set by option lower = <vector of lower bounds>
and upper = <vector of upper bounds>
. In this case, method = "L-BFGS-B"
must be selected. By default, optim()
and optimx()
minimize the objective function.
# one-dimensional optimization
<- function(x) 0.01 * x^3 + 2* x^2 + 1 * x + 4 # define a polynomial function
fn.poly optim(par = 1, fn = fn.poly, method = "BFGS") # one result list
optimx(par = 1, fn = fn.poly, method = c("Brent", "CG", "BFGS", "bobyqa", "nlm")) # and comparison of alternative methods
As an example for 2-dimensional optimization, the continuous location planning problem with Manhattan metric is considered:
\[\min\limits_{(u,v) \in \mathbb R^2} \rightarrow \sum_{i \in \mathcal{C} } a_i \cdot (|x_i - u| + |y_i - v|)\]
The weighted Manhattan distance function introduced before also handles vectors. So, we need to adapt this function only slightly:
<- function(loc, a, x, y) sum(a * (abs(x - loc[1]) + abs(y - loc[2]) ) )
loc.mh <- 100
n <- sample(1:100, size = n) # sample weights for each point/customer
a.vec <- rnorm(n) # sample x coordinates
x.vec <- rnorm(n) # sample y coordinates
y.vec <- optim(par = c(0,0), fn = loc.mh, method = "BFGS", a = a.vec, x = x.vec , y= y.vec)
res # optimal location should be close to (0,0)
The corresponding solution can de plotted as follows (sizes of customer points are proportional to their weights \(a_i\)).
library(grDevices) # package for generating color gradients
# generate grid of potential locations
<- expand.grid(seq(-3,3, length.out = 100), seq(-3,3, length.out = 100))
emat # list with grid point coordinates
<- list(x = seq(-3,3, length.out = 100), y = seq(-3,3, length.out = 100))
plist # evaluate objective function at each potential location
<- apply(emat, 1, function(x) loc.mh(x, a = a.vec, x = x.vec, y = y.vec))
z.vec # generate color vector --> assigns a color between green and red depending on the objective value
<- colorRamp(c("green","orange","red"))( (z.vec-min(z.vec))/(max(z.vec)) )
col.vec # recode color vector to RGB code
<- rgb(col.vec[,1],col.vec[,2],col.vec[,3], maxColorValue = 255)
col.vec # create plot
par(mar =c(4,4,0.3,0.3))
plot(emat, xlim=c(-3,3), ylim=c(-3,3), xlab="x", ylab="y", col = col.vec, pch = 15, asp = 1, xaxs = "i", yaxs = "i", cex = 1.5)
points(x.vec, y.vec, pch =20 , cex = log(a.vec)/2)
points(rbind(res$par,res$par), pch =17 , cex = 2, col="blue")
contour(plist, z= matrix(z.vec, ncol=100) , add= TRUE)
legend("topleft", bty="n", legend=c("customer","center"), col=c("black","blue"), pch = c(20,17), bg = "yellow", cex = 1.5)
Exercise 1: Formulate the Steiner-Weber model (i.e., the continuous location planning problem with Euclidean distance metric) and solve it with
optim()
.
Exercise 2: Formulate the Newsvendor model \[Z(q) = \int_0^q{c_o \cdot (q-x) \cdot f(x, \mu, \sigma) \mathrm{d} x } + \int_q^\infty {c_u\cdot (x-q) \cdot f(x, \mu, \sigma) \mathrm{d} x }\] and use
optim()
to find the optimal order quantity. Compare the optimal solution found byoptim()
with the theoretical optimum.
7.2 Mixed-integer linear optimization with GLPK
7.2.1 Generic formulation of MILP models
Mixed-integer linear optimization problems (MILP) are characterized by linear objective functions and constraints w.r.t. the decision variables. However, some or all decision variables are integer and/or binary variables. In general, the canonical form of an MILP can be written as
\[ \min\limits_{ \mathbf{x}, \mathbf{y}, \mathbf{z}} \rightarrow \mathbf{c}_x^T \cdot \mathbf{x} + \mathbf{c}_y^T \cdot \mathbf{y} + \mathbf{c}_z^T \cdot \mathbf{z}\]
\[\begin{align} \mathbf{A}_x \cdot \mathbf{x} + \mathbf{A}_y \cdot \mathbf{y} + \mathbf{A}_z \cdot \mathbf{z} & \leq \mathbf{b} \\ \mathbf{x} & \in \mathbb{R} \\ \mathbf{y} & \in \mathbb{Z} \\ \mathbf{z} & \in \{0,1\} \\ \end{align}\]
Hence, a MILP basically consists of four parts:
- coefficient vector \(c = (c_x, c_y, c_z)\)
- constraint matrix \(A = (A_x,A_y,A_z)\)
- right-hand side vector \(\mathbf{b}\)
- direction of the constraints
- the domain declarations of the decision variables.
These five components are to be specified when a MILP is solved with R. In case, there are only continuous variables (i.e., a linear program), no domain declarations are necessary.
As an example, the dynamic lot-sizing problem or Wagner-Whitin model is considered where \(c_o\) is the ordering cost rate, \(c_h\) is the stock holding cost rate, and \(d_t\) denotes the demand in period \(t\). It is to decide whether to place an order in period \(t\) (i.e., \(y_t\) is 1 or 0), the corresponding order quantity (\(q_t\)), and the stock level \(l_t\). The MILP can be formulated compactly as: \[ \min\limits_{ l_t, q_t, y_t} \rightarrow \sum_{t = 1}^{T} \left(c_o \cdot y_t + c_h \cdot l_t \right)\]
\[\begin{align} l_{t-1} - l_t + q_t & = d_t & t = 1,...,T \\ q_t - M \cdot y_t & \leq 0 & t = 1,...,T \\ l_t & = 0 & t \in \{0,T\}\\ q_t, l_t & \geq 0 & t = 1,...,T\\ y_t & \in \{0,1\} & t = 1,...,T \\ \end{align}\]
7.2.2 Extensive model formulation
For \(T = 3\) periods, the canonical form of the Wagner-Whitin model consists of a vector of decision variables with 10 elements \(\mathbf x = (l_0,...,l_3,q_1,....,q_3,y_1,...,y_3)\). The vector of objective parameters is \(\mathbf c = (\underbrace{c_h,...,c_h}_{4},\underbrace{0,...,0}_{3},\underbrace{c_o,...,c_o}_{3})\). There are \(2 \cdot T + 2 = 8\) constraints. Thus, the right-hand side vector \(\mathbf b = (\underbrace{d_1,...,d_3}_{3},\underbrace{0,...,0}_{5})\). Correspondingly, the constraint matrix \(\mathbf A\) has dimension \(8 \times 10\) and looks as follows:
\[ \mathbf A = \left[ \begin{array}{*{10}{r}} 1 & -1 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & -1 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & -1 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & -M & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & -M & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & -M \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ \end{array} \right] \] Thus, in R a dynamic lot-sizing problem can be formulated as follows
library(Rglpk) # load solver package
<- 6 # number of periods
n <- 50 # ordering cost rate
co <- 0.1 # holding cost rate
ch <- round(rnorm(n, mean = 100, sd = 20)) # sample demand from normal distribution with mean 100 and sd 20
d.vec <- sum(d.vec) # set big M
bigM <- numeric(3*n + 1) # initialize decision variables
x.vec <- c(d.vec, rep(0, n + 2)) # right-hand side vector
b.vec <- c(rep(ch, n+1), rep(0, n), rep(co, n)) # objective coefficient vector
c.vec <- c(rep("==", n), rep("<=", n) , rep("==", 2)) # vector with constraint directions
const.vec <- c(rep("C",n + 1), rep("C",n), rep("B",n)) # vector with variable types ("C" continuous,"I" integer, "B" binary )
vtype.vec
<- matrix(0, ncol = 3*n + 1, nrow = 2*n+2) # initialize constraint matrix
A.mat # write coefficient in first n rows (inventory constraints)
for(i in 1:n){
<- 1 # coefficint for l_{t-1}
A.mat[i,i] +1] <- -1 # coefficint for l_{t}
A.mat[i,i+ 1 + i] <- 1 # coefficint for q_{t}
A.mat[i,n
}# write coefficient in rows (n+1):(2*n) (ordering constraints)
for(i in (n+1):(2*n)){
+1] <- 1 # coefficint for q_{t}
A.mat[i,i+ 1 + i] <- -bigM # coefficint for y_{t}
A.mat[i,n
}# write coefficient in last two rows (inventory inititialization)
nrow(A.mat)-1,1] <- 1 # coefficint for i_{0}
A.mat[nrow(A.mat),n+1] <- 1 # coefficint for i_{T}
A.mat[
A.mat
# solve MILP
<- Rglpk_solve_LP(obj = c.vec, mat = A.mat, dir = const.vec, rhs = b.vec, types = vtype.vec)
sol # list with solution and demand
list( l = sol$solution[1:(n+1)], # inventory levels
q = sol$solution[(n+2):(2*n+1)], # order quantities
y = tail(sol$solution, n), # order indicators
d = d.vec ) # demand
By default, most solvers assume that decision variables are non-negative. Thus, in the example above,it is not necessary to add \(l_t \geq 0\) and \(q_t \geq 0\) to the constraint matrix \(\mathbf A\). Nonetheless, this should be checked by taking a look at the solver function’s help page.
Partiularly for larger MILPs the constraint matrices quickly become quite large. However, most entries are 0. Therefore, it is convenient to use sparse matrices. Sparse matrices are matrices where only non-zero entries are explicitly defined by there “coordinates” in the matrix (i.e. row and column index) and their value. In R one can use sparseMatrix()
from the Matrix
package to create such a matrix. The syntax to create a sparse matrix is sparseMatrix(i = <vector of row indices>, j = <vector of col. indices>, x = <vector of entries>, dims = <vector matrix dimensions>)
# assign names to decision vector
names(x.vec) <- c(paste("l", 0:n, sep=""), paste("q", 1:n, sep=""), paste("y", 1:n, sep=""))
<- 1 # initialize number of constraints
nb.const <- NULL # initialize constraint data frame
const.mat
# write inventory balance constraints
for(i in 1:n){
<- rep(nb.const, 3) # row index --> in each row, three non-zero entries
tmp.ind.row <- c(i, # l_{i-1}
tmp.ind.col which(names(x.vec) %in% paste("q",i, sep="")), # find index of q_i
+1 # l_i
i
)<- c(1,1,-1) # assign coefficients
tmp.const <- rbind(const.mat, cbind(tmp.ind.row, tmp.ind.col, tmp.const)) # update constraint data frame
const.mat <- 1 + nb.const # update constraint index
nb.const
}
# write purchasing constraints
for(i in 1:n){
<- rep(nb.const, 2) # row index --> in each row, two non-zero entries
tmp.ind.row <- c(which(names(x.vec) %in% paste("q",i, sep="")), # find index of q_i
tmp.ind.col which(names(x.vec) %in% paste("y",i, sep="")) # find index of y_i
)<- c(1,-bigM) # assign coefficients
tmp.const <- rbind(const.mat, cbind(tmp.ind.row, tmp.ind.col, tmp.const)) # update constraint data frame
const.mat <- 1 + nb.const # update constraint index
nb.const
}
# add initialization constraints
<- rbind(const.mat,
const.mat cbind(nb.const , 1 , 1), # l_0 = 0
cbind(nb.const+1, n+1, 1)) # l_T = 0
# write as sparse matrix
<- sparseMatrix(i = const.mat[,1], # vector of row indices
A.mat j = const.mat[,2], # vector of column indices
x = const.mat[,3], # vector of coefficients
dims = c(max(const.mat[,1]) , length(x.vec))) # matrix dimensions
# solve MILP
<- Rglpk_solve_LP(obj = c.vec, mat = A.mat, dir = const.vec, rhs = b.vec, types = vtype.vec)
sol # list with solution and demand --> same result as before
list( l = sol$solution[1:(n+1)], # inventory levels
q = sol$solution[(n+2):(2*n+1)], # order quantities
y = tail(sol$solution, n), # order indicators
d = d.vec ) # demand
Exercise 3: Add an inventory capacity constrint to the Wagner-Whitin model and resolve it with
Rglpk
.
Exercise 4: Solve the Transshipment Problem with
Rglpk
: \[ \min\limits_{x_{ij}} \rightarrow \sum_{i \in \mathcal{V}_a }\sum_{j \in \mathcal{V}_t } c_{ij} \cdot x_{ij} + \sum_{j \in \mathcal{V}_t }\sum_{k \in \mathcal{V}_b } c_{jk} \cdot x_{jk} \] \[\begin{align} \sum_{j \in \mathcal{V}_t} x_{ij} & \leq a_i & \forall i \in \mathcal{V}_a \\ \sum_{j \in \mathcal{V}_t} x_{jk} & \geq b_k & \forall k \in \mathcal{V}_b \\ \sum_{i \in \mathcal{V}_a} x_{ij} & = \sum_{k \in \mathcal{V}_b} x_{jk} & \forall j \in \mathcal{V}_t \\ x_{ij} & \geq 0 \\ \end{align}\] where \(a_i\) is the maximum supply capacity of node \(i \in \mathcal{V}_a\), \(b_k\) denotes the demand of node \(j \in \mathcal{V}_b\), \(c_{ij/jk}\) are transport cost rates and \(x_{ij/jk}\) are shipment quantities.
How would you cope with transshipment capacities at nodes \(j \in \mathcal{V}_t\)?
How would you cope with fixed transshipment costs?
7.3 ROI: R Optimization Infrastructure
There are quite a lot of different optimization problems and associated R function where tailor-made solution algorithms are implemented. Although any optimization problem consists of objective function, variables, and constraints, there are numerous ways to formalize these components for submission to an optimization function in R. The package ROI
attempts to provide a unified framework for seting up and solving generic optimization problems. An extensive description of the package can be found here.
library(ROI) # load package
ROI_available_solvers() # check available solvers
ROI_installed_solvers() # check installed solvers
# install some new solvers
install.packages("ROI.plugin.quadprog")
install.packages("ROI.plugin.symphony")
ROI_installed_solvers() # check again
Commercial solvers for linear optimization problems are typically the most powerful solvers available but sometimes rather difficult to integrate into open-source project as they are proprieary software. Nonetheless, there are some workarounds for integrating Gurobi and CPLEX. Prerequisite is always that solver is already installed on your machine and the corresponding R-API is installed. For Gurobi the necessary API is bundled in an R package which is distributed with the software an zip-archive. Once Gurobi and the corresponding R package is installed, it can be registered and used as an ROI
solver. Note that it might be necessary to install the ROI Gurobi plugin from the sources.
library(gurobi) # load Gurobi package (to be found in the installation directory of Gurobi)
install.packages("ROI.plugin.gurobi", repos="http://R-Forge.R-project.org")
ROI_installed_solvers() # check again
To formulate an optimization problem, the function OP(objective, constraints, types, bounds)
is used whereby the objective and constraint components are generated by creator functions. The types of variables are defined by a character vector of length \(n\) consisting of the characters “C”, “I” or “B” indicating continuous, integer or binary variables, respectively. Corresponsindgly, upper and lower bounds of the variables can be given as a named list consisting of two vectors of length \(n\) and named “lower” and “upper”.
For the objective function these creator functions are defined as follows:
type | creator | description |
---|---|---|
function | F_objective(F,n,G,H,names) |
\(F(\mathbf x)\)…function to optimize, \(\mathbf x\)…vector of decision variables, \(G(\mathbf x)\)…gradient function (\(F'(\mathbf x)\), optional), \(H(\mathbf x)\)…Hessian function (\(F''(\mathbf x)\), optional), names…naming of variables |
quadratic | Q_objective(Q,L,names) |
\(F(\mathbf x)= \frac{1}{2}\cdot \mathbf x^T \cdot \mathbf Q\cdot \mathbf x+ \mathbf x^T\cdot \mathbf l\), \(\mathbf Q\)…matrix (\(n\times n\)) with coefficients of quadratic terms, \(\mathbf l\)… vector of coefficients for linear terms, names…naming of variables |
linear | L_objective(L,names) |
\(F(\mathbf x)= \mathbf x^T\cdot \mathbf l\), \(\mathbf l\)… vector of coefficients for linear terms, names…naming of variables |
The second component of optimization problems are the constraints. In general, the constraints consist of the left-hand side which depends on the decision variables \(x\), the right-hand side vector (\(\mathbf{b}\), rhs
), and the direction vector (dir
), e.g. \(lhs(\mathbf{x})\leq \mathbf{b}\). The left-hand side is created in a similar way as the objective function using the following creators
type | creator | description |
---|---|---|
function | F_constraint(F,dir,rhs,names) |
\(F\)…list of functions with \(F=\left\{f_i(\mathbf x) \right\}\) such that \(f_i(\mathbf x) \leq b_i\) |
quadratic | Q_constraint(Q,L,dir,rhs,names) |
left-hand side: \(\frac{1}{2}\cdot \mathbf x^T \cdot \mathbf Q\cdot \mathbf x+ \mathbf x^T\cdot \mathbf L\) with \(\mathbf Q\)…list of \(n\times n\)-matrices with coefficients of quadratic terms, \(\mathbf L\)…matrix of coefficients for linear terms |
linear | L_constraint(L,dir,rhs,names) |
left-hand side: \(\mathbf x^T\cdot \mathbf L\), \(\mathbf L\)…matrix of linear coefficients |
The following example solves the continuous location planning problem via ROI
:
<- 100
n <- sample(1:100, size = n) # sample weights for each point/customer
a.vec <- rnorm(n) # sample x coordinates
x.vec <- rnorm(n) # sample y coordinates
y.vec
<- OP(
copt objective = F_objective(F = function(loc) sum(a.vec * (abs(x.vec - loc[1]) + abs(y.vec - loc[2]) ) ), n = 2),
types = rep("C", 2),
bounds = V_bound(ub = rep(Inf, 2), lb = rep(-Inf, 2))
)
<- ROI_solve(copt, start = c(0,0)) # solve the problem
copt_sol $solution # obtain solution
copt_sol$objval # objective value copt_sol
To change the problem, an already existing optimization problem can be updated componentwise:
# change objective function - inverted weights 1/a
objective(copt) <- F_objective(F = function(loc) sum( 1/a.vec * (abs(x.vec - loc[1]) + abs(y.vec - loc[2]) ) ), n = 2)
<- ROI_solve(copt, start = c(0,0)) # solve the problem
copt_sol $solution # obtain solution
copt_sol$objval # objective value copt_sol
Exercise 5: Change the problem by adding constraints that prohibit the center to be closer than 0.5 distance units to any point.
Exercise 6: Formulate the Newsvendor model with \(\beta\) service level restriction with the
ROI
framework.
Similarly, the ROI
can be used to solve mixed-integer problems. Consider the following linear assignment problem assigning \(n\) elements to \(n\) rooms in a cost-minimal way:
\[min \rightarrow \sum_{i,j =1}^n x_{ij} \cdot c_{ij}\]
subject to
\[\begin{align}
\sum_{i=1}^n x_{ij} &= 1 & \forall j = 1,...,n\\
\sum_{j=1}^n x_{ij} &= 1 & \forall i = 1,...,n\\
x_{ij} & \in \left\{0,1\right\}
\end{align}\]
In the extensive ILP formulation, the decision vector is defined as \(\mathbf{x} = (x_{11},x_{12},...,x_{1n},x_{21},...,x_{2n},...,x_{nn})\). For a linear assignment problem with 3 rooms/elements, the constraint matrix is constructs as
\[
\mathbf A = \left[
\begin{array}{*{10}{r}}
1 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 \\
0 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 0 \\
0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & 1 \\
1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 1 & 1 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 \\
\end{array}
\right]
\]
Thus, a linear assignment problem can be formulated via ROI
as follows:
# create problem dimension
<- 10
n # create cost vector
<- rpois(n^2, 100) # sample costs from Poisson distribution with mean 100
c.vec # create empty constraint matrix
<- matrix(0, ncol = n^2, nrow = 2*n)
L.mat # first n rows
1:n,] <- do.call(cbind, lapply(1:n, function(x) diag(n)))
L.mat[# last n rows
+1):(2*n),] <- t(sapply(1:n, function(x) c(rep(0,(x-1)*n), rep(1,n),rep(0,(n-x)*n) )))
L.mat[(n
# create optimization problem
<- OP(
copt objective = L_objective(L = c.vec),
constraints = L_constraint(L = L.mat, dir = rep("==", 2*n), rhs = rep(1, 2*n)),
types = rep("B", n^2)
)
<- ROI_solve(copt) # solve the problem
copt_sol $solution # obtain solution
copt_sol$objval # objective value
copt_sol
# display solution in matrix form
<- matrix(copt_sol$solution, ncol = n)
sol.mat # check first set of constraints
colSums(sol.mat) # all column sums equal to 1?
# check second set of constraints
rowSums(sol.mat) # all row sums equal to 1?
# check objective
sum(sol.mat * matrix(c.vec, ncol = n))
Exercise 7: Formulate the constraint matrix
L
as a sparse matrix.
Exercise 8: Implement the quadratic assignment problem with
ROI
. Can you reuse parts of the linear assignment problem?
7.4 Algebraic model formulation with ompr
Algebraic model formulations allow one to pass MILPs in math-like way to a solver. Algebraic modelling languages are usually used to formulate large and/or complex optimization problems. Thus, commercial solvers often provide a tailored algebraic programming language like ILOG OPL or A/GMPL. The package ompr
provides a similar access for modelling MILPs in a algebraic way. The package uses the ROI
framework for accessing MILP solvers. The application of the package is illustrated by modelling the well-known travelling salesman problem (TSP).
Given a set of nodes \(\mathcal{N}\) with \(n = |\mathcal{N}|\) and pairwise distances \(d_{ij}\) between any pair of nodes (\(i,j \in \mathcal{N}\)), the TSP is to find the shortest tour which visits all nodes and returns to the origin. The following code snippet generates a set f nodes and displays the nodes
library(ggplot2)
# set number of nodes
<- 10
n # sample node coordinates
<- data.frame(id = 1:n, x = runif(n), y = runif(n))
cities # plot nodes
ggplot(cities, aes(x, y)) + geom_point()
To formulate the TSP as a ILP, various formulations are available. Here, a simple formulation to avoid subtours is proposed: the Miller-Tucker-Zemlin formulation.
\[\min \rightarrow \sum_{i=1}^n \sum_{j=1}^n d_{ij} \cdot x_{ij}\]
\[\begin{align} \sum_{i \in \mathcal{N}: i\neq j} x_{ij} & = 1 & \forall j \in \mathcal{N} \\ \sum_{j \in \mathcal{N}: j\neq i} x_{ij} & = 1 & \forall i \in \mathcal{N} \\ u_j & \geq u_i + 1 - n \cdot (1-x_{ij}) & \forall i,j \in \mathcal{N}, i \neq j, i,j>1 \\ u_1 & = 1 & \\ u_i & \in \left[2,n\right] & \forall i \in \mathcal{N}, i>1\\ x_{ii} & = 0 & \forall i \in \mathcal{N} \\ x_{ij} & \in \left\{0,1\right\} & \forall i,j \in \mathcal{N} \end{align}\]
Setting up the ILP with ompr
uses so-called pipes. Piping is an instrument to simplify code in R. Typically, more complex analyses in R often call multiple function subsequently whereby each function produces intermediate results which are passed (sometimes after some alterations) to the next function, i.e. code of the form \(x \rightarrow f(x) \rightarrow g(f(x))\). In case, there is such a sequence of functions instead of writing g(f(x))
the equivalent pipe is x %>% f %>% g
, e.g. log(sqrt(x=2))
is equivalent to x %>% sqrt %>% log
. The piping operaor is part of dplyr
package, see here for an introduciton and some more advanced piping methods.
For setting up a MILP, the MIPModel()
function is fed by piping constraints, variables, and objective functions using helper functions of the form add_<object>()
and set_<object>()
. The objects containing the data which are referred to in the helper functions should exist in the workspace before the MIPModel()
function is called as this functions creates the extensive model formulation. The helper expressions like add_variable(<expr>, <quant>)
consist of two parts: the expression and a quntifier. Thereby, the expression must be vectorized. I.e., the expression must accept vector inputs and return a vector of the same length. In the TSP case, the access to the distance matrix must be vectorized as follows.
library(ompr)
library(dplyr)
library(ompr.roi)
library(ROI.plugin.glpk)
<- as.matrix(dist(cities[,c("x","y")])) # calculate distances
distance # create vectorized function for accessing distance values
<- function(i, j) {
dist_fun vapply(seq_along(i), function(k) distance[i[k], j[k]], numeric(1L))
}# compare results: retain distance of arcs (1,2), (2,3), and (3,4)
c(1,2,3), c(2,3,4)]
distance[dist_fun(c(1,2,3), c(2,3,4))
Exercise 9: The function for accessing the distances is obtained from here. Try to formulate a more compact
dist_fun()
.
Now, the TSP can be modelled as follows:
<- MIPModel() %>%
model # create decision variable that is 1 iff we travel from node i to j
add_variable(x[i, j], i = 1:n, j = 1:n, type = "integer", lb = 0, ub = 1) %>%
# a helper variable for the MTZ formulation of the tsp
add_variable(u[i], i = 1:n, lb = 1, ub = n) %>%
# minimize travel distance
set_objective(sum_expr(dist_fun(i, j) * x[i, j], i = 1:n, j = 1:n), "min") %>%
# leave each node
add_constraint(sum_expr(x[i, j], j = 1:n) == 1, i = 1:n) %>%
# visit each node
add_constraint(sum_expr(x[i, j], i = 1:n) == 1, j = 1:n) %>%
# ensure no subtours (arc constraints)
add_constraint(u[1] == 1) %>%
add_constraint(u[j] >= u[i] + 1 - n * (1 - x[i, j]), i = 1:n, j = 2:n) %>%
# exclude self-cycles
set_bounds(x[i, i], ub = 0, i = 1:n)
# solve model with GLPK
<- solve_model(model, with_ROI(solver = "glpk", verbose = TRUE))
result
# obtain solution
<- get_solution(result, x[i, j]) %>%
solution filter(value > 0)
Note that the model variables (in the TSPexample x
and u
) should not exist in the workspace before setting up the MILP model. Otherwise, a warning will be produced.
The solution can be displayed as a graph with the corresponding tour.
# use auxiliary u variables to reorder the data frame of nodes
ggplot(cities[c(order(get_solution(result, u[i])$value),1),], aes(x, y)) +
geom_point() +
geom_text(aes(label=id),hjust=0, vjust=0) +
geom_path() +
ggtitle(paste0("Total cost: ", round(objective_value(result), 2)))
Exercise 10: Implement and solve the Wagner-Whithin modell introduced above with
ompr
.