Three phase search with dynamic population size heuristic — three_phase_search

This function implements the three phase search algorithm TPSPD for anticlustering by Yang et al. (2022; <doi.org/10.1016/j.ejor.2022.02.003>). The description of their algorithm is given in Section 2 of their paper (in particular, see the Pseudocode in Algorithm 1).

three_phase_search_anticlustering(
  x,
  K,
  N,
  objective = "diversity",
  number_iterations = 50,
  clusters = NULL,
  upper_bound = NULL,
  lower_bound = NULL,
  beta_max = 15,
  theta_max = NULL,
  theta_min = NULL,
  beta_min = NULL,
  eta_max = 3,
  alpha = 0.05
)

Arguments

x: The data input, as in anticlustering.
K: Number of anticlusters to be formed.
N: Number of elements.
objective: The anticlustering objective, can be "diversity" or "dispersion".
number_iterations: A number that defines how many times the steps in the search algorithm are repeated.
clusters: A vector of length K that specifies the number of elements each cluster can contain. If this vector is not NULL, the lower and upper bounds will be disregarded.
upper_bound: Maximum number of elements in each anticluster. By default, anticlusters are of equal size, calculated as the total number of items divided by the number of clusters.
lower_bound: Minimum number of elements in each anticluster. By default, anticlusters are of equal size, calculated as the total number of items divided by the number of clusters.
beta_max: The algorithm begins with a pool of random initial solutions of size beta_max. Over time, the size of the solution pool decreases linearly until it reaches beta_min.
theta_max: Parameter for the strength of undirected perturbation, which decreases linearly over time from theta_max to theta_min.
theta_min: Parameter for the strength of undirected perturbation, which decreases linearly over time from theta_max to theta_min.
beta_min: The minimum solution pool size the algorithm should reach before making a determination.
eta_max: Parameter that specifies how many times the steps in the direct perturbation are executed.
alpha: Parameter for weighing the discrimination of a slightly worse local optimal child solution.

Value

A vector of length N that assigns a group (i.e, a number between 1 and K) to each input element

Details

Details of the implementation of the algorithm can be found in the pseudocode of the paper Yang et al. (2022). However, we performed one change as compared to the original description of the algorithm: Instead of setting a time limit, we define the number of iterations the algorithm performs before terminating (via argument number_iterations).

References

Xiao Yang et al. “A three-phase search approach with dynamic population size for solving the maximally diverse grouping problem”. In: European Journal of Operational Research 302.3 (2022) <doi:10.1016/j.ejor.2022.02.003>

Author

Hannah Hengelbrock Hannah.Hengelbrock@hhu.de, Martin Papenberg martin.papenberg@hhu.de

Examples


# Generate some random data
N <- 120
M <- 5
K <- 4
dat <- matrix(rnorm(N * M), ncol = M)
distances <- dist(dat)

# Perform three hase serach algorithm
result1 <- three_phase_search_anticlustering(dat, K, N)

# Compute objectives function
diversity_objective(distances, result1)
#> [1] 5357.225

# Standard algorithm:
result2 <- anticlustering(distances, K=K, method="local-maximum", repetitions = 10)
diversity_objective(distances, result2)
#> [1] 5356.094