Volume 190, Issue 2, 15 July 2007, Pages 1502–1513

Application of honey-bee mating optimization algorithm on clustering

Mohammad Fathian,
Babak Amiri
^,
,
Ali Maroosi

http://dx.doi.org/10.1016/j.amc.2007.02.029

Abstract

Cluster analysis is one of attractive data mining technique that use in many fields. One popular class of data clustering algorithms is the center based clustering algorithm. K-means used as a popular clustering method due to its simplicity and high speed in clustering large datasets. However, K-means has two shortcomings: dependency on the initial state and convergence to local optima and global solutions of large problems cannot found with reasonable amount of computation effort. In order to overcome local optima problem lots of studies done in clustering. Over the last decade, modeling the behavior of social insects, such as ants and bees, for the purpose of search and problem solving has been the context of the emerging area of swarm intelligence. Honey-bees are among the most closely studied social insects. Honey-bee mating may also be considered as a typical swarm-based approach to optimization, in which the search algorithm is inspired by the process of marriage in real honey-bee. Honey-bee has been used to model agent-based systems. In this paper, we proposed application of honeybee mating optimization in clustering (HBMK-means). We compared HBMK-means with other heuristics algorithm in clustering, such as GA, SA, TS, and ACO, by implementing them on several well-known datasets. Our finding shows that the proposed algorithm works than the best one.

Keywords

Clustering;
Meta-heuristic;
K-means;
Honey-bee

1. Introduction

Data mining, extracting interesting and valuable information such as trends, features, or patterns from hidden predictive data, is the multidisciplinary field that is at the intersection of statistics, machine learning, database management, and data visualization, to provide a new perspective on data analysis.

Clustering analysis, which is the subject of active research in several fields such as statistics, pattern recognition, machine learning, and data mining, is to partition a given set of data or objects into clusters (or called groups, classes). It also applied in a large variety of applications, for example, image segmentation, objects and character recognition, document retrieval, etc [17]. There are many methods applied in clustering analysis, like hierarchical clustering, partition-based clustering, density-based clustering, and artificial intelligence-based clustering.

One popular class of data clustering algorithms is the center based clustering algorithm. K-means used as a popular clustering method due to its simplicity and high speed in clustering large datasets [6]. However, K-means has two shortcomings: dependency on the initial state and convergence to local optima [20] and global solutions of large problems cannot found with reasonable amount of computation effort [7]. In order to overcome local optima problem lots of studies done in clustering.

Mualik and Bandyopadhyay [22] proposed a genetic algorithm based method to solve the clustering problem and experiment on synthetic and real life datasets to evaluate the performance. The results showed that GA-based method might improve the final output of K-means.

Krishna and Murty [11] proposed a novel approach called genetic K-means algorithm for clustering analysis. It defines a basic mutation operator specific to clustering called distance-based mutation. Using finite Markov chain theory, it proved that GKA converge to the best-known optimum.

In [21] Selim and Al-Sultan discussed the solution of the clustering problem usually solved by the K-means algorithm. The problem known to have local minimum solutions, which are usually what the K-means algorithm obtains. The simulated annealing approach for solving optimization problems described and proposed for solving the clustering problem. The parameters of the algorithm discussed in detail and it shown that the algorithm converges to a global solution of the clustering problem.

According to [5], researchers considered a clustering problem where a given data set partitioned into a certain number of natural and homogeneous subsets such that each subset is composed of elements similar to one another but deferent from those of any other subset. For the clustering problem, a heuristic algorithm exploited by combining the tabu search heuristic with two complementary functional procedures, called packing and releasing procedures. The algorithm numerically tested for its electiveness in comparison with reference works including the tabu search algorithm, the K-means algorithm and the simulated annealing algorithm.

Over the last decade, modeling the behavior of social insects, such as ants and bees, for the purpose of search and problem solving has been the context of the emerging area of swarm intelligence. Using ant colony is a typical successful swarm-based optimization approach, where the search algorithm inspired by the behavior of real ants.

Kuo et al. [17] proposed a novel clustering method, ant K-means (AK) algorithm. AK algorithm modifies the K-means as locating the objects in a cluster with the probability, which updated by the pheromone, while the rule of updating pheromone is according to total within cluster variance (TWCV).

In [16] presents an ant colony optimization, methodology for optimally clustering N objects into k clusters. The algorithm employs distributed agents who mimic the way real ants find a shortest path from their nest to food source and back. They compared result with other heuristic algorithms in clustering, GA, Tabu search, SA. They showed that their algorithms are better than other algorithms in performance and time.

Honey-bees are among the most closely studied social insects. Honey-bee mating may also be considered as a typical swarm-based approach to optimization, in which the search algorithm is inspired by the process of marriage in real honey-bee. Honey bee has been used to model agent-based systems [1]. In a recent work, Abbass [8] and [9] developed an optimization algorithm based on the honey-bee marriage process.

Bozorg Haddad and Afshar [15] presented an optimization algorithm based on honeybee mating that successfully applied to a single reservoir optimization problem with discrete decision variables. Later, Bozorg Haddad et al. [14] applied the same algorithm to three benchmark mathematical problems.

In [2] the honey bee mating optimization algorithm is presented and tested with a nonlinear, continues constrained problem with continues decision and state variables to demonstrate the efficiency of the algorithm in handling the single reservoir operation optimization problems. They showed that the performance of the model is quite comparable with the result of the well-developed traditional linear programming solvers.

This paper presents application of honey-bee mating optimization (HBMO) algorithm for clustering. The paper organized as follow: in Section 2 we discussed cluster analysis problems. Section 3 introduce honeybee mating nature and application of honey-bee mating algorithm in clustering, and then in Section 4 experimental result of proposed clustering algorithm in comparison with other heuristics clustering algorithms showed.

2. Clustering

Data clustering, which is an NP-complete problem of finding groups in heterogeneous data by minimizing some measure of dissimilarity, is one of the fundamental tools in data mining, machine learning and pattern classification solutions [13]. Clustering in N-dimensional Euclidean space R^N is the process of partitioning a given set of n points into a number, say k, of groups (or, clusters) based on some similarity (distance) metric in clustering procedure is Euclidean distance, which derived from the Minkowski metric (Eqs. (1) and (2)).

equation1

d (x, y) = {(\sum_{i = 1}^{m} | x_{i} - y_{j} |^{r})}^{1 / r},

Turn MathJax on

equation2

d (x, y) = \sqrt{\sum_{i = 1}^{m} (x_{i} - y_{j})^{2}} .

Turn MathJax on

In this study, we will also use Euclidian metric as a distance metric. The existing clustering algorithms can be simply classified into the following two categories: hierarchical clustering and partitional clustering. The most class of popular class of partitional clustering methods is the center based clustering algorithms [23].

The K-means algorithms, is one of the most widely used center based clustering algorithms [6].

To find K centers, the problem is defined as an optimization (minimization) of a performance function, Perf(X, C), defined on both the data items and the center locations. A popular performance function for measuring goodness of the k clustering is the total within-cluster variance or the total mean-square quantization error (MSE), Eq. (3) [23]

equation3

Perf (X, C) = \sum_{i = 1}^{N} Min {‖ x_{i} - c_{l} ‖^{2} | l = 1, \dots, K} .

Turn MathJax on

The steps of the K-means algorithm are as follow [22]:

Step 1:: Choose K cluster centers randomly from n points.
Step 2:: Assign each point to clusters.
Step 3:: Compute new cluster centers.
Step 4:: If termination criteria satisfied, stop otherwise continues from step 2.

Note that in case the process close not terminates at step 4 normally, then it executed for a mutation fixed number of iterations.

3. Application of honey-bee mating optimization algorithm in clustering

3.1. Honey-bee modeling

A honeybee colony typically consists of a single egg-laying long-lived queen, anywhere from zero to several thousands drones (depending on the season) and usually 10,000–60,000 workers [9]. Queens are specialized in egg laying. A colony may contain one queen or more during its life cycle, which named monogynous and/or polygynous colonies, respectively. Only the queen bee is fed “royal jelly,” which is a milky-white colored, jelly-like substance “Nurse bees” secrete this nourishing food from their glands, and feed it to their queen. The diet of royal jelly makes the queen bee bigger than any other bee in the hive. A queen bee may live up to 5 or 6 years, whereas worker bees and drones never live more than 6 months. There usually several hundred drones that live with the queen and worker bees. Mother Nature has given the drones’ just one task, which is to provide the queen with some sperm. After the mating process, the drones die [10].

Drones are the fathers of the colony. They are haploid and act to amplify their mothers’ genome without altering their genetic composition, except through mutation. Therefore, drones considered as agents that propagate one of their mother’s gametes and function to enable females to act genetically as males. Workers specialized in brood care and sometimes lay eggs. Broods arise either from fertilized or unfertilized eggs. The former represent potential queens or workers, whereas the latter represent prospective drones [2].

In the marriage process, the queen(s) mate during their mating flights far from the nest. A mating flight starts with a dance performed by the queen who then starts a mating flight during which the drones follow the queen and mate with her in the air. In each mating, sperm reaches the spermatheca and accumulates there to form the genetic pool of the colony [19].

Each time a queen lays fertilized eggs, she randomly retrieves a mixture of the sperm accumulated in the spermatheca to fertilize the egg [18].

The queen is pursued by a large swarm of drones (drone comets), when copulation occurs. Insemination ends with the eventual death of the drone, and the queen receiving the “mating sign.” The queen mates multiple times but the drone, inevitably, only once. These features make bee mating the most spectacular mating among insects [2].

The mating flight may considered as a set of transitions in a state-space (the environment) where the queen moves between the different states in some speed and mates with the drone encountered at each state probabilistically. At the start of the flight, the queen initialized with some energy content and returns to her nest when the energy is within some threshold from zero to full spermatheca. [2].

In developing the algorithm, the functionality of workers is restricted to brood care and therefore, each worker may be represented as a heuristic which acts to improve and/or take care of a set of broods (i.e., as feeding the future queen with royal jelly). A drone mates with a queen probabilistically using an annealing function as follows [9]:

equation4

Prob(Q,D)=exp[-Δ(f)/S(t)],

Prob (Q, D) = \exp [- Δ (f) / S (t)],

Turn MathJax on

where Prob (Q,D)

(Q, D)

is the probability of adding the sperm of drone D to the spermatheca of queen Q (that is, the probability of a successful mating); Δ(f ) is the absolute difference between the fitness of D (i.e., f (D )) and the fitness of Q (i.e., f (Q )); and S (t ) the speed of the queen at time t . It is apparent that this function acts as an annealing function, where the probability of mating is high when the queen is still at the beginning of her mating flight, therefore her speed is high, or when the fitness of the drone is as good as the queen’. After each transition in space, the queen’s speed and energy decays according to the following equations:

equation5

S(t+1)=α(t)×S(t),

S (t + 1) = α (t) \times S (t),

Turn MathJax on

where α is a factor ∈[0,1]

\in [0, 1]

and is the amount of speed reduction after each transition.

Workers that used to improve the brood’s genotype may represent a set of different heuristics. The rate of improvement in the brood’s genotype, as result of a heuristic application to that brood, defines the heuristic fitness value.

The fitness of the resulting genotype is determined by evaluating the value of the objective function of the brood genotype and/or its normalized value. It is important to note that a brood has only one genotype.

Thus, an HBMO algorithm maybe constructed with the following five main stages:

·: The algorithm starts with the mating flight, where a queen (best solution) selects drones probabilistically to form the spermatheca (list of drones). A drone then selected from the list randomly for the creation of broods.
·: Creation of new broods (cluster centers) by crossover the drone’s genotypes with the queens.
·: Use of workers (heuristics) to conduct local search on broods (trial solutions).
·: Adaptation of worker’s fitness, based on the amount of improvement achieved on broods.
·: Replacement of weaker queens by fitter broods.

The algorithm starts with three user-defined parameters and one predefined parameter. The predefined parameter is the number of workers (W), representing the number of heuristics encoded in the program. However, the predefined parameter may be used as a parameter to alter the number of active heuristics if required; that is, the user may choose the first heuristic, where W is less than or equal to the total number of heuristics encoded in the program. The three user-defined parameters are the number of queens, the queen’s spermatheca size representing the maximum number of mating per queen in a single mating flight, and the number of broods that will be born by all queens. The speed of each queen at the start of each mating flight initialized at random. A set of queens then initialized at random. A randomly selected heuristic then used to improve the genotype of each queen, assuming that a queen is usually a good bee. A number of mating flights are undertaken. In each mating flight, all queens fly based on the speed of each, where speed generated at random for each queen before each mating flight commences. At the start of a mating flight, a drone generated randomly and the queen positioned over that drone. The transition made by the queen in space based on her speed that represents the probability of flipping each gene in the drone’s genome. At the start of a mating flight, the speed may be higher and the queen may make very large steps in space. While the energy of the queen decreases, her speed decreases, and as a result, the neighborhood covered by the queen, decreases. At each step in the space, the queen mates with the drone encountered at that step using the probabilistic rule in Eq. (4). If the mating is successful (i.e., the drone passes the probabilistic decision rule), the drone’s sperm is stored in the queen’s spermatheca. To sum up, the algorithm starts with a mating flight where a queen selects a drone with a predefined probabilistic rule. By cross-overing the drone’s genotypes with the queen’s, a new brood (trial solution) is formed which later can be improved, employing workers to conduct local search.

When all queens complete their mating flight, they start breeding. For a required number of broods, a queen selected in proportion to her fitness and mated with a randomly selected sperm from her spermatheca. A worker chosen in proportion to its fitness to improve the resultant brood. After all broods have been generated, they are sorted according to their fitness. The best brood replaces the worst queen until there is no brood that is better than any of the queens are. Remaining broods then killed and a new mating flight begins until all assigned mating flights are completed or convergence criteria met. The main steps of the HBMO algorithm presented in Fig. 1. In addition, a full-scale computational flowchart illustrated in Fig. 2.

: Fig. 1.
The HBMO algorithm [9].; Figure options

: Fig. 2.
HBM algorithm representation.; Figure options

3.2. Application of HBM algorithm in clustering

The search capability of HBM algorithm used in this article for the purpose of appropriately determining a fixed number of K cluster centers in R^N; thereby suitably clustering the set of n unlabelled points the clustering metric that has been adopted is the sum of the Euclidean distance of the points of the points from their respective cluster centers. The steps of the proposed algorithm are shown in Fig. 2, there are now described in detail.

Step 1:

String representation

A chromosome has used to represent a candidate solution to a problem where each gene in the chromosome represents a parameter of the candidate solution. In this study, a chromosome regarded as a set of K initial cluster centers and each gene is a cluster center dimension. Specifically, a chromosome can be represented as C=[c₁…c_j…c_K] $C = [c_{1} \dots c_{j} \dots c_{K}]$ where c _j is the jth gene and K is total number of genes. Fig. 3, illustrate a chromosome encoding example. C₁=(2,5,1),C₂=(6,3,2),C₃=(5,7,4) $C_{1} = (2, 5, 1), C_{2} = (6, 3, 2), C_{3} = (5, 7, 4)$ .

: Fig. 3.
A chromosome-encoding example.; Figure options

Step 2:

Define the model inputs parameters

The algorithm starts with three user-defined parameters and one predefined parameter. The predefined parameter is the number of worker (W), representing the number of heuristics encoded in the program. However, the predefined parameter may be used as a parameter to alter the number of active heuristics if required; that is, the user may choose the first heuristic, where W is less than or equal to the total number of heuristics encoded in the program.

The user-defined parameters are number of queens, the queen’s spermatheca size representing the maximum number of broods that will be born by all queens. The speed of each queen at the start of each mating flight initialized at random,

Step 3:

Random generation of a set of initial solutions

In this stage, a set of initial cluster centers generated randomly from the dataset points. Each solution represents K cluster centers as shown in Fig. 1.

Step 4:

Selection of queen

After generation of a set of initial solutions mentioned in the previous stage, rank the initial solutions based on performance function and set the best one as queen.

Step 5:

Mating flight

Use simulated annealing to select the set of solutions from the search space to make a mating pool for possible information exchange between the best present solution (queen) and the selected trial solutions.

Step 6:

Breeding process

Generate new set of solutions by employing predefined crossover operators and heuristic functions between the present solutions and the trial solution according to their fitness values. In this study, we adopt intermediate crossover. It creates children by taking a weighted average of parents. You can specify the weights by a single parameter, Ratio, which can be a scalar or a raw vector of length number of variables. The default is a vector of all 1’s. The function creates the child from parent 1 and parent 2 using the following formula.

equation6

Child = parent 1 + ran ? Ratio ? (parent 2 - parent 1) .

Turn MathJax on

If all the entries of Ratio lie in the range [0, 1], the children produced are within the hypercube defined by placing the parents apposite vertices. If ratio is not in that range, the children might lie outside the hypercube. If ratio is a scalar, then all the children lie on the line between the parents.

Step 7:

feeding selected broods and queen with the royal jelly by workers

Improve the newly generated set of solutions employing different heuristic functions and mutation operators according to their fitness values. For binary representation of chromosomes, a bit position mutated by simply flipping its value. Since we are considering floating point representation in this article, we use the following mutation. A number δ in the range [0, 1] generated with uniform distribution. If the value at a gene position is v, after mutation it becomes:

\begin{matrix} v \pm 2 ? δ ? v, v \neq 0, \\ v \pm 2 ? δ, v = 0 . \end{matrix}

Turn MathJax on

The + or ? sign occurs with equal probability. Note that we could have implemented mutation as

v±δ?v.

v \pm δ ? v .

Turn MathJax on

However, one problem with this form is that if the values at a particular position in all the chromosomes of a population become positive (or negative), then we will never be able to generate a new chromosome having a negative (or positive) value at the position. In order to overcome this limitation, we have incorporated a factor of 2 while implementing mutation other form like

v±(δ+ε)?v,

v \pm (δ + ε) ? v,

Turn MathJax on

where 0<ε<1

0 < ε < 1

would also have satisfied our purpose.

Step 8:

If the new best solution is, better than the queen replace it with queen.

Step 9:

Check the termination criteria

If the termination criteria satisfied finish the algorithm, else discard all previous trial solutions and generate new trial solutions. Then go to step 5 until all assigned iteration (mating flights) completed or convergence criteria met.

The pseudo code for application of HBM algorithm in clustering illustrate in Fig. 4.

: Fig. 4.
Pseudo code for HBM clustering Algorithm.; Figure options

4. Experimental result

In this section, we present a set of experiments that shows goodness of our algorithm. We have done our experiments on a Pentium IV, 2.8 GHz, 512 GB RAM computer and we have coded with Matlab 7.0 software. We run all five algorithms on three different datasets. The datasets are all well-known iris, wine and breast-cancer datasets taken from Machine Learning Laboratory [4].

DataSet1: This is the Iris data set, which is perhaps the best-known database to found in the pattern recognition literature. Fisher’s paper is a classic in the field and referenced frequently to this day. The data set contains three classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are not linearly separable from each other. There are 150 instances with four numeric attributes in iris data set. There is no missing attribute value. The attributes of the iris data set are; sepal length in cm, sepal width in cm, petal length in cm and petal width in cm.

DataSet2: This is the wine data set, which also taken from MCI laboratory. These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituents found in each of the three types of wines. There are 178 instances with 13 numeric attributes in wine data set. All attributes are continuous. There is no missing attribute value.

DataSet3: This is the Wisconsin Breast Cancer Database (January 8, 1991) which also taken from MCI laboratory. There are 699 instances with 10 numeric attributes in data set. Attributes 2 through 10 have been used to represent instances. Each instance has one of two possible classes: benign or malignant. There are 16 missing attribute values in data set.

To evaluate the performance of the HBM algorithm in clustering, we have compared it with several typical stochastic algorithms including the ACO algorithm [16], the simulated annealing approach [18], the genetic algorithms [3], and the tabu search approach [12]. The effectiveness of stochastic algorithms is greatly dependent on the generation of initial solutions. Therefore, foe every dataset, algorithms performed 10 times individually for their own effectiveness tests, each time with randomly generated initial solutions.

The comparison of results for each dataset based on the bet solution found in 10 distinct runs of each algorithm, the average number of evaluations required and the convergence processing time taken to attain the best solution.

The solution quality is also given in terms of the average and worst values of the clustering metric (Favg, F_worst, respectively) after 10 different runs for each of the five algorithms. Table 1, Table 2 and Table 3 show these results.

Table 1.

Result obtained by the five algorithms for 10 different runs on dataset 1

Method	Function value			Function evaluation	CPU time (s)
Method	F_best	F_average	F_worst	Function evaluation	CPU time (s)
HBM	96.752047	96.95316	97.757625	11214	35.25
ACO	97.100777	97.171546	97.808466	10998	33.72
GA	113.986503	125.197025	139.778272	38128	105.53
TS	97.365977	97.868008	98.569485	20201	72.86
SA	97.100777	97.134625	97.263845	29103	95.92

Table options

本站仅提供存储服务，所有内容均由用户发布，如发现有害或侵权内容，请点击举报。