Cluster analysis (in marketing)
'Cluster analysis' is a class of statistical techniques that can be applied to data that exhibit 'natural' groupings. Cluster analysis sorts through the raw data and groups them into clusters. A cluster is a group of relatively homogeneous cases or observations. Objects in a cluster are similar to each other. They are also dissimilar to objects outside the cluster, particularly objects in other clusters.
The diagram below illustrates the results of a survey that studied drinkers' perceptions of spirits (alcohol). Each point represents the results from one respondent. The research indicates there are four clusters in this market.
Illustration of clusters
Another example is the vacation travel market. Recent research has identified three clusters or market segments. They are the: 1) The demanders - they want exceptional service and expect to be pampered; 2) The escapists - they want to get away and just relax; 3) The educationalist - they want to see new things, go to museums, go on a safari, or experience new cultures.
Cluster analysis, like factor analysis and multi dimensional scaling, is an interdependence technique: it makes no distinction between dependent and independent variables. The entire set of interdependent relationships is examined. It is similar to multi dimensional scaling in that both examine inter-object similarity by examining the complete set of interdependent relationships. The difference is that multi dimensional scaling identifies underlying dimensions, while cluster analysis identifies clusters. Cluster analysis is the obverse of factor analysis. Whereas factor analysis reduces the number of variables by grouping them into a smaller set of factors, cluster analysis reduces the number of observations or cases by grouping them into a smaller set of clusters.
 In marketing, cluster analysis is used for
 Basic procedure
- Formulate the problem - select the variables to which you wish to apply the clustering technique
- Select a distance measure - various ways of computing distance:
- Select a clustering procedure (see below)
- Decide on the number of clusters
- Map and interpret clusters - draw conclusions - illustrative techniques like perceptual maps, icicle plots, and dendrograms are useful
- Assess reliability and validity - various methods:
- repeat analysis but use different distance measure
- repeat analysis but use different clustering technique
- split the data randomly into two halves and analyze each part separately
- repeat analysis several times, deleting one variable each time
- repeat analysis several times, using a different order each time
 Clustering procedures
There are several types of clustering methods:
- Non-Hierarchical clustering (also called k-means clustering)
- first determine a cluster center, then group all objects that are within a certain distance
- Sequential Threshold method - first determine a cluster center, then group all objects that are within a predetermined threshold from the center - one cluster is created at a time
- Parallel Threshold method - simultaneously several cluster centers are determined, then objects that are within a predetermined threshold from the centers are grouped
- Optimizing Partitioning method - first a non-hierarchical procedure is run, then objects are reassigned so as to optimize an overall criterion.
- Hierarchical clustering
- objects are organized into an hierarchical structure as part of the procedure
- Divisive clustering - start by treating all objects as if they are part of a single large cluster, then divide the cluster into smaller and smaller clusters
- Agglomerative clustering - start by treating each object as a separate cluster, then group them into bigger and bigger clusters
- Centroid methods - clusters are generated that maximize the distance between the centers of clusters (a centroid is the mean value for all the objects in the cluster)
- Variance methods - clusters are generated that minimize the within-cluster variance
- Ward's Procedure - clusters are generated that minimize the squared Euclidean distance to the center mean
- Linkage methods - cluster objects based on the distance between them
- Single Linkage method - cluster objects based on the minimum distance between them (also called the nearest neighbour rule)
- Complete Linkage method - cluster objects based on the maximum distance between them (also called the furthest neighbour rule)
- Average Linkage method - cluster objects based on the average distance between all pairs of objects (one member of the pair must be from a different cluster)
 External links
 See also
- Sheppard, A. G. (1996). The sequence of factor analysis and cluster analysis: Differences in segmentation and dimensionality through the use of raw and factor scores. Tourism Analysis, 1(Inaugural Volume), 49-57.
This article is based on one or more articles in Wikipedia, with modifications and
additional content by SOURCES editors. This article is covered by a Creative Commons
Attribution-Sharealike 3.0 License (CC-BY-SA) and the GNU Free Documentation License
(GFDL). The remainder of the content of this website, except where otherwise indicated,
is copyright SOURCES and may not be reproduced without written permission.
(For information call 416-964-7799 or use the
SOURCES.COM is an online portal and directory for journalists, news media, researchers
and anyone seeking experts, spokespersons, and reliable information resources. Use
SOURCES.COM to find experts, media contacts, news releases, background information,
scientists, officials, speakers, newsmakers, spokespeople, talk show guests, story
ideas, research studies, databases, universities, associations and NGOs, businesses,
government spokespeople. Indexing and search applications by Ulli Diemer and Chris
For information about being included in SOURCES as a expert or
spokesperson see the FAQ or use
the online membership form.
Check here for
information about becoming an
For partnerships, content and applications, and domain name opportunities