Understanding Fuzzy Clustering: The Soft K-Means You Should Know

Chapter 1: Introduction to Clustering Algorithms

In a previous discussion, I highlighted the advanced version of k-means clustering known as K-means++. Today, I will delve into a different clustering algorithm: Fuzzy Clustering in Machine Learning.

What Are Clustering Algorithms?

Before diving into Fuzzy Clustering, it’s essential to understand what clustering algorithms are. Clustering is a form of unsupervised learning widely used in statistical data analysis across various domains. In Data Science, we utilize clustering techniques to extract meaningful insights from datasets by observing how data points group together when a clustering algorithm is applied.

Fuzzy Clustering Explained

Fuzzy clustering is a methodology where data points can belong to multiple clusters simultaneously. From a computational perspective, creating fuzzy boundaries is significantly more manageable than assigning each point to a single cluster.

Types of Fuzzy Clustering Algorithms

Fuzzy clustering methods can be categorized into two main types: classical fuzzy clustering and shape-based fuzzy clustering.

Classical Fuzzy Clustering Algorithms
- Fuzzy C-Means (FCM): This widely-used method functions similarly to the K-Means algorithm, allowing a data point to belong to all clusters with a membership function ranging from 0 (farthest from the cluster center) to 1 (closest to the center). Variants include:
  - Possibilistic C-Means (PCM)
  - Fuzzy Possibilistic C-Means (FPCM)
  - Possibilistic Fuzzy C-Means (PFCM)
- Gustafson-Kessel (GK) Algorithm: Unlike C-means, which assumes spherical clusters, GK allows for elliptical-shaped clusters.
- Gath-Geva Algorithm: Also known as Gaussian Mixture Decomposition, this method is akin to FCM but supports clusters of varying shapes.
Shape-Based Fuzzy Clustering Algorithms
- Circular-Shaped Algorithms: These restrict data points to a circular formation. When incorporated into FCM, it is referred to as CS-FCM.
- Elliptical-Shaped Algorithms: These constrain points to elliptical shapes, used in the GK algorithm.
- Generic-Shaped Algorithms: Given that most real-world objects aren't strictly circular or elliptical, this algorithm accommodates clusters of any form.

Fuzzy Clustering Implementation in Python

To begin, we need to generate a dataset.

from __future__ import division, print_function

import numpy as np

import matplotlib.pyplot as plt

import skfuzzy as fuzz

colors = ['b', 'orange', 'g', 'r', 'c', 'm', 'y', 'k', 'Brown', 'ForestGreen']

# Define three cluster centers

centers = [[4, 2], [1, 7], [5, 6]]

# Define three cluster sigmas in x and y, respectively

sigmas = [[0.8, 0.3], [0.3, 0.5], [1.1, 0.7]]

# Generate test data

np.random.seed(42) # Set seed for reproducibility

xpts = np.zeros(1)

ypts = np.zeros(1)

labels = np.zeros(1)

for i, ((xmu, ymu), (xsigma, ysigma)) in enumerate(zip(centers, sigmas)):

xpts = np.hstack((xpts, np.random.standard_normal(200) * xsigma + xmu))

ypts = np.hstack((ypts, np.random.standard_normal(200) * ysigma + ymu))

labels = np.hstack((labels, np.ones(200) * i))

# Visualize the test data

fig0, ax0 = plt.subplots()

for label in range(3):

ax0.plot(xpts[labels == label], ypts[labels == label], '.', color=colors[label])

ax0.set_title('Test Data: 200 Points Across 3 Clusters.')

The above video titled "Day 70 - Fuzzy C-Means Clustering Algorithm" provides a comprehensive overview of the Fuzzy C-Means algorithm, discussing its core principles and applications in clustering.

Clustering Visualization

fig1, axes1 = plt.subplots(3, 3, figsize=(8, 8))

alldata = np.vstack((xpts, ypts))

fpcs = []

for ncenters, ax in enumerate(axes1.reshape(-1), 2):

cntr, u, u0, d, jm, p, fpc = fuzz.cluster.cmeans(

alldata, ncenters, 2, error=0.005, maxiter=1000, init=None)

# Store FPC values for later

fpcs.append(fpc)

# Plot assigned clusters for each data point in the training set

cluster_membership = np.argmax(u, axis=0)

for j in range(ncenters):

ax.plot(xpts[cluster_membership == j],

ypts[cluster_membership == j], '.', color=colors[j])

# Mark the center of each fuzzy cluster

for pt in cntr:

ax.plot(pt[0], pt[1], 'rs')

ax.set_title('Centers = {0}; FPC = {1:.2f}'.format(ncenters, fpc))

ax.axis('off')

fig1.tight_layout()

Building the Model

# Regenerate fuzzy model with 3 cluster centers

cntr, u_orig, _, _, _, _, _ = fuzz.cluster.cmeans(

alldata, 3, 2, error=0.005, maxiter=1000)

# Show 3-cluster model

fig2, ax2 = plt.subplots()

ax2.set_title('Trained Model')

for j in range(3):

ax2.plot(alldata[0, u_orig.argmax(axis=0) == j],

alldata[1, u_orig.argmax(axis=0) == j], 'o',

label='series ' + str(j))

ax2.legend()

The subsequent video titled "Day 71 - Fuzzy C-Means Clustering Implementation" illustrates how to implement Fuzzy C-Means clustering in Python, showcasing practical coding techniques and results.

Predicting Cluster Membership

# Generate uniformly sampled data across the range [0, 10] in x and y

newdata = np.random.uniform(0, 1, (1100, 2)) * 10

# Predict new cluster membership using cmeans_predict

u, u0, d, jm, p, fpc = fuzz.cluster.cmeans_predict(

newdata.T, cntr, 2, error=0.005, maxiter=1000)

# Visualize the classified uniform data

cluster_membership = np.argmax(u, axis=0) # Hardening for visualization

fig3, ax3 = plt.subplots()

ax3.set_title('Random Points Classified According to Known Centers')

for j in range(3):

ax3.plot(newdata[cluster_membership == j, 0],

newdata[cluster_membership == j, 1], 'o',

label='series ' + str(j))

ax3.legend()

plt.show()

rhondamuse.com

Understanding Fuzzy Clustering: The Soft K-Means You Should Know

Chapter 1: Introduction to Clustering Algorithms

What Are Clustering Algorithms?

Fuzzy Clustering Explained

Types of Fuzzy Clustering Algorithms

Fuzzy Clustering Implementation in Python

Clustering Visualization

Building the Model

Predicting Cluster Membership

Share the page:

Recent Post:

Embrace the Moon's Wisdom: A Letter from Moonbae

Understanding and Overcoming Carb Cravings: A Fresh Perspective

Establishing Healthy Boundaries: A Guide to Self-Protection

Nurturing the Blue Revolution: Advancements in Aquaculture

Boosting Digital Product Sales Through Professional Services

Essential Tools for Parenting Teens in a Digital Age

# Utilizing

Exploring the Connection Between Spirits and Our World