rhondamuse.com

Understanding Fuzzy Clustering: The Soft K-Means You Should Know

Written on

Chapter 1: Introduction to Clustering Algorithms

In a previous discussion, I highlighted the advanced version of k-means clustering known as K-means++. Today, I will delve into a different clustering algorithm: Fuzzy Clustering in Machine Learning.

What Are Clustering Algorithms?

Before diving into Fuzzy Clustering, it’s essential to understand what clustering algorithms are. Clustering is a form of unsupervised learning widely used in statistical data analysis across various domains. In Data Science, we utilize clustering techniques to extract meaningful insights from datasets by observing how data points group together when a clustering algorithm is applied.

Fuzzy Clustering Explained

Fuzzy clustering is a methodology where data points can belong to multiple clusters simultaneously. From a computational perspective, creating fuzzy boundaries is significantly more manageable than assigning each point to a single cluster.

Types of Fuzzy Clustering Algorithms

Fuzzy clustering methods can be categorized into two main types: classical fuzzy clustering and shape-based fuzzy clustering.

  1. Classical Fuzzy Clustering Algorithms

    • Fuzzy C-Means (FCM): This widely-used method functions similarly to the K-Means algorithm, allowing a data point to belong to all clusters with a membership function ranging from 0 (farthest from the cluster center) to 1 (closest to the center). Variants include:

      • Possibilistic C-Means (PCM)
      • Fuzzy Possibilistic C-Means (FPCM)
      • Possibilistic Fuzzy C-Means (PFCM)
    • Gustafson-Kessel (GK) Algorithm: Unlike C-means, which assumes spherical clusters, GK allows for elliptical-shaped clusters.

    • Gath-Geva Algorithm: Also known as Gaussian Mixture Decomposition, this method is akin to FCM but supports clusters of varying shapes.

  2. Shape-Based Fuzzy Clustering Algorithms

    • Circular-Shaped Algorithms: These restrict data points to a circular formation. When incorporated into FCM, it is referred to as CS-FCM.
    • Elliptical-Shaped Algorithms: These constrain points to elliptical shapes, used in the GK algorithm.
    • Generic-Shaped Algorithms: Given that most real-world objects aren't strictly circular or elliptical, this algorithm accommodates clusters of any form.

Fuzzy Clustering Implementation in Python

To begin, we need to generate a dataset.

from __future__ import division, print_function

import numpy as np

import matplotlib.pyplot as plt

import skfuzzy as fuzz

colors = ['b', 'orange', 'g', 'r', 'c', 'm', 'y', 'k', 'Brown', 'ForestGreen']

# Define three cluster centers

centers = [[4, 2], [1, 7], [5, 6]]

# Define three cluster sigmas in x and y, respectively

sigmas = [[0.8, 0.3], [0.3, 0.5], [1.1, 0.7]]

# Generate test data

np.random.seed(42) # Set seed for reproducibility

xpts = np.zeros(1)

ypts = np.zeros(1)

labels = np.zeros(1)

for i, ((xmu, ymu), (xsigma, ysigma)) in enumerate(zip(centers, sigmas)):

xpts = np.hstack((xpts, np.random.standard_normal(200) * xsigma + xmu))

ypts = np.hstack((ypts, np.random.standard_normal(200) * ysigma + ymu))

labels = np.hstack((labels, np.ones(200) * i))

# Visualize the test data

fig0, ax0 = plt.subplots()

for label in range(3):

ax0.plot(xpts[labels == label], ypts[labels == label], '.', color=colors[label])

ax0.set_title('Test Data: 200 Points Across 3 Clusters.')

The above video titled "Day 70 - Fuzzy C-Means Clustering Algorithm" provides a comprehensive overview of the Fuzzy C-Means algorithm, discussing its core principles and applications in clustering.

Clustering Visualization

fig1, axes1 = plt.subplots(3, 3, figsize=(8, 8))

alldata = np.vstack((xpts, ypts))

fpcs = []

for ncenters, ax in enumerate(axes1.reshape(-1), 2):

cntr, u, u0, d, jm, p, fpc = fuzz.cluster.cmeans(

alldata, ncenters, 2, error=0.005, maxiter=1000, init=None)

# Store FPC values for later

fpcs.append(fpc)

# Plot assigned clusters for each data point in the training set

cluster_membership = np.argmax(u, axis=0)

for j in range(ncenters):

ax.plot(xpts[cluster_membership == j],

ypts[cluster_membership == j], '.', color=colors[j])

# Mark the center of each fuzzy cluster

for pt in cntr:

ax.plot(pt[0], pt[1], 'rs')

ax.set_title('Centers = {0}; FPC = {1:.2f}'.format(ncenters, fpc))

ax.axis('off')

fig1.tight_layout()

Building the Model

# Regenerate fuzzy model with 3 cluster centers

cntr, u_orig, _, _, _, _, _ = fuzz.cluster.cmeans(

alldata, 3, 2, error=0.005, maxiter=1000)

# Show 3-cluster model

fig2, ax2 = plt.subplots()

ax2.set_title('Trained Model')

for j in range(3):

ax2.plot(alldata[0, u_orig.argmax(axis=0) == j],

alldata[1, u_orig.argmax(axis=0) == j], 'o',

label='series ' + str(j))

ax2.legend()

The subsequent video titled "Day 71 - Fuzzy C-Means Clustering Implementation" illustrates how to implement Fuzzy C-Means clustering in Python, showcasing practical coding techniques and results.

Predicting Cluster Membership

# Generate uniformly sampled data across the range [0, 10] in x and y

newdata = np.random.uniform(0, 1, (1100, 2)) * 10

# Predict new cluster membership using cmeans_predict

u, u0, d, jm, p, fpc = fuzz.cluster.cmeans_predict(

newdata.T, cntr, 2, error=0.005, maxiter=1000)

# Visualize the classified uniform data

cluster_membership = np.argmax(u, axis=0) # Hardening for visualization

fig3, ax3 = plt.subplots()

ax3.set_title('Random Points Classified According to Known Centers')

for j in range(3):

ax3.plot(newdata[cluster_membership == j, 0],

newdata[cluster_membership == j, 1], 'o',

label='series ' + str(j))

ax3.legend()

plt.show()

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

Embrace the Moon's Wisdom: A Letter from Moonbae

Discover how the phases of the moon influence our lives and the insights shared by Moon Bae in this heartfelt letter.

Understanding and Overcoming Carb Cravings: A Fresh Perspective

Explore the complexities of carbohydrate addiction and how to manage cravings for a healthier life.

Establishing Healthy Boundaries: A Guide to Self-Protection

Discover how to identify and communicate your boundaries effectively to foster healthier relationships.

Nurturing the Blue Revolution: Advancements in Aquaculture

Explore how aquaculture, known as the Blue Revolution, is shaping sustainable seafood production and addressing global demand.

Boosting Digital Product Sales Through Professional Services

Discover how professional services can enhance your digital product sales strategy effectively.

Essential Tools for Parenting Teens in a Digital Age

Discover key strategies and tools for effectively parenting your teenager in today's tech-driven world.

# Utilizing

Discover how the concept of

Exploring the Connection Between Spirits and Our World

Discover how spirits remain connected to their loved ones even after passing, through personal stories and insights.