.. _doc.mixture: .. meta:: :description: Documentation of Skew-t Mixture Models in CassioPy :keywords: skew-t, clustering, mixture models, machine learning Mixture Model ============= .. _doc.mixture.SkewTMixture: Skew-t Mixture Models ---------------------- Skew-t mixture models are an unsupervised machine learning method used for clustering data. These models extend Gaussian Mixture Models (GMM) to accommodate non-symmetric distributions by employing skew-t distributions. A random variable X follows a skew-t distribution if it can be represented by: .. math:: X = \mu + \sigma \frac{U}{\sqrt{\tau}}, \qquad with \qquad U\sim\mathcal{SN}(\lambda), \qquad \tau\sim\Gamma\left(\frac{\nu}{2}, \frac{\nu}{2}\right) with :math:`\mu \in \mathbb{R}` : location parameter :math:`\sigma \in \mathbb{R^*_+}` : scale parameter, diagonal covariance matrix :math:`\nu \in \mathbb{R^*_+}` : degrees of freedom :math:`\lambda \in \mathbb{R}` : skewness parameter :math:`\Gamma(\alpha, \beta)` : gamma distribution with shape parameter :math:`\alpha` and an inverse scale parameter :math:`\beta` :math:`\mathcal{SN}` : standard normal distribution with parameter :math:`\lambda` :math:`\mathcal{SN}(x) = 2\phi(x)\Phi(\lambda x)` with :math:`\phi` the standard normal density and :math:`\Phi` the standard normal cumulative distribution function A skew-t mixture model assumes that the data is generated from a finite mixture of skew-t distributions, each characterized by unknown parameters. This approach is particularly useful for modeling data with skewed distributions, providing a more flexible and accurate representation than traditional GMMs. For sake of simplicity, we assume that variables are independent given the cluster. .. math:: p(\vec{x_i};\vec{\theta_{k}}) = \sum_{k=1}^{K} \alpha_k \prod_{j=1}^d \; p(x_{ij} \mid \vec{\theta}_k) Where :math:`\vec{\theta}` groups all the parameters, :math:`\alpha_k` is the proportion of the :math:`k`-th cluster, and :math:`\vec{\theta}_k` are parameters related to the cluster :math:`k`. .. math:: p(x_{ij} \mid \vec{\theta}_k) = \mathcal{ST}(x_{ij} \mid \mu_{kj}, \sigma_{kj}, \lambda_{kj}, \nu_{kj}) Where :math:`K` : number of clusters :math:`d` : number of features :math:`\mathcal{ST}` : skew-t probabibility density function **Examples:** .. code-block:: python >>> import numpy as np >>> from cassiopy.mixture import SkewTMixture >>> from cassiopy.mixture import SkewT >>> data, y_true = SkewT().random_cluster(n_samples=10000, n_dim=2, n_clusters=10, labels=True, random_state=4) >>> model = SkewTMixture(n_cluster=10, init='gmm', n_iter=100, n_init=4, verbose=0).fit(data) >>> y_pred = model.predict(data) >>> model.ARI(y_true, y_pred) 0.9828358328555337 >>> model.save('model.h5') >>> # Plotting >>> plt.scatter(data[:, 0], data[:, 1], c=y_pred, cmap='viridis', s=3) >>> plt.xlabel('dim 1') >>> plt.ylabel('dim 2') >>> plt.text(max(data[:, 0]), max(data[:, 1]), s = f'ARI: {MM.ARI(y_true, y_pred):.3f}', fontsize=12, color='black', ha='right', va='top') >>> plt.title('Clustering using SkewT Mixture Model') .. figure:: ../_static/Images/Skewt_clustering.png :alt: Clustering using SkewT Mixture Model :width: 500px :align: center .. dropdown:: References .. bibliography:: referencemixturemodel.bib :all: **See also** :func:`Skew-t Mixture ` .. _doc.mixture.SkewTUniformMixture: Skew-t Uniform Mixture Models ------------------------------ The Skew-t uniform mixture models is an unsupervised machine learning method used for clustering data into groups following skewed distributions (see :numref:`doc.mixture.SkewTMixture` above) together with an uniform background. These models extend Gaussian Mixture Models (GMM) to accommodate non-symmetric distributions by employing skew-t distributions with a uniform background. .. math:: p(\vec{x_i};\vec{\theta}) = \sum_{k=1}^{K} \alpha_k \; p(\vec{x_i}|\vec{\theta_{k}}) + \alpha_{K+1} \frac{1}{V} Where :math:`V` is the volume of the uniform background. .. math:: V = \prod_{j=1}^d \left( x_{\max,j} - x_{\min,j} \right) **Examples:** .. code-block:: python >>> import numpy as np >>> from cassiopy.mixture import SkewTUniformMixture >>> X = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]]) >>> model = SkewTUniformMixture(n_cluster=2, n_iter=100, tol=1e-4, init='random') >>> model.fit(X) >>> model.mu array([[10., 2.], [ 1., 2.]]) >>> model.predict([[0, 0], [12, 3]]) array([0, 1]) >>> model.predict_proba([[0, 0], [12, 3]]) array([[0.99999999, 0. , 0. ], [0. , 0.90 , 0.10 ]]) >>> model.save('model.h5') import sklearn from cassiopy.stats import SkewT from cassiopy.mixture import SkewTUniformMixture seed_value = 3 n_uniforme = 2000 n_dim = 4 n_clusters=10 data, y_true = SkewT().random_cluster(n_samples=2000, n_dim=n_dim, n_clusters=n_clusters, labels=True, random_state=seed_value) x_uniform = np.random.uniform(low=-50, high=50, size=(n_uniforme, n_dim)) data = np.concatenate((data, x_uniform), axis=0) y_true = np.append(y_true, np.full(n_uniforme, max(y_true)+1)) data = sklearn.utils.shuffle(data, random_state=seed_value) y_true = sklearn.utils.shuffle(y_true, random_state=seed_value) **See also** :func:`Skew-t Mixture ` .. _doc.mixture.BIC: Bayesian Information Criterion (BIC) ------------------------------------ The Bayesian Information Criterion (BIC) is a criterion for model selection among a finite set of models. The model with the lowest BIC is preferred. The BIC is defined as: .. math:: BIC = -2 \log(L) + p \log(n) Where :math:`L` : likelihood of the model :math:`p` : number of parameters in the model :math:`n` : number of samples .. dropdown:: References .. bibliography:: referenceBIC.bib :all: .. _doc.mixture.ARI: Adjusted Rand Index (ARI) -------------------------- The Adjusted Rand Index (ARI) is a measure of the similarity between two data clusterings. It ensure that random clusterings receive a score close to zero, with a maximum score of 1 indicating perfect agreement between the clusterings. The Rand Index is defined as: .. math:: RI = \frac{a + b}{\binom{N}{2}} With :math:`a` : number of pairs of elements that are in the same cluster in both the true and predicted clusters :math:`b` : number of pairs of elements that are in different clusters in both the true and predicted clusters :math:`\binom{N}{2}` : number of possible pairs of elements Value attended for a random clustering : .. math:: E = \frac{\sum_i \binom{n_i}{2} \quad \sum_j \binom{n_j}{2}}{\binom{N}{2}} :math:`n_i` : number of elements in the :math:`i`-th cluster in the true clustering :math:`n_j` : number of elements in the :math:`j`-th cluster in the predicted clustering Adjusted Rand Index : .. math:: ARI = \frac{RI - E}{max(RI) - E} With :math:`max(RI) = \frac{1}{2} \left(\sum_i\binom{n_i}{2} + \sum_j\binom{n_j}{2} \right)` the maximum possible value of the Rand Index **Special Cases:** - When :math:`ARI=1` , the two clusterings are identical, perfect agreement - When :math:`ARI=0` , the two clusterings are random, no agreement - When :math:`ARI=-1` , the two clusterings are different, perfect disagreement .. dropdown:: References .. bibliography:: referenceARI.bib :all: