Handson ML

Soft Clustering with Gaussian Mixture Models (GMM)

GMM Theory¶ The Gaussian Mixture Model is a generative model that assumes that data are generated from multiple Gaussion distributions each with own Mean and variance. the Gaussian Mixture Models or Mixture of Gaussians models a convex combination of the various distributions. Unlike K-Means, with Gaussian Mixture Models we want to define a probability distribution on the data. In order to do that, we need to convert our clustering problem

Hard clustering with K-means

kmeans Theory¶ K-Means is the simplest and most fundemental clustering algorithm. Given: $x_1,x_2,…,x_n$, Where $x \in I\!R^d$ Output: Clusters $C_1,C_2,…,C_n$, Where $C_i \in \{1,2,..K\}$ Goal: Partition data into K clusters(groups) where each cluster has similar data. The goal is pretty clear. you have a bunch of data from which you may or may not know the generative distrubition. you want to learn the structure of the data in a such

Pills of ML & AI: Feature Selection

Did you know that there are basically three ways of selecting the most important features before feeding your database into an ML model?

Predict house prices with dense neural networks and tensorflow

In a regression problem, we aim to predict the output of a continuous value, like a price or a probability. Contrast this with a classification problem, where we aim to select a class from a list of classes (for example, where a picture contains an apple or an orange, recognizing which fruit is in the picture).

Multi Label Classification Of Texts With NLTK

MultiLabelClassification-with-nltk MULTI LABEL CLASSIFICATION WITH NLTK¶ In this tutorial, I will show you how to predict tags for a text. In this post, we will build a multi-label model that’s capable of detecting different types of toxicity in a large number of Wikipedia comments which have been labeled by human raters for toxic behavior. The types of toxicity are: toxic severe_toxic obscene threat insult identity_hate The data set used can

Discrete Probability Distributions

8 Discrete Probability Distributions¶ 8.2 Binomial Distribution¶ The following code plots the probability mass function (PMF) of $B_{p,n}$, the binomial distribution with parameters $p$ and $n$. It contains interactive sliders that you can use to vary $n$ over the interval $[0,30]$ and $p$ over the interval $[0, 1]$. In : %matplotlib inline Let us now load the required code and analyze it part by part. In : # %load plot_pmf.py import numpy