Dimensionality Reduction using Autoencoders — Easy Explanation — with source code
So in today’s very interesting blog, we will see how we can perform Dimensionality Reduction using Autoencoders in the simplest way possible using Tensorflow. So without any further due.
Read the full article with source code here — https://machinelearningprojects.net/dimensionality-reduction-using-autoencoders/
Let’s do it…
Step 1 — Importing all required libraries.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Sequential,Model
from sklearn.preprocessing import MinMaxScaler
import seaborn as sns
Step 2 — Reading our input data.
data = pd.read_csv('anonymized_data.csv')
- We have 30 feature columns and 1 Label column in our dataset.
- These feature columns are anonymous.
Step 3 — Checking info of our data.
- We can see from the stats below that we don’t have any null values in our data.
Step 4 — Scaling our data for Dimensionality Reduction using Autoencoders.
scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(data.drop('Label',axis=1))
- We are using MinMaxScaler here to scale our data.
- Also here we are checking the shape of our data.
- While scaling we dropped the Label column as shown above.
Step 5 — Defining no. of nodes in layers.
num_inputs = 30
num_hidden = 2
num_outputs = num_inputs # Must be true for an autoencoder!
Step 6 — Building the model for Dimensionality Reduction using Autoencoders.
model = Sequential()
model.compile(optimizer=Adam(0.001), metrics=['accuracy'], loss='mae')
- We have a very simple model for this use case.
- We have 3 layers.
- The first layer is having 30 nodes, 2nd is having 2 nodes and the third is also having 30 nodes.
- Our input and output nodes should show the same type of data when building Autoencoders.
Step 7 — Let’s train the model for Dimensionality Reduction using Autoencoders.
model.fit(x=scaled_data, y=scaled_data, epochs=1000, batch_size=32)
Step 8 — Taking the output from the middle layer.
intermediate_layer_model = Model(inputs=model.input, outputs=model.get_layer(index=1).output)
intermediate_output = intermediate_layer_model.predict(scaled_data)
- We can’t just directly take the output from our middle layer.
- That’s why we need to create a model, with just 1 layer which will be our middle layer.
- model.get_layer(index=1) is extracting the middle layer from our original model and .output is used for taking its output.
- In the second line, we are simply using predict to take the results from the 2nd layer.
Step 9 — Checking output shape of our result.
- As we can see below that the shape of the intermediate output became 500X2 means 30 columns/features are now suppressed to only 2 columns/features.
Step 10 — Plotting our results for Dimensionality Reduction using Autoencoders.
- And BOOM here is the results.
- The amount of data our 30 features were showing, we are able to show that data precisely using just these 2 dimensions.
- Both the classes are linearly separable, which means our model did a good job in keeping the essence of data.
Do let me know if there’s any query regarding Dimensionality Reduction using Autoencoders by contacting me on email or LinkedIn.
To explore more Machine Learning, Deep Learning, Computer Vision, NLP, Flask Projects visit my blog — Machine Learning Projects
For further code explanation and source code visit here — https://machinelearningprojects.net/dimensionality-reduction-using-autoencoders/
So this is all for this blog folks, thanks for reading it and I hope you are taking something with you after reading this and till the next time 👋…
Read my previous post: MNIST HANDWRITTEN NUMBER RECOGNITION — USING DEEP NEURAL NETWORKS