https://pin.it/5Ls51ez

Deep Learning: Neural Network for Regression with Tensorflow

Naveed Ul Mustafa

--

In my previous article, I embarked on an introductory journey through the foundational concepts of TensorFlow. Today, I am delving deeper, shining a light on Regression problems within TensorFlow.

Regression in a Nutshell

At its core, regression revolves around predicting a specific number, based on numerous features. Think of it like this: in image detection, our goal might be to determine the precise coordinates of bounding boxes. This is akin to training a neural network for object detection where the objective is predicting the location of the bounding box encircling our target.

Key to any regression task is selecting suitable inputs for our model and understanding the desired output. A pivotal part of this involves discerning the relationship between features and labels. This may involve:

  • dropping some features, which might not good for the model. Too many features may lead to complication and longer run times.
  • changing data types of categorical features, so model can preform appropriately. Techniques such as one-hot encoding may come in handy for transforming categorical data into numerical format.
  • Normalizing features to a standard scale. It ensures all features contribute equally to predictions, regardless of their original scale, leading to faster and more stable training. Additionally, normalized data aligns well with common weight initialization strategies and helps avoid issues related to activation function saturation.

The Anatomy of Neural Networks

A neural network is a mathematical operation, where multiple equations mimics the way human brain process information. Picture it as a vast web of tiny calculators (known as neurons) connected in layers. These calculators work together, processing data and adjusting their calculations based on patterns they identify. Over time, much like a student studying for an exam, the network improves its accuracy. It’s this ability to learn from data that makes neural networks a cornerstone of modern artificial intelligence, allowing machines to recognize images, understand speech, and even make decisions.

Breaking down a neural network, we find three primary components:

  • Input layer: The entry point for the data (or features).
  • Hidden Layer: This segment can range from a singular layer to hundreds, each populated with neurons. The more complex our problem, the more layers we might need. It’s here that our network recognizes patterns within the data.
  • Output Layer: Our final stop, where the model’s predictions or classifications materialize.
borrowed from Daniel

Designing Neural Networks for Regression

Designing a neural network involves several steps, tailored to the specific problem one is trying to solve. it is important to understand the nature of model architecture.

# Building Neural Network for Regression Problem - 3 features and 1 label
nn_model = tf.keras.Sequential([
tf.keras.layers.Dense(3, input_shape = (3,), name = "input_Layer"),
tf.keras.layers.Dense(100, activation = "relu"),
tf.keras.layers.Dense(100, activation = "relu"),
tf.keras.layers.Dense(1, activation = none, name = "output_Layer")
])

Each model consist of multiple layers to identify the pattern more robustly. Each layer consists of units (neurons) and activation function. more about layers can be ready by tensorflow documents.

  1. Input layer: The number of neurons should match the number of input features, and for clarity purposes its a decent practice to mention the no of features, and name.
  2. Hidden layers: Starts with 1 or 2 for simpler model and grows as per the complexity of the data. Each hidden layer consist of neurons and activation function. The involvement of activation function solely depends on the nature of data. It is possible that model have good prediction score without involvement of any activation function. Adding hidden layers and activation function surely increases the model complexity.
  3. Outer layer: The number of neurons should match the number of outputs.

This can also be visualized by this online tool, which helps builds basics in neural network.

Embarking on Modeling with TensorFlow

Model compilation is a fundamental step in the process of building a neural network, as compilation sets the stage for model training by defining the rules of the learning process and optimizing the computation. Without this step, the neural network would lack direction and efficiency in the learning process.

# Compiling model
nn_model.compile(loss = tf.keras.losses.mae,
optimizer = tf.keras.optimizers.SGD(),
metrics = ["mae"])

Training a neural network revolves around optimizing its performance by minimizing what’s called a loss function. This loss function measures the difference between the model’s predictions and the actual data. To learn and improve, neural networks use a process called backpropagation, which adjusts the model’s internal settings or “weights”. The compilation step in neural network training specifies the tools, known as optimizers (like SGD or Adam), that make these adjustments based on certain calculations called gradients.

Additionally, while the loss function steers the training, we also use evaluation metrics, such as accuracy, to provide an intuitive understanding of the model’s performance. Although these metrics don’t change how the model learns, they are vital for tracking its progress.

# Fit the model
history = nn_model.fit(X_train, y_train,
epochs =50)

When we fit the model to training data, we allow it to discern underlying patterns. In this scenario, “training” refers to the modification of the model’s internal parameters using the data it encounters, enhancing its predictive capabilities. The term “epoch” denotes one complete cycle through the entire dataset. So, if we specify 50 epochs, it means the model will process and learn from the entire dataset a total of 50 times.

Assessing Model Performance

Depending on the problem in hand, an overall model performance is assessed on the test dataset. This shows, how effective a model performs on an unseen data in reading patterns, it learned from training data. It is normal to get absolutely humongous loss values as a results, however, it is the motivation to appropriately adjust the hyperparameters such as activation function, loss function, no of epochs and/or learning rate.

# Evaluate the model in Test set (we already set losses as MAE)
nn_model.evaluate(X_test, y_test)

There are different variation of losses, which are effective in certain scenario:

  1. MAE: (Mean Absolute Error) on avg how wrong each of the prediction of the model `tf.keras.losses.MAE()` OR `tf.metrics.mean_absolute_error()` — can be used in any regression problem
  2. MSE: (Mean Square Error) Square of avg errors `tf.keras.losses.MSE()` OR `tf.metrics.mean_square_error()` — most useful when largere errors are more significant than small errors.
  3. huber: combination of MAE & MSE `tf.keras.losses.Huber()`

One key aspect of an ML practitioner is to continue the feedback loop until reached for desired results. The feedback loop:

Build a model -> fit it -> evaluate it-> tweak it -> fit it -> evaluate it -> tweak it -> fit it -> evaluate it -> …`

This tweaking starts with visualization. As I delve deeper in my learning of neural network models, I feel going blind as I progress to tweak the hyperparameters. The only way I feel conferrable doing the hyperparameter tuning, is visualizing each step I take. This not only makes it easier for, but makes me confident in applying the neural network model in a regression problem.

# plot history (also known as training curve OR loss curve)
pd.DataFrame(history.history).plot()
plt.ylabel("Loss")
plt.xlabel("epochs")

This shows how loss behaves, as model continue to train for larger epoch size. A sudden decrease during 100 run indicates a good performing model. Afterwards a decreasing trend, which seemd pleatued after 300 runs, indicates a sense of overfitting.

Beside visualization, one can:

  • Get more data,
  • make mode a bit larger (by adding few more hidden layers)
  • train for longer — increase the epoch size.

In conclusion, understanding regression within the realm of TensorFlow is akin to piecing together a puzzle. It requires selecting the right pieces (inputs), knowing the end picture (outputs), and carefully positioning the pieces to see the whole image (training and refining the model). Dive in, experiment, and let the world of neural networks unravel its mysteries to you.

I learned this all with Daniel, his course on Tensorflow.

Follow me on Github and on X.

--

--

Naveed Ul Mustafa

Student, interested in Machine Learning & Gen AI, Computational Neuroscience & Computer Vision