Understanding the Basics of Deep Learning
Background
Artificial Intelligence (AI) is a rapidly growing field with significant developments occurring in various sectors. Recent advancements, such as ChatGPT models, have made large language models (LLMs) a part of everyday discussions. Similarly, applications like DALLE, based on diffusion models, demonstrate the progress in AI capabilities. The 21st century has seen remarkable evolution in AI, with intelligent software performing tasks that were previously challenging for humans, including understanding speech, recognizing images, and making medical diagnoses.
Earlier AI systems relied on hard-coded software rules to solve specific tasks that were difficult for humans, such as performing monotonous tasks accurately, following complex mathematical rules without interruption, and minimizing errors. However, modern AI systems are designed to solve tasks that come naturally to humans but are hard to formalize, such as recognizing spoken words and understanding images and videos.
Hierarchy of Concepts
Modern AI systems need to learn from experiences and understand tasks objectively in a “hierarchy of concepts.” Unlike earlier AI systems that required extensive human intervention to hard-code rules, today’s AI systems can autonomously gather knowledge and learn from it by extracting patterns from raw data. The “hierarchy of concepts” approach allows AI systems to learn complex ideas by building a conceptual map, where each idea is defined through its relation to simpler concepts. This depth of understanding is why we refer to this approach as “deep learning.”
From Machine Learning to Deep Learning
Machine learning is the broader concept of learning from patterns, and deep learning is a more sophisticated version of it. Deep learning algorithms handle complex data, sort it, and devise patterns. For instance, while simple machine learning algorithms like regression models can easily handle tabular data, more abstract data, such as images, require deep learning approaches.
Challenges in Feature Engineering
One major challenge in AI is designing appropriate features for learning algorithms to aid in pattern recognition. Some AI tasks can be solved by creating precise features that help the algorithm make accurate predictions. For example, in speech recognition, defining features related to the speaker’s vocal tract can help the algorithm predict whether the speaker is a child, man, or woman. However, defining features for tasks like identifying a car in a picture is more complex, as the features (e.g., wheels, doors) can vary greatly in appearance.
Representation Learning
To improve prediction accuracy and avoid the hassle of defining appropriate features, learning algorithms can be designed to learn the representation of features themselves and map these representations to the output. This approach is known as “representation learning.” Unlike traditional learning algorithms, representation learning algorithms not only learn patterns from features but also discover a good set of features, making this approach more robust and efficient.
Neural Networks
When designing a representation learning algorithm, the objective is to separate the sources of influence (factors of variation) from the data. In speech recognition, factors of variation might include the speaker’s age, sex, and dialect, while in image recognition, they might include the position, color, or reflection intensity of an object.
Neural networks, or multilayer perceptrons (MLPs), are advanced approaches that overcome the limitations of traditional representation learning algorithms. A neural network is a mathematical function composed of many simpler functions, mimicking the concept of the “hierarchy of concepts.” As the depth of the network grows, so does its efficiency, as it offers sequential interactions and self-improvement capacity. Each iteration enhances the network’s ability to make more holistic predictions.
Conclusion
Deep learning is an advanced approach in AI, drawing inspiration from the human brain, applied mathematics, and statistics. It enables AI systems to learn from experiences, build a hierarchy of concepts, and autonomously improve over time. As AI continues to evolve, deep learning will play a crucial role in developing intelligent systems capable of performing complex tasks with minimal human intervention.
This is a first article in my quest of getting to know more about deep learning. My primary literature is Deep Learning by Ian Goodfellow. You can follow me on X and Githuib.