Understanding Neural Networks

What is a neural network?

A neural network is a machine learning (ML) model designed to mimic the function and structure of the human brain. Neural networks are intricate networks of interconnected nodes, or neurons, that collaborate to tackle complicated problems.

Also referred to as artificial neural networks (ANNs) or deep neural networks, neural networks represent a type of deep learning technology that's classified under the broader field of artificial intelligence (AI).

Neural networks are widely used in a variety of applications, including image recognition, predictive modeling and natural language processing (NLP). Examples of significant commercial applications since 2000 include handwriting recognition for check processing, speech-to-text transcription, oil exploration data analysis, weather prediction and facial recognition.

How do neural networks work?

An artificial neural network usually involves many processors operating in parallel and arranged in tiers or layers. The first tier -- analogous to optic nerves in human visual processing -- receives the raw input information. Each successive tier receives the output from the tier preceding it rather than the raw input -- the same way neurons further from the optic nerve receive signals from those closer to it. The last tier produces the output of the system.

Think of each individual node as its own linear regression model, composed of input data, weights, a bias (or threshold), and an output.

A linear regression model describes the relationship between a dependent variable, y, and one or more independent variables, X. The dependent variable is also called the response variable. Independent variables are also called explanatory or predictor variables.

The formula would look something like this:

∑wixi + bias = w1x1 + w2x2 + w3x3 + bias

output = f(x) = 1 if ∑w1x1 + b>= 0; 0 if ∑w1x1 + b < 0

Once an input layer is determined, weights are assigned. These weights help determine the importance of any given variable, with larger ones contributing more significantly to the output compared to other inputs. All inputs are then multiplied by their respective weights and then summed. Afterward, the output is passed through an activation function, which determines the output. If that output exceeds a given threshold, it “fires” (or activates) the node, passing data to the next layer in the network. This results in the output of one node becoming in the input of the next node. This process of passing data from one layer to the next layer defines this neural network as a feedforward network.

Let’s break down what one single node might look like using binary values. We can apply this concept to a more tangible example, like whether you should go surfing (Yes: 1, No: 0). The decision to go or not to go is our predicted outcome, or y-hat.

Let’s assume that there are three factors influencing your decision-making:

Are the waves good? (Yes: 1, No: 0)
Is the line-up empty? (Yes: 1, No: 0)
Has there been a recent shark attack? (Yes: 0, No: 1)

Then, let’s assume the following, giving us the following inputs:

X1 = 1, since the waves are pumping
X2 = 0, since the crowds are out
X3 = 1, since there hasn’t been a recent shark attack

Now, we need to assign some weights to determine importance. Larger weights signify that particular variables are of greater importance to the decision or outcome.

W1 = 5, since large swells don’t come around often
W2 = 2, since you’re used to the crowds
W3 = 4, since you have a fear of sharks

Finally, we’ll also assume a threshold value of 3, which would translate to a bias value of –3. With all the various inputs, we can start to plug in values into the formula to get the desired output.

Y-hat = (1*5) + (0*2) + (1*4) – 3 = 6

If we use the activation function from the beginning of this section, we can determine that the output of this node would be 1, since 6 is greater than 0. In this instance, you would go surfing; but if we adjust the weights or the threshold, we can achieve different outcomes from the model. When we observe one decision, like in the above example, we can see how a neural network could make increasingly complex decisions depending on the output of previous decisions or layers.

In the example above, we used perceptrons to illustrate some of the mathematics at play here, but neural networks leverage sigmoid neurons, which are distinguished by having values between 0 and 1. Since neural networks behave similarly to decision trees, cascading data from one node to another, having x values between 0 and 1 will reduce the impact of any given change of a single variable on the output of any given node, and subsequently, the output of the neural network.

As we start to think about more practical use cases for neural networks, like image recognition or classification, we’ll leverage supervised learning, or labeled datasets, to train the algorithm. As we train the model, we’ll want to evaluate its accuracy using a cost (or loss) function. This is also commonly referred to as the mean squared error (MSE). In the equation below,

i represents the index of the sample,
y-hat is the predicted outcome,
y is the actual value, and
m is the number of samples.

𝐶𝑜𝑠𝑡 𝐹𝑢𝑛𝑐𝑡𝑖𝑜𝑛 = 𝑀𝑆𝐸=1/2𝑚 ∑129_(𝑖=1)^𝑚▒(𝑦 ̂^((𝑖) )−𝑦^((𝑖) ) )^2

Ultimately, the goal is to minimize our cost function to ensure correctness of fit for any given observation. As the model adjusts its weights and bias, it uses the cost function and reinforcement learning to reach the point of convergence, or the local minimum. The process in which the algorithm adjusts its weights is through gradient descent, allowing the model to determine the direction to take to reduce errors (or minimize the cost function). With each training example, the parameters of the model adjust to gradually converge at the minimum.

Most deep neural networks are feedforward, meaning they flow in one direction only, from input to output. However, you can also train your model through backpropagation; that is, move in the opposite direction from output to input. Backpropagation allows us to calculate and attribute the error associated with each neuron, allowing us to adjust and fit the parameters of the model(s) appropriately.

Applications of artificial neural networks

Image recognition was one of the first areas in which neural networks were successfully applied. But the technology uses have expanded to many more areas:

Chatbots.
NLP, translation and language generation.
Stock market predictions.
Delivery driver route planning and optimization.
Drug discovery and development.
Social media.
Personal assistants.

Prime uses involve any process that operates according to strict rules or patterns and has large amounts of data. If the data involved is too large for a human to make sense of in a reasonable amount of time, the process is likely a prime candidate for automation through artificial neural networks.

Types of neural networks

Neural networks can be classified into different types, which are used for different purposes. While this isn’t a comprehensive list of types, the below would be representative of the most common types of neural networks that you’ll come across for its common use cases:

The perceptron is the oldest neural network, created by Frank Rosenblatt in 1958.

Feedforward neural networks, or multi-layer perceptrons (MLPs), are what we’ve primarily been focusing on within this article. They are comprised of an input layer, a hidden layer or layers, and an output layer. While these neural networks are also commonly referred to as MLPs, it’s important to note that they are actually comprised of sigmoid neurons, not perceptrons, as most real-world problems are nonlinear. Data usually is fed into these models to train them, and they are the foundation for computer vision, natural language processing, and other neural networks.

Convolutional neural networks (CNNs) are similar to feedforward networks, but they’re usually utilized for image recognition, pattern recognition, and/or computer vision. These networks harness principles from linear algebra, particularly matrix multiplication, to identify patterns within an image.

Recurrent neural networks (RNNs) are identified by their feedback loops. These learning algorithms are primarily leveraged when using time-series data to make predictions about future outcomes, such as stock market predictions or sales forecasting.

Neural networks vs. deep learning

Deep Learning and neural networks tend to be used interchangeably in conversation, which can be confusing. As a result, it’s worth noting that the “deep” in deep learning is just referring to the depth of layers in a neural network. A neural network that consists of more than three layers—which would be inclusive of the inputs and the output—can be considered a deep learning algorithm. A neural network that only has two or three layers is just a basic neural network.

Neural networks vs. deep learning

Multi-Layered Perceptron

In a multi-layered perceptron (MLP), perceptrons are arranged in interconnected layers. The input layer collects input patterns. The output layer has classifications or output signals to which input patterns may map. For instance, the patterns may comprise a list of quantities for technical indicators about a security; potential outputs could be “buy,” “hold” or “sell.”

Hidden layers fine-tune the input weightings until the neural network’s margin of error is minimal. It is hypothesized that hidden layers extrapolate salient features in the input data that have predictive power regarding the outputs. This describes feature extraction, which accomplishes a utility similar to statistical techniques such as principal component analysis.

History of Neural Networks

Though the concept of integrated machines that can think has existed for centuries, there have been the largest strides in neural networks in the past 100 years. In 1943, Warren McCulloch and Walter Pitts from the University of Illinois and the University of Chicago published "A Logical Calculus of the Ideas Immanent in Nervous Activity". The research analyzed how the brain could produce complex patterns and could be simplified down to a binary logic structure with only true/false connections.

Frank Rosenblatt from the Cornell Aeronautical Laboratory was credited with the development of perceptron in 1958. His research introduced weights to McColloch's and Pitt's work, and Rosenblatt leveraged his work to demonstrate how a computer could use neural networks to detect imagines and make inferences.

Even though there was a dry spell of research (largely due to a dry spell in funding) during the 1970's, Paul Werbos is often credited with the primary contribution during this time in his PhD thesis.

Then, Jon Hopfield presented Hopfield Net, a paper on recurrent neural networks in 1982. In addition, the concept of backpropagation resurfaced, and many researchers began to understand its potential for neural nets.

Most recently, more specific neural network projects are being generated for direct purposes. For example, Deep Blue, developed by IBM, conquered the chess world by pushing the ability of computers to handle complex calculations.

Though publicly known for beating the world chess champion, these types of machines are also leveraged to discover new medicine, identify financial market trend analysis, and perform massive scientific calculations.

Pros of using Neural Networks

Can often work more efficiently and for longer than humans
Can be programmed to learn from prior outcomes to strive to make smarter future calculations
Often leverage online services that reduce (but do not eliminate) systematic risk
Are continually being expanded in new fields with more difficult problems

Cons of using Neural Networks

Still rely on hardware that may require labor and expertise to maintain
May take long periods of time to develop the code and algorithms
May be difficult to assess errors or adaptions to the assumptions if the system is self-learning but lacks transparency
Usually report an estimated range or estimated amount that may not actualize

Neural networks represent a transformative tool in modern computing, adept at tackling intricate tasks with unparalleled efficiency. Their capacity to learn from data and discern intricate patterns has led to groundbreaking advancements across numerous fields. As we refine their design and optimize their functionality, neural networks promise to remain at the forefront of artificial intelligence, driving innovation and shaping the future of technology.

It wasn't until around 2010 that research in neural networks picked up great speed. The big data trend, where companies amass vast troves of data and parallel computing gave data scientists the training data and computing resources needed to run complex artificial neural networks. In 2012, a neural network named AlexNet won the ImageNet Large Scale Visual Recognition competition, an image classification challenge. Since then, interest in artificial neural networks has soared and technology has continued to improve.

credits: IBM.com

Search This Blog

Techtroniks

Microsoft’s Majorana 1: world’s first quantum processor powered by topological qubits

Understanding Neural Networks

Comments

Post a Comment

Popular posts from this blog

Memory devices

Next-Gen AI Processors: Balancing Speed and Power with Efficiency

Registers: The Backbone of Computer Memory