One of the most important distinctions between an agent and an intelligent agent is the ability to learn. Learning provides an agent with the ability to operate in unknown situations and environments. Many techniques have been developed to endow agents with learning capabilities and help evolve their behaviour. This article begins with a brief overview of learning techniques by discussing Artificial Neural Network (ANN), their architecture, and different learning paradigms.
The Importance of Machine Learning Paradigms
Over the past few decades, many Artificial Intelligence techniques have been proposed and developed to make smarter entities and applications that can perform human-like activities (as discussed in the Brief History of Artificial Intelligence). AI techniques were evolved to create domain-specific Expert Systems. The traditional AI techniques included symbolic computational approaches, search-based, and rule-based AI techniques. Although these techniques have enabled us to create reasonably smart entities that are capable of reasoning and decision-making, they have drawbacks and limitations.
Developing Expert Systems requires hard-coding expert knowledge and manually adding the rules to the system, a task that is time-consuming, tedious, error-prone, and which becomes more and more complex as the amount of data it must deal with increases. Additionally, using traditional techniques it is hard to predict and anticipate all scenarios that could occur for which rules will be required for creating intelligent behaviours in agents. As the technology evolved, more modern AI techniques started to emerge to model cognitive aspects of human behavior; including, reasoning, planning, and learning.
Modern AI techniques allow agents to learn, adapt and evolve. Therefore creating a more realistic cognitive behaviour pattern. The main categories of these techniques include Evolutionary Algorithms and Artificial Neural Networks. which have been inspired by the way biological life evolves and the way human nervous systems process information, respectively. These techniques can be used to adapt solutions to new problems without the need for human explicit knowledge [1]. They have the ability to learn from examples, extract knowledge from data, and process imprecise information.
Multi-agent learning has attracted considerable attention in the research community. “Two major categories of cooperative multi-agent learning approaches are team learning and concurrent learning” [2]. Team learning involves training a single learner to improve the behaviour of the whole team. Concurrent learning uses multiple concurrent learning processes and allows each learner to improve their own behaviour via their own learning process. In other words, learners modify their behaviour independent of one another.
The main learning techniques investigated and implemented within this research are along the lines of concurrent multi-agent learning techniques.
What is Artificial Neural Network?
An Artificial Neural Network is an information-processing concept, which is “composed of a large number of simple interconnected processing elements (neurons) working in parallel to solve specific problems” [3]. The computational model of the Neural Network mimics how neurons in the biological brain process information to learn new patterns and adapt to changes. Unlike conventional algorithms, Neural Network cannot be programmed to perform a specific task but it can learn and adapt from experience. Just like biological neurons, Artificial neurons receive incoming signals, process them, and derive an output. The input signal might be raw data or it might be from the output of another neuron. Generally, a multi-layer Neural Network consists of one input and output layer but may have a number of intermediate (or hidden) layers. Each layer may have one or more neurons. Figure 1 shows a three-layer ANN with one hidden layer.
The Structure of Neuron Model in Neural Networks
According to a simple mathematical model of a neuron, proposed by McCulloch and Pitts [4, 5], the inputs of an ANN are weighted, and the effect of each input on decision-making is dependent on the weight of the particular input. Neurons are connected by these links and the weights determine the strength of these connections (as shown in Figure 2) [6].
In the fixed networks the weights are fixed, but in the adaptive networks, they can be changed. The activation of a neuron is the weighted sum of its inputs based on the Equation below:
where (xi) is the input of the neuron and (wi) is its weight.
The output of a Neural Network is dependent upon whether the activation exceeds a pre-set threshold value T. Should the activation exceeds T, the neuron fires, and is represented by the following Equation:
This equation may also be written as follows:
In the above Equation, T can be represented as a fixed input w0x0 by setting x0 to -1 and w0 to T . The w0 weight is referred to as bias weight. By including the bias weight in the equation, the threshold value can also be derived.
Activation Functions of Neurons in Artificial Neural Networks
The output and behaviour of an ANN depend on both the weights and the transfer function (also referred to as the activation function) that is specified for the units [7]. This function typically falls into one of three categories: threshold, linear, and sigmoid transfer function. The Figure below shows three commonly used transfer functions of each of these categories.
The threshold transfer function (or step function) limits the output to specific values of one of two levels. This function was used by the original perceptron learning algorithm [8]. In the case of a hard-limit function, as shown in the Figure above the output is either 0 or 1 depending on whether the total input is less than or greater than some threshold value, respectively.
The linear transfer functions vary the output activity linearly. The output is proportional to the weighted sum of inputs and a linearly dependent bias.
The sigmoid transfer function (also known as the logistic function) varies the output continuously as the input changes. There are different types of sigmoid functions including log-sigmoid and tan-sigmoid. The log-sigmoid function squashes the output into the range 0 to 1 by the following Equation:
Where e is a mathematical constant approximately equal to 2.7183, and the a parameter is the activation of the neuron. The activation value for each neuron is the sum of all its input values times input connection weights, as shown in the first Equation. The value of p is the activation response, which is a positive coefficient representing the gain of the sigmoid function, and which controls the shape of the sigmoid curve [13]. As p increases, the curve becomes flatter and as it decreases the curve becomes steeper (shown in Figure below). Setting p to a very low value would create the effect much like a step function, which makes the Neural Network more sensitive to change. In this research, the log-sigmoid activation function is used and the activation response is tweaked along with other parameters of the Neural Network to evolve its structure.
Architectures of Neural Network
The architecture of a Neural Network is defined by its attributes; including weight, transform function, topology, and learning algorithm [6]. Neural Network topology refers to the number of neurons and the structure of their interconnection. The structure of a Neural Network falls into two main categories, the feed-forward network and the feedback network. Feed-forward Neural Networks are the type of networks that allow a signal to travel only in one direction, namely from input to output.
Figure 1 is an example of a feed-forward Neural Network. The feedback (or recurrent) networks enable neurons in one layer to connect with other neurons in the same and previous layers. In other words, the signal in a feedback network can travel in both directions. This type of Neural Network can get very complicated but is more powerful. It can change dynamically based on input changes until it reaches an equilibrium state. It can also support short-term memory as its response may depend on previous inputs. Figure 5 shows an example of a feedback network.
The learning algorithm is the method used to train the Neural Network. The concept of incorporating learning with Neural Networks, and adapting connections between nodes, was introduced by Hebb — a pioneer of Neural Networks [7]. The training process involves tweaking the weights of the connections in order to map a set of inputs to a set of outputs. This process can be done in online or offline mode [9, 10]. In the online learning process, the agent learns while it is performing its tasks. In the offline learning mode, the learning phase is distinct from the operation phase.
Over the past few decades, researchers have proposed a variety of different Neural Network architectures and learning paradigms. Some examples of these include Perceptron [8], Hopfield network [11, 12], Multi-Layer Perceptron [14], Adaptive Resonance Theory (ART) networks [15, 16], and Self-Organising Map (SOM) [17]. Each of these techniques has pros and cons and is suitable for specific applications. For example, SOM provides a nice visualisation of output results by putting them into a 2D map.
Minsky and Papert showed the limitation of perceptron in learning linearly inseparable problems [18]. They proved this point by demonstrating that a perceptron network consisting of only an input and output layer cannot learn to solve the XOR function. This problem held up research into the Neural Network for a while, but in the early eighties, researchers realised that the problems with perceptron can be solved using Multi-Layer Perceptron with the learning method of backpropagation [12, 14, 19].
The backpropagation learning technique creates a network with one or more hidden layers and randomises the weights of neuron connections. Then an input pattern is presented to the network and the output is calculated and compared with the target output to find the difference between them, which is referred to as the ‘error value’. The weights from the previous layers are adjusted to minimise the error values if the same input pattern is presented to the network. This process is repeated with different input patterns until the error value is below an acceptable value, which means the network is trained.
In general, the Neural Network learning paradigms fall into three major categories: Supervised, Unsupervised, and Reinforcement Learning.
What is Supervised Learning in Neural Networks?
In Supervised Learning, the Neural Network is presented with a set of input data and corresponding desired target values for training and finding the mapping function between inputs and their correct (desired) outputs [6]. Each output unit is told what its desired responses to input signals should be and it changes these based on the feedback response that is given to it about its performance. The training process is repeated to adjust the weights of connections between neurons and move the actual outputs of the network closer to the target outputs until the difference between them reaches a certain predetermined value.
Supervised Learning is mostly suited to fully observable environments where the effects of actions are well known beforehand and a series of input patterns exist that can be mapped to corresponding output patterns.
Some examples of supervised learning algorithms are:
- Linear Regression
- Logistic Regression
- Support Vector Machines
- Decision Trees
- Random Forests
- k-Nearest Neighbors.
What is Unsupervised Learning in Neural Networks?
In Unsupervised Learning, no specific target outputs are available and the Neural Network finds patterns in the data without receiving any help or feedback from the environment [20]. The goal of unsupervised learning is to find hidden structures in data. It is often used to find clusters of data points. The weights and biases are modified in response to network inputs only. Some applications of this learning technique; include pattern classification, clustering, and data mining.
Some examples of unsupervised learning are:
- Clustering algorithms, such as k-means
- Dimensionality reduction algorithms, such as principal component analysis
- Anomaly detection algorithms
What is Reinforcement Learning?
Reinforcement Learning techniques enable agents to learn by receiving positive or negative reinforcement from the environment [21]. The learner agent is not told what actions to take but is given feedback (or reinforcement) based on the result of the actions it takes. The reinforcement can be in the form of reward or punishment based on the desirability of its actions. If the action is desirable the agent is rewarded, but if not then the agent is punished. The aim is to maximise its cumulative reward.
Reinforcement learning is useful in dynamic environments and unpredictable situations. Some examples of the Reinforcement Learning paradigm; include Temporal Difference Learning [22], SARSA, and Q-Learning [23].
Conclusion
In this article, we discussed about the importance of machine learning paradigms and in particular talked about Artificial Neural Networks. We then looked into the structure of a neural network neuron and how it works, explaining its architecture and some of the activation functions that trigger the neuron output.
Neural Network is a network of simple neurons, which are capable of processing the information that is input to them and adjusting their parameters and structure to learn a pattern of behaviour by experiment. Various techniques have been developed to enable the learning process. Such techniques fall into the three categories Supervised, Unsupervised, and Reinforcement Learning. Every technique might be best suited for specific applications or situations depending on different factors, including the type of environment and the way the feedback information is provided to agents.
The next article gives you an Introduction to Evolutionary Algorithms, Genetic Algorithm and Neuroevolution of augmenting topologies.
References:
- D. Fogel. Review of computational intelligence: Imitating life. IEEE Trans. on Neural Networks, 6:1562–1565, 1995.
- L. Panait and S. Luke. Cooperative multi-agent learning: The state of the art. Autonomous Agents and Multi-Agent Systems, 11(3):387–434, 2005.
- R. P. Lippmann. An introduction to computing with neural nets. IEEE ASSP Magazine, 4(2):4–22, April 1987.
- W. S. McCulloch and W. Pitts. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5:115–133, 1943.
- W. Pitts and W. S. McCulloch. How we know universals: The perception of auditory and visual forms. Bulletin of Mathematical Biophysics, 9:127147, 1947.
- S. J. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice-Hall, Pearson Education, Inc., Upper Saddle River, NJ, USA, 2nd edition, 2003.
- D. O. Hebb. The Organization of Behavior. Wiley, New York, 1949. [93]C. Hewitt. Viewing control structures as patterns of passing messages. Artificial Intelligence, 8(3):323–364, 1977.
- F. Rosenblatt. The perceptron, a probabilistic model for information storage and organisation in the brain. Psychological Review, 65:386–408, 1958.
- S. Ben-David, E. Kushilevitz, and Y. Mansour. Online learning versus offline learning. Machine Learning, 29(1):45–63, October 1997.
- N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. In Machine Learning, pages 285–318, 1988.
- J. J. Hopfield. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Science, 79:2554–2558, 1982.
- J. J. Hopfield. Neurons with graded responses have collective computational properties like those of two-state neurons. In Proceedings of the National Academy of Sciences (USA), volume 81, pages 3088–3092, 1984.
- M. Buckland. AI Techniques for Game Playing. Premier Press, Cincinnati, OH, USA, 2002.
- D.E. Rumelhart, G.E. Hinton, and R.J. Williams. Learning internal representations by error propagation. In D.E. Rumelhart and J.L. McClelland, editors, Parallel distributed processing: Explorations in the Microstructure of Cognition, volume 1. MIT Press, Cambridge, MA, 1986.
- G. A. Carpenter and S. Grossberg. ART 2: Self-organization of stable category recognition codes for analog input patterns. Applied Optics, 26(23):4919–4930, 1987.
- G. A. Carpenter and S. Grossberg. Neural networks for vision and image processing. MIT Press, Cambridge, Mass, 1992.
- T. Kohonen. Self-organising formation of topologically correct feature maps. Biological Cybernetics, 43:59–69, 1982.
- M. L. Minsky and S. A. Papert. Perceptrons. Cambridge, MA: MIT Press, 1969.
- H. Ackley, E. Hinton, and J. Sejnowski. A learning algorithm for boltzmann machines. Cognitive Science, pages 147–169, 1985.
- G. E. Hinton and T. J. Sejnowski, editors. Unsupervised Learning: Foundations of Neural Computation. MIT Press, 1999.
- R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998.
- R. S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9–44, 1988.
- C. J. C. H. Watkins and Peter Dayan. Technical note: Q-learning. Machine Learning, 8(3):279–292, May 1992.