This work proposes a general Reservoir Computing (RC) learning framework which can be used to learn navigation behaviors for mobile robots in simple and complex unknown, partially observable environments. RC provides an efficient way to train recurrent neural networks by letting the recurrent part of the network (called reservoir) fixed while only a linear readout output layer is trained. The proposed RC framework builds upon the notion of navigation attractor or behavior which can be embedded in the high-dimensional space of the reservoir after learning. The learning of multiple behaviors is possible because the dynamic robot behavior, consisting of a sensory-motor sequence, can be linearly discriminated in the high-dimensional nonlinear space of the dynamic reservoir. Three learning approaches for navigation behaviors are shown in this paper. The first approach learns multiple behaviors based on examples of navigation behaviors generated by a supervisor, while the second approach learns goal-directed navigation behaviors based only on rewards. The third approach learns complex goal-directed behaviors, in a supervised way, using an hierarchical architecture whose internal predictions of contextual switches guide the sequence of basic navigation behaviors towards the goal.
During my PhD, I've worked mainly on Reservoir Computing (RC) architectures with application to modeling cognitive capabilities for mobile robots from sensor data and sometimes through interaction with the environment.
Reservoir Computing (RC) is an efficient method for trainning recurrent neural networks, which can handle spatio-temporal processing tasks, such as speech recognition. These networks are also biological plausible, as recently argued in the literature.
In my case, I used these RC networks for modeling a wide range of capabilities for mobile robots, such as:
My publications are listed and can be downloaded in Google Scholar or here.
Some simulated and real robots employed in the experiments:
Environment used for localization experiments using the real e-puck robot:
After using unsupervised learning methods for self-localization, the plots below show the mean activation of place cells as a function of the robot position in the environment.
Red denotes a high response whereas blue denotes a low response.
It is possible to perform map generation through sensory prediction given the robot position as input. Black points represent the sensory readings whereas gray points are the robot trajectory.
In the context of energy distribution networks, frauds are non-technical losses (NTL) that may account for up to 40% of the total distributed energy in some developing countries. The fraudster alters the eletricity meter in order to pay less than the right amount. In this context, the discovery or detection of frauds is necessary in order to decrease the non-technical losses of the energy distribution networks, consequently enhancing the stability and reliability of the network.
This project proposes the use of Recurrent Neural Networks (RNNs) for projecting a times series into a spatial dimension such that it can be used as a universal temporal feature for fraud detection predictive models. The particular problem tackled here jointly with the partner company is to predict whether a given time series of monthly energy consumption data is likely to indicate a fraud (NTL) or not.
Two main approaches are planned to be used with RNNs: supervised learning with bias correction techniques, and self-organized models for unsupervised learning of new fraud (anomaly) patterns.Finally, a last step is to integrate both of the previously developed models into an unified architecture that learns the responsibilities of each model in an online way by feedback from the environment using the results of the inspections of the fraudsters - the ground truth for some of the predictions.
This project has potential not only for generating significant technological and commercial value for the industrial partner, but also outstanding scientific output, being applicable in the long-term to other fields such as monitoring, prognosis/diagnostics in robotics, medical systems and security applications.
With Prof. Thorsteinn Rögnvaldsson, from Halmstad University, Sweden, we are looking at how Reservoir Computing can help in deviation detection in a fleet of Swedish city buses using a signal from the air tank pressure from the buses in order to predict when a bus is going to break well in advance.
Process measurements are of vital importance for monitoring and control of industrial plants. When we consider offshore oil production platforms, wells that require gas-lift technology to yield oil production from low pressure oil reservoirs can become unstable under some conditions. This undesirable phenomenon is usually called slugging flow, and can be identified by an oscillatory behavior of the downhole pressure measurement.
Given the importance of this measurement and the unreliability of the related sensor, this work aims at designing data-driven soft-sensors for downhole pressure estimation in two contexts: one for speeding up first-principled model simulation of a vertical riser model; and another for estimating the downhole pressure using real-world data from an oil well from Petrobras based only on topside platform measurements. Both tasks are tackled by employing Echo State Networks (ESNs) as an efficient technique for training Recurrent Neural Networks.
We show that a single ESN is capable of robustly modeling both the slugging flow behavior and a steady state based only on a square wave input signal representing the production choke opening in the vertical riser. Besides, we compare the performance of a standard network to the performance of a multiple timescale hierarchical architecture in the second task and show that for some periods the latter architecture performs better.
This work proposes a hierarchical biologically-inspired architecture for learning sensor-based spatial representations of a robot environment in an unsupervised way. The first layer is comprised of a fixed randomly generated recurrent neural network, the reservoir, which projects the input into a high-dimensional, dynamic space. The second layer learns instantaneous slowly-varying signals from the reservoir states using Slow Feature Analysis (SFA), whereas the third layer learns a sparse coding on the SFA layer using Independent Component Analysis (ICA). While the SFA layer generates non-localized activations in space, the ICA layer presents high place selectivity, forming a localized spatial activation, characteristic of place cells found in the hippocampus area of the rodent’s brain. We show that, using a limited number of noisy short-range distance sensors as input, the proposed system learns a spatial representation of the environment which can be used to predict the actual location of simulated and real robots, without the use of odometry. The results confirm that the reservoir layer is essential for learning spatial representations from low-dimensional input such as distance sensors. The main reason is that the reservoir state reflects the recent history of the input stream. Thus, this fading memory is essential for detecting locations, mainly when locations are ambiguous and characterized by similar sensor readings.
Video for data generation:
Publications
Eric Antonelo and Benjamin SchrauwenLearning slow features with reservoir computing for biologically-inspired robot localization NEURAL NETWORKS, pp. 178-190 (2011)
Eric Antonelo and Benjamin SchrauwenTowards autonomous self-localization of small mobile robots using reservoir computing and slow feature analysis IEEE International conference on Systems, Man, and Cybernetics, Conference digest, Vol. 2, pp. (2009)
Eric Antonelo and Benjamin SchrauwenUnsupervised learning in reservoir computing : modeling hippocampal place cells for small mobile robots LECTURE NOTES IN COMPUTER SCIENCE, Vol. 5768, pp. 747-756 (2009)
In this work we propose a hierarchical architecture which constructs internal models of a robot environment for goal-oriented navigation by an imitation learning process. The proposed architecture is based on the Reservoir Computing paradigm for training Recurrent Neural Networks (RNN). It is composed of two randomly generated RNNs (called reservoirs), one for modeling the localization capability and one for learning the navigation skill. The localization module is trained to detect the current and previously visited robot rooms based only on 8 noisy infra-red distance sensors. These predictions together with distance sensors and the desired goal location are used by the navigation network to actually steer the robot through the environment in a goal-oriented manner. The training of this architecture is performed in a supervised way (with examples of trajectories created by a supervisor) using linear regression on the reservoir states. So, the reservoir acts as a temporal kernel projecting the inputs to a rich feature space, whose states are linearly combined to generate the desired outputs. Experimental results on a simulated robot show that the trained system can localize itself within both simple and large unknown environments and navigate successfully to desired goals.
Eric Antonelo and Benjamin SchrauwenOn Learning Navigation Behaviors for Small Mobile Robots with Reservoir Computing Architectures IEEE Transactions on Neural Networks and Learning Systems, Vol. 26 pp. 763-780 (2014). DOI: 10.1109/TNNLS.2014.2323247.
Eric Antonelo and Benjamin SchrauwenSupervised learning of internal models for autonomous goal-oriented robot navigation using Reservoir Computing IEEE International conference on Robotics and Automation, Proceedings, pp. 6 (2010)
In this work we tackle the road sign problem with Reservoir Computing (RC) networks. The T-maze task (a particular form of the road sign problem) consists of a robot in a T-shaped environment that must reach the correct goal (left or right arm of the T-maze) depending on a previously received input sign. It is a control task in which the delay period between the sign received and the required response (e.g., turn right or left) is a crucial factor. Delayed response tasks like this one form a temporal problem that can be handled very well by RC networks. Reservoir Computing is a biologically plausible technique which overcomes the problems of previous algorithms such as Backpropagation Through Time - which exhibits slow (or non-) convergence on training. RC is a new concept that includes a fast and efficient training algorithm. We show that this simple approach can solve the T-maze task efficiently.
Video showing trained RC network controlling the robot:
Publications
Eric Antonelo, Benjamin Schrauwen and Dirk StroobandtMobile Robot Control in the Road Sign Problem using Reservoir Computing Networks Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 911-916 (2008)
Eric Antonelo, Benjamin Schrauwen and Jan Van CampenhoutGenerative Modeling of Autonomous Robots and their Environments using Reservoir Computing Neural Processing Letters, Vol. 26(3), pp. 233-249 (2007)
Reservoir Computing (RC) techniques use a fixed (usually randomly created) recurrent neural network, or more generally any dynamic system, which operates at the edge of stability, where only a linear static readout output layer is trained by standard linear regression methods. In this work, RC is used for detecting complex events in autonomous robot navigation. This can be extended to robot localization tasks which are solely based on a few low-range, high-noise sensory data. The robot thus builds an implicit map of the environment (after learning) that is used for efficient localization by simply processing the stream of distance sensors. These techniques are demonstrated in both a simple simulation environment and in the physically realistic Webots simulation of the commercially available e-puck robot, using several complex and even dynamic environments.
Videos showing data generation for event detection and localization:
Title of Master thesis: A Neural Reinforcement Learning Approach for Intelligent Autonomous Navigation Systems
Classical reinforcement learning mechanisms and a modular neural network are unified to conceive an intelligent autonomous system for mobile robot navigation. The conception aims at inhibiting two common navigation deficiencies: generation of unsuitable cyclic trajectories and ineffectiveness in risky configurations. Different design apparatuses are considered to compose a system to tackle with these navigation difficulties, for instance: 1) neuron parameter to simultaneously memorize neuron activities and function as a learning factor, 2) reinforcement learning mechanisms to adjust neuron parameters (not only synapse weights), and 3) a inner-triggered reinforcement. Simulation results show that the proposed system circumvents difficulties caused by specific environment configurations, improving the relation between collisions and captures.
Video (inhibiting unsuitable cyclic trajectories through reinforcement learning):
The robot starts not knowing what it should do in the environment, but as times passes, we can see that it interacts with the environment by colliding against obstacles and capturing targets (yellow boxes). Each collision elicits an appropriate innate response, i.e., aversion. As more collisions take place, its neural network learns to associate obstacles (and its blue color) with aversion behaviors such that it can deviate from obstacles (emergent behavior). The same process occurs for target capture being associated with attraction behavior through learning. In the end, the robot can navigate the environment efficiently, capturing targets, effectively suppressing cyclic trajectories common to such reactive systems.
Video (robot cooperation; each robot trained with previous neural network architecture)
The intelligent autonomous system corresponds to a neural network arranged in three layers (Fig. 4). In the first layer there are two neural repertoires: Proximity Identifier repertoire (PI) and Color Identifier repertoire (CI). Distance sensors stimulate PI repertoire whereas color sensors feed CI repertoire. Both repertoires receive stimuli from contact sensors. The second layer is composed by two neural repertoires: Attraction repertoire (AR) and Repulsion repertoire (RR). Each one establishes connections with both networks in the first layer as well as with contact sensors. The actuator network, connected to AR and RR repertoires, outputs the adjustment on direction of the robot.