Meta Learning Via Learned Loss

Sarah Bechtle, Artem Molchanov, Yevgen Chebotar, Edward Thomas Grefenstette, Ludovic Righetti, Gaurav Sukhatme, Franziska Meier

Responsive image

Auto-TLDR; meta-learning for learning parametric loss functions that generalize across different tasks and model architectures

Slides

Typically, loss functions, regularization mechanisms and other important aspects of training parametric models are chosen heuristically from a limited set of options. In this paper, we take the first step towards automating this process, with the view of producing models which train faster and more robustly. Concretely, we present a meta-learning method for learning parametric loss functions that can generalize across different tasks and model architectures. We develop a pipeline for “meta-training” such loss functions, targeted at maximizing the performance of the model trained under them. The loss landscape produced by our learned losses significantly improves upon the original task-specific losses in both supervised and reinforcement learning tasks. Furthermore, we show that our meta-learning framework is flexible enough to incorporate additional information at meta-train time. This information shapes the learned loss function such that the environment does not need to provide this information during meta-test time.

Similar papers

Low Dimensional State Representation Learning with Reward-Shaped Priors

Nicolò Botteghi, Ruben Obbink, Daan Geijs, Mannes Poel, Beril Sirmacek, Christoph Brune, Abeje Mersha, Stefano Stramigioli

Responsive image

Auto-TLDR; Unsupervised Learning for Unsupervised Reinforcement Learning in Robotics

Poster Similar

Reinforcement Learning has been able to solve many complicated robotics tasks without any need of feature engineering in an end-to-end fashion. However, learning the optimal policy directly from the sensory inputs, i.e the observations, often requires processing and storage of huge amount of data. In the context of robotics, the cost of data from real robotics hardware is usually very high, thus solutions that achieves high sample-efficiency are needed. We propose a method that aims at learning a mapping from the observations into a lower dimensional state space. This mapping is learned with unsupervised learning using loss functions shaped to incorporate prior knowledge of the environment and the task. Using the samples from the state space, the optimal policy is quickly and efficiently learned. We test the method on several mobile robot navigation tasks in simulation environment and also on a real robot.

The Effect of Multi-Step Methods on Overestimation in Deep Reinforcement Learning

Lingheng Meng, Rob Gorbet, Dana Kulić

Responsive image

Auto-TLDR; Multi-Step DDPG for Deep Reinforcement Learning

Slides Poster Similar

Multi-step (also called n-step) methods in reinforcement learning (RL) have been shown to be more efficient than the 1-step method due to faster propagation of the reward signal, both theoretically and empirically, in tasks exploiting tabular representation of the value-function. Recently, research in Deep Reinforcement Learning (DRL) also shows that multi-step methods improve learning speed and final performance in applications where the value-function and policy are represented with deep neural networks. However, there is a lack of understanding about what is actually contributing to the boost of performance. In this work, we analyze the effect of multi-step methods on alleviating the overestimation problem in DRL, where multi-step experiences are sampled from a replay buffer. Specifically building on top of Deep Deterministic Policy Gradient (DDPG), we experiment with Multi-step DDPG (MDDPG), where different step sizes are manually set, and with a variant called Mixed Multi-step DDPG (MMDDPG) where an average over different multi-step backups is used as target Q-value. Empirically, we show that both MDDPG and MMDDPG are significantly less affected by the overestimation problem than DDPG with 1-step backup, which consequently results in better final performance and learning speed. We also discuss the advantages and disadvantages of different ways to do multi-step expansion in order to reduce approximation error, and expose the tradeoff between overestimation and underestimation that underlies offline multi-step methods. Finally, we compare the computational resource needs of TD3 and our proposed methods, since they show comparable final performance and learning speed.

Deep Reinforcement Learning on a Budget: 3D Control and Reasoning without a Supercomputer

Edward Beeching, Jilles Steeve Dibangoye, Olivier Simonin, Christian Wolf

Responsive image

Auto-TLDR; Deep Reinforcement Learning in Mobile Robots Using 3D Environment Scenarios

Slides Poster Similar

An important goal of research in Deep Reinforcement Learning in mobile robotics is to train agents capableof solving complex tasks, which require a high level of scene understanding and reasoning from an egocentric perspective.When trained from simulations, optimal environments should satisfy a currently unobtainable combination of high-fidelity photographic observations, massive amounts of different environment configurations and fast simulation speeds. In this paper we argue that research on training agents capable of complex reasoning can be simplified by decoupling from the requirement of high fidelity photographic observations. We present a suite of tasks requiring complex reasoning and exploration in continuous,partially observable 3D environments. The objective is to provide challenging scenarios and a robust baseline agent architecture that can be trained on mid-range consumer hardware in under 24h. Our scenarios combine two key advantages: (i) they are based on a simple but highly efficient 3D environment (ViZDoom)which allows high speed simulation (12000fps); (ii) the scenarios provide the user with a range of difficulty settings, in order to identify the limitations of current state of the art algorithms and network architectures. We aim to increase accessibility to the field of Deep-RL by providing baselines for challenging scenarios where new ideas can be iterated on quickly. We argue that the community should be able to address challenging problems in reasoning of mobile agents without the need for a large compute infrastructure.

Is the Meta-Learning Idea Able to Improve the Generalization of Deep Neural Networks on the Standard Supervised Learning?

Xiang Deng, Zhongfei Zhang

Responsive image

Auto-TLDR; Meta-learning Based Training of Deep Neural Networks for Few-Shot Learning

Slides Poster Similar

Substantial efforts have been made on improving the generalization abilities of deep neural networks (DNNs) in order to obtain better performances without introducing more parameters. On the other hand, meta-learning approaches exhibit powerful generalization on new tasks in few-shot learning. Intuitively, few-shot learning is more challenging than the standard supervised learning as each target class only has a very few or no training samples. The natural question that arises is whether the meta-learning idea can be used for improving the generalization of DNNs on the standard supervised learning. In this paper, we propose a novel meta-learning based training procedure (MLTP) for DNNs and demonstrate that the meta-learning idea can indeed improve the generalization abilities of DNNs. MLTP simulates the meta-training process by considering a batch of training samples as a task. The key idea is that the gradient descent step for improving the current task performance should also improve a new task performance, which is ignored by the current standard procedure for training neural networks. MLTP also benefits from all the existing training techniques such as dropout, weight decay, and batch normalization. We evaluate MLTP by training a variety of small and large neural networks on three benchmark datasets, i.e., CIFAR-10, CIFAR-100, and Tiny ImageNet. The experimental results show a consistently improved generalization performance on all the DNNs with different sizes, which verifies the promise of MLTP and demonstrates that the meta-learning idea is indeed able to improve the generalization of DNNs on the standard supervised learning.

A Bayesian Approach to Reinforcement Learning of Vision-Based Vehicular Control

Zahra Gharaee, Karl Holmquist, Linbo He, Michael Felsberg

Responsive image

Auto-TLDR; Bayesian Reinforcement Learning for Autonomous Driving

Slides Poster Similar

In this paper, we present a state-of-the-art reinforcement learning method for autonomous driving. Our approach employs temporal difference learning in a Bayesian framework to learn vehicle control signals from sensor data. The agent has access to images from a forward facing camera, which are pre-processed to generate semantic segmentation maps. We trained our system using both ground truth and estimated semantic segmentation input. Based on our observations from a large set of experiments, we conclude that training the system on ground truth input data leads to better performance than training the system on estimated input even if estimated input is used for evaluation. The system is trained and evaluated in a realistic simulated urban environment using the CARLA simulator. The simulator also contains a benchmark that allows for comparing to other systems and methods. The required training time of the system is shown to be lower and the performance on the benchmark superior to competing approaches.

Learning from Learners: Adapting Reinforcement Learning Agents to Be Competitive in a Card Game

Pablo Vinicius Alves De Barros, Ana Tanevska, Alessandra Sciutti

Responsive image

Auto-TLDR; Adaptive Reinforcement Learning for Competitive Card Games

Slides Poster Similar

Learning how to adapt to complex and dynamic environments is one of the most important factors that contribute to our intelligence. Endowing artificial agents with this ability is not a simple task, particularly in competitive scenarios. In this paper, we present a broad study on how popular reinforcement learning algorithms can be adapted and implemented to learn and to play a real-world implementation of a competitive multiplayer card game. We propose specific training and validation routines for the learning agents, in order to evaluate how the agents learn to be competitive and explain how they adapt to each others' playing style. Finally, we pinpoint how the behavior of each agent derives from their learning style and create a baseline for future research on this scenario.

MetaMix: Improved Meta-Learning with Interpolation-based Consistency Regularization

Yangbin Chen, Yun Ma, Tom Ko, Jianping Wang, Qing Li

Responsive image

Auto-TLDR; MetaMix: A Meta-Agnostic Meta-Learning Algorithm for Few-Shot Classification

Slides Poster Similar

Model-Agnostic Meta-Learning (MAML) and its variants are popular few-shot classification methods. They train an initializer across a variety of sampled learning tasks (also known as episodes) such that the initialized model can adapt quickly to new tasks. However, within each episode, current MAML-based algorithms have limitations in forming generalizable decision boundaries using only a few training examples. In this paper, we propose an approach called MetaMix. It generates virtual examples within each episode to regularize the backbone models. MetaMix can be applied in any of the MAML-based algorithms and learn the decision boundaries which are more generalizable to new tasks. Experiments on the mini-ImageNet, CUB, and FC100 datasets show that MetaMix improves the performance of MAML-based algorithms and achieves the state-of-the-art result when applied in Meta-Transfer Learning.

Object-Oriented Map Exploration and Construction Based on Auxiliary Task Aided DRL

Junzhe Xu, Jianhua Zhang, Shengyong Chen, Honghai Liu

Responsive image

Auto-TLDR; Auxiliary Task Aided Deep Reinforcement Learning for Environment Exploration by Autonomous Robots

Similar

Environment exploration by autonomous robots through deep reinforcement learning (DRL) based methods has attracted more and more attention. However, existing methods usually focus on robot navigation to single or multiple fixed goals, while ignoring the perception and construction of external environments. In this paper, we propose a novel environment exploration task based on DRL, which requires a robot fast and completely perceives all objects of interest, and reconstructs their poses in a global environment map, as much as the robot can do. To this end, we design an auxiliary task aided DRL model, which is integrated with the auxiliary object detection and 6-DoF pose estimation components. The outcome of auxiliary tasks can improve the learning speed and robustness of DRL, as well as the accuracy of object pose estimation. Comprehensive experimental results on the indoor simulation platform AI2-THOR have shown the effectiveness and robustness of our method.

Trajectory Representation Learning for Multi-Task NMRDP Planning

Firas Jarboui, Vianney Perchet

Responsive image

Auto-TLDR; Exploring Non Markovian Reward Decision Processes for Reinforcement Learning

Slides Poster Similar

Expanding Non Markovian Reward Decision Processes (NMRDP) into Markov Decision Processes (MDP) enables the use of state of the art Reinforcement Learning (RL) techniques to identify optimal policies. In this paper an approach to exploring NMRDPs and expanding them into MDPs, without the prior knowledge of the reward structure, is proposed. The non Markovianity of the reward function is disentangled under the assumption that sets of similar and dissimilar trajectory batches can be sampled. More precisely, within the same batch, measuring the similarity between any couple of trajectories is permitted, although comparing trajectories from different batches is not possible. A modified version of the triplet loss is optimised to construct a representation of the trajectories under which rewards become Markovian.

Detecting and Adapting to Crisis Pattern with Context Based Deep Reinforcement Learning

Eric Benhamou, David Saltiel Saltiel, Jean-Jacques Ohana Ohana, Jamal Atif Atif

Responsive image

Auto-TLDR; Deep Reinforcement Learning for Financial Crisis Detection and Dis-Investment

Slides Poster Similar

Deep reinforcement learning (DRL) has reached super human levels in complexes tasks like game solving (Go, StarCraft II), and autonomous driving. However, it remains an open question whether DRL can reach human level in applications to financial problems and in particular in detecting pattern crisis and consequently dis-investing. In this paper, we present an innovative DRL framework consisting in two sub-networks fed respectively with portfolio strategies past performances and standard deviation as well as additional contextual features. The second sub network plays an important role as it captures dependencies with common financial indicators features like risk aversion, economic surprise index and correlations between assets that allows taking into account context based information. We compare different network architectures either using layers of convolutions to reduce network's complexity or LSTM block to capture time dependency and whether previous allocations is important in the modeling. We also use adversarial training to make the final model more robust. Results on test set show this approach substantially over-performs traditional portfolio optimization methods like Markovitz and is able to detect and anticipate crisis like the current Covid one.

AOAM: Automatic Optimization of Adjacency Matrix for Graph Convolutional Network

Yuhang Zhang, Hongshuai Ren, Jiexia Ye, Xitong Gao, Yang Wang, Kejiang Ye, Cheng-Zhong Xu

Responsive image

Auto-TLDR; Adjacency Matrix for Graph Convolutional Network in Non-Euclidean Space

Slides Poster Similar

Graph Convolutional Network (GCN) is adopted to tackle the problem of the convolution operation in non-Euclidean space. Although previous works on GCN have made some progress, one of their limitations is that their input Adjacency Matrix (AM) is designed manually and requires domain knowledge, which is cumbersome, tedious and error-prone. In addition, entries of this fixed Adjacency Matrix are generally designed as binary values (i.e., ones and zeros) which can not reflect more complex relationship between nodes. However, many applications require a weighted and dynamic Adjacency Matrix instead of an unweighted and fixed Adjacency Matrix. To this end, there are few works focusing on designing a more flexible Adjacency Matrix. In this paper, we propose an end-to-end algorithm to improve the GCN performance by focusing on the Adjacency Matrix. We first provide a calculation method that called node information entropy to update the matrix. Then, we analyze the search strategy in a continuous space and introduce the Deep Deterministic Policy Gradient (DDPG) method to overcome the demerit of the discrete space search. Finally, we integrate the GCN and reinforcement learning into an end-to-end framework. Our method can automatically define the adjacency matrix without artificial knowledge. At the same time, the proposed approach can deal with any size of the matrix and provide a better value for the network. Four popular datasets are selected to evaluate the capability of our algorithm. The method in this paper achieves the state-of-the-art performance on Cora and Pubmed datasets, respectively, with the accuracy of 84.6% and 81.6%.

AVD-Net: Attention Value Decomposition Network for Deep Multi-Agent Reinforcement Learning

Zhang Yuanxin, Huimin Ma, Yu Wang

Responsive image

Auto-TLDR; Attention Value Decomposition Network for Cooperative Multi-agent Reinforcement Learning

Slides Poster Similar

Multi-agent reinforcement learning (MARL) is of importance for variable real-world applications but remains more challenges like stationarity and scalability. While recently value function factorization methods have obtained empirical good results in cooperative multi-agent environment, these works mostly focus on the decomposable learning structures. Inspired by the application of attention mechanism in machine translation and other related domains, we propose an attention based approach called attention value decomposition network (AVD-Net), which capitalizes on the coordination relations between agents. AVD-Net employs centralized training with decentralized execution (CTDE) paradigm, which factorizes the joint action-value functions with only local observations and actions of agents. Our method is evaluated on multi-agent particle environment (MPE) and StarCraft micromanagement environment (SMAC). The experiment results show the strength of our approach compared to existing methods with state-of-the-art performance in cooperative scenarios.

Can Reinforcement Learning Lead to Healthy Life?: Simulation Study Based on User Activity Logs

Masami Takahashi, Masahiro Kohjima, Takeshi Kurashima, Hiroyuki Toda

Responsive image

Auto-TLDR; Reinforcement Learning for Healthy Daily Life

Slides Poster Similar

The importance of developing an application based on intervention technology that leads to a healthier life is widely recognized. A challenging part of realizing the application is the need for planning, i.e., considering a user's health goal (e.g., sleep at 10:00 p.m. to get enough sleep), providing intervention at the appropriate timing to help the user achieve the goal. The reinforcement learning (RL) approach is well suited to this type of problem since it is a methodology for planning; RL finds the optimal strategy as that which maximizes future expected profit. The purpose of this study is to clarify the effects of intervention based on RL to support healthy daily life. Therefore, we (i) collect real daily activity data from participants, (ii) generate a user model that imitates the user's response to system interventions, (iii) examine valuable goals and design them as rewards in RL and (iv) obtain optimal intervention strategies by RL via simulations given a user model and goals. We evaluate a generated user model and verify by simulations whether our method could successfully achieve the goal. In addition, we analyze the cases that demonstrated higher probability of achieving the goal and report the features.

Vacant Parking Space Detection Based on Task Consistency and Reinforcement Learning

Manh Hung Nguyen, Tzu-Yin Chao, Ching-Chun Huang

Responsive image

Auto-TLDR; Vacant Space Detection via Semantic Consistency Learning

Slides Poster Similar

In this paper, we proposed a novel task-consistency learning method that allows training a vacant space detection network (target task) based on the logistic consistency with the semantic outcomes from a naive flow-based motion behavior classifier (source task) in a parking lot. By well designing the reward mechanism upon semantic consistency, we show the possibility to train the target network in a reinforcement learning setting. Compared with conventional supervised detection methods, the major contribution of this work is to learn a vacant space detector via semantic consistency rather than supervised labels. The dynamic learning property may make the proposed detector been deployed in different lots easily without heavy training loads. The experiments show that based on the task consistency rewards from the motion behavior classifier, the vacant space detector can be trained successfully.

RNN Training along Locally Optimal Trajectories via Frank-Wolfe Algorithm

Yun Yue, Ming Li, Venkatesh Saligrama, Ziming Zhang

Responsive image

Auto-TLDR; Frank-Wolfe Algorithm for Efficient Training of RNNs

Slides Poster Similar

We propose a novel and efficient training method for RNNs by iteratively seeking a local minima on the loss surface within a small region, and leverage this directional vector for the update, in an outer-loop. We propose to utilize the Frank-Wolfe (FW) algorithm in this context. Although, FW implicitly involves normalized gradients, which can lead to a slow convergence rate, we develop a novel RNN training method that, surprisingly, even with the additional cost, the overall training cost is empirically observed to be lower than back-propagation. Our method leads to a new Frank-Wolfe method, that is in essence an SGD algorithm with a restart scheme. We prove that under certain conditions our algorithm has a sublinear convergence rate of $O(1/\epsilon)$ for $\epsilon$ error. We then conduct empirical experiments on several benchmark datasets including those that exhibit long-term dependencies, and show significant performance improvement. We also experiment with deep RNN architectures and show efficient training performance. Finally, we demonstrate that our training method is robust to noisy data.

Explore and Explain: Self-Supervised Navigation and Recounting

Roberto Bigazzi, Federico Landi, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi, Rita Cucchiara

Responsive image

Auto-TLDR; Exploring a Photorealistic Environment for Explanation and Navigation

Slides Similar

Embodied AI has been recently gaining attention as it aims to foster the development of autonomous and intelligent agents. In this paper, we devise a novel embodied setting in which an agent needs to explore a previously unknown environment while recounting what it sees during the path. In this context, the agent needs to navigate the environment driven by an exploration goal, select proper moments for description, and output natural language descriptions of relevant objects and scenes. Our model integrates a novel self-supervised exploration module with penalty, and a fully-attentive captioning model for explanation. Also, we investigate different policies for selecting proper moments for explanation, driven by information coming from both the environment and the navigation. Experiments are conducted on photorealistic environments from the Matterport3D dataset and investigate the navigation and explanation capabilities of the agent as well as the role of their interactions.

Complementing Representation Deficiency in Few-Shot Image Classification: A Meta-Learning Approach

Xian Zhong, Cheng Gu, Wenxin Huang, Lin Li, Shuqin Chen, Chia-Wen Lin

Responsive image

Auto-TLDR; Meta-learning with Complementary Representations Network for Few-Shot Learning

Slides Poster Similar

Few-shot learning is a challenging problem that has attracted more and more attention recently since abundant training samples are difficult to obtain in practical applications. Meta-learning has been proposed to address this issue, which focuses on quickly adapting a predictor as a base-learner to new tasks, given limited labeled samples. However, a critical challenge for meta-learning is the representation deficiency since it is hard to discover common information from a small number of training samples or even one, as is the representation of key features from such little information. As a result, a meta-learner cannot be trained well in a high-dimensional parameter space to generalize to new tasks. Existing methods mostly resort to extracting less expressive features so as to avoid the representation deficiency. Aiming at learning better representations, we propose a meta-learning approach with complemented representations network (MCRNet) for few-shot image classification. In particular, we embed a latent space, where latent codes are reconstructed with extra representation information to complement the representation deficiency. Furthermore, the latent space is established with variational inference, collaborating well with different base-learners, and can be extended to other models. Finally, our end-to-end framework achieves the state-of-the-art performance in image classification on three standard few-shot learning datasets.

Deep Reinforcement Learning for Autonomous Driving by Transferring Visual Features

Hongli Zhou, Guanwen Zhang, Wei Zhou

Responsive image

Auto-TLDR; Deep Reinforcement Learning for Autonomous Driving by Transferring Visual Features

Slides Poster Similar

Deep reinforcement learning (DRL) has achieved great success in processing vision-based driving tasks. However, the end-to-end training manner makes DRL agents suffer from overfitting training scenes. The agents easily fail to generalize to unseen environments. In this paper, we propose a deep reinforcement learning for autonomous driving by transferring visual features. We formulate the DRL training as a perception and control module and introduce adversarial training mechanism for autonomous driving. The perception module is able to extract invariant features between different domains through adversarial training. While the DRL agent can then be trained on the basis of low dimensional states. In this manner, the proposed approach enables trained agents to adapt to unseen environments by learning robust features invariant across various scenes. We evaluate the proposed approach by transferring visual features between different simulators. The experimental results demonstrate the driving policy trained in the source domain can be directly applied in the target domain, and achieve great efficient and effective performance for autonomous driving.

Self-Play or Group Practice: Learning to Play Alternating Markov Game in Multi-Agent System

Chin-Wing Leung, Shuyue Hu, Ho-Fung Leung

Responsive image

Auto-TLDR; Group Practice for Deep Reinforcement Learning

Slides Poster Similar

The research in reinforcement learning has achieved great success in strategic game playing. These successes are thanks to the incorporation of deep reinforcement learning (DRL) and Monte Carlo Tree Search (MCTS) to the agent trained under the self-play (SP) environment. By self-play, agents are provided with an incrementally more difficult curriculum which in turn facilitate learning. However, recent research suggests that agents trained via self-play may easily lead to getting stuck in local equilibria. In this paper, we consider a population of agents each independently learns to play an alternating Markov game (AMG). We propose a new training framework---group practice---for a population of decentralized RL agents. By group practice (GP), agents are assigned into multiple learning groups during training, for every episode of games, an agent is randomly paired up and practices with another agent in the learning group. The convergence result to the optimal value function and the Nash equilibrium are proved under the GP framework. Experimental study is conducted by applying GP to Q-learning algorithm and the deep Q-learning with Monte-Carlo tree search on the game of Connect Four and the game of Hex. We verify that GP is the more efficient training scheme than SP given the same amount of training. We also show that the learning effectiveness can even be improved when applying local grouping to agents.

ActionSpotter: Deep Reinforcement Learning Framework for Temporal Action Spotting in Videos

Guillaume Vaudaux-Ruth, Adrien Chan-Hon-Tong, Catherine Achard

Responsive image

Auto-TLDR; ActionSpotter: A Reinforcement Learning Algorithm for Action Spotting in Video

Slides Poster Similar

Action spotting has recently been proposed as an alternative to action detection and key frame extraction. However, the current state-of-the-art method of action spotting requires an expensive ground truth composed of the search sequences employed by human annotators spotting actions - a critical limitation. In this article, we propose to use a reinforcement learning algorithm to perform efficient action spotting using only the temporal segments from the action detection annotations, thus opening an interesting solution for video understanding. Experiments performed on THUMOS14 and ActivityNet datasets show that the proposed method, named ActionSpotter, leads to good results and outperforms state-of-the-art detection outputs redrawn for this application. In particular, the spotting mean Average Precision on THUMOS14 is significantly improved from 59.7% to 65.6% while skipping 23% of video.

Meta Generalized Network for Few-Shot Classification

Wei Wu, Shanmin Pang, Zhiqiang Tian, Yaochen Li

Responsive image

Auto-TLDR; Meta Generalized Network for Few-Shot Classification

Similar

Few-shot classification aims to learn a well performance model with very limited labeled examples. There are mainly two directions for this aim, namely, meta- and metric-learning. Meta learning trains models in a particular way to fast adapt to new tasks, but it neglects variational features of images. Metric learning considers relationships among same or different classes, however on the downside, it usually fails to achieve competitive performance on unseen boundary examples. In this paper, we propose a Meta Generalized Network (MGNet) that aims to combine advantages of both meta- and metric-learning. There are two novel components in MGNet. Specifically, we first develop a meta backbone training method that both learns a flexible feature extractor and a classifier initializer efficiently, delightedly leading to fast adaption to unseen few-shot tasks without overfitting. Second, we design a trainable adaptive interval model to improve the cosine classifier, which increases the recognition accuracy of hard examples. We train the meta backbone in the training stage by all classes, and fine-tune the meta-backbone as well as train the adaptive classifier in the testing stage.

A Novel Actor Dual-Critic Model for Remote Sensing Image Captioning

Ruchika Chavhan, Biplab Banerjee, Xiao Xiang Zhu, Subhasis Chaudhuri

Responsive image

Auto-TLDR; Actor Dual-Critic Training for Remote Sensing Image Captioning Using Deep Reinforcement Learning

Slides Poster Similar

We deal with the problem of generating textual captions from optical remote sensing (RS) images using the notion of deep reinforcement learning. Due to the high inter-class similarity in reference sentences describing remote sensing data, jointly encoding the sentences and images encourages prediction of captions that are semantically more precise than the ground truth in many cases. To this end, we introduce an Actor Dual-Critic training strategy where a second critic model is deployed in the form of an encoder-decoder RNN to encode the latent information corresponding to the original and generated captions. While all actor-critic methods use an actor to predict sentences for an image and a critic to provide rewards, our proposed encoder-decoder RNN guarantees high-level comprehension of images by sentence-to-image translation. We observe that the proposed model generates sentences on the test data highly similar to the ground truth and is successful in generating even better captions in many critical cases. Extensive experiments on the benchmark Remote Sensing Image Captioning Dataset (RSICD) and the UCM-captions dataset confirm the superiority of the proposed approach in comparison to the previous state-of-the-art where we obtain a gain of sharp increments in both the ROUGE-L and CIDEr measures.

Visual Object Tracking in Drone Images with Deep Reinforcement Learning

Derya Gözen, Sedat Ozer

Responsive image

Auto-TLDR; A Deep Reinforcement Learning based Single Object Tracker for Drone Applications

Slides Poster Similar

There is an increasing demand on utilizing camera equipped drones and their applications in many domains varying from agriculture to entertainment and from sports events to surveillance. In such drone applications, an essential and a common task is tracking an object of interest visually. Drone (or UAV) images have different properties when compared to the ground taken (natural) images and those differences introduce additional complexities to the existing object trackers to be directly applied on drone applications. Some important differences among those complexities include (i) smaller object sizes to be tracked and (ii) different orientations and viewing angles yielding different texture and features to be observed. Therefore, new algorithms trained on drone images are needed for the drone-based applications. In this paper, we introduce a deep reinforcement learning (RL) based single object tracker that tracks an object of interest in drone images by estimating a series of actions to find the location of the object in the next frame. This is the first work introducing a single object tracker using a deep RL-based technique for drone images. Our proposed solution introduces a novel reward function that aims to reduce the total number of actions taken to estimate the object's location in the next frame and also introduces a different backbone network to be used on low resolution images. Additionally, we introduce a set of new actions into the action library to better deal with the above-mentioned complexities. We compare our proposed solutions to a state of the art tracking algorithm from the recent literature and demonstrate up to 3.87\% improvement in precision and 3.6\% improvement in IoU values on the VisDrone2019 dataset. We also provide additional results on OTB-100 dataset and show up to 3.15\% improvement in precision on the OTB-100 dataset when compared to the same previous state of the art algorithm. Lastly, we analyze the ability to handle some of the challenges faced during tracking, including but not limited to occlusion, deformation, and scale variation for our proposed solutions.

Meta Soft Label Generation for Noisy Labels

Görkem Algan, Ilkay Ulusoy

Responsive image

Auto-TLDR; MSLG: Meta-Learning for Noisy Label Generation

Slides Poster Similar

The existence of noisy labels in the dataset causes significant performance degradation for deep neural networks (DNNs). To address this problem, we propose a Meta Soft Label Generation algorithm called MSLG, which can jointly generate soft labels using meta-learning techniques and learn DNN parameters in an end-to-end fashion. Our approach adapts the meta-learning paradigm to estimate optimal label distribution by checking gradient directions on both noisy training data and noise-free meta-data. In order to iteratively update soft labels, meta-gradient descent step is performed on estimated labels, which would minimize the loss of noise-free meta samples. In each iteration, the base classifier is trained on estimated meta labels. MSLG is model-agnostic and can be added on top of any existing model at hand with ease. We performed extensive experiments on CIFAR10, Clothing1M and Food101N datasets. Results show that our approach outperforms other state-of-the-art methods by a large margin. Our code is available at \url{https://github.com/gorkemalgan/MSLG_noisy_label}.

ARCADe: A Rapid Continual Anomaly Detector

Ahmed Frikha, Denis Krompass, Volker Tresp

Responsive image

Auto-TLDR; ARCADe: A Meta-Learning Approach for Continuous Anomaly Detection

Slides Poster Similar

Although continual learning and anomaly detection have separately been well-studied in previous works, their intersection remains rather unexplored. The present work addresses a learning scenario where a model has to incrementally learn a sequence of anomaly detection tasks, i.e. tasks from which only examples from the normal (majority) class are available for training. We define this novel learning problem of continual anomaly detection (CAD) and formulate it as a meta-learning problem. Moreover, we propose \emph{A Rapid Continual Anomaly Detector (ARCADe)}, an approach to train neural networks to be robust against the major challenges of this new learning problem, namely catastrophic forgetting and overfitting to the majority class. The results of our experiments on three datasets show that, in the CAD problem setting, ARCADe substantially outperforms baselines from the continual learning and anomaly detection literature. Finally, we provide deeper insights into the learning strategy yielded by the proposed meta-learning algorithm.

Rethinking Experience Replay: A Bag of Tricks for Continual Learning

Pietro Buzzega, Matteo Boschini, Angelo Porrello, Simone Calderara

Responsive image

Auto-TLDR; Experience Replay for Continual Learning: A Practical Approach

Slides Poster Similar

In Continual Learning, a Neural Network is trained on a stream of data whose distribution shifts over time. Under these assumptions, it is especially challenging to improve on classes appearing later in the stream while remaining accurate on previous ones. This is due to the infamous problem of catastrophic forgetting, which causes a quick performance degradation when the classifier focuses on learning new categories. Recent literature proposed various approaches to tackle this issue, often resorting to very sophisticated techniques. In this work, we show that naive rehearsal can be patched to achieve similar performance. We point out some shortcomings that restrain Experience Replay (ER) and propose five tricks to mitigate them. Experiments show that ER, thus enhanced, displays an accuracy gain of 51.2 and 26.9 percentage points on the CIFAR-10 and CIFAR-100 datasets respectively (memory buffer size 1000). As a result, it surpasses current state-of-the-art rehearsal-based methods.

Improving Robotic Grasping on Monocular Images Via Multi-Task Learning and Positional Loss

William Prew, Toby Breckon, Magnus Bordewich, Ulrik Beierholm

Responsive image

Auto-TLDR; Improving grasping performance from monocularcolour images in an end-to-end CNN architecture with multi-task learning

Slides Poster Similar

In this paper we introduce two methods of improv-ing real-time objecting grasping performance from monocularcolour images in an end-to-end CNN architecture. The first isthe addition of an auxiliary task during model training (multi-task learning). Our multi-task CNN model improves graspingperformance from a baseline average of 72.04% to 78.14% onthe large Jacquard grasping dataset when performing a supple-mentary depth reconstruction task. The second is introducinga positional loss function that emphasises loss per pixel forsecondary parameters (gripper angle and width) only on points ofan object where a successful grasp can take place. This increasesperformance from a baseline average of 72.04% to 78.92% aswell as reducing the number of training epochs required. Thesemethods can be also performed in tandem resulting in a furtherperformance increase to 79.12%, while maintaining sufficientinference speed to enable processing at 50FPS

E-DNAS: Differentiable Neural Architecture Search for Embedded Systems

Javier García López, Antonio Agudo, Francesc Moreno-Noguer

Responsive image

Auto-TLDR; E-DNAS: Differentiable Architecture Search for Light-Weight Networks for Image Classification

Slides Poster Similar

Designing optimal and light weight networks to fit in resource-limited platforms like mobiles, DSPs or GPUs is a challenging problem with a wide range of interesting applications, {\em e.g.} in embedded systems for autonomous driving. While most approaches are based on manual hyperparameter tuning, there exist a new line of research, the so-called NAS (Neural Architecture Search) methods, that aim to optimize several metrics during the design process, including memory requirements of the network, number of FLOPs, number of MACs (Multiply-ACcumulate operations) or inference latency. However, while NAS methods have shown very promising results, they are still significantly time and cost consuming. In this work we introduce E-DNAS, a differentiable architecture search method, which improves the efficiency of NAS methods in designing light-weight networks for the task of image classification. Concretely, E-DNAS computes, in a differentiable manner, the optimal size of a number of meta-kernels that capture patterns of the input data at different resolutions. We also leverage on the additive property of convolution operations to merge several kernels with different compatible sizes into a single one, reducing thus the number of operations and the time required to estimate the optimal configuration. We evaluate our approach on several datasets to perform classification. We report results in terms of the SoC (System on Chips) metric, typically used in the Texas Instruments TDA2x families for autonomous driving applications. The results show that our approach allows designing low latency architectures significantly faster than state-of-the-art.

On Embodied Visual Navigation in Real Environments through Habitat

Marco Rosano, Antonino Furnari, Luigi Gulino, Giovanni Maria Farinella

Responsive image

Auto-TLDR; Learning Navigation Policies on Real World Observations using Real World Images and Sensor and Actuation Noise

Slides Poster Similar

Visual navigation models based on deep learning can learn effective policies when trained on large amounts of visual observations through reinforcement learning. Unfortunately, collecting the required experience deploying a robotic platform in the real world is expensive and time-consuming. To deal with this limitation, several simulation platforms have been proposed in order to train visual navigation policies on virtual environments efficiently. Despite the advantages they offer, simulators present a limited realism in terms of appearance and physical dynamics, leading to navigation policies that do not generalize in the real world. In this paper, we propose a tool based on the Habitat simulator which exploits real world images of the environment, together with sensor and actuator noise models, to produce more realistic navigation episodes. We perform a range of experiments using virtual, real and images transformed with a simple domain adaptation approach. We also assess the impact of sensor and actuation noise on the navigation performance and investigate whether they allow to learn more robust navigation policies. We show that our tool can effectively help to train and evaluate navigation policies on real world observations without running navigation episodes in the real world.

Learning with Delayed Feedback

Pranavan Theivendiram, Terence Sim

Responsive image

Auto-TLDR; Unsupervised Machine Learning with Delayed Feedback

Slides Poster Similar

We propose a novel supervised machine learning strategy, inspired by human learning, that enables an Agent to learn continually over its lifetime. A natural consequence is that the Agent must be able to handle an input whose label is delayed until a later time, or may not arrive at all. Our Agent learns in two steps: a short Seeding phase, in which the Agent's model is initialized with labelled inputs, and an indefinitely long Growing phase, in which the Agent refines and assesses its model if the label is given for an input, but stores the input in a finite-length queue if the label is missing. Queued items are matched against future input-label pairs that arrive, and the model is then updated. Our strategy also allows for the delayed feedback to take a different form. For example, in an image captioning task, the feedback could be a semantic segmentation rather than a textual caption. We show with many experiments that our strategy enables an Agent to learn flexibly and efficiently.

Multiple Future Prediction Leveraging Synthetic Trajectories

Lorenzo Berlincioni, Federico Becattini, Lorenzo Seidenari, Alberto Del Bimbo

Responsive image

Auto-TLDR; Synthetic Trajectory Prediction using Markov Chains

Slides Poster Similar

Trajectory prediction is an important task, especially in autonomous driving. The ability to forecast the position of other moving agents can yield to an effective planning, ensuring safety for the autonomous vehicle as well for the observed entities. In this work we propose a data driven approach based on Markov Chains to generate synthetic trajectories, which are useful for training a multiple future trajectory predictor. The advantages are twofold: on the one hand synthetic samples can be used to augment existing datasets and train more effective predictors; on the other hand, it allows to generate samples with multiple ground truths, corresponding to diverse equally likely outcomes of the observed trajectory. We define a trajectory prediction model and a loss that explicitly address the multimodality of the problem and we show that combining synthetic and real data leads to prediction improvements, obtaining state of the art results.

Generalization Comparison of Deep Neural Networks Via Output Sensitivity

Mahsa Forouzesh, Farnood Salehi, Patrick Thiran

Responsive image

Auto-TLDR; Generalization of Deep Neural Networks using Sensitivity

Slides Similar

Although recent works have brought some insights into the performance improvement of techniques used in state-of-the-art deep-learning models, more work is needed to understand their generalization properties. We shed light on this matter by linking the loss function to the output's sensitivity to its input. We find a rather strong empirical relation between the output sensitivity and the variance in the bias-variance decomposition of the loss function, which hints on using sensitivity as a metric for comparing the generalization performance of networks, without requiring labeled data. We find that sensitivity is decreased by applying popular methods which improve the generalization performance of the model, such as (1) using a deep network rather than a wide one, (2) adding convolutional layers to baseline classifiers instead of adding fully-connected layers, (3) using batch normalization, dropout and max-pooling, and (4) applying parameter initialization techniques.

Graph-Based Interpolation of Feature Vectors for Accurate Few-Shot Classification

Yuqing Hu, Vincent Gripon, Stéphane Pateux

Responsive image

Auto-TLDR; Transductive Learning for Few-Shot Classification using Graph Neural Networks

Slides Poster Similar

In few-shot classification, the aim is to learn models able to discriminate classes using only a small number of labeled examples. In this context, works have proposed to introduce Graph Neural Networks (GNNs) aiming at exploiting the information contained in other samples treated concurrently, what is commonly referred to as the transductive setting in the literature. These GNNs are trained all together with a backbone feature extractor. In this paper, we propose a new method that relies on graphs only to interpolate feature vectors instead, resulting in a transductive learning setting with no additional parameters to train. Our proposed method thus exploits two levels of information: a) transfer features obtained on generic datasets, b) transductive information obtained from other samples to be classified. Using standard few-shot vision classification datasets, we demonstrate its ability to bring significant gains compared to other works.

Transferable Model for Shape Optimization subject to Physical Constraints

Lukas Harsch, Johannes Burgbacher, Stefan Riedelbauch

Responsive image

Auto-TLDR; U-Net with Spatial Transformer Network for Flow Simulations

Slides Poster Similar

The interaction of neural networks with physical equations offers a wide range of applications. We provide a method which enables a neural network to transform objects subject to given physical constraints. Therefore an U-Net architecture is used to learn the underlying physical behaviour of fluid flows. The network is used to infer the solution of flow simulations which will be shown for a wide range of generic channel flow simulations. Physical meaningful quantities can be computed on the obtained solution, e.g. the total pressure difference or the forces on the objects. A Spatial Transformer Network with thin-plate-splines is used for the interaction between the physical constraints and the geometric representation of the objects. Thus, a transformation from an initial to a target geometry is performed such that the object is fulfilling the given constraints. This method is fully differentiable i.e., gradient informations can be used for the transformation. This can be seen as an inverse design process. The advantage of this method over many other proposed methods is, that the physical constraints are based on the inferred flow field solution. Thus, we can apply a transferable model to varying problem setups, which is not limited to a given set of geometry parameters or physical quantities.

Rethinking Domain Generalization Baselines

Francesco Cappio Borlino, Antonio D'Innocente, Tatiana Tommasi

Responsive image

Auto-TLDR; Style Transfer Data Augmentation for Domain Generalization

Slides Poster Similar

Despite being very powerful in standard learning settings, deep learning models can be extremely brittle when deployed in scenarios different from those on which they were trained. Domain generalization methods investigate this problem and data augmentation strategies have shown to be helpful tools to increase data variability, supporting model robustness across domains. In our work we focus on style transfer data augmentation and we present how it can be implemented with a simple and inexpensive strategy to improve generalization. Moreover, we analyze the behavior of current state of the art domain generalization methods when integrated with this augmentation solution: our thorough experimental evaluation shows that their original effect almost always disappears with respect to the augmented baseline. This issue open new scenarios for domain generalization research, highlighting the need of novel methods properly able to take advantage of the introduced data variability.

TAAN: Task-Aware Attention Network for Few-Shot Classification

Zhe Wang, Li Liu, Fanzhang Li

Responsive image

Auto-TLDR; TAAN: Task-Aware Attention Network for Few-Shot Classification

Slides Poster Similar

Few-shot classification aims to recognize unlabeled samples from unseen classes given only a few labeled samples.Current approaches of few-shot learning usually employ a metriclearning framework to learn a feature similarity comparison between a query (test) example and the few support (training) examples. However, these approaches all extract features from samples independently without looking at the entire task as a whole, and so fail to provide an enough discrimination to features. Moreover, the existing approaches lack the ability to select the most relevant features for the task at hand. In this work, we propose a novel algorithm called Task-Aware Attention Network (TAAN) to address the above problems in few-shot classification. By inserting a Task-Relevant Channel Attention Module into metric-based few-shot learners, TAAN generates channel attentions for each sample by aggregating the context of the entire support set and identifies the most relevant features for similarity comparison. The experiment demonstrates that TAAN is competitive in overall performance comparing to the recent state-of-the-art systems and improves the performance considerably over baseline systems on both mini-ImageNet and tiered-ImageNet benchmarks.

VOWEL: A Local Online Learning Rule for Recurrent Networks of Probabilistic Spiking Winner-Take-All Circuits

Hyeryung Jang, Nicolas Skatchkovsky, Osvaldo Simeone

Responsive image

Auto-TLDR; VOWEL: A Variational Online Local Training Rule for Winner-Take-All Spiking Neural Networks

Slides Similar

Networks of spiking neurons and Winner-Take-All spiking circuits (WTA-SNNs) can detect information encoded in spatio-temporal multi-valued events. These are described by the timing of events of interest, e.g., clicks, as well as by categorical numerical values assigned to each event, e.g., like or dislike. Other use cases include object recognition from data collected by neuromorphic cameras, which produce, for each pixel, signed bits at the times of sufficiently large brightness variations. Existing schemes for training WTA-SNNs are limited to rate-encoding solutions, and are hence able to detect only spatial patterns. Developing more general training algorithms for arbitrary WTA-SNNs inherits the challenges of training (binary) Spiking Neural Networks (SNNs). These amount, most notably, to the non-differentiability of threshold functions, to the recurrent behavior of spiking neural models, and to the difficulty of implementing backpropagation in neuromorphic hardware. In this paper, we develop a variational online local training rule for WTA-SNNs, referred to as VOWEL, that leverages only local pre- and post-synaptic information for visible circuits, and an additional common reward signal for hidden circuits. The method is based on probabilistic generalized linear neural models, control variates, and variational regularization. Experimental results on real-world neuromorphic datasets with multi-valued events demonstrate the advantages of WTA-SNNs over conventional binary SNNs trained with state-of-the-art methods, especially in the presence of limited computing resources.

Switching Dynamical Systems with Deep Neural Networks

Cesar Ali Ojeda Marin, Kostadin Cvejoski, Bogdan Georgiev, Ramses J. Sanchez

Responsive image

Auto-TLDR; Variational RNN for Switching Dynamics

Slides Poster Similar

The problem of uncovering different dynamicalregimes is of pivotal importance in time series analysis. Switchingdynamical systems provide a solution for modeling physical phe-nomena whose time series data exhibit different dynamical modes.In this work we propose a novel variational RNN model forswitching dynamics allowing for both non-Markovian and non-linear dynamical behavior between and within dynamic modes.Attention mechanisms are provided to inform the switchingdistribution. We evaluate our model on synthetic and empiricaldatasets of diverse nature and successfully uncover differentdynamical regimes and predict the switching dynamics.

Task-based Focal Loss for Adversarially Robust Meta-Learning

Yufan Hou, Lixin Zou, Weidong Liu

Responsive image

Auto-TLDR; Task-based Adversarial Focal Loss for Few-shot Meta-Learner

Slides Poster Similar

Adversarial robustness of machine learning has been widely studied in recent years, and a series of effective methods are proposed to resist adversarial attacks. However, less attention is paid to few-shot meta-learners which are much more vulnerable due to the lack of training samples. In this paper, we propose Task-based Adversarial Focal Loss (TAFL) to handle this tough challenge on a typical meta-learner called MAML. More concretely, we regard few-shot classification tasks as normal samples in learning models and apply focal loss mechanism on them. Our proposed method focuses more on adversarially fragile tasks, leading to improvement on overall model robustness. Results of extensive experiments on several benchmarks demonstrate that TAFL can effectively promote the performance of the meta-learner on adversarial examples with elaborately designed perturbations.

Separation of Aleatoric and Epistemic Uncertainty in Deterministic Deep Neural Networks

Denis Huseljic, Bernhard Sick, Marek Herde, Daniel Kottke

Responsive image

Auto-TLDR; AE-DNN: Modeling Uncertainty in Deep Neural Networks

Slides Poster Similar

Despite the success of deep neural networks (DNN) in many applications, their ability to model uncertainty is still significantly limited. For example, in safety-critical applications such as autonomous driving, it is crucial to obtain a prediction that reflects different types of uncertainty to address life-threatening situations appropriately. In such cases, it is essential to be aware of the risk (i.e., aleatoric uncertainty) and the reliability (i.e., epistemic uncertainty) that comes with a prediction. We present AE-DNN, a model allowing the separation of aleatoric and epistemic uncertainty while maintaining a proper generalization capability. AE-DNN is based on deterministic DNN, which can determine the respective uncertainty measures in a single forward pass. In analyses with synthetic and image data, we show that our method improves the modeling of epistemic uncertainty while providing an intuitively understandable separation of risk and reliability.

Joint Supervised and Self-Supervised Learning for 3D Real World Challenges

Antonio Alliegro, Davide Boscaini, Tatiana Tommasi

Responsive image

Auto-TLDR; Self-supervision for 3D Shape Classification and Segmentation in Point Clouds

Slides Similar

Point cloud processing and 3D shape understanding are very challenging tasks for which deep learning techniques have demonstrated great potentials. Still further progresses are essential to allow artificial intelligent agents to interact with the real world. In many practical conditions the amount of annotated data may be limited and integrating new sources of knowledge becomes crucial to support autonomous learning. Here we consider several scenarios involving synthetic and real world point clouds where supervised learning fails due to data scarcity and large domain gaps. We propose to enrich standard feature representations by leveraging self-supervision through a multi-task model that can solve a 3D puzzle while learning the main task of shape classification or part segmentation. An extensive analysis investigating few-shot, transfer learning and cross-domain settings shows the effectiveness of our approach with state-of-the-art results for 3D shape classification and part segmentation.

RLST: A Reinforcement Learning Approach to Scene Text Detection Refinement

Xuan Peng, Zheng Huang, Kai Chen, Jie Guo, Weidong Qiu

Responsive image

Auto-TLDR; Saccadic Eye Movements and Peripheral Vision for Scene Text Detection using Reinforcement Learning

Slides Poster Similar

Within the research of scene text detection, some previous work has already achieved significant accuracy and efficiency. However, most of the work was generally done without considering about the implicit relationship between detection and eye movements. In this paper, we propose a new method for scene text detection especially for its refinement based on reinforcement learning. The idea of this method is inspired by Saccadic Eye Movements and Peripheral Vision. A saccade makes it possible for humans to orient the gaze to the location where a visual object has appeared. Peripheral vision gathers visual information of surroundings which provides supplement to foveal vision during gazing. We propose a simple pipeline, imitating the way human eyes do a saccade and collect peripheral information, to locate scene text roughly and to refine multi-scale vision field iteratively using reinforcement learning. For both training and evaluation, we use ICDAR2015 Challenge 4 dataset as a base and design several criteria to measure the feasibility of our work.

Learning Stable Deep Predictive Coding Networks with Weight Norm Supervision

Guo Ruohao

Responsive image

Auto-TLDR; Stability of Predictive Coding Network with Weight Norm Supervision

Slides Poster Similar

Predictive Coding Network (PCN) is an important neural network inspired by visual processing models in neuroscience. It combines the feedforward and feedback processing and has the architecture of recurrent neural networks (RNNs). This type of network is usually trained with backpropagation through time (BPTT). With infinite recurrent steps, PCN is a dynamic system. However, as one of the most important properties, stability is rarely studied in this type of network. Inspired by reservoir computing, we investigate the stability of hierarchical RNNs from the perspective of dynamic systems, and propose a sufficient condition for their echo state property (ESP). Our study shows the global stability is determined by stability of the local layers and the feedback between neighboring layers. Based on it, we further propose Weight Norm Supervision, a new algorithm that controls the stability of PCN dynamics by imposing different weight norm constraints on different parts of the network. We compare our approach with other training methods in terms of stability and prediction capability. The experiments show that our algorithm learns stable PCNs with a reliable prediction precision in the most effective and controllable way.

Adaptive Noise Injection for Training Stochastic Student Networks from Deterministic Teachers

Yi Xiang Marcus Tan, Yuval Elovici, Alexander Binder

Responsive image

Auto-TLDR; Adaptive Stochastic Networks for Adversarial Attacks

Slides Similar

Adversarial attacks have been a prevalent problem causing misclassification in machine learning models, with stochasticity being a promising direction towards greater robustness. However, stochastic networks frequently underperform compared to deterministic deep networks. In this work, we present a conceptually clear adaptive noise injection mechanism in combination with teacher-initialisation, which adjusts its degree of randomness dynamically through the computation of mini-batch statistics. This mechanism is embedded within a simple framework to obtain stochastic networks from existing deterministic networks. Our experiments show that our method is able to outperform prior baselines under white-box settings, exemplified through CIFAR-10 and CIFAR-100. Following which, we perform in-depth analysis on varying different components of training with our approach on the effects of robustness and accuracy, through the study of the evolution of decision boundary and trend curves of clean accuracy/attack success over differing degrees of stochasticity. We also shed light on the effects of adversarial training on a pre-trained network, through the lens of decision boundaries.

PROPEL: Probabilistic Parametric Regression Loss for Convolutional Neural Networks

Muhammad Asad, Rilwan Basaru, S M Masudur Rahman Al Arif, Greg Slabaugh

Responsive image

Auto-TLDR; PRObabilistic Parametric rEgression Loss for Probabilistic Regression Using Convolutional Neural Networks

Slides Similar

In recent years, Convolutional Neural Networks (CNNs) have enabled significant advancements to the state-of-the-art in computer vision. For classification tasks, CNNs have widely employed probabilistic output and have shown the significance of providing additional confidence for predictions. However, such probabilistic methodologies are not widely applicable for addressing regression problems using CNNs, as regression involves learning unconstrained continuous and, in many cases, multi-variate target variables. We propose a PRObabilistic Parametric rEgression Loss (PROPEL) that facilitates CNNs to learn parameters of probability distributions for addressing probabilistic regression problems. PROPEL is fully differentiable and, hence, can be easily incorporated for end-to-end training of existing CNN regression architectures using existing optimization algorithms. The proposed method is flexible as it enables learning complex unconstrained probabilities while being generalizable to higher dimensional multi-variate regression problems. We utilize a PROPEL-based CNN to address the problem of learning hand and head orientation from uncalibrated color images. Our experimental validation and comparison with existing CNN regression loss functions show that PROPEL improves the accuracy of a CNN by enabling probabilistic regression, while significantly reducing required model parameters by 10x, resulting in improved generalization as compared to the existing state-of-the-art.

Deep Next-Best-View Planner for Cross-Season Visual Route Classification

Kurauchi Kanya, Kanji Tanaka

Responsive image

Auto-TLDR; Active Visual Place Recognition using Deep Convolutional Neural Network

Slides Poster Similar

This paper addresses the problem of active visual place recognition (VPR) from a novel perspective of long-term autonomy. In our approach, a next-best-view (NBV) planner plans an optimal action-observation-sequence to maximize the expected cost-performance for a visual route classification task. A difficulty arises from the fact that the NBV planner is trained and tested in different domains (times of day, weather conditions, and seasons). Existing NBV methods may be confused and deteriorated by the domain-shifts, and require significant efforts for adapting them to a new domain. We address this issue by a novel deep convolutional neural network (DNN) -based NBV planner that does not require the adaptation. Our main contributions in this paper are summarized as follows: (1) We present a novel domain-invariant NBV planner that is specifically tailored for DNN-based VPR. (2) We formulate the active VPR as a POMDP problem and present a feasible solution to address the inherent intractability. Specifically, the probability distribution vector (PDV) output by the available DNN is used as a domain-invariant observation model without the need to retrain it. (3) We verify efficacy of the proposed approach through challenging cross-season VPR experiments, where it is confirmed that the proposed approach clearly outperforms the previous single-view-based or multi-view-based VPR in terms of VPR accuracy and/or action-observation-cost.

A Close Look at Deep Learning with Small Data

Lorenzo Brigato, Luca Iocchi

Responsive image

Auto-TLDR; Low-Complex Neural Networks for Small Data Conditions

Slides Poster Similar

In this work, we perform a wide variety of experiments with different Deep Learning architectures in small data conditions. We show that model complexity is a critical factor when only a few samples per class are available. Differently from the literature, we improve the state of the art using low complexity models. We show that standard convolutional neural networks with relatively few parameters are effective in this scenario. In many of our experiments, low complexity models outperform state-of-the-art architectures. Moreover, we propose a novel network that uses an unsupervised loss to regularize its training. Such architecture either improves the results either performs comparably well to low capacity networks. Surprisingly, experiments show that the dynamic data augmentation pipeline is not beneficial in this particular domain. Statically augmenting the dataset might be a promising research direction while dropout maintains its role as a good regularizer.

Algorithm Recommendation for Data Streams

Jáder Martins Camboim De Sá, Andre Luis Debiaso Rossi, Gustavo Enrique De Almeida Prado Alves Batista, Luís Paulo Faina Garcia

Responsive image

Auto-TLDR; Meta-Learning for Algorithm Selection in Time-Changing Data Streams

Slides Poster Similar

In the last decades, many companies are taking advantage of massive data generation at high frequencies through knowledge discovery to identify valuable information. Machine learning techniques can be employed for knowledge discovery, since they are able to extract patterns from data and induce models to predict future events. However, dynamic and evolving environments generate streams of data that usually are non-stationary. Models induced in these scenarios may perish over time due to seasonality or concept drift. The periodic retraining could help but the fixed algorithm's hypothesis space could no longer be appropriate. An alternative solution is to use meta-learning for periodic algorithm selection in time-changing environments, choosing the bias that best suits the current data. In this paper, we present an enhanced framework for data streams algorithm selection based on MetaStream. Our approach uses meta-learning and incremental learning to actively select the best algorithm for the current concept in a time-changing. Different from previous works, a set of cutting edge meta-features and an incremental learning approach in the meta-level based on LightGBM are used. The results show that this new strategy can improve the recommendation of the best algorithm more accurately in time-changing data.