ICPR2020 Paper Browser

Paper download is intended for registered attendees only, and is subjected to the IEEE Copyright Policy. Any other use is strongly forbidden.

Overcoming Noisy and Irrelevant Data in Federated Learning

Tiffany Tuor, Shiqiang Wang, Bong Jun Ko, Changchang Liu, Kin K Leung

Auto-TLDR; Distributedly Selecting Relevant Data for Federated Learning

Abstract Slides Poster

Many image and vision applications require a large amount of data for model training. Collecting all such data at a central location can be challenging due to data privacy and communication bandwidth restrictions. Federated learning is an effective way of training a machine learning model in a distributed manner from local data collected by client devices, which does not require exchanging the raw data among clients. A challenge is that among the large variety of data collected at each client, it is likely that only a subset is relevant for a learning task while the rest of data has a negative impact on model training. Therefore, before starting the learning process, it is important to select the subset of data that is relevant to the given federated learning task. In this paper, we propose a method for distributedly selecting relevant data, where we use a benchmark model trained on a small benchmark dataset that is task-specific, to evaluate the relevance of individual data samples at each client and select the data with sufficiently high relevance. Then, each client only uses the selected subset of its data in the federated learning process. The effectiveness of our proposed approach is evaluated on multiple real-world image datasets in a simulated system with a large number of clients, showing up to 25% improvement in model accuracy compared to training with all data.

Similar papers

Adaptive Distillation for Decentralized Learning from Heterogeneous Clients

Jiaxin Ma, Ryo Yonetani, Zahid Iqbal

Auto-TLDR; Decentralized Learning via Adaptive Distillation

Overcoming Noisy and Irrelevant Data in Federated Learning

Similar papers

Adaptive Distillation for Decentralized Learning from Heterogeneous Clients

Meta Soft Label Generation for Noisy Labels

P-DIFF: Learning Classifier with Noisy Labels Based on Probability Difference Distributions

Iterative Label Improvement: Robust Training by Confidence Based Filtering and Dataset Partitioning

Towards Robust Learning with Different Label Noise Distributions

NeuralFP: Out-Of-Distribution Detection Using Fingerprints of Neural Networks

Rethinking Deep Active Learning: Using Unlabeled Data at Model Training

Making Every Label Count: Handling Semantic Imprecision by Integrating Domain Knowledge

Beyond Cross-Entropy: Learning Highly Separable Feature Distributions for Robust and Accurate Classification

Local Clustering with Mean Teacher for Semi-Supervised Learning

Adaptive Noise Injection for Training Stochastic Student Networks from Deterministic Teachers

A Close Look at Deep Learning with Small Data

Generative Latent Implicit Conditional Optimization When Learning from Small Sample

Learning Embeddings for Image Clustering: An Empirical Study of Triplet Loss Approaches

Boundary Optimised Samples Training for Detecting Out-Of-Distribution Images

Evaluation of Anomaly Detection Algorithms for the Real-World Applications

Low-Cost Lipschitz-Independent Adaptive Importance Sampling of Stochastic Gradients

Image Representation Learning by Transformation Regression

Separation of Aleatoric and Epistemic Uncertainty in Deterministic Deep Neural Networks

Attack Agnostic Adversarial Defense via Visual Imperceptible Bound

Multi-Modal Deep Clustering: Unsupervised Partitioning of Images

Removing Backdoor-Based Watermarks in Neural Networks with Limited Data

Rethinking of Deep Models Parameters with Respect to Data Distribution

Rethinking Experience Replay: A Bag of Tricks for Continual Learning

Enlarging Discriminative Power by Adding an Extra Class in Unsupervised Domain Adaptation

Optimal Transport As a Defense against Adversarial Attacks

Initialization Using Perlin Noise for Training Networks with a Limited Amount of Data

Efficient Online Subclass Knowledge Distillation for Image Classification

Class-Incremental Learning with Pre-Allocated Fixed Classifiers

Verifying the Causes of Adversarial Examples

On-Manifold Adversarial Data Augmentation Improves Uncertainty Calibration

Joint Supervised and Self-Supervised Learning for 3D Real World Challenges

RNN Training along Locally Optimal Trajectories via Frank-Wolfe Algorithm

Improving Batch Normalization with Skewness Reduction for Deep Neural Networks

Knowledge Distillation with a Precise Teacher and Prediction with Abstention

Discriminative Multi-Level Reconstruction under Compact Latent Space for One-Class Novelty Detection

Cross-People Mobile-Phone Based Airwriting Character Recognition

A Joint Representation Learning and Feature Modeling Approach for One-Class Recognition

Semi-Supervised Domain Adaptation Via Selective Pseudo Labeling and Progressive Self-Training

The eXPose Approach to Crosslier Detection

Pseudo Rehearsal Using Non Photo-Realistic Images

GuCNet: A Guided Clustering-Based Network for Improved Classification

Can Data Placement Be Effective for Neural Networks Classification Tasks? Introducing the Orthogonal Loss

Deep Learning on Active Sonar Data Using Bayesian Optimization for Hyperparameter Tuning

MetaMix: Improved Meta-Learning with Interpolation-based Consistency Regularization

Revisiting ImprovedGAN with Metric Learning for Semi-Supervised Learning

Contextual Classification Using Self-Supervised Auxiliary Models for Deep Neural Networks

Rethinking Domain Generalization Baselines