Zhiguo Cao

Papers from this author

Exploiting Distilled Learning for Deep Siamese Tracking

Chengxin Liu, Zhiguo Cao, Wei Li, Yang Xiao, Shuaiyuan Du, Angfan Zhu

Responsive image

Auto-TLDR; Distilled Learning Framework for Siamese Tracking

Slides Poster Similar

Existing deep siamese trackers are typically built on off-the-shelf CNN models for feature learning, with the demand for huge power consumption and memory storage. This limits current deep siamese trackers to be carried on resource-constrained devices like mobile phones, given factor that such a deployment normally requires cost-effective considerations. In this work, we address this issue by presenting a novel Distilled Learning Framework(DLF) for siamese tracking, which aims at learning tracking model with efficiency and high accuracy. Specifically, we propose two simple yet effective knowledge distillation strategies, denote as point-wise distillation and pair-wise distillation, which are designed for transferring knowledge from a more discriminative teacher tracker into a compact student tracker. In this way, cost-effective and high performance tracking could be achieved. Extensive experiments on several tracking benchmarks demonstrate the effectiveness of our proposed method.

Multi-Direction Convolution for Semantic Segmentation

Dehui Li, Zhiguo Cao, Ke Xian, Xinyuan Qi, Chao Zhang, Hao Lu

Responsive image

Auto-TLDR; Multi-Direction Convolution for Contextual Segmentation

Slides Similar

Context is known to be one of crucial factors effecting the performance improvement of semantic segmentation. However, state-of-the-art segmentation models built upon fully convolutional networks are inherently weak in encoding contextual information because of stacked local operations such as convolution and pooling. Failing to capture context leads to inferior segmentation performance. Despite many context modules have been proposed to relieve this problem, they still operate in a local manner or use the same contextual information in different positions (due to upsampling). In this paper, we introduce the idea of Multi-Direction Convolution (MDC)—a novel operator capable of encoding rich contextual information. This operator is inspired by an observation that the standard convolution only slides along the spatial dimension (x, y direction) where the channel dimension (z direction) is fixed, which renders slow growth of the receptive field (RF). If considering the channel-fixed convolution to be one-direction, MDC is multi-direction in the sense that MDC slides along both spatial and channel dimensions, i.e., it slides along x, y when z is fixed, along x, z when y is fixed, and along y, z when x is fixed. In this way, MDC is able to encode rich contextual information with the fast increase of the RF. Compared to existing context modules, the encoded context is position-sensitive because no upsampling is required. MDC is also efficient and easy to implement. It can be implemented with few standard convolution layers with permutation. We show through extensive experiments that MDC effectively and selectively enlarges the RF and outperforms existing contextual modules on two standard benchmarks, including Cityscapes and PASCAL VOC2012.

Parallel Network to Learn Novelty from the Known

Shuaiyuan Du, Chaoyi Hong, Zhiyu Pan, Chen Feng, Zhiguo Cao

Responsive image

Auto-TLDR; Trainable Parallel Network for Pseudo-Novel Detection

Slides Poster Similar

Towards multi-class novelty detection, we propose an end-to-end trainable Parallel Network (PN) using no additional data but only the training set itself. Our key idea is to first divide the training set into successive subtasks of pseudo-novelty detection to simulate real scenarios. We then design a multi-branch PN to well address the fine-grained division, which yields a compressed and more discriminative classification space and forms a natural ensemble. In practice, we divide the training data into subsets consisting of known and pseudo-novel classes. Each subset forms a sub-task fed to one branch in PN. During training, both known and pseudo-novel classes are uniformly distributed over the branches for better data balance and model diversity. By distinguishing between the known and the diverse pseudo-novel, PN extracts the concept of novelty in a compressed classification space. This provides PN with generalization ability to real novel classes which are absent during training. During online inference, this ability is further strengthened with the ensemble of PN's multiple branches. Experiments on three public datasets show our method's superiority to the mainstream methods.