Hi-Fi: Hierarchical Feature Integration for Skeleton Detection

Kai Zhao1, Wei Shen2, Shanghua Gao1, Dandan Li2, Ming-Ming Cheng1,

1: College of Computer and Control Engineering, Nankai University.

2: Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Shanghai University.


In natural images, the scales (thickness) of object skeletons may dramatically vary among objects and object parts, making object skeleton detection a challenging problem. We present a new convolutional neural network (CNN) architecture by introducing a novel hierarchical feature integration mechanism, named Hi-Fi, to address the skeleton detection problem. The proposed CNN-based approach has a powerful multi-scale feature integration ability that intrinsically captures high-level semantics from deeper layers as well as low-level details from shallower layers. By hierarchically integrating different CNN feature levels with bidirectional guidance, our approach (1) enables mutual refinement across features of different levels, and (2) possesses the strong ability to capture both rich object context and high-resolution details. Experimental results show that our method significantly outperforms the state-of-the-art methods in terms of effectively fusing features from very different scales, as evidenced by a considerable performance improvement on several benchmarks.


Different multi-scale CNN feature fusing methods: (a) side-outputs as independent detectors at different scales; (b) deep-to-shallow refinement that brings high-level semantics to lower layers; (c) directly fuse all feature levels at once; (d) our hierarchical integration architecture, which enables bidirectional mutual refinement across low/high level features by recursive feature integration.

Performance Evaluation:

We test the proposed method on 4 skeleton detasets in terms of F-measure and pr-curve. The datasets are: SK-SMALL, SK-LARGE, SYM-PASCAL and WH-SYMMAX.

Performance comparison between different methods on four popular skeleton datasets. Our proposed Hi-Fi network outperforms other methods with an evident margin.

Pr-curve comparison ofskeleton detectors on different datasets.


If our method is helpful to your research, please kindly consider to cite:
  title     = {Hi-{F}i: Hierarchical Feature Integration for Skeleton Detection},
  author    = {Kai Zhao and Wei Shen and Shanghua Gao and Dandan Li and Ming-Ming Cheng},
  booktitle = {Proceedings of the Twenty-Seventh International Joint Conference on
               Artificial Intelligence, {IJCAI-18}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  pages     = {1191--1197},
  year      = {2018},
  month     = {7},
  doi       = {10.24963/ijcai.2018/166},
  url       = {http://kaizhao.net/hifi},
[arXiv], [TeX source code] and [Slides].

Code and pretrained models:

Full training and testing code are available at https://github.com/zeakey/skeleton.

We provide pretrained models, pre-computed results along with evaluation results on several datasets:
  1. Hi-Fi2 on SK-LARGE dataset (model, detections and evaluation results): http://data.kaizhao.net/projects/hifi/hifi2-sklarge-results.tar
  2. Hi-Fi2 on Sym-PASCAL dataset (model, detections and evaluation results): http://data.kaizhao.net/projects/hifi/hifi2-pascal-results.tar
  3. Hi-Fi2 on WH-SYMMAX dataset (model, detections and evaluation results): http://data.kaizhao.net/projects/hifi/hifi2-horse-results.tar
  4. FSDS on SK-LARGE dataset (pretrained model only): http://data.kaizhao.net/projects/skeleton/fsds-pretrained-skl.zip
  5. FSDS on WH-SYMMAX dataset (pretrained model only): http://data.kaizhao.net/projects/skeleton/fsds-wh-symmax.caffemodel