Sunday, June 25, 2017


The development of image querying using content-based image retrieval (CBIR) techniques has attracted a great attention owing to its abundant applicability. We are particularly interested in how the user's semantics could be integrated into a CBIR system. The user's semantics for CBIR involves two different sources of information: the similarity relations entailed by the content-based features, and the relevance relations specified in the feedback. Besides, we also look into the issues of selecting a good feature set for improving the retrieval performance.


Semantic Manifold Learning for Image Retrieval

  • To address the problem of CBIR with relevance feedback by manifold learning
    • Two aspects of information are being fused: Intrinsic Similarity Relations and Query & Relevance Feedback
    • User-specific Semantic Manifold
  • A manifold-learning technique to effectively fuse the multi-modality of information
    • Intrinsic Similarity Relations: abundant, and auxiliary
    • User Relevance Feedback: few, but crucial
  • User relevance feedback serves as augmented relations to intrinsic similarity relations
    • Augmented Relations Embedding (ARE)
    • Augmented features: global and local image features



  • Global
    • Color
      • Quantized HSV color histogram: 64 dimensions
      • First three moments in each color channel: 9 dimensions
      • Color coherence vector: 128 dimensions
    • Texture
      • Tamura coarseness: 10 dimensions
      • Tamura directionality: 8 dimensions
    • Wavelet
      • 3-level DWT image decomposition
      • The first two moments in each high-frequency sub-bands: 18 dimensions
  • Local
    • Detecting salient regions (associated with interest points)
    • Describing the local properties of each salient region
    • Difference-of-Gaussian (DoG) for salient-region detection [Lowe 2004]
  • Augmented Features
    • The proposed image representation
    • Augmenting the local-feature scheme with global image properties so that the vector quantization just described can be used to derive a k-dimensional vector



Visualization of Embedding Space

  • The iterative procedure
    • Use ARE to compute a 30-D semantic manifold
    • Project data points onto a 2-D plane by multidimensional scaling (MDS) [Cox and Cox, 1994] for visualization


Office Interiors





Semantic Manifold Learning for Image Retrieval

Yen-Yu Lin, Tyng-Luh Liu, and Hwann-Tzong Chen

ACM International Conference on Multimedia (ACM MM), December 2005, (Best Student Papers Session)