Revision as of 20:46, 19 May 2011

Summer 2011 (TA)

Research: Gram Matrix approximation

May 19

Spectral Hashing is Data dependent hashing and more powerful than the random projection, see this comparison examples in the 2D.
Random Projection Hashing
- Method:
  - Take random projections of data.
  - Quantize the projections to hash value’s bits.
- Disadvantage:
  - Data Independent Hashing.
- Advantage:
  - Strong theoretical guarantees.
  - Simple and fast to compute (linear time).
- Types:
  - Euclidean distance: two parameters to vary that have the opposite affect (number of random vectors k and quantization width w).
  - L1 distance: uncontrolled, as k increase error increase.
  - Cosine distance: more controlled than L1 as k increase.
Data-Dependent Hashing
- Advantage:
  - Data-distribution dependent; powerful than Random Projection.
- Disadvantage:
  - More complex to compute than Random Projection.
- Types:
  - Spectral hashing:
    - Compute PCA.
    - Calculate the k smallest single-dimension analytical EigenFunctions.
    - Threshold the analytical EigenFnction of zero to obtain hash value's bits.
  - KernelLSH:
    - The key idea is first embed the data into a lower space using the kernel G and then do LSH in the lower dimensional space.

May 5

Recap
- Problem: kernel methods' gram matrix is O(N^2), which is computationally and memory usage infeasible for large-scale datasets
- Previous Work:
  - Low rank matrix approximation:
    - using EigenDecomposition: Optimal solution
      - Problem: requires O(N^3) computation to construct the low rank matrix
      - Solution: use Nytsrom theorem to approximate the EigenDecomposition of the matrix which requires O(m^2N) where m << N.
      - Problem: In this methods in despite of reducing the computation complexity still we need O(N^2) to store the matrix, so the next methods reduce both computation and memory.
  - Efficient implementation for computing the kernel matrix:
    - Using Spatial Indexing:
      - using Z-order, H-order, kd-tress and other spatial indexing techniques to order the points in the space and then using sliding window where the kernel function computed only between points in that window.
      - Problems: requires O(NlogN) as preprocessing to order the points, fails for very high dimensional Data and hard to parallel
      - Solution: We proposed a method that works fine for large-scale and highly dimensional data that overcomes all pervious works drawbacks, easy to parallel, requires linear time preprocessing and reduce both computation and memory usage down form quadratic to subquadratic:
    - Using Spatial Hashing or Locality sensitive hashing: where the kernel function and the cluster methods works on the buckets individually.
      - LSH families :
        Random projection:
        Hamming distance, Euclidean Distance, Cosine Distance
        
        Problem for random projection, assume data uniformly distrubuted and it doesn’t take in account if data are in a kernel space, Solution:
        
        Spectral Hashing: overcomes previous families problems
        
        Kernelized LSH : Problem: more complex to compute
- In progress: Comparing different LSH families

Spring 2011 (TA/RA)

Courses:
- CMPT 885: Special Topics in Computer Architecture
- CMPT 886: Special Topics in Operating Systems

Fall 2010 (FAS-GF/RA)

Courses:
- CMPT 705: Design/Analysis Algorithms
- CMPT 820: Multimedia Systems

Worked on Gram matrix approximation, High Performance/large scale application case studies

Summer 2010 (TA)

Worked on large Scale Kernel methods

Spring 2010 (RA)

Courses:
- CMPT 768: Sound Synthesis

Worked on Locality sensitive hashing families and their applications and implementations

@@ Line 1: / Line 1: @@
 = Summer 2011 (TA) =
 * Research: Gram Matrix approximation
 === May 19 ===
-* Spectral Hashing is Data dependent hashing and more powerful than the random projection, see this comparison examples in the 2D.
+* Spectral Hashing is Data dependent hashing and more powerful than the random projection, see this [https://cs-nsl-svn.cs.surrey.sfu.ca/cssvn/nsl-members/dameh/summer11/doc/RandomVsSpectral.pdf comparison examples] in the 2D.
 * Random Projection Hashing
 ** Method:

Difference between revisions of "Private:progress-dameh"

Revision as of 20:46, 19 May 2011

Contents

Summer 2011 (TA)

May 19

May 5

Spring 2011 (TA/RA)

Fall 2010 (FAS-GF/RA)

Summer 2010 (TA)

Spring 2010 (RA)