Summer 2011 (TA)

Research: Gram Matrix approximation

Aug 8

working on comparing LSH methods with Space filling Curve methods.
- for the low level comparing (matrix level using average FN): we get almost same average FN for both spectral hashing and H-curve.
- comparing in the high level (clustering error):
  - I am picking the optimal value of k for spectral hashing and choosing a widow size for H-curve that gives us same approximated matrix size.
  - In the affinity propagation, using small value of self similarity (s(i,i)) we get small numbers of clusters and vise verse , so I am changing self similarity values between max and min to get same number of clusters for both methods, which will be the x-axis, on the other hand, the y axis will be the average distance between each point and its Exemplar.
PS: I was sick for a week after I got back form China and then busy in setting up and grading assignments for cmpt300.

July 10 - July 24

Attending workshop on High Performance Computing at National University of Defense Technology, China.

July 8

Comparing our method using the "Locality Sensitive Hashing" to the method using "Hilbert Space filling Curve"
Initial results can be found at this address students/dameh/reports/dameh/summer11/H-CurveVSlsh.pdf

Jun 17

Report should be available here

Jun 03

Proposed new metric to be used instead of the FN ratio (more powerful to compare different families and to pick the optimal value of k)
More analysis on the spectral hashing
Writing up the report with the results to be sent soon

May 19

Spectral Hashing is Data dependent hashing and more powerful than the random projection, see this comparison examples in the 2D.
Random Projection Hashing
- Method:
  - Take random projections of data.
  - Quantize the projections to hash value’s bits.
- Disadvantage:
  - Data Independent Hashing.
- Advantage:
  - Strong theoretical guarantees.
  - Simple and fast to compute (linear time).
- Types:
  - Euclidean distance: two parameters to vary that have the opposite affect (number of random vectors k and quantization width w).
  - L1 distance: uncontrolled, as k increase error increase.
  - Cosine distance: more controlled than L1 as k increase.
Data-Dependent Hashing
- Advantage:
  - Data-distribution dependent; powerful than Random Projection.
- Disadvantage:
  - More complex to compute than Random Projection.
- Types:
  - Spectral hashing:
    - Compute PCA.
    - Calculate the k smallest single-dimension analytical EigenFunctions.
    - Threshold the analytical EigenFnction of zero to obtain hash value's bits.
  - KernelLSH:
    - The key idea is first embed the data into a lower space using the kernel G and then do LSH in the lower dimensional space.

May 5

Recap
- Problem: kernel methods' gram matrix is O(N^2), which is computationally and memory usage infeasible for large-scale datasets
- Previous Work:
  - Low rank matrix approximation:
    - using EigenDecomposition: Optimal solution
      - Problem: requires O(N^3) computation to construct the low rank matrix
      - Solution: use Nytsrom theorem to approximate the EigenDecomposition of the matrix which requires O(m^2N) where m << N.
      - Problem: In this methods in despite of reducing the computation complexity still we need O(N^2) to store the matrix, so the next methods reduce both computation and memory.
  - Efficient implementation for computing the kernel matrix:
    - Using Spatial Indexing:
      - using Z-order, H-order, kd-tress and other spatial indexing techniques to order the points in the space and then using sliding window where the kernel function computed only between points in that window.
      - Problems: requires O(NlogN) as preprocessing to order the points, fails for very high dimensional Data and hard to parallel
      - Solution: We proposed a method that works fine for large-scale and highly dimensional data that overcomes all pervious works drawbacks, easy to parallel, requires linear time preprocessing and reduce both computation and memory usage down form quadratic to subquadratic:
    - Using Spatial Hashing or Locality sensitive hashing: where the kernel function and the cluster methods works on the buckets individually.
      - LSH families :
        Random projection:
        Hamming distance, Euclidean Distance, Cosine Distance
        
        Problem for random projection, assume data uniformly distrubuted and it doesn’t take in account if data are in a kernel space, Solution:
        
        Spectral Hashing: overcomes previous families problems
        
        Kernelized LSH : Problem: more complex to compute
- In progress: Comparing different LSH families

Spring 2011 (TA/RA)

Courses:
- CMPT 885: Special Topics in Computer Architecture
- CMPT 886: Special Topics in Operating Systems

Fall 2010 (FAS-GF/RA)

Courses:
- CMPT 705: Design/Analysis Algorithms
- CMPT 820: Multimedia Systems

Worked on Gram matrix approximation, High Performance/large scale application case studies

Summer 2010 (TA)

Worked on large Scale Kernel methods

Spring 2010 (RA)

Courses:
- CMPT 768: Sound Synthesis

Worked on Locality sensitive hashing families and their applications and implementations

Private:progress-dameh

Contents

Summer 2011 (TA)

Aug 8

July 10 - July 24

July 8

Jun 17

Jun 03

May 19

May 5

Spring 2011 (TA/RA)

Fall 2010 (FAS-GF/RA)

Summer 2010 (TA)

Spring 2010 (RA)