Difference between revisions of "Private:hpc-discussion"

Latest revision as of 16:08, 23 December 2009

Back to the HPC main page

Implementing kernel matrix approximation algorithm of [HA09] on a multi-core CPU

Advantages:

Faster kernel matrix computation than brute-force on the CPU

Handling larger data sets by using the sparse representation of [HA09]

Possible improvements:

Using Hilbert curves instead of the Z-curve

Using Locality Sensitive Hashing

Fast support vector machines

Use the data reordering principal as in [HA09]

Use locality sensitive hashing to reorder data in the input space

Compute the kernel matrix using the new data reordering The kernel matrix can be efficiently computed in parallel on multiple cores

Note: It is sufficient to obtain sub-optimal data reordering because it is only used to create the band kernel matrix

Use QR decomposition to compute the eigen decomposition and regularize the kernel matrix

Use the standard sequential minimal optimization (SMO) to compute the support vectors

Advantages:

Faster SVMs

Handling larger data sets

Possible improvements:

If we use kernel functions with compact support, can we relate the kernel bandwidth to the neighborhood size of LSH? If we can do that, we might end up with a symmetric, positive, semi-definite kernel matrix that does not need further regularization

Exa-scale support vector machines

The algorithm to be developed in (2) lends itself to implementation on high performance clusters Compute the new data reordering on a master node

Divide the kernel matrix into blocks and compute each block on a separate node, based on number of nodes

size of data memory available on each node Run SMO on the master node and use message passing to communicate required values

Advantages

natural scalability to increasing data sizes

@@ Line 1: / Line 1: @@
+Back to the [http://nsl.cs.sfu.ca/wiki/index.php/hpc HPC main page]
-== Implementing kernel matrix approximation algorithm of [1] on a multi-core CPU ==
+== Implementing kernel matrix approximation algorithm of [http://www.cs.umd.edu/~mhussein/papers/sc09_BAG_ha.pdf [HA09]] on a multi-core CPU ==
 '''Advantages:'''
@@ Line 6: / Line 7: @@
 * Faster kernel matrix computation than brute-force on the CPU
-* Handling larger data sets by using the sparse representation of [1]
+* Handling larger data sets by using the sparse representation of  [http://www.cs.umd.edu/~mhussein/papers/sc09_BAG_ha.pdf [HA09]]
 '''Possible improvements:'''
@@ Line 12: / Line 13: @@
 * Using Hilbert curves instead of the Z-curve
-* Using Locality Sensitive Hash Functions
+* [http://www.mit.edu/~andoni/LSH/ Using Locality Sensitive Hashing]
 == Fast support vector machines ==
-* Use the data reordering principal as in [1]
+* Use the data reordering principal as in [http://www.cs.umd.edu/~mhussein/papers/sc09_BAG_ha.pdf [HA09]]
-* Use locality sensitive hashing [2] to reorder data in the input space
+* Use locality sensitive hashing to reorder data in the input space
 * Compute the kernel matrix using the new data reordering The kernel matrix can be efficiently computed in parallel on multiple cores
@@ Line 50: / Line 51: @@
 * natural scalability to increasing data sizes
-References
-[1] M. Hussein and W. Abd-Almageed, "Efficient Band Approximation of Gram Matrices for Large Scale Kernel Methods on GPUs," Supercomputing, 2009
-[2] http://www.mit.edu/~andoni/LSH/