Difference between revisions of "Private:hpc-discussion"
From NMSL
(New page: == Implementing kernel matrix approximation algorithm of [1] on a multi-core CPU == '''Advantages:''' * Faster kernel matrix computation than brute-force on the CPU * Handling larger d...) |
|||
Line 1: | Line 1: | ||
+ | Back to the [http://nsl.cs.sfu.ca/wiki/index.php/hpc HPC main page] | ||
− | == Implementing kernel matrix approximation algorithm of [ | + | == Implementing kernel matrix approximation algorithm of [http://www.cs.umd.edu/~mhussein/papers/sc09_BAG_ha.pdf [HA09]] on a multi-core CPU == |
'''Advantages:''' | '''Advantages:''' | ||
Line 6: | Line 7: | ||
* Faster kernel matrix computation than brute-force on the CPU | * Faster kernel matrix computation than brute-force on the CPU | ||
− | * Handling larger data sets by using the sparse representation of [ | + | * Handling larger data sets by using the sparse representation of [http://www.cs.umd.edu/~mhussein/papers/sc09_BAG_ha.pdf [HA09]] |
'''Possible improvements:''' | '''Possible improvements:''' | ||
Line 12: | Line 13: | ||
* Using Hilbert curves instead of the Z-curve | * Using Hilbert curves instead of the Z-curve | ||
− | * Using Locality Sensitive | + | * [http://www.mit.edu/~andoni/LSH/ Using Locality Sensitive Hashing] |
== Fast support vector machines == | == Fast support vector machines == | ||
− | * Use the data reordering principal as in [ | + | * Use the data reordering principal as in [http://www.cs.umd.edu/~mhussein/papers/sc09_BAG_ha.pdf [HA09]] |
− | * Use locality sensitive hashing | + | * Use locality sensitive hashing to reorder data in the input space |
* Compute the kernel matrix using the new data reordering The kernel matrix can be efficiently computed in parallel on multiple cores | * Compute the kernel matrix using the new data reordering The kernel matrix can be efficiently computed in parallel on multiple cores | ||
Line 50: | Line 51: | ||
* natural scalability to increasing data sizes | * natural scalability to increasing data sizes | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− |
Latest revision as of 16:08, 23 December 2009
Back to the HPC main page
Implementing kernel matrix approximation algorithm of [HA09] on a multi-core CPU
Advantages:
- Faster kernel matrix computation than brute-force on the CPU
- Handling larger data sets by using the sparse representation of [HA09]
Possible improvements:
- Using Hilbert curves instead of the Z-curve
Fast support vector machines
- Use the data reordering principal as in [HA09]
- Use locality sensitive hashing to reorder data in the input space
- Compute the kernel matrix using the new data reordering The kernel matrix can be efficiently computed in parallel on multiple cores
- Note: It is sufficient to obtain sub-optimal data reordering because it is only used to create the band kernel matrix
- Use QR decomposition to compute the eigen decomposition and regularize the kernel matrix
- Use the standard sequential minimal optimization (SMO) to compute the support vectors
Advantages:
- Faster SVMs
- Handling larger data sets
Possible improvements:
- If we use kernel functions with compact support, can we relate the kernel bandwidth to the neighborhood size of LSH? If we can do that, we might end up with a symmetric, positive, semi-definite kernel matrix that does not need further regularization
Exa-scale support vector machines
- The algorithm to be developed in (2) lends itself to implementation on high performance clusters Compute the new data reordering on a master node
- Divide the kernel matrix into blocks and compute each block on a separate node, based on number of nodes
size of data memory available on each node Run SMO on the master node and use message passing to communicate required values
Advantages
- natural scalability to increasing data sizes