Private:hpc-discussion

Implementing kernel matrix approximation algorithm of [1] on a multi-core CPU

Advantages:

Faster kernel matrix computation than brute-force on the CPU

Handling larger data sets by using the sparse representation of [1]

Possible improvements:

Using Hilbert curves instead of the Z-curve

Using Locality Sensitive Hash Functions

Fast support vector machines

Use the data reordering principal as in [1]

Use locality sensitive hashing [2] to reorder data in the input space

Compute the kernel matrix using the new data reordering The kernel matrix can be efficiently computed in parallel on multiple cores

Note: It is sufficient to obtain sub-optimal data reordering because it is only used to create the band kernel matrix

Use QR decomposition to compute the eigen decomposition and regularize the kernel matrix

Use the standard sequential minimal optimization (SMO) to compute the support vectors

Advantages:

Faster SVMs

Handling larger data sets

Possible improvements:

If we use kernel functions with compact support, can we relate the kernel bandwidth to the neighborhood size of LSH? If we can do that, we might end up with a symmetric, positive, semi-definite kernel matrix that does not need further regularization

Exa-scale support vector machines

The algorithm to be developed in (2) lends itself to implementation on high performance clusters Compute the new data reordering on a master node

Divide the kernel matrix into blocks and compute each block on a separate node, based on number of nodes

size of data memory available on each node Run SMO on the master node and use message passing to communicate required values

Advantages

natural scalability to increasing data sizes

References [1] M. Hussein and W. Abd-Almageed, "Efficient Band Approximation of Gram Matrices for Large Scale Kernel Methods on GPUs," Supercomputing, 2009 [2] http://www.mit.edu/~andoni/LSH/