Kernel Trick
(Deep Learning with Python - François Chollet pg. 15)
If data is mapped to a very high dimensional space, then the decision boundary can be a hyperplane. And this works perfect in theory, but converting to input to the high dimensional space is computationally expensive. This is where Kernel Trick comes in:
A kernel function maps any two points in your initial space to the distance between these points in your target representation space without having to explicitly map the input to the representation space.
E.g.
- Polynomial Kernel: \(k(x_i, x_j) = (x_i \cdot x_j)^d\)
- Gaussian Kernel (Radial Basis Function): \(k(x_i, x_j) = exp(-\gamma||x_i \cdot x_j ||^2)\) for \(\gamma > 0\)
Support Vector Machine (SVM) in conjuction with the kernel trick give good results on small datasets.