Skip to content

7. Jádrové funkce Souvislost s bázovými funkcemi, obecné vlastnosti, příklady

Kernel Functions

Kernel functions play a crucial role in machine learning, enabling simpler computations in high-dimensional data spaces. In more precise terms, these functions are used to calculate the inner product of two points in an appropriate feature space, hence sometimes referred to as "generalized dot products."

A kernel function is typically denoted as:

K(x, y) = \langle \phi(x), \phi(y) \rangle

Here: - x, y represent the input data points. - \phi is the feature map, responsible for mapping input data into a high-dimensional feature space. - \langle \phi(x), \phi(y) \rangle is the dot product in the feature space.

This definition underlines an essential attribute of kernel functions: they can calculate the dot product of data points mapped into a higher-dimensional space by a feature map in the original, lower-dimensional space.

Connection Between Kernel Functions and Basis Functions

Basis functions are like building blocks that are used to create other functions. They transform data into a new space, just as a kernel does.

A kernel function usually gets constructed from a set of basis functions, with each dimension in the transformed feature space corresponding to one basis function. However, unlike the basis function, the kernel trick allows us to compute dot products in this feature space without explicitly calculating the coordinates of the data in this space, which can sometimes be infinite-dimensional.

General Characteristics of Kernel Functions

Kernel functions possess several significant properties, including:

  1. Symmetry: Kernel functions are symmetric, implying that for any two points x and y, K(x, y) = K(y, x).

  2. Positive Semidefinite: Kernel functions are positive semi-definite. In simpler terms, for any group of points, the kernel matrix (or Gram matrix), given by the kernel evaluated at these points, is positive semi-definite.

A function K(x, y) is defined as a kernel function if and only if for any set {x_1, ..., x_m}, the related kernel matrix K = [K(x_i, x_j)]_{i,j=1}^m is positive semi-definite.

Examples of Kernel Functions

Various kernel functions are commonly used in real-world applications:

  • Linear Kernel: This is the simplest type of kernel, used when the data can be linearly separated. It's defined as: K(x, y) = \langle x, y \rangle.

  • Polynomial Kernel: Polynomial kernel introduces non-linearity and is often used when the data isn't linearly separable. It's defined as: K(x, y) = ( \langle x, y \rangle + c )^d, where c \geq 0 is a free parameter trading off the influence of higher-order versus lower-order terms in the polynomial, and d is the degree of the polynomial.

  • Radial Basis Function (RBF) or Gaussian Kernel: RBF kernel is a popular kernel for support vector machines and is also used in other kernelized models. It's defined as: K(x, y) = e^{(\dfrac{||x-y||^2}{2o^2})}, where ||x-y||^2 is the squared Euclidean distance between two points.

Constructing New Kernels

New kernel functions can be designed by merging existing kernel functions. For instance, if K_1 and K_2 are kernels, then the following are also kernels:

  • K(x, y) = aK_1(x, y) + bK_2(x, y) for any a, b \geq 0.

  • K(x, y) = K_1(x, y) K_2(x, y).