# VC-Dimension

@VC-Dimension

## Labeling patterns/Dichomotmies

Let $\mathcal{X}$ be a set of points. Let $\mathcal{H}$ be a set of hypotheses on these points. Let $x_1,\dots,x_n \in \mathcal{X}$. The set of dichotomies (or labeling patterns) generated by $\mathcal{H}$ on $x_1 , \dots , x_n$ is defined by $$\mathcal{H}(x_1, \dots x_n) = \{ h(x_1), \dots , h(x_n) | h \in \mathcal{H} \}$$ For example let $x_1,x_2,x_3$ be three points on a line spaced with $1$. Let $\mathcal{H}$ be circles with radius $0.1$. The circles are defined so that points inside the circle has label +, points outside has label -. Now we can place a circle with center as one of the three points. Hence we can produce the labeling patterns $$[+,-,-],[-,+,-],[-,-,+]$$ But we can't cover more than one of the points by one circle, hence we cannot generate for example the pattern $$[+,+,-]$$

## The growth function

The growth function on the hypotheses space $\mathcal{H}$ is the maximum number of labeling patterns $\mathcal{H}$ can generate on $n$ points. That is $$m_{\mathcal{H}}(n) = max | \mathcal{H}(x_1 , \dots x_n) |$$ In the above example we have a maximum of 3.

## Shattered points

A set of points $x_1, \dots , x_{n}$ is shattered by $\mathcal{H}$ if functions from $\mathcal{H}$ can produce all binary labels of $x_1 , \dots , x_{n}$. That is if $$| \mathcal{H}(x_1 , \dots , x_n) | = 2^n$$ In the above example we can shatter any one point. That is for any two of the three points we can't produce the pattern $$[+,+]$$

## The Vapnik-Chervonenkis (or VC) Dimension

The VC dimension of the hypotheses space $\mathcal{H}$ is denoted by $d_{VC}(\mathcal{H})$. It is the maximum number of points that can be shattered by $\mathcal{H}$, that is $$d_{VC}(\mathcal{H}) = max\{ n | m_{\mathcal{H}} = 2^n \}$$ If $m_{\mathcal{H}}(n) = 2^n$ for all $n$, then the VC dimension for $\mathcal{H}$ is infinite. As stated for the above example: we have a VC-dimension of 1, since we cannot produce the pattern $$[+,+]$$