Data Structures | Typedefs | Enumerations | Functions

src/som.h File Reference

STATUS: experimental Self-Organizing Maps More...

#include "definitions.h"
#include "helper.h"
#include "distances.h"
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Data Structures

struct  Som
 1-dimensional SOM with trivial topology (neighbouring nodes). More...

Typedefs

typedef double(* NeighbourhoodFunction )(int, int, struct som_struct *, int t)
typedef double(* TimeDecayFunction )(int, int, int)
 alpha = f( t, nruns, nruns in initial_phase )

Enumerations

enum  SOMConnectivityType { ONED_LINEAR, TWOD_GRID, TWOD_HEXAGONAL, CUSTOM }
 

defines the dimension/structure of the SOM.

More...

Functions

void som_free (Som *s)
double ** som_generate_connectivity_matrix (SOMConnectivityType type, double **m, int n)
Somsom_init (int dimension, int n, int nruns, SOMConnectivityType connectivity)
void som_initialize_random (Som *s, double min, double max)
void som_initialize_random_samples (Som *s, double **X, int dim, int nsamples)
double som_neighbourhood_gaussian (int x, int bmu, struct som_struct *s, int t)
void som_print (FILE *f, Som *s)
double som_time_decay_linear (int t, int nruns, int initial_runs)
void som_train_from_data (Som *s, double **X, int dim, int nsamples)

Detailed Description

STATUS: experimental Self-Organizing Maps

Warning:
For now this is only experimental but plans are to complete this file.

SOM Algorithm

Self-Organizing Feature Maps are extensively described in

Kohonen, T. 1995. Self-Organizing Maps. Springer.

We have a set of n neurons$ W = \{ \vec{w}_1,\ldots, \vec{w}_n\} $ that are of dimension m:$ \vec{w}_i \in \mathcal{R}^m $. The neurons are arranged on a bidirectional, connected graph, such that there exists a distance between two neurons$ d(\vec{w}_i,\vec{w}_j) $.

Using input data$ \vec{x}_1,\ldots, \vec{x}_n$, of dimension m, we find the best-matching unit (BMU)

\[ w_{\mbox{bmu}} = \mbox{argmin}_{w\in W} || x - w ||^2. \]

(we can use any other metric as well - many metrices are implemented in LibEEGTools).

Then, we update each neuron by the learning rule

\[ w_i(t+1) = w_i(t) + \alpha(t)h(t)[ x - m_i(t) ] \]

where$\alpha$ is a time-decaying function, h(t) is a neighbourhood function centered on the BMU (e.g. gaussian) which narrows down with increasing time.

In this implementation, we divide the training phase into an initial and a fine-tuning phase. During the initial phase, $$ takes on rather large values and the neighbourhood is wide such that an initial arrangement may take place. Later, both parameters are narrowed down to enable fine-tuning.

All of the parameters can be conveniently manipulated in the implementation.

SOM Implementation

The implementation given here is quite general but not very efficient. Topology of the network of neurons is implemented in terms of distance matrices between two neurons (in terms of their ''physical'' distance D on the graph)$ \mathbf{M} = m_{ij} = D(i,j)$.

The user can either supply this matrix directly or generate such a matrix using our convenience function som_generate_connectivity_matrix().

Usage

First, you need to initialize a Som-struct using som_init(). You need to supply a value of SOMConnectivityType. If it is not CUSTOM, the function will allocate a connectivity matrix.

Som *S;
S = som_init( 2, 1000, 1e6, ONED_LINEAR );

Python Usage

Todo:
generalize metric (need to solve Eq. 3.40 (from Kohonen, 1995) for each metric)

Definition in file som.h.


Typedef Documentation

typedef double(* NeighbourhoodFunction)(int, int, struct som_struct *, int t)

Definition at line 105 of file som.h.

typedef double(* TimeDecayFunction)(int, int, int)

alpha = f( t, nruns, nruns in initial_phase )

Definition at line 109 of file som.h.


Enumeration Type Documentation

defines the dimension/structure of the SOM.

  • 1D_LINEAR - 1-dimensional line, like 1<->2<->3<->...<->n
  • 2D_GRID - grid-like structure such that each node has 8 neighbours
  • 2D_HEXAGONAL
  • CUSTOM - you can define a custom connectivit matrix, where each entry [i,j] gives the distance between node i and j
Enumerator:
ONED_LINEAR 
TWOD_GRID 
TWOD_HEXAGONAL 
CUSTOM 

Definition at line 119 of file som.h.


Function Documentation

void som_free ( Som s  ) 

Definition at line 133 of file som.c.

double** som_generate_connectivity_matrix ( SOMConnectivityType  type,
double **  m,
int  n 
)

generate a connectivity matrix of type type. The (i,j)th entry gives the distance between nodes i and j in number of nodes.

type:

  • ONED_LINEAR:$ m_{i,j} = | i-j | $
  • TWOD_GRID: n should be a square number and$ m_{i,j} = $
  • TWOD_HEXAGONAL:
Parameters:
type 
m if NULL or ALLOC_IN_FCT, allocate own memory.
n number of nodes

Definition at line 40 of file som.c.

Som* som_init ( int  dimension,
int  n,
int  nruns,
SOMConnectivityType  connectivity_type 
)

allocate the Som-struct

Definition at line 103 of file som.c.

void som_initialize_random ( Som s,
double  min,
double  max 
)

initialize the codebook vectors with random values between min and max (uniformely drawn);

Definition at line 237 of file som.c.

void som_initialize_random_samples ( Som s,
double **  X,
int  dim,
int  nsamples 
)

initialize the codebook vectors with randomly drawn samples from the data X;

Parameters:
s the model
X data of dimension [dim, nsamples]

Definition at line 254 of file som.c.

double som_neighbourhood_gaussian ( int  x,
int  bmu,
struct som_struct *  s,
int  t 
)

Gaussian neighbourhood (std shrinks exponentially with t).

Definition at line 166 of file som.c.

void som_print ( FILE *  f,
Som s 
)

Definition at line 269 of file som.c.

double som_time_decay_linear ( int  t,
int  nruns,
int  initial_runs 
)

Linear time-decay function between 0 and 1.

Definition at line 151 of file som.c.

void som_train_from_data ( Som s,
double **  X,
int  dim,
int  nsamples 
)

apply SOM-training rule

\[ m_i(t+1) = m_i(t) + h_{ci}(t)[ x(t) - m_i(t) ] \]

Definition at line 184 of file som.c.