Generally, DTW is used to compare or transform two curves such that the difference in temporal characteristics is minimized. It was developed for cases in which there is a general shape in the curves, but they are differently aligned on the-axis. First, a pointwise dissimilarity measure between two signals
is defined, such as
where denotes the {normalized} signal and
is the first derivative of
. The data is normalized to find the best match regardless of absolute amplitude, since only the temporal difference between the signals is of interest. The distance used here is referred to as derivative DTW because it considers both amplitude and slope of the signals.
This measure constitutes the dissimilarity matrix. An optimal time-mapping according to the metric chosen above is produced by finding a path
that is described recursively by
through from the top left to the bottom right corner that minimizes the sum of the
.
This path can be found by a dynamic programming strategy, computing the cumulated cost matrix
and backtracking via the minimum of the three neighboring entries (down, down-right, right) from to
. The final element
constitutes a measure for the (dis-)similarity of the two curves based on their overall shape. Once this path is available, it is easy to average the curves (called {averaging dynamic time-warping}, ADTW) to reduce both temporal and scale variance by setting
where as introduced in Eq. and
.
For trials, a straightforward solution proposed in Picton (1988) is to simply combine pairs of single-trial ERPs using ADTW. In a next step, pairs of the results from this combination can be averaged again and the entire process iterated until only one average is left. Recursively,
with base case from Eq..