Site maintained by Stéphane Derrode. Last modified: November 8, 2011 (correspond to version 0.8 in source codes).

Introduction

This web site gives access to on-line demos of programs, written in C++, I have implemented during the last years regarding the so-called Hidden Markov Chain (HMC) model for time-series analysis. Source codes can be downloaded here. Programs are concerned with the unsupervised restoration of recent extensions of HMC models, e.g. Noise-independent HMC (the classical model as described by L. R. Rabiner), Pairwise Markov Chain (including HMC with correlated noise)...

Make your choice on the left menu, set arguments (number of classes, number of iterations, data-driven pdf...) and provide your data file to get, in return, a file with classified data. For each demo program, a data file example is provided for testing purposes.

Most of programs have been developed from the work and with the collaboration of Prof. Wojciech Pieczynski. Bibliographical references regarding the underlying models (and many more!) can be found in its web pages. You can also check for draft papers on my personal pages.



Stéphane Derrode

K-means classification algorithm

Short description This is the classical kmeans classification algorithm. You can set vectorial data. Examples of data file are provided below.

Form

Number of iterations (-e)
Number of classes (-K)
Noisy data files (-Y)

Help on parameters

-Y File names containing observations; If several files are set : all file must have the same number of samples.
-e Number of iterations (>0); classical value: 30
-K Class number (>1); classical value: 2

Data files for testing

Mixture model for iid data

Short description This is the classical mixture model for iid data. You can set vectorial data. Examples of data file are provided below.

Form

Estimation method (-E)
Number of iterations (-e)
Number of classes (-K)
Law for pdf (-l)
Noisy data files (-Y)

Help on parameters

-Y File names containing observations; If several files are set : (i) all file must have the same number of samples; (ii) Gaussian copulas for multi-dimensionnal data-driven densities are assumed.
-E Estimation method [EM/SEM/ICE];
-e Number of iterations (0=no iter=kmeans only); classical value: 300
-K Class number (>1); classical value: 2
-l Data driven density types
0Gaussian
1Gamma
2Student
3Gamma inverse
4Beta first kind
5Beta second Kind
6All Pearson' system
Classical value for "-K 2" classes: 0:0 (two gaussians)

Data files for testing

Multiscale or Joint mixture model (JMM - PMM) for iid data

Short description This is the multiscale mixture model (or mixture of mixture, or joint mixture model) for iid data. You can set vectorial data. Examples of data file are provided below.

Form

Estimation method (-E)
Number of iterations (-e)
Number of classes (-K)
Copulas (-c)
Law for pdf (-l)
Noisy data files (-Y)

Help on parameters

-Y File names containing observations; If several files are set : (i) all file must have the same number of samples; (ii) Gaussian copulas for multi-dimensionnal data-driven densities are assumed.
-E Estimation method [EM/SEM/ICE];
-e Number of iterations (0=no iter=kmeans only); classical value: 300
-K Multiscale and class numbers; for example 2:3 means 2 scales and two classes for the upper scale, and three classes for the lower.
-c Copula type
0Product Copula
1Gaussian Copula
2Student Copula
3Gumbel-Hougaard Copula
4Farlie-Gumbel-Morgenstern Copula
5Cubic section Copula
6Clayton Copula
7A12 Copula
8A14 Copula
Classical value for "-K 2:3" classes: 1:1:1:1:1:1 (2*3=6 gaussian copulas).
-l Data driven density types
0Gaussian
1Gamma
2Student
3Gamma inverse
4Beta first kind
5Beta second Kind
6All Pearson' system
Classical value for "-K 2:2" classes and only one observation file: 0:0:0:0 (four gaussians). The number of scales is limited to three.
-p Data driven density types
0Independent MM
1Coupled PMM model
2Full PMM model

Data files for testing

HMC with independent noise

Short description This is the clasical HMC-IN (independent noise) model. You can set vectorial data. Examples of data file are provided below. Also the variant below.

Form

Estimation method (-E)
Number of iterations (-e)
Number of classes (-K)
Law for pdf (-l)
Noisy data files (-Y)

Help on parameters

-Y File names containing observations; If several files are set : (i) all file must have the same number of samples; (ii) Gaussian copulas for multi-dimensionnal data-driven densities are assumed.
-E Estimation method [EM/SEM/ICE];
-e Number of iterations (0=no iter=kmeans only); classical value: 50
-K Class number (>1); classical value: 2
-l Data driven density types
0Gaussian
1Gamma
2Student
3Gamma inverse
4Beta first kind
5Beta second Kind
6All Pearson' system
Classical value for "-K 2" classes: 0:0 (two gaussians)

Data files for testing



HMC with independent noise (local estimation)

Short description This variant uses a bootstrap strategy to estimate HMC parameters from a subsample. This model is of special interest for very huge data set. But, due to server limitation, you can only test this program for not too big datafile (less than 500000 bytes). So, the goal of including this demo here is to check that bootstrap estimation is robust and gives very similar compared to the classical estimation algorithm.

Form

Estimation method (-E)
Number of iterations (-e)
Number of classes (-K)
Law for pdf (-l)
Local size support (-s)
Sub-sample type (-b)
Noisy data file (-Y)

Help on parameters

-Y File name containing observations;
-E Estimation method [EM/SEM/ICE];
-e Number of iterations (0=no iter=kmeans only); classical value: 50
-K Class number (>1); classical value: 2
-l Data driven density types
0Gaussian
1Gamma
2Student
3Gamma inverse
4Beta first kind
5Beta second Kind
6All Pearson' system
Classical value for "-K 2" classes: 0:0 (two gaussians)
-s Local size support >0; classical value: 5
-b Sub-sample type (0:all or 3:bootstrap); classical value: 3

HMC with dependent noise

Short description This is the non-clasical HMC-DN (dependant noise) model (particular case of PMC model for which th state process is also markovian). Copulas are used to model the bi-dimensional data driven densities.

Form

Estimation method (-E)
Number of iterations (-e)
Number of classes (-K)
Law for pdf (-l)
List of 4 copulas (-c)
Noisy data file (-Y)

Help on parameters

-Y File name containing observations;
-E Estimation method [SEM/ICE];
-e Number of iterations (0=no iter=kmeans only); classical value: 50
-K Class number (>1); classical value: 2
-l Data driven density types
0Gaussian
1Gamma
2Student
3Gamma inverse
4Beta first kind
5Beta second Kind
6All Pearson' system
Classical value for "-K 2" classes: 0:0 (two gaussians)
-c List of 4 copulas, each to choose in [0,8] (0:product; 1:Gaussian; 2:Student; 3:Gumbel; 4:Farlie-Gumbel-Morgenstern; 5:Cubic; 6: Clayton; 7:A12; 8:A14); classical value: 1:1:1:1

Data files for testing

Pairwise Markov Chain

Short description This is the general PMC model with copulas.

Form

Estimation method (-E)
Number of iterations (-e)
Number of classes (-K)
Shape of K*K margins (-l) (see help below)
Shape of K*K copulas (-c) (see help below)
Noisy data file (-Y)

Help on parameters

-E Estimation method [SEM/ICE];
-Y File name containing observations;
-e Number of iterations (0=no iter=kmeans only); classical value: 50
-K Class number (>1); classical value: 2
-l Data driven density types (0=Gaussian; 1=Gamma; 2=8_2_7 in Pearson system; 3=2+1; 4=all Pearson); classical value for "-K 2" classes: 0:0:0:0 (four gaussians)
-c List of K*K copulas, separated by ':', each to choose in [0,8]
  • 0:Product
  • 1:Gaussian
  • 2:Student
  • 3:Gumbel
  • 4:Farlie-Gumbel-Morgenstern
  • 5:Cubic
  • 6:Clayton
  • 7:A12 (see Nelsen's book, "An introduction to copulas")
  • 8:A14 (see Nelsen's book)
The classical value: 1:1:1:1.
The choise for the K*K copulas is not completely free. For example, if K=3, due to some symmetry condition, we must write "c1:c2:c3:c2:c4:c5:c3:c5:c6", which corresponds to:
c1c2c3
c2c4c5
c3c5c6
where c1, ..., c6 correspond to 6 copulas shapes. These copula shapes can be identical (e.g. c1=c4=3).

Data files for testing

Direct Peano scan of a pgm image

Short description The Hilbert-Peano scan demo allow to signalize your image (i.e. 2D->1D). The inverse scan (see below) allows to reconstruct the classified signal (i.e. 1D->2D).

Form

Pgm image file (-Y)

Help on parameters

-Y Image filename (pgm format only - odd dimensions. Ex: 308x204);

Image files for testing


Inverse Peano scan of a txt file


Txt image file (-J)
Dimensions of the image to be recomposed (#raw:#column) (-D)

Image files for testing

Help on parameters

-J Txt filename containing a Peano scan to be image-reconstructed;
-D Dimensions of the image to be recomposed (#raw,#column)

Perform all classifications from one data file

Short description This demo classifies a data file with all available demos.

Form

Estimation method (-E)
Number of classes (-K)
Noisy data file (-Y)

Help on parameters

-Y File name containing observations;
-E Estimation method [SEM/ICE];
-K Class number (>1); classical value: 2

Last classification results

Impossible!


Impossible!