class TMultiLayerPerceptron: public TObject


 TMultiLayerPerceptron

 This class describes a neural network.
 There are facilities to train the network and use the output.

 The input layer is made of inactive neurons (returning the
 optionaly normalized input) and output neurons are linear.
 The type of hidden neurons is free, the default being sigmoids.
 (One should still try to pass normalized inputs, e.g. between [0.,1])

 The basic input is a TTree and two (training and test) TEventLists.
 Input and output neurons are assigned a value computed for each event
 with the same possibilities as for TTree::Draw().
 Events may be weighted individualy or via TTree::SetWeight().
 6 learning methods are available: kStochastic, kBatch,
 kSteepestDescent, kRibierePolak, kFletcherReeves and kBFGS.

 This implementation, written by C. Delaere, is *inspired* from
 the mlpfit package from J.Schwindling et al. with some extensions:
   * the algorithms are globally the same
   * in TMultilayerPerceptron, there is no limitation on the number of
     layers/neurons, while MLPFIT was limited to 2 hidden layers
   * TMultilayerPerceptron allows you to save the network in a root file, and
     provides more export functionnalities
   * TMultilayerPerceptron gives more flexibility regarding the normalization of
     inputs/outputs
   * TMultilayerPerceptron provides, thanks to Andrea Bocci, the possibility to
     use cross-entropy errors, which allows to train a network for pattern
     classification based on Bayesian posterior probability.

Introduction

Neural Networks are more and more used in various fields for data analysis and classification, both for research and commercial institutions. Some randomly choosen examples are:

image analysis
financial movements predictions and analysis
sales forecast and product shipping optimisation
in particles physics: mainly for classification tasks (signal over background discrimination)

More than 50% of neural networks are multilayer perceptrons. This implementation of multilayer perceptrons is inspired from the MLPfit package originaly written by Jerome Schwindling. MLPfit remains one of the fastest tool for neural networks studies, and this ROOT add-on will not try to compete on that. A clear and flexible Object Oriented implementation has been choosen over a faster but more difficult to maintain code. Nevertheless, the time penalty does not exceed a factor 2.

The MLP

The multilayer perceptron is a simple feed-forward network with the following structure:

It is made of neurons characterized by a bias and weighted links between them (let's call those links synapses). The input neurons receive the inputs, normalize them and forward them to the first hidden layer.

Each neuron in any subsequent layer first computes a linear combination of the outputs of the previous layer. The output of the neuron is then function of that combination with f being linear for output neurons or a sigmoid for hidden layers. This is useful because of two theorems:

A linear combination of sigmoids can approximate any continuous function.
Trained with output = 1 for the signal and 0 for the background, the approximated function of inputs X is the probability of signal, knowing X.

Learning methods

The aim of all learning methods is to minimize the total error on a set of weighted examples. The error is defined as the sum in quadrature, devided by two, of the error on each individual output neuron.

In all methods implemented, one needs to compute the first derivative of that error with respect to the weights. Exploiting the well-known properties of the derivative, especialy the derivative of compound functions, one can write:

for a neuton: product of the local derivative with the weighted sum on the outputs of the derivatives.
for a synapse: product of the input with the local derivative of the output neuron.

This computation is called back-propagation of the errors. A loop over all examples is called an epoch.

Six learning methods are implemented.

Stochastic minimization: This is the most trivial learning method. This is the Robbins-Monro stochastic approximation applied to multilayer perceptrons. The weights are updated after each example according to the formula:

$w_{ij}(t+1) = w_{ij}(t) + \Delta w_{ij}(t)$

with

$\Delta w_{ij}(t) = - \eta(\d e_p / \d w_{ij} + \delta) + \epsilon \Deltaw_{ij}(t-1)$

The parameters for this method are Eta, EtaDecay, Delta and Epsilon.

Steepest descent with fixed step size (batch learning): It is the same as the stochastic minimization, but the weights are updated after considering all the examples, with the total derivative dEdw. The parameters for this method are Eta, EtaDecay, Delta and Epsilon.

Steepest descent algorithm: Weights are set to the minimum along the line defined by the gradient. The only parameter for this method is Tau. Lower tau = higher precision = slower search. A value Tau = 3 seems reasonable.

Conjugate gradients with the Polak-Ribiere updating formula: Weights are set to the minimum along the line defined by the conjugate gradient. Parameters are Tau and Reset, which defines the epochs where the direction is reset to the steepes descent.

Conjugate gradients with the Fletcher-Reeves updating formula: Weights are set to the minimum along the line defined by the conjugate gradient. Parameters are Tau and Reset, which defines the epochs where the direction is reset to the steepes descent.

Broyden, Fletcher, Goldfarb, Shanno (BFGS) method: Implies the computation of a NxN matrix computation, but seems more powerful at least for less than 300 weights. Parameters are Tau and Reset, which defines the epochs where the direction is reset to the steepes descent.

How to use it...

TMLP is build from 3 classes: TNeuron, TSynapse and TMultiLayerPerceptron. Only TMultiLayerPerceptron should be used explicitely by the user.

TMultiLayerPerceptron will take examples from a TTree given in the constructor. The network is described by a simple string: The input/output layers are defined by giving the expression for each neuron, separated by comas. Hidden layers are just described by the number of neurons. The layers are separated by colons. In addition, input/output layer formulas can be preceded by '@' (e.g "@out") if one wants to also normalize the data from the TTree. Input and outputs are taken from the TTree given as second argument. Expressions are evaluated as for TTree::Draw(), arrays are expended in distinct neurons, one for each index. This can only be done for fixed-size arrays. If the formula ends with "!", softmax functions are used for the output layer. One defines the training and test datasets by TEventLists.

Example: TMultiLayerPerceptron("x,y:10:5:f",inputTree);

Both the TTree and the TEventLists can be defined in the constructor, or later with the suited setter method. The lists used for training and test can be defined either explicitely, or via a string containing the formula to be used to define them, exactly as for a TCut.

The learning method is defined using the TMultiLayerPerceptron::SetLearningMethod() . Learning methods are :

TMultiLayerPerceptron::kStochastic,
TMultiLayerPerceptron::kBatch,
TMultiLayerPerceptron::kSteepestDescent,
TMultiLayerPerceptron::kRibierePolak,
TMultiLayerPerceptron::kFletcherReeves,
TMultiLayerPerceptron::kBFGS

A weight can be assigned to events, either in the constructor, either with TMultiLayerPerceptron::SetEventWeight(). In addition, the TTree weight is taken into account.

Finally, one starts the training with TMultiLayerPerceptron::Train(Int_t nepoch, Option_t* options). The first argument is the number of epochs while option is a string that can contain: "text" (simple text output) , "graph" (evoluting graphical training curves), "update=X" (step for the text/graph output update) or "+" (will skip the randomisation and start from the previous values). All combinations are available.

Example: net.Train(100,"text, graph, update=10").

When the neural net is trained, it can be used directly ( TMultiLayerPerceptron::Evaluate() ) or exported to a standalone C++ code ( TMultiLayerPerceptron::Export() ).

Finaly, note that even if this implementation is inspired from the mlpfit code, the feature lists are not exactly matching:

mlpfit hybrid learning method is not implemented

output neurons can be normalized, this is not the case for mlpfit

the neural net is exported in C++, FORTRAN or PYTHON

the drawResult() method allows a fast check of the learning procedure

In addition, the paw version of mlpfit had additional limitations on the number of neurons, hidden layers and inputs/outputs that does not apply to TMultiLayerPerceptron.

Function Members (Methods)

public:

	TMultiLayerPerceptron()
	TMultiLayerPerceptron(const char* layout, TTree* data = 0, const char* training = "Entry$%2==0", const char* test = "", TNeuron::ENeuronType type = TNeuron::kSigmoid, const char* extF = "", const char* extD = "")
	TMultiLayerPerceptron(const char* layout, TTree* data, TEventList* training, TEventList* test, TNeuron::ENeuronType type = TNeuron::kSigmoid, const char* extF = "", const char* extD = "")
	TMultiLayerPerceptron(const char* layout, const char* weight, TTree* data = 0, const char* training = "Entry$%2==0", const char* test = "", TNeuron::ENeuronType type = TNeuron::kSigmoid, const char* extF = "", const char* extD = "")
	TMultiLayerPerceptron(const char* layout, const char* weight, TTree* data, TEventList* training, TEventList* test, TNeuron::ENeuronType type = TNeuron::kSigmoid, const char* extF = "", const char* extD = "")
virtual	~TMultiLayerPerceptron()
void	TObject::AbstractMethod(const char* method) const
virtual void	TObject::AppendPad(Option_t* option = "")
virtual void	TObject::Browse(TBrowser* b)
static TClass*	Class()
virtual const char*	TObject::ClassName() const
virtual void	TObject::Clear(Option_t* = "")
virtual TObject*	TObject::Clone(const char* newname = "") const
virtual Int_t	TObject::Compare(const TObject* obj) const
void	ComputeDEDw() const
virtual void	TObject::Copy(TObject& object) const
virtual void	TObject::Delete(Option_t* option = "")MENU
virtual Int_t	TObject::DistancetoPrimitive(Int_t px, Int_t py)
virtual void	Draw(Option_t* option = "")
virtual void	TObject::DrawClass() constMENU
virtual TObject*	TObject::DrawClone(Option_t* option = "") constMENU
void	DrawResult(Int_t index = 0, Option_t* option = "test") const
virtual void	TObject::Dump() constMENU
void	DumpWeights(Option_t* filename = "-") const
virtual void	TObject::Error(const char* method, const char* msgfmt) const
Double_t	Evaluate(Int_t index, Double_t* params) const
virtual void	TObject::Execute(const char* method, const char* params, Int_t* error = 0)
virtual void	TObject::Execute(TMethod* method, TObjArray* params, Int_t* error = 0)
virtual void	TObject::ExecuteEvent(Int_t event, Int_t px, Int_t py)
void	Export(Option_t* filename = "NNfunction", Option_t* language = "C++") const
virtual void	TObject::Fatal(const char* method, const char* msgfmt) const
virtual TObject*	TObject::FindObject(const char* name) const
virtual TObject*	TObject::FindObject(const TObject* obj) const
Double_t	GetDelta() const
virtual Option_t*	TObject::GetDrawOption() const
static Long_t	TObject::GetDtorOnly()
Double_t	GetEpsilon() const
Double_t	GetError(Int_t event) const
Double_t	GetError(TMultiLayerPerceptron::EDataSet set) const
Double_t	GetEta() const
Double_t	GetEtaDecay() const
virtual const char*	TObject::GetIconName() const
virtual const char*	TObject::GetName() const
virtual char*	TObject::GetObjectInfo(Int_t px, Int_t py) const
static Bool_t	TObject::GetObjectStat()
virtual Option_t*	TObject::GetOption() const
Int_t	GetReset() const
TString	GetStructure() const
Double_t	GetTau() const
virtual const char*	TObject::GetTitle() const
TNeuron::ENeuronType	GetType() const
virtual UInt_t	TObject::GetUniqueID() const
virtual Bool_t	TObject::HandleTimer(TTimer* timer)
virtual ULong_t	TObject::Hash() const
virtual void	TObject::Info(const char* method, const char* msgfmt) const
virtual Bool_t	TObject::InheritsFrom(const char* classname) const
virtual Bool_t	TObject::InheritsFrom(const TClass* cl) const
virtual void	TObject::Inspect() constMENU
void	TObject::InvertBit(UInt_t f)
virtual TClass*	IsA() const
virtual Bool_t	TObject::IsEqual(const TObject* obj) const
virtual Bool_t	TObject::IsFolder() const
Bool_t	TObject::IsOnHeap() const
virtual Bool_t	TObject::IsSortable() const
Bool_t	TObject::IsZombie() const
void	LoadWeights(Option_t* filename = "")
virtual void	TObject::ls(Option_t* option = "") const
void	TObject::MayNotUse(const char* method) const
virtual Bool_t	TObject::Notify()
static void	TObject::operator delete(void* ptr)
static void	TObject::operator delete(void* ptr, void* vp)
static void	TObject::operator delete[](void* ptr)
static void	TObject::operator delete[](void* ptr, void* vp)
void*	TObject::operator new(size_t sz)
void*	TObject::operator new(size_t sz, void* vp)
void*	TObject::operator new[](size_t sz)
void*	TObject::operator new[](size_t sz, void* vp)
virtual void	TObject::Paint(Option_t* option = "")
virtual void	TObject::Pop()
virtual void	TObject::Print(Option_t* option = "") const
void	Randomize() const
virtual Int_t	TObject::Read(const char* name)
virtual void	TObject::RecursiveRemove(TObject* obj)
void	TObject::ResetBit(UInt_t f)
Double_t	Result(Int_t event, Int_t index = 0) const
virtual void	TObject::SaveAs(const char* filename = "", Option_t* option = "") constMENU
virtual void	TObject::SavePrimitive(basic_ostream<char,char_traits<char> >& out, Option_t* option = "")
void	TObject::SetBit(UInt_t f)
void	TObject::SetBit(UInt_t f, Bool_t set)
void	SetData(TTree*)
void	SetDelta(Double_t delta)
virtual void	TObject::SetDrawOption(Option_t* option = "")MENU
static void	TObject::SetDtorOnly(void* obj)
void	SetEpsilon(Double_t eps)
void	SetEta(Double_t eta)
void	SetEtaDecay(Double_t ed)
void	SetEventWeight(const char*)
void	SetLearningMethod(TMultiLayerPerceptron::ELearningMethod method)
static void	TObject::SetObjectStat(Bool_t stat)
void	SetReset(Int_t reset)
void	SetTau(Double_t tau)
void	SetTestDataSet(TEventList* test)
void	SetTestDataSet(const char* test)
void	SetTrainingDataSet(TEventList* train)
void	SetTrainingDataSet(const char* train)
virtual void	TObject::SetUniqueID(UInt_t uid)
virtual void	ShowMembers(TMemberInspector& insp, char* parent)
virtual void	Streamer(TBuffer& b)
void	StreamerNVirtual(TBuffer& b)
virtual void	TObject::SysError(const char* method, const char* msgfmt) const
Bool_t	TObject::TestBit(UInt_t f) const
Int_t	TObject::TestBits(UInt_t f) const
void	Train(Int_t nEpoch, Option_t* option = "text")
virtual void	TObject::UseCurrentStyle()
virtual void	TObject::Warning(const char* method, const char* msgfmt) const
virtual Int_t	TObject::Write(const char* name = 0, Int_t option = 0, Int_t bufsize = 0)
virtual Int_t	TObject::Write(const char* name = 0, Int_t option = 0, Int_t bufsize = 0) const

protected:

void	AttachData()
void	BFGSDir(TMatrixD&, Double_t*)
void	BuildNetwork()
void	ConjugateGradientsDir(Double_t*, Double_t)
Double_t	DerivDir(Double_t*)
virtual void	TObject::DoError(int level, const char* location, const char* fmt, va_list va) const
bool	GetBFGSH(TMatrixD&, TMatrixD&, TMatrixD&)
Double_t	GetCrossEntropy() const
Double_t	GetCrossEntropyBinary() const
void	GetEntry(Int_t) const
Double_t	GetSumSquareError() const
Bool_t	LineSearch(Double_t, Double_t)
void	TObject::MakeZombie()
void	MLP_Batch(Double_t*)
void	MLP_Stochastic(Double_t*)
void	SetGammaDelta(TMatrixD&, TMatrixD&, Double_t*)
void	SteepestDir(Double_t*)

private:

	TMultiLayerPerceptron(const TMultiLayerPerceptron&)
void	BuildFirstLayer(TString&)
void	BuildHiddenLayers(TString&)
void	BuildLastLayer(TString&, Int_t)
void	ExpandStructure()
void	MLP_Line(Double_t, Double_t, Double_t)
TMultiLayerPerceptron&	operator=(const TMultiLayerPerceptron&)
void	Shuffle(Int_t*, Int_t) const

enum ELearningMethod {	kStochastic
	kBatch
	kSteepestDescent
	kRibierePolak
	kFletcherReeves
	kBFGS
};
enum EDataSet {	kTraining
	kTest
};
enum TObject::EStatusBits {	kCanDelete
	kMustCleanup
	kObjInCanvas
	kIsReferenced
	kHasUUID
	kCannotPick
	kNoContextMenu
	kInvalidObject
};
enum TObject::[unnamed] {	kIsOnHeap
	kNotDeleted
	kZombie
	kBitMask
	kSingleKey
	kOverwrite
	kWriteDelete
};

Int_t	fCurrentTree	! index of the current tree in a chain
Double_t	fCurrentTreeWeight	! weight of the current tree in a chain
TTree*	fData	! pointer to the tree used as datasource
Double_t	fDelta	! Delta - used in stochastic minimisation - Default=0.
Double_t	fEpsilon	! Epsilon - used in stochastic minimisation - Default=0.
Double_t	fEta	! Eta - used in stochastic minimisation - Default=0.1
Double_t	fEtaDecay	! EtaDecay - Eta *= EtaDecay at each epoch - Default=1.
TTreeFormula*	fEventWeight	! formula representing the event weight
TObjArray	fFirstLayer	Collection of the input neurons; subset of fNetwork
Double_t	fLastAlpha	! internal parameter used in line search
TObjArray	fLastLayer	Collection of the output neurons; subset of fNetwork
TMultiLayerPerceptron::ELearningMethod	fLearningMethod	! The Learning Method
TTreeFormulaManager*	fManager	! TTreeFormulaManager for the weight and neurons
TObjArray	fNetwork	Collection of all the neurons in the network
TNeuron::ENeuronType	fOutType	Type of output neurons
Int_t	fReset	! number of epochs between two resets of the search direction to the steepest descent - Default=50
TString	fStructure	String containing the network structure
TObjArray	fSynapses	Collection of all the synapses in the network
Double_t	fTau	! Tau - used in line search - Default=3.
TEventList*	fTest	! EventList defining the events in the test dataset
Bool_t	fTestOwner	! internal flag whether one has to delete fTest or not
TEventList*	fTraining	! EventList defining the events in the training dataset
Bool_t	fTrainingOwner	! internal flag whether one has to delete fTraining or not
TNeuron::ENeuronType	fType	Type of hidden neurons
TString	fWeight	String containing the event weight
TString	fextD	String containing the derivative name
TString	fextF	String containing the function name

class TMultiLayerPerceptron: public TObject

Function Members (Methods)

Data Members

Class Charts

Function documentation