[ VIGRA Homepage | Function Index | Class Index | Namespaces | File List | Main Page ]

details RandomForestOptions Class Reference VIGRA

Options object for the random forest. More...

#include <vigra/random_forest/rf_common.hxx>

Public Member Functions

RandomForestOptionsfeatures_per_node (RF_OptionTag in)
 use built in mapping to calculate mtry More...
 
RandomForestOptionsfeatures_per_node (int in)
 Set mtry to a constant value. More...
 
RandomForestOptionsfeatures_per_node (int(*in)(int))
 use a external function to calculate mtry More...
 
RandomForestOptionsmin_split_node_size (int in)
 Number of examples required for a node to be split. More...
 
RandomForestOptionspredict_weighted ()
 weight each tree with number of samples in that node
 
 RandomForestOptions ()
 create a RandomForestOptions object with default initialisation. More...
 
RandomForestOptionssample_with_replacement (bool in)
 sample from training population with or without replacement? More...
 
RandomForestOptionssamples_per_tree (double in)
 specify the fraction of the total number of samples used per tree for learning. More...
 
RandomForestOptionssamples_per_tree (int in)
 directly specify the number of samples per tree More...
 
RandomForestOptionssamples_per_tree (int(*in)(int))
 use external function to calculate the number of samples each tree should be learnt with. More...
 
RandomForestOptionstree_count (unsigned int in)
 
RandomForestOptionsuse_stratification (RF_OptionTag in)
 specify stratification strategy More...
 

Public Attributes

sampling options
double training_set_proportion_
 
int training_set_size_
 
int(* training_set_func_ )(int)
 
RF_OptionTag training_set_calc_switch_
 
bool sample_with_replacement_
 
RF_OptionTag stratification_method_
 
general random forest options

these usually will be used by most split functors and stopping predicates

RF_OptionTag mtry_switch_
 
int mtry_
 
int(* mtry_func_ )(int)
 
bool predict_weighted_
 
int tree_count_
 
int min_split_node_size_
 
bool prepare_online_learning_
 

Detailed Description

Options object for the random forest.

usage: RandomForestOptions a = RandomForestOptions() .param1(value1) .param2(value2) ...

This class only contains options/parameters that are not problem dependent. The ProblemSpec class contains methods to set class weights if necessary.

Note that the return value of all methods is *this which makes concatenating of options as above possible.

Constructor & Destructor Documentation

create a RandomForestOptions object with default initialisation.

look at the other member functions for more information on default values

Member Function Documentation

RandomForestOptions& use_stratification ( RF_OptionTag  in)

specify stratification strategy

default: RF_NONE possible values: RF_EQUAL, RF_PROPORTIONAL, RF_EXTERNAL, RF_NONE RF_EQUAL: get equal amount of samples per class. RF_PROPORTIONAL: sample proportional to fraction of class samples in population RF_EXTERNAL: strata_weights_ field of the ProblemSpec_t object has been set externally. (defunct)

RandomForestOptions& sample_with_replacement ( bool  in)

sample from training population with or without replacement?


Default: true

RandomForestOptions& samples_per_tree ( double  in)

specify the fraction of the total number of samples used per tree for learning.

This value should be in [0.0 1.0] if sampling without replacement has been specified.


default : 1.0

RandomForestOptions& samples_per_tree ( int  in)

directly specify the number of samples per tree

This value should not be higher than the total number of samples if sampling without replacement has been specified.

RandomForestOptions& samples_per_tree ( int(*)(int)  in)

use external function to calculate the number of samples each tree should be learnt with.

Parameters
infunction pointer that takes the number of rows in the learning data and outputs the number samples per tree.
RandomForestOptions& features_per_node ( RF_OptionTag  in)

use built in mapping to calculate mtry

Use one of the built in mappings to calculate mtry from the number of columns in the input feature data.

Parameters
inpossible values:
  • RF_LOG (the number of features considered for each split is $ 1+\lfloor \log(n_f)/\log(2) \rfloor $ as in Breiman's original paper),
  • RF_SQRT (default, the number of features considered for each split is $ \lfloor \sqrt{n_f} + 0.5 \rfloor $)
  • RF_ALL (all features are considered for each split)
RandomForestOptions& features_per_node ( int  in)

Set mtry to a constant value.

mtry is the number of columns/variates/variables randomly chosen to select the best split from.

RandomForestOptions& features_per_node ( int(*)(int)  in)

use a external function to calculate mtry

Parameters
infunction pointer that takes int (number of columns of the and outputs int (mtry)
RandomForestOptions& tree_count ( unsigned int  in)

How many trees to create?


Default: 255.

RandomForestOptions& min_split_node_size ( int  in)

Number of examples required for a node to be split.

When the number of examples in a node is below this number, the node is not split even if class separation is not yet perfect. Instead, the node returns the proportion of each class (among the remaining examples) during the prediction phase.
Default: 1 (complete growing)


The documentation for this class was generated from the following file:

© Ullrich Köthe (ullrich.koethe@iwr.uni-heidelberg.de)
Heidelberg Collaboratory for Image Processing, University of Heidelberg, Germany

html generated using doxygen and Python
vigra 1.11.1 (Fri May 19 2017)