RandomForestOptions Class Reference

Options object for the random forest. More...

#include <vigra/random_forest/rf_common.hxx>

Public Member Functions
RandomForestOptions &	features_per_node (RF_OptionTag in)
	use built in mapping to calculate mtry More...

RandomForestOptions &	features_per_node (int in)
	Set mtry to a constant value. More...

RandomForestOptions &	features_per_node (int(*in)(int))
	use a external function to calculate mtry More...

RandomForestOptions &	min_split_node_size (int in)
	Number of examples required for a node to be split. More...

RandomForestOptions &	predict_weighted ()
	weight each tree with number of samples in that node

	RandomForestOptions ()
	create a RandomForestOptions object with default initialisation. More...

RandomForestOptions &	sample_with_replacement (bool in)
	sample from training population with or without replacement? More...

RandomForestOptions &	samples_per_tree (double in)
	specify the fraction of the total number of samples used per tree for learning. More...

RandomForestOptions &	samples_per_tree (int in)
	directly specify the number of samples per tree More...

RandomForestOptions &	samples_per_tree (int(*in)(int))
	use external function to calculate the number of samples each tree should be learnt with. More...

RandomForestOptions &	tree_count (unsigned int in)

RandomForestOptions &	use_stratification (RF_OptionTag in)
	specify stratification strategy More...

Public Attributes
sampling options
double	training_set_proportion_

int	training_set_size_

int(*	training_set_func_ )(int)

RF_OptionTag	training_set_calc_switch_

bool	sample_with_replacement_

RF_OptionTag	stratification_method_

general random forest options
these usually will be used by most split functors and stopping predicates
RF_OptionTag	mtry_switch_

int	mtry_

int(*	mtry_func_ )(int)

bool	predict_weighted_

int	tree_count_

int	min_split_node_size_

bool	prepare_online_learning_

Detailed Description

Options object for the random forest.

usage: RandomForestOptions a = RandomForestOptions() .param1(value1) .param2(value2) ...

This class only contains options/parameters that are not problem dependent. The ProblemSpec class contains methods to set class weights if necessary.

Note that the return value of all methods is *this which makes concatenating of options as above possible.

Constructor & Destructor Documentation

RandomForestOptions ( )

create a RandomForestOptions object with default initialisation.

look at the other member functions for more information on default values

Member Function Documentation

RandomForestOptions& use_stratification ( RF_OptionTag in )

specify stratification strategy

default: RF_NONE possible values: RF_EQUAL, RF_PROPORTIONAL, RF_EXTERNAL, RF_NONE RF_EQUAL: get equal amount of samples per class. RF_PROPORTIONAL: sample proportional to fraction of class samples in population RF_EXTERNAL: strata_weights_ field of the ProblemSpec_t object has been set externally. (defunct)

RandomForestOptions& sample_with_replacement ( bool in )

sample from training population with or without replacement?

Default: true

RandomForestOptions& samples_per_tree ( double in )

specify the fraction of the total number of samples used per tree for learning.

This value should be in [0.0 1.0] if sampling without replacement has been specified.

default : 1.0

RandomForestOptions& samples_per_tree ( int in )

directly specify the number of samples per tree

This value should not be higher than the total number of samples if sampling without replacement has been specified.

RandomForestOptions& samples_per_tree ( int(*)(int) in )

use external function to calculate the number of samples each tree should be learnt with.

Parameters

in	function pointer that takes the number of rows in the learning data and outputs the number samples per tree.

RandomForestOptions& features_per_node ( RF_OptionTag in )

use built in mapping to calculate mtry

Use one of the built in mappings to calculate mtry from the number of columns in the input feature data.

Parameters

in

possible values:

RF_LOG (the number of features considered for each split is $1+\lfloor \log(n_f)/\log(2) \rfloor$ as in Breiman's original paper),
RF_SQRT (default, the number of features considered for each split is $\lfloor \sqrt{n_f} + 0.5 \rfloor$ )
RF_ALL (all features are considered for each split)

RandomForestOptions& features_per_node ( int in )

Set mtry to a constant value.

mtry is the number of columns/variates/variables randomly chosen to select the best split from.

RandomForestOptions& features_per_node ( int(*)(int) in )

use a external function to calculate mtry

Parameters

in	function pointer that takes int (number of columns of the and outputs int (mtry)

RandomForestOptions& tree_count ( unsigned int in )

How many trees to create?

Default: 255.

RandomForestOptions& min_split_node_size ( int in )

Number of examples required for a node to be split.

When the number of examples in a node is below this number, the node is not split even if class separation is not yet perfect. Instead, the node returns the proportion of each class (among the remaining examples) during the prediction phase.
Default: 1 (complete growing)

The documentation for this class was generated from the following file:

vigra/random_forest/rf_common.hxx

Public Member Functions

Public Attributes

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation