[ VIGRA Homepage | Function Index | Class Index | Namespaces | File List | Main Page ]

details vigra::rf::algorithms Namespace Reference VIGRA

Classes

class  ClusterImportanceVisitor
 
class  CorrectStatus
 
class  Draw
 
class  GetClusterVariables
 
struct  HC_Entry
 
class  HClustering
 
class  NormalizeStatus
 
class  PermuteCluster
 
class  RFErrorCallback
 
class  VariableSelectionResult
 

Functions

template<class FeatureT , class ResponseT , class ErrorRateCallBack >
void backward_elimination (FeatureT const &features, ResponseT const &response, VariableSelectionResult &result, ErrorRateCallBack errorcallback)
 
template<class FeatureT , class ResponseT >
void cluster_permutation_importance (FeatureT const &features, ResponseT const &response, HClustering &linkage, MultiArray< 2, double > &distance)
 
template<class FeatureT , class ResponseT , class ErrorRateCallBack >
void forward_selection (FeatureT const &features, ResponseT const &response, VariableSelectionResult &result, ErrorRateCallBack errorcallback)
 
template<class FeatureT , class ResponseT , class ErrorRateCallBack >
void rank_selection (FeatureT const &features, ResponseT const &response, VariableSelectionResult &result, ErrorRateCallBack errorcallback)
 

Detailed Description

This namespace contains all algorithms developed for feature selection

Function Documentation

void vigra::rf::algorithms::forward_selection ( FeatureT const &  features,
ResponseT const &  response,
VariableSelectionResult &  result,
ErrorRateCallBack  errorcallback 
)

Perform forward selection

Parameters
featuresIN: n x p matrix containing n instances with p attributes/features used in the variable selection algorithm
responseIN: n x 1 matrix containing the corresponding response
resultIN/OUT: VariableSelectionResult struct which will contain the results of the algorithm. Features between result.selected.begin() and result.pivot will be left untouched.
See Also
VariableSelectionResult
Parameters
errorcallbackIN, OPTIONAL: Functor that returns the error rate given a set of features and labels. Default is the RandomForest OOB Error.

Forward selection subsequently chooses the next feature that decreases the Error rate most.

usage:

MultiArray<2, double> features = createSomeFeatures();
MultiArray<2, int> labels = createCorrespondingLabels();
VariableSelectionResult result;
forward_selection(features, labels, result);

To use forward selection but ensure that a specific feature e.g. feature 5 is always included one would do the following

VariableSelectionResult result;
result.init(features, labels);
std::swap(result.selected[0], result.selected[5]);
result.setPivot(1);
forward_selection(features, labels, result);
See Also
VariableSelectionResult
void vigra::rf::algorithms::backward_elimination ( FeatureT const &  features,
ResponseT const &  response,
VariableSelectionResult &  result,
ErrorRateCallBack  errorcallback 
)

Perform backward elimination

Parameters
featuresIN: n x p matrix containing n instances with p attributes/features used in the variable selection algorithm
responseIN: n x 1 matrix containing the corresponding response
resultIN/OUT: VariableSelectionResult struct which will contain the results of the algorithm. Features between result.pivot and result.selected.end() will be left untouched.
See Also
VariableSelectionResult
Parameters
errorcallbackIN, OPTIONAL: Functor that returns the error rate given a set of features and labels. Default is the RandomForest OOB Error.

Backward elimination subsequently eliminates features that have the least influence on the error rate

usage:

MultiArray<2, double> features = createSomeFeatures();
MultiArray<2, int> labels = createCorrespondingLabels();
VariableSelectionResult result;
backward_elimination(features, labels, result);

To use backward elimination but ensure that a specific feature e.g. feature 5 is always excluded one would do the following:

VariableSelectionResult result;
result.init(features, labels);
std::swap(result.selected[result.selected.size()-1], result.selected[5]);
result.setPivot(result.selected.size()-1);
backward_elimination(features, labels, result);
See Also
VariableSelectionResult
void vigra::rf::algorithms::rank_selection ( FeatureT const &  features,
ResponseT const &  response,
VariableSelectionResult &  result,
ErrorRateCallBack  errorcallback 
)

Perform rank selection using a predefined ranking

Parameters
featuresIN: n x p matrix containing n instances with p attributes/features used in the variable selection algorithm
responseIN: n x 1 matrix containing the corresponding response
resultIN/OUT: VariableSelectionResult struct which will contain the results of the algorithm. The struct should be initialized with the predefined ranking.
See Also
VariableSelectionResult
Parameters
errorcallbackIN, OPTIONAL: Functor that returns the error rate given a set of features and labels. Default is the RandomForest OOB Error.

Often some variable importance, score measure is used to create the ordering in which variables have to be selected. This method takes such a ranking and calculates the corresponding error rates.

usage:

MultiArray<2, double> features = createSomeFeatures();
MultiArray<2, int> labels = createCorrespondingLabels();
std::vector<int> ranking = createRanking(features);
VariableSelectionResult result;
result.init(features, labels, ranking.begin(), ranking.end());
backward_elimination(features, labels, result);
See Also
VariableSelectionResult
void vigra::rf::algorithms::cluster_permutation_importance ( FeatureT const &  features,
ResponseT const &  response,
HClustering &  linkage,
MultiArray< 2, double > &  distance 
)

Perform hierarchical clustering of variables and assess importance of clusters

Parameters
featuresIN: n x p matrix containing n instances with p attributes/features used in the variable selection algorithm
responseIN: n x 1 matrix containing the corresponding response
linkageOUT: Hierarchical grouping of variables.
distanceOUT: distance matrix used for creating the linkage

Performs Hierarchical clustering of variables. And calculates the permutation importance measures of each of the clusters. Use the Draw functor to create human readable output The cluster-permutation importance measure corresponds to the normal permutation importance measure with all columns corresponding to a cluster permuted. The importance measure for each cluster is stored as the status() field of each clusternode

See Also
HClustering

usage:

MultiArray<2, double> features = createSomeFeatures();
MultiArray<2, int> labels = createCorrespondingLabels();
HClustering linkage;
MultiArray<2, double> distance;
cluster_permutation_importance(features, labels, linkage, distance)
// create graphviz output
Draw<double, int> draw(features, labels, "linkagetree.graph");
linkage.breadth_first_traversal(draw);

© Ullrich Köthe (ullrich.koethe@iwr.uni-heidelberg.de)
Heidelberg Collaboratory for Image Processing, University of Heidelberg, Germany

html generated using doxygen and Python
vigra 1.11.1 (Fri May 19 2017)