[ VIGRA Homepage | Function Index | Class Index | Namespaces | File List | Main Page ]

details Sampler< Random > Class Template Reference VIGRA

Create random samples from a sequence of indices. More...

#include <vigra/sampling.hxx>

Public Types

typedef ArrayVectorView
< IndexType
IndexArrayViewType
 
typedef Int32 IndexType
 

Public Member Functions

IndexArrayViewType oobIndices () const
 
IndexType operator[] (int k) const
 
void sample ()
 
IndexArrayViewType sampledIndices () const
 
 Sampler (UInt32 totalCount, SamplerOptions const &opt=SamplerOptions(), Random const *rnd=0)
 
template<class Iterator >
 Sampler (Iterator strataBegin, Iterator strataEnd, SamplerOptions const &opt=SamplerOptions(), Random const *rnd=0)
 
int sampleSize () const
 
int size () const
 
int strataCount () const
 
bool stratifiedSampling () const
 
int totalCount () const
 
bool withReplacement () const
 

Detailed Description

template<class Random = MersenneTwister>
class vigra::Sampler< Random >

Create random samples from a sequence of indices.

Selecting data items at random is a basic task of machine learning, for example in boostrapping, RandomForest training, and cross validation. This class implements various ways to select random samples via their indices. Indices are assumed to be consecutive in the range 0 <= index < total_sample_count.

The class always contains a current sample which can be accessed by the index operator or by the function sampledIndices(). The indices that are not in the current sample (out-of-bag indices) can be accessed via the function oobIndices().

The sampling method (with/without replacement, stratified or not) and the number of samples to draw are determined by the option object SamplerOptions.

Usage:

#include <vigra/sampling.hxx>
Namespace: vigra

Create a Sampler with default options, i.e. sample as many indices as there are data elements, with replacement. On average, the sample will contain 0.63*totalCount distinct indices.

int totalCount = 10000; // total number of data elements
int numberOfSamples = 20; // repeat experiment 20 times
Sampler<> sampler(totalCount);
for(int k=0; k<numberOfSamples; ++k)
{
// process current sample
for(int i=0; i<sampler.sampleSize(); ++i)
{
int currentIndex = sampler[i];
processData(data[currentIndex]);
}
// create next sample
sampler.sample();
}

Create a Sampler for stratified sampling, without replacement.

// prepare the strata (i.e. specify which stratum each element belongs to)
int stratumSize1 = 2000, stratumSize2 = 8000,
totalCount = stratumSize1 + stratumSize2;
ArrayVerctor<int> strata(totalCount);
for(int i=0; i<stratumSize1; ++i)
strata[i] = 1;
for(int i=stratumSize1; i<stratumSize2; ++i)
strata[i] = 2;
int sampleSize = 200; // i.e. sample 100 elements from each of the two strata
int numberOfSamples = 20; // repeat experiment 20 times
Sampler<> stratifiedSampler(strata.begin(), strata.end(),
SamplerOptions().withoutReplacement().stratified().sampleSize(sampleSize));
// create first sample
sampler.sample();
for(int k=0; k<numberOfSamples; ++k)
{
// process current sample
for(int i=0; i<sampler.sampleSize(); ++i)
{
int currentIndex = sampler[i];
processData(data[currentIndex]);
}
// create next sample
sampler.sample();
}

Member Typedef Documentation

typedef Int32 IndexType

Internal type of the indices. Currently, 64-bit indices are not supported because this requires extension of the random number generator classes.

Type of the array view object that is returned by sampledIndices() and oobIndices().

Constructor & Destructor Documentation

Sampler ( UInt32  totalCount,
SamplerOptions const &  opt = SamplerOptions(),
Random const *  rnd = 0 
)

Create a sampler for totalCount data objects.

In each invocation of sample() below, it will sample indices according to the options passed. If no options are given, totalCount indices will be drawn with replacement.

Sampler ( Iterator  strataBegin,
Iterator  strataEnd,
SamplerOptions const &  opt = SamplerOptions(),
Random const *  rnd = 0 
)

Create a sampler for stratified sampling.

strataBegin and strataEnd must refer to a sequence which specifies for each sample the stratum it belongs to. The total number of data objects will be set to strataEnd - strataBegin. Equally many samples (subject to rounding) will be drawn from each stratum, unless the option object explicitly requests unstratified sampling, in which case the strata are ignored.

Member Function Documentation

IndexType operator[] ( int  k) const

Return the k-th index in the current sample.

void sample ( )

Create a new sample.

int totalCount ( ) const

The total number of data elements.

int sampleSize ( ) const

The number of data elements that have been sampled.

int size ( ) const

Same as sampleSize().

int strataCount ( ) const

The number of strata to be used. Will be 1 if no strata are given. Will be ignored when stratifiedSampling() is false.

bool stratifiedSampling ( ) const

Whether to use stratified sampling. (If this is false, strata will be ignored even if present.)

bool withReplacement ( ) const

Whether sampling should be performed with replacement.

IndexArrayViewType sampledIndices ( ) const

Return an array view containing the indices in the current sample.

IndexArrayViewType oobIndices ( ) const

Return an array view containing the out-of-bag indices. (i.e. the indices that are not in the current sample)


The documentation for this class was generated from the following file:

© Ullrich Köthe (ullrich.koethe@iwr.uni-heidelberg.de)
Heidelberg Collaboratory for Image Processing, University of Heidelberg, Germany

html generated using doxygen and Python
vigra 1.11.1 (Fri May 19 2017)