HEBench
hebench::TestHarness::PartialDataLoader Class Reference

Base class for data loaders and data generators. More...

#include <hebench_idata_loader.h>

Inheritance diagram for hebench::TestHarness::PartialDataLoader:
Collaboration diagram for hebench::TestHarness::PartialDataLoader:

Public Types

typedef std::shared_ptr< PartialDataLoaderPtr
 
- Public Types inherited from hebench::TestHarness::IDataLoader
template<typename T >
using unique_ptr_custom_deleter = hebench::TestHarness::unique_ptr_custom_deleter< T >
 
typedef std::shared_ptr< ResultDataResultDataPtr
 
typedef std::shared_ptr< IDataLoaderPtr
 

Public Member Functions

 ~PartialDataLoader () override
 
std::uint64_t getParameterCount () const override
 Number of parameter components (operands) for the represented operation. More...
 
const hebench::APIBridge::DataPackgetParameterData (std::uint64_t param_position) const override
 Data pack for specified operation parameter (operand). More...
 
std::uint64_t getResultCount () const override
 Number of components in a result for the represented operation. More...
 
const hebench::APIBridge::DataPackgetResultData (std::uint64_t param_position) const override
 Data pack corresponding to the specified component of the result. More...
 
ResultDataPtr getResultFor (const std::uint64_t *param_data_pack_indices) override
 getResultFor More...
 
std::uint64_t getResultIndex (const std::uint64_t *param_data_pack_indices) const override
 Computes the index of the result NativeDataBuffer given the indices of the input data. More...
 
std::uint64_t getTotalDataLoaded () const override
 Total data loaded by this loader in bytes. More...
 
bool hasResults () const
 Retrieves whether buffers to contain output data have been allocated or not. More...
 
bool isInitialized () const
 
hebench::APIBridge::DataType getDataType () const
 
- Public Member Functions inherited from hebench::TestHarness::IDataLoader
virtual ~IDataLoader ()
 

Protected Member Functions

 PartialDataLoader ()
 
void init (hebench::APIBridge::DataType data_type, std::size_t input_dim, const std::size_t *input_sample_count_per_dim, const std::uint64_t *input_count_per_dim, std::size_t output_dim, const std::uint64_t *output_count_per_dim, bool allocate_output)
 Initializes dimensions of inputs and outputs. No allocation is performed. More...
 
void init (const std::string &filename, hebench::APIBridge::DataType data_type, std::size_t expected_input_dim, const std::size_t *max_input_sample_count_per_dim, const std::uint64_t *expected_input_count_per_dim, std::size_t expected_output_dim, const std::uint64_t *expected_output_count_per_dim)
 Loads a dataset from a file. More...
 
std::vector< std::shared_ptr< hebench::APIBridge::DataPack > > getResultTempDataPacks (std::uint64_t result_index) const
 Retrieves a pre-allocated result providing memory space to store a single operation result sample. More...
 
std::vector< std::shared_ptr< hebench::APIBridge::DataPack > > getResultTempDataPacks (const std::uint64_t *param_data_pack_indices) const
 Retrieves a pre-allocated result providing memory space to store a single operation result sample. More...
 
std::vector< std::shared_ptr< hebench::APIBridge::DataPack > > getResultTempDataPacks () const
 Retrieves a pre-allocated result providing memory space to store a single operation result sample. More...
 
- Protected Member Functions inherited from hebench::TestHarness::IDataLoader
 IDataLoader ()
 

Friends

template<typename >
class PartialDataLoaderHelper
 

Additional Inherited Members

- Static Public Member Functions inherited from hebench::TestHarness::IDataLoader
static std::size_t sizeOf (hebench::APIBridge::DataType data_type)
 
static unique_ptr_custom_deleter< hebench::APIBridge::DataPackCollectioncreateDataPackCollection (std::uint64_t data_pack_count)
 Creates shallow packed data that self cleans up. More...
 
static unique_ptr_custom_deleter< hebench::APIBridge::DataPackcreateDataPack (std::uint64_t buffer_count, std::uint64_t param_position)
 Creates shallow data pack that self cleans up. More...
 
static unique_ptr_custom_deleter< hebench::APIBridge::NativeDataBuffercreateDataBuffer (std::uint64_t size, std::int64_t tag)
 

Detailed Description

Base class for data loaders and data generators.

During initialization, derived class should call PartialDataLoader::init() to initialize the dimensions of the inputs and outputs. After initializing the dimensions, the derived class should call PartialDataLoader::allocate() to allocate the data for the inputs and outputs specified during PartialDataLoader::init().

After the data has been allocated, the methods PartialDataLoader::getParameterCount(), PartialDataLoader::getParameterData(), PartialDataLoader::getResultCount(), PartialDataLoader::getResultData() and other data access methods become available, and the buffers will be pointed to the correct memory location to contain the data specified. Derived class can, then use these methods to generate the data or load from storage into the allocated memory.

Definition at line 231 of file hebench_idata_loader.h.

Member Typedef Documentation

◆ Ptr

Definition at line 243 of file hebench_idata_loader.h.

Constructor & Destructor Documentation

◆ ~PartialDataLoader()

hebench::TestHarness::PartialDataLoader::~PartialDataLoader ( )
inlineoverride

Definition at line 245 of file hebench_idata_loader.h.

◆ PartialDataLoader()

hebench::TestHarness::PartialDataLoader::PartialDataLoader ( )
protected

Definition at line 381 of file hebench_idata_loader.cpp.

Member Function Documentation

◆ getDataType()

hebench::APIBridge::DataType hebench::TestHarness::PartialDataLoader::getDataType ( ) const
inline

Definition at line 270 of file hebench_idata_loader.h.

◆ getParameterCount()

std::uint64_t hebench::TestHarness::PartialDataLoader::getParameterCount ( ) const
inlineoverridevirtual

Number of parameter components (operands) for the represented operation.

This is the number of operands for the operation. For example, a unary operation has 1 operand, binary has 2, n-ary has n.

Implements hebench::TestHarness::IDataLoader.

Definition at line 247 of file hebench_idata_loader.h.

◆ getParameterData()

const hebench::APIBridge::DataPack & hebench::TestHarness::PartialDataLoader::getParameterData ( std::uint64_t  param_position) const
overridevirtual

Data pack for specified operation parameter (operand).

Parameters
[in]param_positionZero-based position of the parameter.
Returns
A data pack containing the collection of samples for the specified parameter.

The parameter position for an operation is zero-based starting from the leftmost parameter. i.e. For an operation such as:

R = op(A, B, C, ...)

R is the result, A is parameter 0, B, is parameter 1, C is parameter 2, etc.

Implements hebench::TestHarness::IDataLoader.

Definition at line 643 of file hebench_idata_loader.cpp.

◆ getResultCount()

std::uint64_t hebench::TestHarness::PartialDataLoader::getResultCount ( ) const
inlineoverridevirtual

Number of components in a result for the represented operation.

Shape of result is always 2D: [getResultCount(), ?].

For example, if the represented operation has getParameterCount() = 3 input parameters and getResultCount() = 2 outputs, then, the shape of the result is

const hebench::APIBridge::DataPack & getParameterData(std::uint64_t param_position) const override
Data pack for specified operation parameter (operand).
std::uint64_t buffer_count
Number of data buffers in p_buffers.
Definition: types.h:613

Implements hebench::TestHarness::IDataLoader.

Definition at line 249 of file hebench_idata_loader.h.

◆ getResultData()

const hebench::APIBridge::DataPack & hebench::TestHarness::PartialDataLoader::getResultData ( std::uint64_t  param_position) const
overridevirtual

Data pack corresponding to the specified component of the result.

Parameters
[in]param_positionZero-based position of the component to return.
Returns
The data pack containing the collection of samples for the specified component of the result.

Implements hebench::TestHarness::IDataLoader.

Definition at line 654 of file hebench_idata_loader.cpp.

◆ getResultFor()

IDataLoader::ResultDataPtr hebench::TestHarness::PartialDataLoader::getResultFor ( const std::uint64_t *  param_data_pack_indices)
overridevirtual

getResultFor

Parameters
[in]param_data_pack_indicesCollection of indices for data sample to use inside each parameter pack. Number of elements pointed must be, at least, getParameterCount().
Returns
Returns a non-null pointer ResultData containing the ground-truth result corresponding to the specified parameter indices.
Exceptions
std::out_of_rangeif any index is out of range.
std::invalid_argumentif param_data_pack_indices is null.
instanceof std::exception on any other error.

The shape of result is always 2D: [n = getResultCount(), ?], so, the result for an operation is

(result[0][r_i], result[1][r_i], ..., result[n-1][r_i])

where r_i is the index of the NativeDataBuffers for the result in the second dimension.

See also
getResultIndex()

Implements hebench::TestHarness::IDataLoader.

Definition at line 665 of file hebench_idata_loader.cpp.

◆ getResultIndex()

std::uint64_t hebench::TestHarness::PartialDataLoader::getResultIndex ( const std::uint64_t *  param_data_pack_indices) const
overridevirtual

Computes the index of the result NativeDataBuffer given the indices of the input data.

Parameters
[in]param_data_pack_indicesCollection of indices for data sample to use inside each parameter pack. Number of elements pointed must be, at least, parameterCount().
Returns
Returns the index for the ground-truth result corresponding to the specified parameter indices.
Exceptions
std::out_of_rangeif any index is out of range.
std::invalid_argumentif param_data_pack_indices is null.

The shape of result is always 2D: [n = resultCount(), ?], so, the result for an operation is

(result[0][r_i], result[1][r_i], ..., result[n-1][r_i])

where r_i is the index of the data buffer for the result in the second dimension.

By specification definition, the result index (r_i) is computed in a row-major fashion, where the most significant parameter moves faster.

In general:

r_i = param_data_pack_indices[0]
for i in [1..getParameterCount() - 1]
r_i = param_data_pack_indices[i] + getParameterData(i).buffer_count * r_i;

For example, given the operation op() that returns a result of the shape [1, ?]:

param_count[0] = 3;
param_count[1] = 10;
Param[0][param_count[0]]
Param[1][param_count[1]]
param_data_pack_indices[0] = 2
param_data_pack_indices[1] = 3
Result = op(param_data_pack_indices[0], param_data_pack_indices[1]);

Then, r_i, the index in the second dimension that corresponds to the NativeDataBuffer in getResultData(0) where the result of the operation using the specified indices will be placed, is computed in row-major order as:

r_i = param_data_pack_indices[0] * param_count[1] + param_data_pack_indices[1]
r_i = 2 * 10 + 3 = 23

then, Result is actually:

Result = [[ getResultData(0).p_buffers[r_i] ]]
or
Result = [[ getResultData(0).p_buffers[23] ]]
const hebench::APIBridge::DataPack & getResultData(std::uint64_t param_position) const override
Data pack corresponding to the specified component of the result.
NativeDataBuffer * p_buffers
Array of data buffers for parameter.
Definition: types.h:612

For complete details, see Ordering of Results Based on Input Batch Sizes .

See also
hebench::APIBridge::Category::Offline

Implements hebench::TestHarness::IDataLoader.

Definition at line 693 of file hebench_idata_loader.cpp.

◆ getResultTempDataPacks() [1/3]

std::vector<std::shared_ptr<hebench::APIBridge::DataPack> > hebench::TestHarness::PartialDataLoader::getResultTempDataPacks ( ) const
inlineprotected

Retrieves a pre-allocated result providing memory space to store a single operation result sample.

Returns
Pre-allocated collection of DataPack. All fields in the Data packs and contained NativeDataBuffer are valid. Do not assign new values to any fields.
Exceptions
std::logic_errorif this method is called before init() has been called.

This method must be called after init() has been called.

Clients can use the memory pointed to by the contained NativeDataBuffer to store the values of the corresponding result sample. The returned allocation is appropriate to store the first result sample of the operation based on the first input sample. Technically, buffers allocated for every possible result sample should have the same capacity, thus, requesting allocation for first result sample should work for all results.

Allocation occurs every time this method is called.

Returned Data packs and associated buffers will be automatically cleaned up when out of scope and no other copies are available.

Definition at line 411 of file hebench_idata_loader.h.

◆ getResultTempDataPacks() [2/3]

std::vector<std::shared_ptr<hebench::APIBridge::DataPack> > hebench::TestHarness::PartialDataLoader::getResultTempDataPacks ( const std::uint64_t *  param_data_pack_indices) const
inlineprotected

Retrieves a pre-allocated result providing memory space to store a single operation result sample.

Parameters
[in]param_data_pack_indicesFor each operation parameter, this array indicates the sample index to use for the operation. Must contain, at least, getParameterCount() elements.
Returns
Pre-allocated collection of DataPack. All fields in the Data packs and contained NativeDataBuffer are valid. Do not assign new values to any fields.
Exceptions
std::logic_errorif this method is called before init() has been called.
std::out_of_rangeif values in param_data_pack_indices fall out of range.

This method must be called after init() has been called.

Clients can use the memory pointed to by the contained NativeDataBuffer to store the values of the corresponding result sample. Technically, buffers allocated for every possible result sample should have the same capacity, thus, requesting allocation for first result sample should work for all results.

Allocation occurs every time this method is called.

Returned Data packs and associated buffers will be automatically cleaned up when out of scope and no other copies are available.

Definition at line 387 of file hebench_idata_loader.h.

◆ getResultTempDataPacks() [3/3]

std::vector< std::shared_ptr< hebench::APIBridge::DataPack > > hebench::TestHarness::PartialDataLoader::getResultTempDataPacks ( std::uint64_t  result_index) const
protected

Retrieves a pre-allocated result providing memory space to store a single operation result sample.

Parameters
[in]result_indexResult sample index on which to base the pre-allocated buffers.
Returns
Pre-allocated collection of DataPack. All fields in the Data packs and contained NativeDataBuffer are valid. Do not assign new values to any fields.
Exceptions
std::logic_errorif this method is called before init() has been called.
std::out_of_rangeif result_index out of range.

This method must be called after init() has been called.

Clients can use the memory pointed to by the contained NativeDataBuffer to store the values of the corresponding result sample. Technically, buffers allocated for every possible result_index should have the same capacity, thus, requesting allocation for result 0 should work for all results.

Allocation occurs every time this method is called.

Returned Data packs and associated buffers will be automatically cleaned up when out of scope and no other copies are available.

Definition at line 595 of file hebench_idata_loader.cpp.

◆ getTotalDataLoaded()

std::uint64_t hebench::TestHarness::PartialDataLoader::getTotalDataLoaded ( ) const
inlineoverridevirtual

Total data loaded by this loader in bytes.

Implements hebench::TestHarness::IDataLoader.

Definition at line 253 of file hebench_idata_loader.h.

◆ hasResults()

bool hebench::TestHarness::PartialDataLoader::hasResults ( ) const
inline

Retrieves whether buffers to contain output data have been allocated or not.

Returns
true is memory is allocated for outputs.
false otherwise.

If false, no memory was allocated to hold ground truth results during initialization.

When calling methods getResultData() or getResultFor(), the returned buffers in NativeDataBuffer::p will be null. However, all other fields will be valid in order to provide information regarding output dimensions, size required to hold output data, etc. Clients may use getResultTempDataPacks() method to obtain a temporary scratch memory allocation to hold a single result sample.

See also
init(), getResultTempDataPacks()

Definition at line 268 of file hebench_idata_loader.h.

◆ init() [1/2]

void hebench::TestHarness::PartialDataLoader::init ( const std::string &  filename,
hebench::APIBridge::DataType  data_type,
std::size_t  expected_input_dim,
const std::size_t *  max_input_sample_count_per_dim,
const std::uint64_t *  expected_input_count_per_dim,
std::size_t  expected_output_dim,
const std::uint64_t *  expected_output_count_per_dim 
)
protected

Loads a dataset from a file.

Parameters
[in]filenameFile containing the dataset input and, optionally, expected outputs. The file must be one of the supported loader formats.
[in]data_typeType of the data for the benchmark that will use the dataset.
[in]expected_input_dimDimension of the input (to become getParameterCount()).
[in]max_input_sample_count_per_dimMaximum sample size requested for each input dimension. Must point to a buffer with enough space to contain, at least, expected_input_dim values.
[in]expected_input_count_per_dimExpected number of elements of type data_type in an input sample vector for the corresponding dimension. Must point to a buffer with enough space to contain, at least, expected_input_dim values.
[in]expected_output_dimDimension of the output (to become getResultCount()).
[in]expected_output_count_per_dimExpected number of elements of type data_type for each sample in the corresponding output dimension. Must point to a buffer with enough space to contain, at least, expected_output_dim values.
Exceptions
instanceof, or derived from std::exception on error.

Note that max_input_sample_count is an upper bound on the number of input samples per input component. Depending on the dataset file loaded, the actual sample size for a component may be less than the specified max value.

The number of output samples is the multiplication of the number of samples per input component.

On error, this method throws an instance of, or derived from std::exception.

Definition at line 431 of file hebench_idata_loader.cpp.

◆ init() [2/2]

void hebench::TestHarness::PartialDataLoader::init ( hebench::APIBridge::DataType  data_type,
std::size_t  input_dim,
const std::size_t *  input_sample_count_per_dim,
const std::uint64_t *  input_count_per_dim,
std::size_t  output_dim,
const std::uint64_t *  output_count_per_dim,
bool  allocate_output 
)
protected

Initializes dimensions of inputs and outputs. No allocation is performed.

Parameters
[in]data_typeData type of elements contained in the samples for this dataset.
[in]input_dimDimension of the input (to become getParameterCount()).
[in]input_sample_count_per_dimFor each dimension in the input, how many NativeDataBuffer samples will be loaded, this is, sample size for the input dimension. Must point to a buffer with enough space to contain, at least, input_dim values.
[in]input_count_per_dimArray with the number of elements of type data_type for each sample in the corresponding input parameter. Must point to a buffer with enough space to contain, at least, input_dim values.
[in]output_dimDimension of the output (to become getResultCount()).
[in]output_count_per_dimArray with the number of elements of type data_type for each sample in the corresponding output dimension. Must point to a buffer with enough space to contain, at least, output_dim values.
[in]allocate_outputSpecifies whether buffers for output ground truth will be allocated.
Exceptions
instanceof, or derived from std::exception on error.

After this call succeeds, all buffers for inputs and outputs have been allocated and pointed to the correct locations. It is responsibility of caller to initialize the values of the allocated buffers.

If allocate_output is false, no memory is allocated to hold ground truth results. When calling methods getResultData() or getResultFor(), the returned buffers in NativeDataBuffer::p will be null. However, all other fields will be valid in order to provide information regarding output dimensions, size required to hold output data, etc. Clients may use getResultTempDataPacks() method to obtain a temporary scratch memory allocation to hold a single result sample.

Definition at line 388 of file hebench_idata_loader.cpp.

◆ isInitialized()

bool hebench::TestHarness::PartialDataLoader::isInitialized ( ) const
inline

Definition at line 269 of file hebench_idata_loader.h.

Friends And Related Function Documentation

◆ PartialDataLoaderHelper

template<typename >
friend class PartialDataLoaderHelper
friend

Definition at line 240 of file hebench_idata_loader.h.


The documentation for this class was generated from the following files: