HEBench
|
This operation is defined as:
where
is the standard matrix multiplication operation.
Input: 2
parameters
Parameter | Description |
---|---|
0 | M0 is a matrix with m rows and n columns (m x n ). |
1 | M1 is a n x p matrix. |
Output: 1
output
Output | Description |
---|---|
0 | M is a m x p matrix. It is the result of multiplying M0 and M1 . |
If A[i][j]
denotes the element at row i
and column j
in matrix A
, then, the standard matrix multiplication operation is defined as:
This document applies to the following workloads:
Required workload parameters: 3
Index | Name | Type | Description |
---|---|---|---|
0 | m | uint64_t | Number of rows in matrix M0 . |
1 | n | uint64_t | Number of columns in matrix M0 . |
2 | p | uint64_t | Number of columns in matrix M1 . |
Number of rows in M1
is the same as number of columns in M0
as per definition of the operation.
Above parameters are required for the workload in the specified order. A backend must specify, at least, a set of default arguments for these parameters.
Backends can require extra parameters beyond the base requirements. If a backend requires extra parameters, these must have default values in every set of default arguments for the workload parameters.
This workload supports the following categories:
See Latency Category .
See Offline Category .
Value ranges for elements in CategoryParams::offline::data_count
. Default value is used when the backend implementation sets the data_count
for the corresponding operand to 0
, but user specified 0
or no value at run-time.
Parameter | Lower bound | Upper bound | Default |
---|---|---|---|
0 | 1 | none | 100 |
1 | 1 | none | 100 |
This workload is defined for the following data types:
Data layout in memory for a matrix with elements of type T
(where T
is any of the supported types), follows a row major ordering. All scalar elements in a matrix lie contiguous in memory, where consecutives elements in a row reside next to each other.
For example, given the following matrix A (2 x 3)
:
The elements will be stored in memory as:
Offset: | 0 | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|---|
A | a00 | a01 | a02 | a10 | a11 | a12 |
Backends should expect this layout for their raw, clear text inputs, and must generate this layout for their decoded outputs.
If several matrices will be pointed at by a single pointer, consecutive matrix will follow each other in memory.
Supported modes:
Generate | External |
---|---|
yes | yes |
Data generation for vectors used as input for this workload occurs during workload initialization by Test Harness. Ground truths are pre-computed during data generation. There is no standard dataset.
During data generation, all matrix elements are extracted from a pseudo-random normal distribution with mean 0
and standard deviation of 10
: n(0, 10)
.