xA^T + b as per torch.nn.Linear. Applies general matrix multiply: output(MxN) = input(MxK) * weights(KxN) + bias(N) More...

Collaboration diagram for Gemm:

Classes
class	GemmReluGraph< GEMM, M, K, N, IS_RELU >
	Single instance graph that stores weights and biases Max size = 16384 and 4096 bytes respectively. More...

class	GemmReluStreamGraph< GEMM, M, K, N, IS_RELU >
	Single instance graph that streams weights and biases, significantly slower. More...

class	GemmReluMknkChunkGraph< GEMM, CONCAT, NCHUNK, M, K, N, IS_RELU >
	Multiinstance graph for MxK times NxK that stores weights and biases. Requires NxK weight, NCHUNK%8=0 and N%4=0 Chunks NxK weights by N dimension into NCHUNK chunks. Each instance has max size = 16384 and 4096 bytes respectively. Places maximum of 3x3 tiles, 8 conv tiles surrounding concat tile (max AIE DMA input=8) Padding handled within graph, NPAD or KPAD parameters not used. More...

class	GemmReluMkknChunkGraph< GEMM, CONCAT, NCHUNK, M, K, N, IS_RELU >
	Multiinstance graph for MxK times KxN that stores weights and biases Requires KxN_RND weight, NCHUNK%8=0, N%4=0 Chunks KxN weights by N dimension into NCHUNK chunks. Each instance has max size = 16384 and 4096 bytes respectively. Places maximum of 3x3 tiles, 8 conv tiles surrounding concat tile (max AIE DMA input=8) More...

class	GemmReluMkknChunkNStreamGraph< GEMM, CONCAT, NCHUNK, M, K, N, IS_RELU >
	Multiinstance graph for MxK times KxN that stores biases. More...

Detailed Description

std::conditional for kernel/graph typedef results in error in graph hierarchy algorithm

Template Parameters

GEMM	Gemm Kernel
CONCAT	Concat Kernel (if multiinstance)
NCHUNK	chunk size for N (if multiinstance)
M	number of rows of input matrix
K	number of cols / number of rows of weight matrix
N	number of cols of weight matrix / size of bias vector
NPAD	number of cols of weight matrix / size of bias vector, padded to vector boundary, used if weights are KxN

Modules