Vector implementation for Hx4 QLinearConv using 32bit scale for precision, requires data to be arranged in [a,b,c,d,e,f,g,h,i] -> [a,b,c,0, d,e,f,0, g,h,i,0, 0,0,0,0], requires bias to be shifted, i.e. tbias - tw.reshape(M,-1).sum(1) * X_zero_point, requires KW<=4, INP_W%16=0, OUT_W_PAD%16=0, STEP_H==1|2, STEP_W==1|2, QLinearConvHx4StreamScale32bit<28,48,28,32,1,1,1,1,8,3> total = 8508 QLinearConvHx4StreamScale32bit<26,32,13,16,2,2,1,1,8,3> total = 4332.

#include <qlinearconv.h>

Public Member Functions
	QLinearConvHx4StreamScale32bit (int32_t(&b)[M], float x_scale, float w_scale, float y_scale, TT x_zero, TTPARAM w_zero, TT y_zero)

void	filter (input_window< TT > in, input_stream< TTPARAM > weights, output_stream< TT > *out)

Static Public Member Functions
static void	registerKernelClass ()

The documentation for this class was generated from the following files:

design/aie_src/qlinearconv.h
design/aie_src/qlinearconv.cc

Public Member Functions

Static Public Member Functions