Linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full precision tensor. The dequantization formula is y = (x - x_zero_point) * x_scale. x_scale and x_zero_point must have same shape, and can be either a scalar for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization. x_zero_point and x must have same type. x and y must have same shape.
More...
- Template Parameters
-
| DEQUANTIZE_LINEAR | DequantizeLinear Kernel |
| TT | int8_t or uint8_t |
| B | batch size |
| INP_W | input width |
| OUT_W | output width, expect OUT_W > INP_W |