Modulo polynomial multiplication is an essential mathematical operation in the area of finite field arithmetic. Polynomial functions can be represented as tensors, which can be utilized as basic building blocks for various lattice-based post-quantum cryptography schemes. This paper presents a tensor-based novel modulo multiplication method for multivariate polynomials over GF(2m) and is realized on the hardware platform (FPGA). The proposed method consumes 6.5× less power and achieves more than 6× speedup compared to other contemporary single variable polynomial multiplication implementations. Our method is embarrassingly parallel and easily scalable for multivariate polynomials. Polynomial functions of nine variables, where each variable is of degree 128, are tested with the proposed multiplier, and its corresponding area, power, and power-delay-area product (PDAP) are presented. The computational complexity of single variable and multivariate polynomial multiplications are O(n) and O(np) , respectively, where n is the maximum degree of a polynomial having p variables. Due to its high speed, low latency, and scalability, the proposed modulo multiplier can be used in a wide range of applications.