The template will work for all integral types and any class types that overload operator%. Your functions only work for int and below (no support for unsigned or long long types or custom types).
I'm not sure what you mean by "How is modulus implemented in C/C++ ?", could you not just test?
Also I think your code and my code would both be optimized to the same code by the compiler, it's just that yours is more obfuscated.
For any reasonable numeric value of x in Z both (x & 1) and (x % 2) result in the same value. Otherwise overload T cannot claim to be numeric.
[edit] Compilers should optimize the same for both, but the first is a well-known idiom because the second is not so nice for not-so-smart (older) compilers.
What about (x % 2_Number) where there exists an operator% with a left hand type of float/double/long double and a right-hand type of whatever 2_Number is? You can't do bitwise operations on floating point types.
_{And once again LB finds an insane way to prove a statement wrong.}