Faster squaring of a vector of floats

I wonder whether there is a faster way (faster then the obvious one a=a*a) to square the content of a vector of floats. I remember that bit shifting could be used somehow, but I'm not sure whether this works for floats. Any suggestions on how to achieve this are welcome (perhaps with the code). If you are not sure about the floats, you might assume unsigned int.
Thanks
bitshifting can be used to multiply/divide integers by powers of 2 (2,4,8,etc). And even then you're better off multiplying/dividing for clarity, as the compiler may optimize the code and turn you mutliplication into bitshifting on its own.

Anyway... no. Just do a*a.
If you compile for an SSE target (i.e. if you compile for amd64/intel64 or if you specify -msse), the compiler should perform SSE optimization automatically, which greatly speeds things up.
Verify this by making sure the generated code is using the mulps instruction.
This seems like a fairly simple thing to do so I do not understand why this would be a good place to optimize. I think that you need to worry about other aspects of the program.
Topic archived. No new replies allowed.