
floatn fast_normalize(floatn p);
DESCRIPTION

p * sqrt(3clc)(p.x2 + p.y2 +...)
The result shall be within 8192 ulps error from the infinitely precise result of:

if ( any(3clc)(p == 0.0f)) result = p; else result = p / sqrt(3clc)(p.x2 + p.y2 +...);
with the following exceptions:
 1. If the sum of squares is greater than FLT_MAX then the value of the floatingpoint values in the result vector are undefined.
 2. If the sum of squares is less than FLT_MIN then the implementation may return back p.
 3. If the device is in 'denorms are flushed to zero' mode, individual operand elements with magnitude less than sqrt(3clc)(FLT_MIN) may be flushed to zero before proceeding with the calculation.
NOTES
Builtin geometric functions operate componentwise. The description is percomponent. floatn is float, float2, float3, or float4 and doublen is double, double2, double3, or double4. The builtin geometric functions are implemented using the round to nearest even rounding mode.
The geometric functions can be implemented using contractions such as mad(3clc) or fma(3clc).
SPECIFICATION
m[blue]OpenCL Specificationm[][1]
AUTHORS
The Khronos Group
COPYRIGHT
Copyright © 20072011 The Khronos Group Inc.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and/or associated documentation files (the "Materials"), to deal in the Materials without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Materials, and to permit persons to whom the Materials are furnished to do so, subject to the condition that this copyright notice and permission notice shall be included in all copies or substantial portions of the Materials.
NOTES
 1.

OpenCL Specification
 page 262, section 6.12.5  Geometric Functions