- floatn fast_normalize(floatn p);
p * sqrt(3clc)(p.x2 + p.y2 +...)
The result shall be within 8192 ulps error from the infinitely precise result of:
if ( any(3clc)(p == 0.0f)) result = p; else result = p / sqrt(3clc)(p.x2 + p.y2 +...);
with the following exceptions:
- 1. If the sum of squares is greater than FLT_MAX then the value of the floating-point values in the result vector are undefined.
- 2. If the sum of squares is less than FLT_MIN then the implementation may return back p.
- 3. If the device is in 'denorms are flushed to zero' mode, individual operand elements with magnitude less than sqrt(3clc)(FLT_MIN) may be flushed to zero before proceeding with the calculation.
Built-in geometric functions operate component-wise. The description is per-component. floatn is float, float2, float3, or float4 and doublen is double, double2, double3, or double4. The built-in geometric functions are implemented using the round to nearest even rounding mode.
The geometric functions can be implemented using contractions such as mad(3clc) or fma(3clc).
The Khronos Group
Copyright © 2007-2011 The Khronos Group Inc.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and/or associated documentation files (the "Materials"), to deal in the Materials without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Materials, and to permit persons to whom the Materials are furnished to do so, subject to the condition that this copyright notice and permission notice shall be included in all copies or substantial portions of the Materials.
- page 262, section 6.12.5 - Geometric Functions