-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Float operations SqrtLower, MulSubAdd, GetExponent etc. #2425
base: master
Are you sure you want to change the base?
Float operations SqrtLower, MulSubAdd, GetExponent etc. #2425
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
There are targets other than SVE that have instructions for the GetExponent op, including AVX3/PPC9/PPC10. GetExponent can be implemented on AVX3 using _mm_getexp_ps/_mm_getexp_pd/_mm_getexp_ph, and GetExponent can be implemented on PPC9/PPC10 using vec_extract_exp (which returns the biased exponent) followed by a subtraction by |
Nice. @mazimkhan , would you like to add either of those, or insert a TODO comment mentioning them? |
@@ -657,6 +657,10 @@ from left to right, of the arguments passed to `Create{2-4}`. | |||
* `V`: `{f}` \ | |||
<code>V **Sqrt**(V a)</code>: returns `sqrt(a[i])`. | |||
|
|||
* `V`: `{f}` \ | |||
<code>V **SqrtLower**(V a)</code>: returns `sqrt(a[0])` in lowest lane and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in other PRs, I think we'd be better off with a new First1 op, and removing the other newly added *Lower ops.
to, and potentially more efficient than, `IfThenElseZero(m, Add(a, b));` etc. | ||
|
||
* `V`: `{f}` \ | ||
<code>V **MaskedSqrtOrZero**(M m, V a)</code>: returns `sqrt(a[i])` where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also drop the OrZero suffix here.
@@ -1234,6 +1243,29 @@ HWY_SVE_FOREACH_F(HWY_SVE_RETV_ARGV, ApproximateReciprocal, recpe) | |||
// ------------------------------ Sqrt | |||
HWY_SVE_FOREACH_F(HWY_SVE_RETV_ARGPV, Sqrt, sqrt) | |||
|
|||
// ------------------------------ MaskedSqrt | |||
namespace detail { | |||
HWY_SVE_FOREACH_F(HWY_SVE_RETV_ARGMV_M, MaskedSqrt, sqrt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we rely on First1 and remove the SqrtLower wrapper, then we can instead expose this op as MaskedSqrtOr(V no, V a).
Introduces:
Operations have been written for
arm_sve-inl.h
andgeneric_ops-inl.h
and all operations have tests written for them.