[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Patches] [PATCH] ARM: NEON detected memcpy.
- To: "Joseph S. Myers" <joseph@xxxxxxxxxxxxxxxx>
- Subject: Re: [Patches] [PATCH] ARM: NEON detected memcpy.
- From: Richard Earnshaw <rearnsha@xxxxxxx>
- Date: Tue, 09 Apr 2013 10:04:56 +0100
On 03/04/13 16:08, Joseph S. Myers wrote:
I was previously told by people at ARM that NEON memcpy wasn't a good idea
in practice because of raised power consumption, context switch costs etc.
from using NEON in processes that otherwise didn't use it, even if it
appeared superficially beneficial in benchmarks.
What really matters is system power increase vs performance gain and
what you might be able to save if you finish sooner. If a 10%
improvement to memcpy performance comes at a 12% increase in CPU power,
then that might seem like a net loss. But if the CPU is only 50% of the
system power, then the increase in system power increase is just half of
that (ie 6%), but the performance improvement will still be 10%. Note
that 20% is just an example to make the figures easier here, I've no
idea what the real numbers are, and they will be hightly dependent on
the other components in the system: a back-lit display, in particular,
will use a significant amount of power.
It's also necessary to think about how the Neon unit in the processor is
managed. Is it power gated or simply clock gated. Power gated regions
are likely to have long power-up times (relative to normal CPU
operations), but clock-gated regions are typically instantaneously
Finally, you need to consider whether the unit is likely to be already
in use. With the increasing trend to using the hard-float ABI, VFP (and
Neon) are generally much more widely used in code now than they were, so
the other potential cost of using Neon (lazy context switching) is also
likely to be a non-issue, than if the unit is almost never touched.
Patches mailing list