Re: [gtk-list] g_memmove() (another RAM quickie)
- From: Erik Mouw <J A K Mouw its tudelft nl>
- To: gtk-list redhat com
- Subject: Re: [gtk-list] g_memmove() (another RAM quickie)
- Date: Fri, 10 Mar 2000 13:54:05 +0100 (MET)
On Thu, 09 Mar 2000 14:03:41 -0800 (PST), Derek Simkowiak wrote:
>
> #define g_memmove(dest, src, num_bytes)
>
> Does the difference between dest & src affect the execution time
> of a memmove() (or bcopy() )? That is, will
>
> /* move ten bytes not very far */
> g_memmove(pointer, pointer + 5, 10 )
>
> ...go faster than...
>
> /* move ten bytes a little farther */
> g_memmove(pointer, pointer + 9999, 10 )
>
> This would be for an overlapping region of memory.
>
> Again, any references/docs/FAQs to read are greatly appreciated.
The default answer is: it depends. It depends on the CPU, the
implementation, and the OS.
Most modern CPUs access memory in chunks of 4, 8 or 16 bytes. memmove()
can be fast if the src and dst pointers both start at the prefered memory
alignment, and if the number of bytes to be moved is a multiple of that
alignment. Non-aligned memory accesses have to be solved in hardware by
the CPU (like the IA32 architecture does), or in software by the OS (some
ARM cores, MIPS (IIRC)).
If the implementation takes the alignment into account, it can minimize
the number of misaligned memory accesses and move a complete aligned block
at a time. The implementation also has to deal with overlapping src and
dst regions. I can explain all possible combinations, but I'm sure you can
find out yourself by just taking a piece of paper and draw all possible
overlap combinations (hint: it is important to distinguish between overlap
larger and smaller than the CPU alignment).
If you're really interested, have a look at the glibc source. It contains
a generic C implementation that is correct and work for every CPU. Next to
that, there are a couple of hand-optimized (C or assembly) versions for
specific CPUs. What strategy is best for a certain CPU can be found from
the CPU documentation. The documentation for almost all Intel CPUs can be
found on http://developer.intel.com/ .
Erik
--
LART. 250 MIPS under one Watt.
http://www-ict.its.tudelft.nl/~erik/open-source/LART/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]