Optimize byte-aligned copies in copy_bitwise()

The function copy_bitwise used for copying DWARF pieces can potentially
be invoked for large chunks of data.  For instance, consider a large
struct one of whose members is currently located in a register.  In this
case copy_bitwise would still copy the data bitwise in a loop, which is
much slower than necessary.

This change uses memcpy for the large part instead, if possible.

gdb/ChangeLog:

	* dwarf2loc.c (copy_bitwise): Use memcpy for the middle part, if
	it is byte-aligned.
diff --git a/gdb/ChangeLog b/gdb/ChangeLog
index d667a37..51787ad 100644
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@@ -1,4 +1,9 @@
 2016-11-24  Andreas Arnez  <arnez@linux.vnet.ibm.com>
+
+	* dwarf2loc.c (copy_bitwise): Use memcpy for the middle part, if
+	it is byte-aligned.
+
+2016-11-24  Andreas Arnez  <arnez@linux.vnet.ibm.com>
 	    Pedro Alves  <palves@redhat.com>
 
 	* dwarf2loc.c (bits_to_str, check_copy_bitwise)
diff --git a/gdb/dwarf2loc.c b/gdb/dwarf2loc.c
index 61f197b..128f654 100644
--- a/gdb/dwarf2loc.c
+++ b/gdb/dwarf2loc.c
@@ -1548,11 +1548,30 @@
     {
       size_t len = nbits / 8;
 
-      while (len--)
+      /* Use a faster method for byte-aligned copies.  */
+      if (avail == 0)
 	{
-	  buf |= *(bits_big_endian ? source-- : source++) << avail;
-	  *(bits_big_endian ? dest-- : dest++) = buf;
-	  buf >>= 8;
+	  if (bits_big_endian)
+	    {
+	      dest -= len;
+	      source -= len;
+	      memcpy (dest + 1, source + 1, len);
+	    }
+	  else
+	    {
+	      memcpy (dest, source, len);
+	      dest += len;
+	      source += len;
+	    }
+	}
+      else
+	{
+	  while (len--)
+	    {
+	      buf |= *(bits_big_endian ? source-- : source++) << avail;
+	      *(bits_big_endian ? dest-- : dest++) = buf;
+	      buf >>= 8;
+	    }
 	}
       nbits %= 8;
     }