Endianness of padded scalar objects

Discussion:

Endianness of padded scalar objects

(too old to reply)

Ray

2010-02-24 20:54:05 UTC

Hello,

Here is an issue I should know, but I just realized I'm not quite sure. Of
course, the endianness of a multibyte scalar object is defined by whether the
least significant byte occupies the lowest address (little endian) or the
highest address (big endian), (I'm not concerned about "middle endian" here).
And of course, taking the "sizeof" any object produces a count of the number
of bytes of storage used by that object. For most scalar types on most
implementations all of those storage bytes are used to actually represent the
object's value. That is, a 4-byte int actually occupies exactly 4 bytes of
storage, an 8-byte double actually occupies 8 bytes of storage. However, in
some cases more storage is used for an object than is actually used to
represent the object's value. For example, only 10 or 12 bytes may be needed
to represent the value of a long double, but on some implementations 6 or 4
additional bytes of padding may be used to enforce 16-byte memory alignment,
and when such a padded object is written to a file, all padding is included.
Assuming that my description is accurate, my concern is regarding the
appropriate way to reverse the endian of such an object. Obviously for an
object that uses all of its storage to represent its value, reversing endian
simply amounts to exchanging the lowest-addressed storage byte with the
highest-addressed storage byte and working your way toward the middle,
something like the code that follows. However, in the case of an object that
doesn't use all of its storage to represent its value, the padding byte(s)
will be swapped with some of the value bytes, which I believe is not what is
desired. Instead, the swap should begin with the highest-addressed byte that
is actually used for the object's value, totally ignoring the padding bytes
themselves. Am I correct in this assumption? Assuming I am correct, I'm at
a loss for a simple portable way to determine how many bytes are used for the
value and how many are used for padding. Of course this can be determined by
looking at the compiler's documentation, but then portability goes out the
window. I can also envision using some bit shifting scheme to determine
this, but this would only work for objects with integral types. What do you
think?

Thanks,
Sonny

void *ReverseEndian(void *p, size_t size)
{
char *head = (char *)p;
char *tail = head + size - 1;

for (; tail > head; --tail, ++head)
{
char temp = *head;
*head = *tail;
*tail = temp;
}
return p;
}

int main(void)
{
int x;
long double y;

ReverseEndian((void *)&x, sizeof(x));
ReverseEndian((void *)&y, sizeof(y));

return 0;
}

David Lowndes

2010-02-24 22:17:03 UTC

Post by Ray
For example, only 10 or 12 bytes may be needed
to represent the value of a long double, but on some implementations 6 or 4
additional bytes of padding may be used to enforce 16-byte memory alignment,
and when such a padded object is written to a file, all padding is included.

I'm pretty sure that shouldn't be the case - do you have a specific
example where it is?

Dave

Ray Mitchell

2010-02-25 01:18:04 UTC

Post by David Lowndes

Post by Ray
For example, only 10 or 12 bytes may be needed
to represent the value of a long double, but on some implementations 6 or 4
additional bytes of padding may be used to enforce 16-byte memory alignment,
and when such a padded object is written to a file, all padding is included.

I'm pretty sure that shouldn't be the case - do you have a specific
example where it is?
Dave
.

Here is an example of gcc running on a Mac with an Intel/AMD processor.
Specifically, see the -m96bit-long-double and -m128bit-long-double options:

http://developer.apple.com/mac/library/DOCUMENTATION/DeveloperTools/gcc-4.0.1/gcc/i386-and-x86_002d64-Options.html

Igor Tandetnik

2010-02-25 01:31:03 UTC

Post by Ray Mitchell

Post by David Lowndes

Post by Ray
For example, only 10 or 12 bytes may be needed
to represent the value of a long double, but on some
implementations 6 or 4 additional bytes of padding may be used to
enforce 16-byte memory alignment, and when such a padded object is
written to a file, all padding is included.

I'm pretty sure that shouldn't be the case - do you have a specific
example where it is?
Dave
.

Here is an example of gcc running on a Mac with an Intel/AMD
processor. Specifically, see the -m96bit-long-double and
http://developer.apple.com/mac/library/DOCUMENTATION/DeveloperTools/gcc-4.0.1/gcc/i386-and-x86_002d64-Options.html

Do these switches affect sizeof(long double), or just __alignof(long double) ?

--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925

Igor Tandetnik

2010-02-25 14:38:17 UTC

Post by Igor Tandetnik

Post by Ray Mitchell
Here is an example of gcc running on a Mac with an Intel/AMD
processor. Specifically, see the -m96bit-long-double and
http://developer.apple.com/mac/library/DOCUMENTATION/DeveloperTools/gcc-4.0.1/gcc/i386-and-x86_002d64-Options.html

Do these switches affect sizeof(long double), or just __alignof(long double) ?

Answering myself - it does look like it affects the size:

Warning: if you override the default value for your target ABI, the structures and *arrays* containing long double variables will change their size.

Emphasis mine. Since arrays can't contain padding, the only way an array may change size is when the size of the element itself changes.

If I had to guess, I'd say it's a pretty good bet that the padding goes at the end (at higher addresses).

--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925

David Lowndes

2010-02-25 10:16:22 UTC

Post by Ray Mitchell
Here is an example of gcc running on a Mac with an Intel/AMD processor.
http://developer.apple.com/mac/library/DOCUMENTATION/DeveloperTools/gcc-4.0.1/gcc/i386-and-x86_002d64-Options.html

I'd verify what Igor suggests. I wouldn't expect sizeof to include the
alignment padding - so the problem you're envisaging with the padding
shouldn't exist.

Dave

Ulrich Eckhardt

2010-02-25 14:00:09 UTC

I wouldn't expect sizeof to include the alignment padding - so the
problem you're envisaging with the padding shouldn't exist.

I would. Imagine this code:

T t0;
T t1[1];
T t2[2];
assert((sizeof t0) == (sizeof t1));
assert((2 * sizeof t0) == (sizeof t2));

The reported size must include the padding for alignment there.

Uli

--
C++ FAQ: http://parashift.com/c++-faq-lite

Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

David Lowndes

2010-02-25 14:30:09 UTC

Post by Ulrich Eckhardt

I wouldn't expect sizeof to include the alignment padding - so the
problem you're envisaging with the padding shouldn't exist.

T t0;
T t1[1];
T t2[2];
assert((sizeof t0) == (sizeof t1));
assert((2 * sizeof t0) == (sizeof t2));
The reported size must include the padding for alignment there.

You've lost me there Uli - the results of that are way I'd expect (no
padding issues). Show us a complete example that illustrates the
problem.

Dave

Ulrich Eckhardt

2010-02-25 15:41:57 UTC

Post by David Lowndes

Post by Ulrich Eckhardt

I wouldn't expect sizeof to include the alignment padding - so the
problem you're envisaging with the padding shouldn't exist.

T t0;
T t1[1];
T t2[2];
assert((sizeof t0) == (sizeof t1));
assert((2 * sizeof t0) == (sizeof t2));
The reported size must include the padding for alignment there.

You've lost me there Uli - the results of that are way I'd expect (no
padding issues). Show us a complete example that illustrates the
problem.

Okay....

Let's assume a fictional long double type that is 12 bytes large. However,
it doesn't need a 12-byte alignment but actually a 16-byte alignment
because the FPU says so.

Now, you said "I wouldn't expect sizeof to include the alignment padding",
so for that type you would expect sizeof(long double) to yield 12, right?
However, I would expect it to include that padding and yield 16 instead.
The reason is simply that the layout of an object (i.e. the size it
occupies including padding) should be the same, regardless of whether it
was allocated alone or in an array. If it didn't, you couldn't e.g. compute
the number of elements in an array with (sizeof array)/(sizeof array[0])
any more.

Uli

--
C++ FAQ: http://parashift.com/c++-faq-lite

Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

David Lowndes

2010-02-25 16:25:38 UTC

Post by Ulrich Eckhardt
Let's assume a fictional long double type that is 12 bytes large. However,
it doesn't need a 12-byte alignment but actually a 16-byte alignment
because the FPU says so.

I could argue that in essence that makes the type really 16 bytes
though.

I'm still not convinced by fictional things. Does anyone have a real
example that would truly illustrate this?

Dave

Bo Persson

2010-02-25 17:45:53 UTC

Post by David Lowndes

Post by Ulrich Eckhardt
Let's assume a fictional long double type that is 12 bytes large.
However, it doesn't need a 12-byte alignment but actually a
16-byte alignment because the FPU says so.

I could argue that in essence that makes the type really 16 bytes
though.
I'm still not convinced by fictional things. Does anyone have a real
example that would truly illustrate this?

You had one previously .-)

T t0;
T t1[1];
T t2[2];
assert((sizeof t0) == (sizeof t1));
assert((2 * sizeof t0) == (sizeof t2));

The reported size must include the padding for alignment there.

Array indexing is based on sizeof(element), as is pointer arithmetic.
That's why the alignment cannot be different from the size of the
array elements.

Bo Persson

David Lowndes

2010-02-25 18:27:47 UTC

Post by Bo Persson
You had one previously .-)

What exactly is it supposed to show?

Are the asserts supposed to be true or false?

As I mentioned, the results from that were exactly what I'd assume
them to be.

Dave

Ray Mitchell

2010-02-25 19:21:01 UTC

Post by David Lowndes

Post by Ulrich Eckhardt
Let's assume a fictional long double type that is 12 bytes large. However,
it doesn't need a 12-byte alignment but actually a 16-byte alignment
because the FPU says so.

I could argue that in essence that makes the type really 16 bytes
though.
I'm still not convinced by fictional things. Does anyone have a real
example that would truly illustrate this?

sizeof must report all bytes including padding since sizeof is defined to
report the number of bytes of storage used for an object. Try gcc 4 on a Mac
OS X implementation. sizeof reports 16 bytes but 4 of those bytes are used
for padding (and the padding bytes are not necessarily 0s either). As an
item of curiosity, I've found that if you simply assign one type long double
variable to another, the padding bytes don't get copied in the process. As
an example of why sizeof must include padding bytes, consider this: The
typical way to read/write a file involves the fread/fwrite functions, and
sizeof is often used to determine argument values:

long double x[DIM] = {...};
FILE *fp = ...open some file...
fwrite((void *)x, sizeof(*x), sizeof(x)/sizeof(*x), fp);
fread((void *)x, sizeof(*x), sizeof(x)/sizeof(*x), fp);

Since fread/fwrite have no idea of the actual data types they are
transferring, they would have no idea of whether or not to insert/discard
padding from the objects in the array for some data types but not for others.
Thus, the padding must be included in the sizeof report so it can be written
to the file by fwrite for later retrieval and insertion into the array by
fread.

Ray

Post by David Lowndes
Dave
.

Ulrich Eckhardt

2010-02-26 12:45:33 UTC

Post by David Lowndes

Post by Ulrich Eckhardt
Let's assume a fictional long double type that is 12 bytes large. However,
it doesn't need a 12-byte alignment but actually a 16-byte alignment
because the FPU says so.

I could argue that in essence that makes the type really 16 bytes
though.

True. 12 significant bytes plus 4 bytes padding.

Post by David Lowndes
I'm still not convinced by fictional things. Does anyone have a real
example that would truly illustrate this?

struct X
{
int i;
char c;
};

Typical layout would be four byte for the int, one for the character and
then one or three byte padding.

Uli

--
C++ FAQ: http://parashift.com/c++-faq-lite

Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

Igor Tandetnik

2010-02-24 23:06:30 UTC

Post by Ray
Here is an issue I should know, but I just realized I'm not quite
sure. Of course, the endianness of a multibyte scalar object is
defined by whether the least significant byte occupies the lowest
address (little endian) or the highest address (big endian), (I'm not
concerned about "middle endian" here). And of course, taking the
"sizeof" any object produces a count of the number of bytes of
storage used by that object. For most scalar types on most
implementations all of those storage bytes are used to actually
represent the object's value. That is, a 4-byte int actually
occupies exactly 4 bytes of storage, an 8-byte double actually
occupies 8 bytes of storage. However, in some cases more storage is
used for an object than is actually used to represent the object's
value. For example, only 10 or 12 bytes may be needed to represent
the value of a long double, but on some implementations 6 or 4
additional bytes of padding may be used to enforce 16-byte memory
alignment, and when such a padded object is written to a file, all
padding is included. Assuming that my description is accurate, my
concern is regarding the appropriate way to reverse the endian of
such an object.

I don't know of any architecture where sizeof(long double) == 16, let alone two of them that differ in endianness. If you know of such machines, and you find yourself in an unenviable position of having to exchange binary data between them, you should consult their accompanying manuals, which hopefully would explain precisely how those long double values are laid out.

Post by Ray
Assuming I am correct, I'm at a loss for a simple portable way to
determine how many bytes are used for the value and how many are used
for padding.

There is no portable way to determine endianness of the machine to begin with, padding or no padding. Binary layout is necessarily machine-specific, and a program relying on any particular layout is non-portable. You keep talking about endianness: what about machines that use sign-magnitue or one's complement to represent signed integers (as opposed to two's complement used by most modern architectures)?

--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925

Ray Mitchell

2010-02-25 01:39:01 UTC

Post by Igor Tandetnik

Post by Ray
Here is an issue I should know, but I just realized I'm not quite
sure. Of course, the endianness of a multibyte scalar object is
defined by whether the least significant byte occupies the lowest
address (little endian) or the highest address (big endian), (I'm not
concerned about "middle endian" here). And of course, taking the
"sizeof" any object produces a count of the number of bytes of
storage used by that object. For most scalar types on most
implementations all of those storage bytes are used to actually
represent the object's value. That is, a 4-byte int actually
occupies exactly 4 bytes of storage, an 8-byte double actually
occupies 8 bytes of storage. However, in some cases more storage is
used for an object than is actually used to represent the object's
value. For example, only 10 or 12 bytes may be needed to represent
the value of a long double, but on some implementations 6 or 4
additional bytes of padding may be used to enforce 16-byte memory
alignment, and when such a padded object is written to a file, all
padding is included. Assuming that my description is accurate, my
concern is regarding the appropriate way to reverse the endian of
such an object.

I don't know of any architecture where sizeof(long double) == 16,

How about this link - Check out the -m96bit-long-double and
-m128bit-long-double options:

http://developer.apple.com/mac/library/DOCUMENTATION/DeveloperTools/gcc-4.0.1/gcc/i386-and-x86_002d64-Options.html

Post by Igor Tandetnik
let alone two of them that differ in endianness.

I'm not clear on this part of your statement. Of course endianness will be
the same for all objects on a given type of processor, but can differ between
different types of processors.

Post by Igor Tandetnik
If you know of such machines, and you find yourself in an unenviable position of having to exchange binary data between them, you should consult their accompanying manuals, which hopefully would explain precisely how those long double values are laid out.

Post by Ray
Assuming I am correct, I'm at a loss for a simple portable way to
determine how many bytes are used for the value and how many are used
for padding.

There is no portable way to determine endianness of the machine to begin with, padding or no padding.

Assuming that at least some integral scalar types are more than one byte
(which I don't believe is actually required by the language standards), I
always thought the following type of thing could be used portably to
determine endiness if no padding is present in the object:

void DetermineEndian()
{
union
{
long obj;
char bytes[sizeof(obj)];
} test = { 1 };

if (test.bytes[0] == 1)
cout << "Addressing is right-to-left (little endian)\n";
else if (test.bytes[sizeof(int) - 1] == 1)
cout << "Addressing is left-to-right (big endian)\n";
else
cout << "Addressing is strange (weird endian?)\n";
}

Post by Igor Tandetnik
Binary layout is necessarily machine-specific, and a program relying on any particular layout is non-portable. You keep talking about endianness: what about machines that use sign-magnitue or one's complement to represent signed integers (as opposed to two's complement used by most modern architectures)?

Of course there are numerous portability considerations including those you
mention and several others. However, my concern here is only regarding
whether padding bytes in a scalar object should be involved in an endian byte
swap, which I believe they probably shouldn't.

Post by Igor Tandetnik
--
With best wishes,
Igor Tandetnik
With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925
.

Igor Tandetnik

2010-02-25 02:41:35 UTC

Post by Ray Mitchell

Post by Igor Tandetnik
I don't know of any architecture where sizeof(long double) == 16,

How about this link - Check out the -m96bit-long-double and
http://developer.apple.com/mac/library/DOCUMENTATION/DeveloperTools/gcc-4.0.1/gcc/i386-and-x86_002d64-Options.html

Do these switches affect sizeof(long double), or just __alignof(long double) ?

Post by Ray Mitchell

Post by Igor Tandetnik
let alone two of them that differ in endianness.

I'm not clear on this part of your statement. Of course endianness
will be the same for all objects on a given type of processor, but
can differ between different types of processors.

Do there exist two architectures that a) differ in endianness, and b) both have sizeof(long double) == 16 ?

If anything, I'd be more concerned about transferring data between two machines where sizeof(long double) itself is different, padding and endianness aside.

Post by Ray Mitchell

Post by Igor Tandetnik
There is no portable way to determine endianness of the machine to
begin with, padding or no padding.

Assuming that at least some integral scalar types are more than one
byte (which I don't believe is actually required by the language
standards), I always thought the following type of thing could be
used portably to determine endiness if no padding is present in the
void DetermineEndian()
{
union
{
long obj;
char bytes[sizeof(obj)];
} test = { 1 };
if (test.bytes[0] == 1)

Assigning to one member of the union and then reading another exhibits undefined behavior.

Post by Ray Mitchell
Of course there are numerous portability considerations including
those you mention and several others. However, my concern here is
only regarding whether padding bytes in a scalar object should be
involved in an endian byte swap, which I believe they probably
shouldn't.

If you can find two architectures that are actually affected by the problem, you can study their documentation and find out (at least for this one case).

--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925

Ray

2010-02-25 06:57:01 UTC

Post by Igor Tandetnik
Do there exist two architectures that a) differ in endianness, and b) both have sizeof(long double) == 16 ?

I don't know, but my original question was intended to be theoretical.
Maybe there is no single answer, but only an imlementation-dependent answer.

Post by Igor Tandetnik
If anything, I'd be more concerned about transferring data between two machines where sizeof(long double) itself is different, padding and endianness aside.

Yes, but my only concern at this point is regarding the endian swapping issue.

Post by Igor Tandetnik

Post by Ray Mitchell
void DetermineEndian()
{
union
{
long obj;
char bytes[sizeof(obj)];
} test = { 1 };
if (test.bytes[0] == 1)

Assigning to one member of the union and then reading another exhibits undefined behavior.

Where did you get this information? Could you please refer me to the
appropriate section of the C standard that states this is the case? I
searched through the C99 standard and could find nothing the either directly
stated nor implied this undefined behavior. Logically, to me at least, since
all union members start at the same address, examining the bytes of only the
most recently written member via a character pointer should yield perfectly
valid results, and that is what I am doing. And even if what you state is
true I could simply set a separate character pointer equal to the address of
the entire union and examine the individual bytes that way, thereby not
reading using another member.

Igor Tandetnik

2010-02-25 14:30:46 UTC

Post by Igor Tandetnik
Assigning to one member of the union and then reading another exhibits undefined behavior.

Where did you get this information? Could you please refer me to the
appropriate section of the C standard that states this is the case?

6.7.2.1p14 The size of a union is sufficient to contain the largest of its members. The value of at most one of the members can be stored in a union object at any time.

6.2.4p2 The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. If an object is referred to outside of its lifetime, the behavior is undefined.

The above should be sufficient, but, as an independent evidence of the intent:

6.5.2.3p5 One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the complete type of the union is visible.

By implication, if two members of a union are _not_ structures that share a common initial sequence, then you can't store one and then inspect the other, or the special guarantee wouldn't be needed.

Post by Ray
Logically, to me at least, since
all union members start at the same address, examining the bytes of only the
most recently written member via a character pointer should yield perfectly
valid results, and that is what I am doing.

It would yield unspecified results:

6.2.6.1p1 The representations of all types are unspecified except as stated in this subclause.

Post by Ray
And even if what you state is
true I could simply set a separate character pointer equal to the address of
the entire union and examine the individual bytes that way, thereby not
reading using another member.

That you can do, and you don't need a union:

long l = 1;
char* p = (char*)&l;

There's a special dispensation for this:

6.5p7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
...
- a character type.

6.3.2.3p7 A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type... When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

However, this doesn't give you much, in view of aforementioned 6.2.6.1p1 - in general, you have no idea what to expect when looking at individual bytes of an object.

I would concede the following: if you limit yourself to architectures with "unsurprising" representations (in particular, no padding bits as defined by 6.2.6.2p1), and the only uncertainty is whether the architecture is little- or big-endian, then you can detect this by inspecting an integer via a char* pointer as shown above. But you are relying on a lot of a-priori knowledge.

However, you seem to be specifically concerned with machines that have "surprising" representation (such as padding bits within the value). I'm not even sure such beasts exist in nature. In any case, if you are so uncertain of the details of the architecture that you need to ask the question in the first place, I don't quite see how looking at individual bytes of the representation may enlighten you.

--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925

Ray Mitchell

2010-02-25 18:49:03 UTC

Hi Igor,

I appreciate all of the time you've taken to converse with me regarding this
issue. I have commented on some of your comments again below. Although I
don't necessarily agree with your interpretations, it's probably due to my
own lack of knowledge on some of the topics.

Thanks,
Ray

Post by Igor Tandetnik

Post by Igor Tandetnik
Assigning to one member of the union and then reading another exhibits undefined behavior.

Where did you get this information? Could you please refer me to the
appropriate section of the C standard that states this is the case?

6.7.2.1p14 The size of a union is sufficient to contain the largest of its members. The value of at most one of the members can be stored in a union object at any time.

I don't agree that this makes reading a member that was not most recently
written undefined as long as the member being read shares all of its bytes
with the recently written member. Concerning the "older" members of a union,
6.1.6.2p7 of the standard says, "When a value is stored in a member of an
object of union type, the bytes of the object representation that do not
correspond to that member but do correspond to other members take unspecified
values, but the value of the union object shall not thereby become a trap
representation." When the compiler generates code to access the various
union members, that code merely accesses the appropriate bytes in the common
object and interprets them in the way appropriate to that member's data type.
The code to do this is "permanent" and does not change just because another
member was recently written. Instead, the access is made without any memory
of what might have happened to the object previously. As a result the values
of the bytes being read are exactly the values that were written. Of course,
if a 4-byte type float member were written but 4-byte type int member were
then read, the value of the int would be implementation dependent, but not
because the individual bytes were not the same values that were written, but
merely because of the difference in the representations of a float and an int.

Post by Igor Tandetnik
6.2.4p2 The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. If an object is referred to outside of its lifetime, the behavior is undefined.

I agree totally, and the object in this case is the underlying memory common
to all members. But this is unrelated to the issue we're discussing since
the lifetime of the object does not end between the write and the read.

Post by Igor Tandetnik
6.5.2.3p5 One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the complete type of the union is visible.
By implication, if two members of a union are _not_ structures that share a common initial sequence, then you can't store one and then inspect the other, or the special guarantee wouldn't be needed.

I'm sorry but that's not how I interpret the implication this.

Post by Igor Tandetnik

Post by Ray
Logically, to me at least, since
all union members start at the same address, examining the bytes of only the
most recently written member via a character pointer should yield perfectly
valid results, and that is what I am doing.

6.2.6.1p1 The representations of all types are unspecified except as stated in this subclause.

Post by Ray
And even if what you state is
true I could simply set a separate character pointer equal to the address of
the entire union and examine the individual bytes that way, thereby not
reading using another member.

long l = 1;
char* p = (char*)&l;

Agreed. The union example was arbitrary and self-contained.

Post by Igor Tandetnik
....
- a character type.
6.3.2.3p7 A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type... When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.
However, this doesn't give you much, in view of aforementioned 6.2.6.1p1 - in general, you have no idea what to expect when looking at individual bytes of an object.

But my original example was not the general case. It merely set the value
of an integral type to a value of 1, and I believe that guarantees that only
the least significant bit will be a 1.

Post by Igor Tandetnik
I would concede the following: if you limit yourself to architectures with "unsurprising" representations (in particular, no padding bits as defined by 6.2.6.2p1), and the only uncertainty is whether the architecture is little- or big-endian, then you can detect this by inspecting an integer via a char* pointer as shown above. But you are relying on a lot of a-priori knowledge.

Yes, but I believe that we've diverged from the original endian swapping
issue into the issue of inspecting bytes. So, the endian swapping issue
still remains unresolved in my mind. For the time being at least I'll just
consider it an implementation dependent issue.

Post by Igor Tandetnik
However, you seem to be specifically concerned with machines that have "surprising" representation (such as padding bits within the value). I'm not even sure such beasts exist in nature. In any case, if you are so uncertain of the details of the architecture that you need to ask the question in the first place, I don't quite see how looking at individual bytes of the representation may enlighten you.
--
With best wishes,
Igor Tandetnik
With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925
.

Bo Persson

2010-02-25 19:46:34 UTC

Post by Ray Mitchell
Hi Igor,
I appreciate all of the time you've taken to converse with me
regarding this issue. I have commented on some of your comments
again below. Although I don't necessarily agree with your
interpretations, it's probably due to my own lack of knowledge on
some of the topics.
Thanks,
Ray

Post by Igor Tandetnik

Post by Igor Tandetnik
Assigning to one member of the union and then reading another
exhibits undefined behavior.

Where did you get this information? Could you please refer me to
the appropriate section of the C standard that states this is the
case?

6.7.2.1p14 The size of a union is sufficient to contain the
largest of its members. The value of at most one of the members
can be stored in a union object at any time.

I don't agree that this makes reading a member that was not most
recently written undefined as long as the member being read shares
all of its bytes with the recently written member.

But it does. There is only one member present in the union. You can't
read what is not there.

It is true that many (most?) compilers will allow you to do it, just
because it is common to do so. The language standard doesn't require
it though, so portability is not optimal.

Bo Persson

Igor Tandetnik

2010-02-25 19:51:36 UTC

Post by Ray Mitchell

Post by Igor Tandetnik
6.7.2.1p14 The size of a union is sufficient to contain the largest
of its members. The value of at most one of the members can be
stored in a union object at any time.

I don't agree that this makes reading a member that was not most
recently written undefined as long as the member being read shares
all of its bytes with the recently written member. Concerning the
"older" members of a union,
6.1.6.2p7 of the standard says, "When a value is stored in a member
of an object of union type, the bytes of the object representation
that do not correspond to that member but do correspond to other
members take unspecified values, but the value of the union object
shall not thereby become a trap representation."

This just says that the union shouldn't turn into something that the CPU would throw a hardware exception on (some architectures have bit patterns that cause the CPU to do so - known as "trap representations").

Post by Ray Mitchell
When the compiler
generates code to access the various union members, that code merely
accesses the appropriate bytes in the common object and interprets
them in the way appropriate to that member's data type. The code to
do this is "permanent" and does not change just because another
member was recently written.

Of course not. But the program that necessitates running this code exhibits undefined behavior. Consider:

int* p = malloc(sizeof(int));
*p = 1;
if (rand() % 2) {
free(p);
}
*p = 42;

Code that assigns 42 to *p doesn't change just because memory is freed. Nevertheless, if it was indeed freed, that line exhibits undefined behavior - it accesses an object whose lifetime has ended.

Post by Ray Mitchell
Instead, the access is made without any
memory of what might have happened to the object previously.

That doesn't make such access any more valid.

Post by Ray Mitchell
As a
result the values of the bytes being read are exactly the values that
were written.

Not necessarily. The compiler can legally optimize away the assignment to one member of the union, seeing that the member is never read afterwards. See also

http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Optimize-Options.html#index-fstrict_002daliasing-542

(note that GCC doesn't perform this optimization in the simple case - only because there's too much invalid code in existence that would be broken by it). If the compiler does that, then no value is written, and the value read is random garbage.

Post by Ray Mitchell

Post by Igor Tandetnik
6.2.4p2 The lifetime of an object is the portion of program
execution during which storage is guaranteed to be reserved for it.
An object exists, has a constant address, and retains its
last-stored value throughout its lifetime. If an object is referred
to outside of its lifetime, the behavior is undefined.

I agree totally, and the object in this case is the underlying memory
common to all members.

Not quite. The union as a whole is an object, and each union member is itself an object:

6.2.5p20 A union type describes an overlapping nonempty set of member objects, each of which has an optionally specified name and possibly distinct type.

Remember also 6.7.2.1p14: "The value of at most one of the members can be stored in a union object at any time." Thus, one member object cannot possibly "retain its last-stored value" when another member is assigned to - the union can only hold one value at a time.

C++ standard states this more explicitly:

3.8p1 ...The lifetime of an object of type T ends when: ... the storage which the object occupies is reused...

Post by Ray Mitchell
But this is unrelated to the issue we're
discussing since the lifetime of the object does not end between the
write and the read.

Lifetime of the union doesn't, but lifetime of the member object whose storage has been hijacked does.

Post by Ray Mitchell

Post by Igor Tandetnik
However, this doesn't give you much, in view of aforementioned
6.2.6.1p1 - in general, you have no idea what to expect when looking
at individual bytes of an object.

But my original example was not the general case. It merely set the value
of an integral type to a value of 1, and I believe that guarantees
that only the least significant bit will be a 1.

What is the basis for this belief? It is my turn now to demand chapter and verse.

--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925

Ray Mitchell

2010-02-25 21:16:01 UTC

Post by Igor Tandetnik

Post by Ray Mitchell

Post by Igor Tandetnik
6.7.2.1p14 The size of a union is sufficient to contain the largest
of its members. The value of at most one of the members can be
stored in a union object at any time.

I don't agree that this makes reading a member that was not most
recently written undefined as long as the member being read shares
all of its bytes with the recently written member. Concerning the
"older" members of a union,
6.1.6.2p7 of the standard says, "When a value is stored in a member
of an object of union type, the bytes of the object representation
that do not correspond to that member but do correspond to other
members take unspecified values, but the value of the union object
shall not thereby become a trap representation."

This just says that the union shouldn't turn into something that the CPU would throw a hardware exception on (some architectures have bit patterns that cause the CPU to do so - known as "trap representations").

Post by Ray Mitchell
When the compiler
generates code to access the various union members, that code merely
accesses the appropriate bytes in the common object and interprets
them in the way appropriate to that member's data type. The code to
do this is "permanent" and does not change just because another
member was recently written.

int* p = malloc(sizeof(int));
*p = 1;
if (rand() % 2) {
free(p);
}
*p = 42;
Code that assigns 42 to *p doesn't change just because memory is freed. Nevertheless, if it was indeed freed, that line exhibits undefined behavior - it accesses an object whose lifetime has ended.

Post by Ray Mitchell
Instead, the access is made without any
memory of what might have happened to the object previously.

That doesn't make such access any more valid.

Post by Ray Mitchell
As a
result the values of the bytes being read are exactly the values that
were written.

Not necessarily. The compiler can legally optimize away the assignment to one member of the union, seeing that the member is never read afterwards. See also
http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Optimize-Options.html#index-fstrict_002daliasing-542
(note that GCC doesn't perform this optimization in the simple case - only because there's too much invalid code in existence that would be broken by it). If the compiler does that, then no value is written, and the value read is random garbage.

Post by Ray Mitchell

Post by Igor Tandetnik
6.2.4p2 The lifetime of an object is the portion of program
execution during which storage is guaranteed to be reserved for it.
An object exists, has a constant address, and retains its
last-stored value throughout its lifetime. If an object is referred
to outside of its lifetime, the behavior is undefined.

I agree totally, and the object in this case is the underlying memory
common to all members.

6.2.5p20 A union type describes an overlapping nonempty set of member objects, each of which has an optionally specified name and possibly distinct type.
Remember also 6.7.2.1p14: "The value of at most one of the members can be stored in a union object at any time." Thus, one member object cannot possibly "retain its last-stored value" when another member is assigned to - the union can only hold one value at a time.
3.8p1 ...The lifetime of an object of type T ends when: ... the storage which the object occupies is reused...

Post by Ray Mitchell
But this is unrelated to the issue we're
discussing since the lifetime of the object does not end between the
write and the read.

Lifetime of the union doesn't, but lifetime of the member object whose storage has been hijacked does.

Post by Ray Mitchell

Post by Igor Tandetnik
However, this doesn't give you much, in view of aforementioned
6.2.6.1p1 - in general, you have no idea what to expect when looking
at individual bytes of an object.

But my original example was not the general case. It merely set the value
of an integral type to a value of 1, and I believe that guarantees
that only the least significant bit will be a 1.

What is the basis for this belief? It is my turn now to demand chapter and verse.
--
With best wishes,
Igor Tandetnik
With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925
.

Thanks Igor,

I think I'm finally beginning to see the light on some of the things you
have been saying. I did not consider the various optimizations and
intermediate operations that might be performed that would render "old"
members invalid.

Post by Igor Tandetnik

Post by Ray Mitchell
of an integral type to a value of 1, and I believe that guarantees
that only the least significant bit will be a 1.

What is the basis for this belief? It is my turn now to demand chapter and verse.

I was basing my assertion on the fact that positive integral values must use
a pure binary representation for their values (6.2.6.2p1). Then, by
definition, doesn't the least significant bit have to be a 1 to represent a
value of 1? And if there happens to be padding bits, then I suppose that
such a bit could occupy the "farthest right" bit position. But that bit
would then not be called the least significant bit would it? Or I suppose
that in some screwed up implementation the value order of the value bits
would not necessarily be the physical order of the bits in the object, but
isn't the least significant bit still going to be a 1 no matter what physical
position it occupies? Is this what you are questioning? The concept of
padding bits does bring up another question that I thought I understood,
however: If an unsigned integral object is set to a value of 1, then the
value of the object is repeatedly shifted left by 1 until its value becomes
0, I always assumed that this was a portable way to determine the number of
value bits in the data type of that object. Now I'm beginning to wonder if
the padding bits might also get included in the count. If this is the case,
however, it seems to do away with the ability to do efficient
multiplications/divisions by powers of 2 by merely shifting instead.

Thanks for your detailed explanations,
Ray

Igor Tandetnik

2010-02-25 22:36:43 UTC

Post by Ray Mitchell

Post by Igor Tandetnik

Post by Ray Mitchell
of an integral type to a value of 1, and I believe that guarantees
that only the least significant bit will be a 1.

What is the basis for this belief? It is my turn now to demand chapter and verse.

I was basing my assertion on the fact that positive integral values
must use a pure binary representation for their values (6.2.6.2p1).

Only value bits participate in pure binary representation. There may be padding bits sprinkled around arbitrarily.

Post by Ray Mitchell
Then, by definition, doesn't the least significant bit have to be a 1
to represent a value of 1? And if there happens to be padding bits,
then I suppose that such a bit could occupy the "farthest right" bit
position. But that bit would then not be called the least
significant bit would it?

Ok, so you define "least significant bit" as "the value bit that would be set to 1 in the representation of the integer whose value is 1". Then you state that in an integer whose value is 1, the least significant bit is necessarily set to 1. Yes, this statement is trivially true, but I don't quite see how this circular definition helps you in your goal of inferring properties of the architecture by inspecting binary representation of select integers. The bit you appear to be interested in is

(char*)(&number)[0] & 1 // [1]

whatever you want to call it.

Post by Ray Mitchell
Or I suppose that in some screwed up
implementation the value order of the value bits would not
necessarily be the physical order of the bits in the object

I don't think that is allowed. 6.2.6.1p3 Footnote 40:

A positional representation for integers that uses the binary digits 0 and 1, in which the values represented by successive bits are additive, begin with 1, and are multiplied by successive integral powers of 2, except perhaps the bit with the highest position.

Though I guess it's arguable what "successive bits" means, as it is never formally defined.

Post by Ray Mitchell
but
isn't the least significant bit still going to be a 1 no matter what
physical position it occupies?

Under your circular definition, yes.

Post by Ray Mitchell
Is this what you are questioning?

I was assuming that by "least significant bit" you mean the bit in physical position zero (as defined by [1] above), because that's the definition that is actually relevant to your original question.

Post by Ray Mitchell
The concept of padding bits does bring up another question that I
thought I understood, however: If an unsigned integral object is set
to a value of 1, then the value of the object is repeatedly shifted
left by 1 until its value becomes 0, I always assumed that this was a
portable way to determine the number of value bits in the data type
of that object.

That's correct.

Post by Ray Mitchell
Now I'm beginning to wonder if the padding bits
might also get included in the count.

No. Shifts are defined arithmetically, not in terms of physical representation:

6.5.7p4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1*2^E2, reduced modulo one more than the maximum value representable in the result type.

Bitwise operations (&, |, ^) are another matter.

--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925

Barry Schwarz

2010-02-26 02:39:22 UTC

On Wed, 24 Feb 2010 22:57:01 -0800, Ray
snip

Post by Igor Tandetnik
Assigning to one member of the union and then reading another exhibits undefined behavior.

Where did you get this information? Could you please refer me to the
appropriate section of the C standard that states this is the case? I
searched through the C99 standard and could find nothing the either directly
stated nor implied this undefined behavior. Logically, to me at least, since

C89 called it implementation defined behavior. C99 "fixed" it with
footnote 82 to paragraph 6.5.2.3 (in n1256): "If the member used to
access the contents of a union object is not the same as the member
last used to store a value in the object, the appropriate part of the
object representation of the value is reinterpreted as an object
representation in the new type as described in 6.2.6 (a process
sometimes called "type punning"). This might be a trap
representation."

However, it apparently went through several iterations. Paragraph
6.5.2.2 of n869 says "With one exception, if the value of a member of
a union object is used when the most recent store to the object was to
a different member, the behavior is implementation-defined." The
exception is for structure members of the union. This is almost
identical to the C89 wording. n1124 doesn't seem to address it at
all.

--
Remove del for email

Ray Mitchell

2010-02-25 01:46:01 UTC

Sorry - In my last posting the line

else if (test.bytes[sizeof(int) - 1] == 1)

should have been

else if (test.bytes[sizeof(test.obj) - 1] == 1)

Barry Schwarz

2010-02-25 02:28:37 UTC

On Wed, 24 Feb 2010 18:06:30 -0500, "Igor Tandetnik"

Post by Igor Tandetnik
I don't know of any architecture where sizeof(long double) == 16, let alone two of them that differ in endianness. If you know of such machines,
and you find yourself in an unenviable position of having to exchange binary data between them, you should consult their accompanying manuals,
which hopefully would explain precisely how those long double values are laid out.

Try the entire IBM zArchitecture family

--
Remove del for email

Igor Tandetnik

2010-02-25 02:57:02 UTC

Post by Barry Schwarz
On Wed, 24 Feb 2010 18:06:30 -0500, "Igor Tandetnik"

Post by Igor Tandetnik
I don't know of any architecture where sizeof(long double) == 16,
let alone two of them that differ in endianness. If you know of such
machines, and you find yourself in an unenviable position of having
to exchange binary data between them, you should consult their
accompanying manuals, which hopefully would explain precisely how
those long double values are laid out.

Try the entire IBM zArchitecture family

Well, according to

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/download/A2278325.pdf?DT=20070807125005&XKS=DZ9ZBK07

page 19-2, this architecture does indeed provide for 16-byte-large floating point numbers, but they don't have any padding inside - all bytes are significant. Also, do z/Architecture machines come in both little-endian and big-endian flavor?

--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925

Bo Persson

2010-02-25 17:53:52 UTC

Post by Igor Tandetnik

Post by Barry Schwarz
On Wed, 24 Feb 2010 18:06:30 -0500, "Igor Tandetnik"

Post by Igor Tandetnik
I don't know of any architecture where sizeof(long double) == 16,
let alone two of them that differ in endianness. If you know of
such machines, and you find yourself in an unenviable position of
having to exchange binary data between them, you should consult
their accompanying manuals, which hopefully would explain
precisely how those long double values are laid out.

Try the entire IBM zArchitecture family

Well, according to
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/download/A2278325.pdf?DT=20070807125005&XKS=DZ9ZBK07
page 19-2, this architecture does indeed provide for 16-byte-large
floating point numbers, but they don't have any padding inside -
all bytes are significant. Also, do z/Architecture machines come in
both little-endian and big-endian flavor?

No, and the floating point format is proprietary anyway, so there is
no need to transfer it anywhere else (in binary).

Isn't the answer to the correct question: "Define a common format, and
convert to and from that as needed at both ends."?

Bo Persson

Igor Tandetnik

2010-02-25 18:10:48 UTC

Post by Bo Persson
Isn't the answer to the correct question: "Define a common format, and
convert to and from that as needed at both ends."?

Quite. For binary data exchange, may I suggest XDR?

http://www.rfc-editor.org/rfc/rfc4506.txt

XRD happens to closely follow the "usual" representation on a big-endian machine for most types. In many cases, all that's needed to convert between native and XDR representation is a judicial application of hton{l,s} and ntoh{l,s}. Thought those weird 16-byte doubles with internal padding will require special handling.

--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925

29 Replies
10 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Ray 2010-02-24 20:54:05 UTC

David Lowndes 2010-02-24 22:17:03 UTC

Ray Mitchell 2010-02-25 01:18:04 UTC

Igor Tandetnik 2010-02-25 01:31:03 UTC

Igor Tandetnik 2010-02-25 14:38:17 UTC

David Lowndes 2010-02-25 10:16:22 UTC

Ulrich Eckhardt 2010-02-25 14:00:09 UTC

David Lowndes 2010-02-25 14:30:09 UTC

Ulrich Eckhardt 2010-02-25 15:41:57 UTC

David Lowndes 2010-02-25 16:25:38 UTC

Bo Persson 2010-02-25 17:45:53 UTC

David Lowndes 2010-02-25 18:27:47 UTC

Ray Mitchell 2010-02-25 19:21:01 UTC

Ulrich Eckhardt 2010-02-26 12:45:33 UTC

Igor Tandetnik 2010-02-24 23:06:30 UTC

Ray Mitchell 2010-02-25 01:39:01 UTC

Igor Tandetnik 2010-02-25 02:41:35 UTC

Ray 2010-02-25 06:57:01 UTC

Igor Tandetnik 2010-02-25 14:30:46 UTC

Ray Mitchell 2010-02-25 18:49:03 UTC

Bo Persson 2010-02-25 19:46:34 UTC

Igor Tandetnik 2010-02-25 19:51:36 UTC

Ray Mitchell 2010-02-25 21:16:01 UTC

Igor Tandetnik 2010-02-25 22:36:43 UTC

Barry Schwarz 2010-02-26 02:39:22 UTC

Ray Mitchell 2010-02-25 01:46:01 UTC

Barry Schwarz 2010-02-25 02:28:37 UTC

Igor Tandetnik 2010-02-25 02:57:02 UTC

Bo Persson 2010-02-25 17:53:52 UTC

Igor Tandetnik 2010-02-25 18:10:48 UTC

about - legalese

Loading...