Unions, aliasing and type-punning in practice: what works and what does not?Purpose of Unions in C and C++A...
How can a kingdom keep the secret of a missing monarchy from the public?
Hands-Free Methods of Firing Arrows for Flying Soldiers
Negotiating 1-year delay to my Assistant Professor Offer
Why do climate experts from the UN/IPCC never mention Grand Solar Minimum?
Proposal for leaving the job
Headless horseman claims new head
Why would you use 2 alternate layout buttons instead of 1, when only one can be selected at once
How to write a character over another character
Does an increasing sequence of reals converge if the difference of consecutive terms approaches zero?
Would life expectancy increase if we replaced healthy organs with artificial ones?
What have we got?
Why do most space probes survive for far longer than they were designed for?
Does limiting the number of sources help simplify the game for a new DM with new and experienced players?
Is there a technology capable of disabling the whole of Earth's satellitle network?
What happens when the last remaining players refuse to kill each other?
Is candidate anonymity at all practical?
Limit involving inverse functions
Manager has noticed coworker's excessive breaks. Should I warn him?
Show about 5 teens who all have elemental powers and turn into different humanoid creatures
Travel agent didn't append MR to my name and my name ends in MS. This caused my first name to be put as Ms on ticket
Unions, aliasing and type-punning in practice: what works and what does not?
How to run a binary file from crontab?
How do I add numbers from two txt files with Bash?
Who, if anyone, was the first astronaut to return to earth in a different vessel?
Unions, aliasing and type-punning in practice: what works and what does not?
Purpose of Unions in C and C++A question about union in C - store as one type and read as another - is it implementation defined?What's a proper way of type-punning a float to an int and vice-versa?What is the strict aliasing rule?What does the explicit keyword mean?What are POD types in C++?gcc, strict-aliasing, and casting through a unionC++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?Fix for dereferencing type-punned pointer will break strict-aliasingReplacing a 32-bit loop counter with 64-bit introduces crazy performance deviationsUnions and type-punningDoes this really break strict-aliasing rules?Type punning a struct in C and C++ via a union
I have a problem understanding what can and cannot be done using unions with GCC. I read the questions (in particular here and here) about it but they focus the C++ standard, I feel there's a mismatch between the C++ standard and the practice (the commonly used compilers).
In particular, I recently found confusing informations in the GCC online doc while reading about the compilation flag -fstrict-aliasing. It says:
-fstrict-aliasing
Allow the compiler to assume the strictest aliasing rules applicable to the language being compiled. For C (and C++), this activates optimizations based on the type of expressions. In particular, an object of one type is assumed never to reside at the same address as an object of a different type, unless the types are almost the same.
For example, anunsigned int
can alias anint
, but not avoid*
or adouble
. A character type may alias any other type.
Pay special attention to code like this:
union a_union {
int i;
double d;
};
int f() {
union a_union t;
t.d = 3.0;
return t.i;
}
The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common.
Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected.
This is what I think I understood from this example and my doubts:
1) aliasing only works between similar types, or char
Consequence of 1): aliasing - as the word suggests - is when you have one value and two members to access it (i.e. the same bytes);
Doubt: are two types similar when they have the same size in bytes? If not, what are similar types?
Consequence of 1) for non similar types (whatever this means), aliasing does not work;
2) type punning is when we read a different member than the one we wrote to; it's common and it works as expected as long as the memory is accessed through the union type;
Doubt: is aliasing a specific case of type-punning where types are similar?
I get confused because it says unsigned int and double are not similar, so aliasing does not work; then in the example it's aliasing between int and double and it clearly says it works as expected, but calls it type-punning:
not because types are or are not similar, but because it's reading from a member it did not write. But reading from a member it did not write is what I understood aliasing is for (as the word suggests). I'm lost.
The questions:
can someone clarify the difference between aliasing and type-punning and what uses of the two techniques are working as expected in GCC? And what does the compiler flag do?
c++ gcc strict-aliasing
add a comment |
I have a problem understanding what can and cannot be done using unions with GCC. I read the questions (in particular here and here) about it but they focus the C++ standard, I feel there's a mismatch between the C++ standard and the practice (the commonly used compilers).
In particular, I recently found confusing informations in the GCC online doc while reading about the compilation flag -fstrict-aliasing. It says:
-fstrict-aliasing
Allow the compiler to assume the strictest aliasing rules applicable to the language being compiled. For C (and C++), this activates optimizations based on the type of expressions. In particular, an object of one type is assumed never to reside at the same address as an object of a different type, unless the types are almost the same.
For example, anunsigned int
can alias anint
, but not avoid*
or adouble
. A character type may alias any other type.
Pay special attention to code like this:
union a_union {
int i;
double d;
};
int f() {
union a_union t;
t.d = 3.0;
return t.i;
}
The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common.
Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected.
This is what I think I understood from this example and my doubts:
1) aliasing only works between similar types, or char
Consequence of 1): aliasing - as the word suggests - is when you have one value and two members to access it (i.e. the same bytes);
Doubt: are two types similar when they have the same size in bytes? If not, what are similar types?
Consequence of 1) for non similar types (whatever this means), aliasing does not work;
2) type punning is when we read a different member than the one we wrote to; it's common and it works as expected as long as the memory is accessed through the union type;
Doubt: is aliasing a specific case of type-punning where types are similar?
I get confused because it says unsigned int and double are not similar, so aliasing does not work; then in the example it's aliasing between int and double and it clearly says it works as expected, but calls it type-punning:
not because types are or are not similar, but because it's reading from a member it did not write. But reading from a member it did not write is what I understood aliasing is for (as the word suggests). I'm lost.
The questions:
can someone clarify the difference between aliasing and type-punning and what uses of the two techniques are working as expected in GCC? And what does the compiler flag do?
c++ gcc strict-aliasing
5
"I feel there's a mismatch between the specs and the practice" Until you upgrade your compiler and everything wreak havoc! (true story)
– YSC
8 hours ago
1
For when you really need type punning: stackoverflow.com/a/17790026/8120642
– hegel5000
5 hours ago
add a comment |
I have a problem understanding what can and cannot be done using unions with GCC. I read the questions (in particular here and here) about it but they focus the C++ standard, I feel there's a mismatch between the C++ standard and the practice (the commonly used compilers).
In particular, I recently found confusing informations in the GCC online doc while reading about the compilation flag -fstrict-aliasing. It says:
-fstrict-aliasing
Allow the compiler to assume the strictest aliasing rules applicable to the language being compiled. For C (and C++), this activates optimizations based on the type of expressions. In particular, an object of one type is assumed never to reside at the same address as an object of a different type, unless the types are almost the same.
For example, anunsigned int
can alias anint
, but not avoid*
or adouble
. A character type may alias any other type.
Pay special attention to code like this:
union a_union {
int i;
double d;
};
int f() {
union a_union t;
t.d = 3.0;
return t.i;
}
The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common.
Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected.
This is what I think I understood from this example and my doubts:
1) aliasing only works between similar types, or char
Consequence of 1): aliasing - as the word suggests - is when you have one value and two members to access it (i.e. the same bytes);
Doubt: are two types similar when they have the same size in bytes? If not, what are similar types?
Consequence of 1) for non similar types (whatever this means), aliasing does not work;
2) type punning is when we read a different member than the one we wrote to; it's common and it works as expected as long as the memory is accessed through the union type;
Doubt: is aliasing a specific case of type-punning where types are similar?
I get confused because it says unsigned int and double are not similar, so aliasing does not work; then in the example it's aliasing between int and double and it clearly says it works as expected, but calls it type-punning:
not because types are or are not similar, but because it's reading from a member it did not write. But reading from a member it did not write is what I understood aliasing is for (as the word suggests). I'm lost.
The questions:
can someone clarify the difference between aliasing and type-punning and what uses of the two techniques are working as expected in GCC? And what does the compiler flag do?
c++ gcc strict-aliasing
I have a problem understanding what can and cannot be done using unions with GCC. I read the questions (in particular here and here) about it but they focus the C++ standard, I feel there's a mismatch between the C++ standard and the practice (the commonly used compilers).
In particular, I recently found confusing informations in the GCC online doc while reading about the compilation flag -fstrict-aliasing. It says:
-fstrict-aliasing
Allow the compiler to assume the strictest aliasing rules applicable to the language being compiled. For C (and C++), this activates optimizations based on the type of expressions. In particular, an object of one type is assumed never to reside at the same address as an object of a different type, unless the types are almost the same.
For example, anunsigned int
can alias anint
, but not avoid*
or adouble
. A character type may alias any other type.
Pay special attention to code like this:
union a_union {
int i;
double d;
};
int f() {
union a_union t;
t.d = 3.0;
return t.i;
}
The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common.
Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected.
This is what I think I understood from this example and my doubts:
1) aliasing only works between similar types, or char
Consequence of 1): aliasing - as the word suggests - is when you have one value and two members to access it (i.e. the same bytes);
Doubt: are two types similar when they have the same size in bytes? If not, what are similar types?
Consequence of 1) for non similar types (whatever this means), aliasing does not work;
2) type punning is when we read a different member than the one we wrote to; it's common and it works as expected as long as the memory is accessed through the union type;
Doubt: is aliasing a specific case of type-punning where types are similar?
I get confused because it says unsigned int and double are not similar, so aliasing does not work; then in the example it's aliasing between int and double and it clearly says it works as expected, but calls it type-punning:
not because types are or are not similar, but because it's reading from a member it did not write. But reading from a member it did not write is what I understood aliasing is for (as the word suggests). I'm lost.
The questions:
can someone clarify the difference between aliasing and type-punning and what uses of the two techniques are working as expected in GCC? And what does the compiler flag do?
c++ gcc strict-aliasing
c++ gcc strict-aliasing
edited 1 hour ago
Justin
13.6k95697
13.6k95697
asked 8 hours ago
L.C.L.C.
449415
449415
5
"I feel there's a mismatch between the specs and the practice" Until you upgrade your compiler and everything wreak havoc! (true story)
– YSC
8 hours ago
1
For when you really need type punning: stackoverflow.com/a/17790026/8120642
– hegel5000
5 hours ago
add a comment |
5
"I feel there's a mismatch between the specs and the practice" Until you upgrade your compiler and everything wreak havoc! (true story)
– YSC
8 hours ago
1
For when you really need type punning: stackoverflow.com/a/17790026/8120642
– hegel5000
5 hours ago
5
5
"I feel there's a mismatch between the specs and the practice" Until you upgrade your compiler and everything wreak havoc! (true story)
– YSC
8 hours ago
"I feel there's a mismatch between the specs and the practice" Until you upgrade your compiler and everything wreak havoc! (true story)
– YSC
8 hours ago
1
1
For when you really need type punning: stackoverflow.com/a/17790026/8120642
– hegel5000
5 hours ago
For when you really need type punning: stackoverflow.com/a/17790026/8120642
– hegel5000
5 hours ago
add a comment |
3 Answers
3
active
oldest
votes
Aliasing can be taken literally for what it means: it is when two different expressions refer to the same object. Type-punning is to "pun" a type, ie to use a object of some type as a different type.
Formally, type-punning is undefined behaviour with only a few exceptions. It happens commonly when you fiddle with bits carelessly
int mantissa(float f)
{
return (int&)f & 0x7FFFFF; // Accessing a float as if it's an int
}
The exceptions are (simplified)
- Accessing integers as their unsigned/signed counterparts
- Accessing anything as a
char
,unsigned char
orstd::byte
This is known as the strict-aliasing rule: the compiler can safely assume two expressions of different types never refer to the same object (except for the exceptions above) because they would otherwise have undefined behaviour. This facilitates optimizations such as
void transform(float* dst, const int* src, int n)
{
for(int i = 0; i < n; i++)
dst[i] = src[i]; // Can be unrolled and use vector instructions
// If dst and src alias the results would be wrong
}
What gcc says is it relaxes the rules a bit, and allows type-punning through unions even though the standard doesn't require it to
union {
int64_t num;
struct {
int32_t hi, lo;
} parts;
} u = {42};
u.parts.hi = 420;
This is the type-pun gcc guarantees will work. Other cases may appear to work but may one day silently be broken.
1
I think your example fails in that the layout of the bit fields in that structure is itself implementation defined. The poor definition of bit fields in C is one of those really annoying things that it is probably way too late to fix. The type pun is ok (in GCC at least), but the bit field may or may not do what you expect.
– Dan Mills
5 hours ago
@DanMills Fair, but I couldn't think of a nice and easy pun off the top of my head. I reckoned if I wanted to show what practically works, might as well go all the way.
– Passer By
3 hours ago
1
@PasserBy one somewhat common example is something likeunion { long long x; struct { unsigned low, high } }
(or same, but withunsigned[2]
, you get the idea).
– Dan M.
2 hours ago
In contexts other than the gcc/clang interpretation of the "strict aliasing" rule, the term "aliasing" would not be used to describe situations in which one reference is used to derive another, and the new reference is used to access the object and then abandoned before the object is used in any other way.
– supercat
19 mins ago
add a comment |
Terminology is a great thing, I can use it however I want, and so can everyone else!
are two types similar when they have the same size in bytes? If not, what are similar types?
Roughly speaking, types are similar when they differ by constness or signedness. Size in bytes alone is definitely not sufficient.
is aliasing a specific case of type-punning where types are similar?
Type punning is any technique that circumvents the type system.
Aliasing is a specific case of that which involves placing objects of different types at the same address. Aliasing is generally allowed when types are similar, and forbidden otherwise. In addition, one may access an object of any type through a char
(or similar to char
) lvalue, but doing the opposite (i.e. accessing an object of type char
through a dissimilar type lvalue) is not allowed. This is guaranteed by both C and C++ standards, GCC simply implements what the standards mandate.
GCC documentation seems to use "type punning" in a narrow sense of reading a union member other than the one last written to. This kind of type punning is allowed by the C standard even when types are not similar. OTOH the C++ standard does not allow this. GCC may or may not extend the permission to C++, the documentation is not clear on this.
Without -fstrict-aliasing
, GCC apparently relaxes these requirements, but it isn't clear to what exact extent. Note that -fstrict-aliasing
is the default when performing an optimised build.
Bottom line, just program to the standard. If GCC relaxes the requirements of the standard, it isn't significant and isn't worth the trouble.
add a comment |
In ANSI C (AKA C89) you have (section 3.3.2.3 Structure and union members):
if a member of a union object is accessed after a value has been stored in a different member of the object, the behavior is implementation-defined
In C99 you have (section 6.5.2.3 Structure and union members):
If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.
IOW, union-based type punning is allowed in C, although the actual semantics may be different, depending on the language standard supported (note that the C99 semantics is narrower than the C89's implementation-defined).
In C99 you also have (section 6.5 Expressions):
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
And there's a section (6.2.7 Compatible type and composite type) in C99 that describes compatible types:
Two types have compatible type if their types are the same. Additional rules for
determining whether two types are compatible are described in 6.7.2 for type specifiers,
in 6.7.3 for type qualifiers, and in 6.7.5 for declarators. ...
And then (6.7.5.1 Pointer declarators):
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
Simplifying it a bit, this means that in C by using a pointer you can access signed ints as unsigned ints (and vice versa) and you can access individual chars in anything. Anything else would amount to aliasing violation.
You can find similar language in the various versions of the C++ standard. However, as far as I can see in C++03 and C++11 union-based type punning isn't explicitly allowed (unlike in C).
UV: this answer clarifies the "compatible types" concept (I suppose that's what they mean by "similar types"). I totally agree it's not explicitely allowed by the standard, but it works in some cases with GCC. It's one situation where "not explicitely allowed" does not mean forbidden.
– L.C.
6 hours ago
1
@L.C. it doesn't mean that it won't suddenly break on a different compiler, arch, OS or even new compiler version either.
– Dan M.
2 hours ago
You are right, point taken... but doing code that is open source and flexible and portable etc. is not always the main goal. It's not elegant, it's not a good practice, but sometimes one just wants a binary that runs on the current machine / OS... so if the compiler produces "valid" code that does what's expected... why not!
– L.C.
2 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54762186%2funions-aliasing-and-type-punning-in-practice-what-works-and-what-does-not%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Aliasing can be taken literally for what it means: it is when two different expressions refer to the same object. Type-punning is to "pun" a type, ie to use a object of some type as a different type.
Formally, type-punning is undefined behaviour with only a few exceptions. It happens commonly when you fiddle with bits carelessly
int mantissa(float f)
{
return (int&)f & 0x7FFFFF; // Accessing a float as if it's an int
}
The exceptions are (simplified)
- Accessing integers as their unsigned/signed counterparts
- Accessing anything as a
char
,unsigned char
orstd::byte
This is known as the strict-aliasing rule: the compiler can safely assume two expressions of different types never refer to the same object (except for the exceptions above) because they would otherwise have undefined behaviour. This facilitates optimizations such as
void transform(float* dst, const int* src, int n)
{
for(int i = 0; i < n; i++)
dst[i] = src[i]; // Can be unrolled and use vector instructions
// If dst and src alias the results would be wrong
}
What gcc says is it relaxes the rules a bit, and allows type-punning through unions even though the standard doesn't require it to
union {
int64_t num;
struct {
int32_t hi, lo;
} parts;
} u = {42};
u.parts.hi = 420;
This is the type-pun gcc guarantees will work. Other cases may appear to work but may one day silently be broken.
1
I think your example fails in that the layout of the bit fields in that structure is itself implementation defined. The poor definition of bit fields in C is one of those really annoying things that it is probably way too late to fix. The type pun is ok (in GCC at least), but the bit field may or may not do what you expect.
– Dan Mills
5 hours ago
@DanMills Fair, but I couldn't think of a nice and easy pun off the top of my head. I reckoned if I wanted to show what practically works, might as well go all the way.
– Passer By
3 hours ago
1
@PasserBy one somewhat common example is something likeunion { long long x; struct { unsigned low, high } }
(or same, but withunsigned[2]
, you get the idea).
– Dan M.
2 hours ago
In contexts other than the gcc/clang interpretation of the "strict aliasing" rule, the term "aliasing" would not be used to describe situations in which one reference is used to derive another, and the new reference is used to access the object and then abandoned before the object is used in any other way.
– supercat
19 mins ago
add a comment |
Aliasing can be taken literally for what it means: it is when two different expressions refer to the same object. Type-punning is to "pun" a type, ie to use a object of some type as a different type.
Formally, type-punning is undefined behaviour with only a few exceptions. It happens commonly when you fiddle with bits carelessly
int mantissa(float f)
{
return (int&)f & 0x7FFFFF; // Accessing a float as if it's an int
}
The exceptions are (simplified)
- Accessing integers as their unsigned/signed counterparts
- Accessing anything as a
char
,unsigned char
orstd::byte
This is known as the strict-aliasing rule: the compiler can safely assume two expressions of different types never refer to the same object (except for the exceptions above) because they would otherwise have undefined behaviour. This facilitates optimizations such as
void transform(float* dst, const int* src, int n)
{
for(int i = 0; i < n; i++)
dst[i] = src[i]; // Can be unrolled and use vector instructions
// If dst and src alias the results would be wrong
}
What gcc says is it relaxes the rules a bit, and allows type-punning through unions even though the standard doesn't require it to
union {
int64_t num;
struct {
int32_t hi, lo;
} parts;
} u = {42};
u.parts.hi = 420;
This is the type-pun gcc guarantees will work. Other cases may appear to work but may one day silently be broken.
1
I think your example fails in that the layout of the bit fields in that structure is itself implementation defined. The poor definition of bit fields in C is one of those really annoying things that it is probably way too late to fix. The type pun is ok (in GCC at least), but the bit field may or may not do what you expect.
– Dan Mills
5 hours ago
@DanMills Fair, but I couldn't think of a nice and easy pun off the top of my head. I reckoned if I wanted to show what practically works, might as well go all the way.
– Passer By
3 hours ago
1
@PasserBy one somewhat common example is something likeunion { long long x; struct { unsigned low, high } }
(or same, but withunsigned[2]
, you get the idea).
– Dan M.
2 hours ago
In contexts other than the gcc/clang interpretation of the "strict aliasing" rule, the term "aliasing" would not be used to describe situations in which one reference is used to derive another, and the new reference is used to access the object and then abandoned before the object is used in any other way.
– supercat
19 mins ago
add a comment |
Aliasing can be taken literally for what it means: it is when two different expressions refer to the same object. Type-punning is to "pun" a type, ie to use a object of some type as a different type.
Formally, type-punning is undefined behaviour with only a few exceptions. It happens commonly when you fiddle with bits carelessly
int mantissa(float f)
{
return (int&)f & 0x7FFFFF; // Accessing a float as if it's an int
}
The exceptions are (simplified)
- Accessing integers as their unsigned/signed counterparts
- Accessing anything as a
char
,unsigned char
orstd::byte
This is known as the strict-aliasing rule: the compiler can safely assume two expressions of different types never refer to the same object (except for the exceptions above) because they would otherwise have undefined behaviour. This facilitates optimizations such as
void transform(float* dst, const int* src, int n)
{
for(int i = 0; i < n; i++)
dst[i] = src[i]; // Can be unrolled and use vector instructions
// If dst and src alias the results would be wrong
}
What gcc says is it relaxes the rules a bit, and allows type-punning through unions even though the standard doesn't require it to
union {
int64_t num;
struct {
int32_t hi, lo;
} parts;
} u = {42};
u.parts.hi = 420;
This is the type-pun gcc guarantees will work. Other cases may appear to work but may one day silently be broken.
Aliasing can be taken literally for what it means: it is when two different expressions refer to the same object. Type-punning is to "pun" a type, ie to use a object of some type as a different type.
Formally, type-punning is undefined behaviour with only a few exceptions. It happens commonly when you fiddle with bits carelessly
int mantissa(float f)
{
return (int&)f & 0x7FFFFF; // Accessing a float as if it's an int
}
The exceptions are (simplified)
- Accessing integers as their unsigned/signed counterparts
- Accessing anything as a
char
,unsigned char
orstd::byte
This is known as the strict-aliasing rule: the compiler can safely assume two expressions of different types never refer to the same object (except for the exceptions above) because they would otherwise have undefined behaviour. This facilitates optimizations such as
void transform(float* dst, const int* src, int n)
{
for(int i = 0; i < n; i++)
dst[i] = src[i]; // Can be unrolled and use vector instructions
// If dst and src alias the results would be wrong
}
What gcc says is it relaxes the rules a bit, and allows type-punning through unions even though the standard doesn't require it to
union {
int64_t num;
struct {
int32_t hi, lo;
} parts;
} u = {42};
u.parts.hi = 420;
This is the type-pun gcc guarantees will work. Other cases may appear to work but may one day silently be broken.
edited 2 hours ago
answered 6 hours ago
Passer ByPasser By
9,83732557
9,83732557
1
I think your example fails in that the layout of the bit fields in that structure is itself implementation defined. The poor definition of bit fields in C is one of those really annoying things that it is probably way too late to fix. The type pun is ok (in GCC at least), but the bit field may or may not do what you expect.
– Dan Mills
5 hours ago
@DanMills Fair, but I couldn't think of a nice and easy pun off the top of my head. I reckoned if I wanted to show what practically works, might as well go all the way.
– Passer By
3 hours ago
1
@PasserBy one somewhat common example is something likeunion { long long x; struct { unsigned low, high } }
(or same, but withunsigned[2]
, you get the idea).
– Dan M.
2 hours ago
In contexts other than the gcc/clang interpretation of the "strict aliasing" rule, the term "aliasing" would not be used to describe situations in which one reference is used to derive another, and the new reference is used to access the object and then abandoned before the object is used in any other way.
– supercat
19 mins ago
add a comment |
1
I think your example fails in that the layout of the bit fields in that structure is itself implementation defined. The poor definition of bit fields in C is one of those really annoying things that it is probably way too late to fix. The type pun is ok (in GCC at least), but the bit field may or may not do what you expect.
– Dan Mills
5 hours ago
@DanMills Fair, but I couldn't think of a nice and easy pun off the top of my head. I reckoned if I wanted to show what practically works, might as well go all the way.
– Passer By
3 hours ago
1
@PasserBy one somewhat common example is something likeunion { long long x; struct { unsigned low, high } }
(or same, but withunsigned[2]
, you get the idea).
– Dan M.
2 hours ago
In contexts other than the gcc/clang interpretation of the "strict aliasing" rule, the term "aliasing" would not be used to describe situations in which one reference is used to derive another, and the new reference is used to access the object and then abandoned before the object is used in any other way.
– supercat
19 mins ago
1
1
I think your example fails in that the layout of the bit fields in that structure is itself implementation defined. The poor definition of bit fields in C is one of those really annoying things that it is probably way too late to fix. The type pun is ok (in GCC at least), but the bit field may or may not do what you expect.
– Dan Mills
5 hours ago
I think your example fails in that the layout of the bit fields in that structure is itself implementation defined. The poor definition of bit fields in C is one of those really annoying things that it is probably way too late to fix. The type pun is ok (in GCC at least), but the bit field may or may not do what you expect.
– Dan Mills
5 hours ago
@DanMills Fair, but I couldn't think of a nice and easy pun off the top of my head. I reckoned if I wanted to show what practically works, might as well go all the way.
– Passer By
3 hours ago
@DanMills Fair, but I couldn't think of a nice and easy pun off the top of my head. I reckoned if I wanted to show what practically works, might as well go all the way.
– Passer By
3 hours ago
1
1
@PasserBy one somewhat common example is something like
union { long long x; struct { unsigned low, high } }
(or same, but with unsigned[2]
, you get the idea).– Dan M.
2 hours ago
@PasserBy one somewhat common example is something like
union { long long x; struct { unsigned low, high } }
(or same, but with unsigned[2]
, you get the idea).– Dan M.
2 hours ago
In contexts other than the gcc/clang interpretation of the "strict aliasing" rule, the term "aliasing" would not be used to describe situations in which one reference is used to derive another, and the new reference is used to access the object and then abandoned before the object is used in any other way.
– supercat
19 mins ago
In contexts other than the gcc/clang interpretation of the "strict aliasing" rule, the term "aliasing" would not be used to describe situations in which one reference is used to derive another, and the new reference is used to access the object and then abandoned before the object is used in any other way.
– supercat
19 mins ago
add a comment |
Terminology is a great thing, I can use it however I want, and so can everyone else!
are two types similar when they have the same size in bytes? If not, what are similar types?
Roughly speaking, types are similar when they differ by constness or signedness. Size in bytes alone is definitely not sufficient.
is aliasing a specific case of type-punning where types are similar?
Type punning is any technique that circumvents the type system.
Aliasing is a specific case of that which involves placing objects of different types at the same address. Aliasing is generally allowed when types are similar, and forbidden otherwise. In addition, one may access an object of any type through a char
(or similar to char
) lvalue, but doing the opposite (i.e. accessing an object of type char
through a dissimilar type lvalue) is not allowed. This is guaranteed by both C and C++ standards, GCC simply implements what the standards mandate.
GCC documentation seems to use "type punning" in a narrow sense of reading a union member other than the one last written to. This kind of type punning is allowed by the C standard even when types are not similar. OTOH the C++ standard does not allow this. GCC may or may not extend the permission to C++, the documentation is not clear on this.
Without -fstrict-aliasing
, GCC apparently relaxes these requirements, but it isn't clear to what exact extent. Note that -fstrict-aliasing
is the default when performing an optimised build.
Bottom line, just program to the standard. If GCC relaxes the requirements of the standard, it isn't significant and isn't worth the trouble.
add a comment |
Terminology is a great thing, I can use it however I want, and so can everyone else!
are two types similar when they have the same size in bytes? If not, what are similar types?
Roughly speaking, types are similar when they differ by constness or signedness. Size in bytes alone is definitely not sufficient.
is aliasing a specific case of type-punning where types are similar?
Type punning is any technique that circumvents the type system.
Aliasing is a specific case of that which involves placing objects of different types at the same address. Aliasing is generally allowed when types are similar, and forbidden otherwise. In addition, one may access an object of any type through a char
(or similar to char
) lvalue, but doing the opposite (i.e. accessing an object of type char
through a dissimilar type lvalue) is not allowed. This is guaranteed by both C and C++ standards, GCC simply implements what the standards mandate.
GCC documentation seems to use "type punning" in a narrow sense of reading a union member other than the one last written to. This kind of type punning is allowed by the C standard even when types are not similar. OTOH the C++ standard does not allow this. GCC may or may not extend the permission to C++, the documentation is not clear on this.
Without -fstrict-aliasing
, GCC apparently relaxes these requirements, but it isn't clear to what exact extent. Note that -fstrict-aliasing
is the default when performing an optimised build.
Bottom line, just program to the standard. If GCC relaxes the requirements of the standard, it isn't significant and isn't worth the trouble.
add a comment |
Terminology is a great thing, I can use it however I want, and so can everyone else!
are two types similar when they have the same size in bytes? If not, what are similar types?
Roughly speaking, types are similar when they differ by constness or signedness. Size in bytes alone is definitely not sufficient.
is aliasing a specific case of type-punning where types are similar?
Type punning is any technique that circumvents the type system.
Aliasing is a specific case of that which involves placing objects of different types at the same address. Aliasing is generally allowed when types are similar, and forbidden otherwise. In addition, one may access an object of any type through a char
(or similar to char
) lvalue, but doing the opposite (i.e. accessing an object of type char
through a dissimilar type lvalue) is not allowed. This is guaranteed by both C and C++ standards, GCC simply implements what the standards mandate.
GCC documentation seems to use "type punning" in a narrow sense of reading a union member other than the one last written to. This kind of type punning is allowed by the C standard even when types are not similar. OTOH the C++ standard does not allow this. GCC may or may not extend the permission to C++, the documentation is not clear on this.
Without -fstrict-aliasing
, GCC apparently relaxes these requirements, but it isn't clear to what exact extent. Note that -fstrict-aliasing
is the default when performing an optimised build.
Bottom line, just program to the standard. If GCC relaxes the requirements of the standard, it isn't significant and isn't worth the trouble.
Terminology is a great thing, I can use it however I want, and so can everyone else!
are two types similar when they have the same size in bytes? If not, what are similar types?
Roughly speaking, types are similar when they differ by constness or signedness. Size in bytes alone is definitely not sufficient.
is aliasing a specific case of type-punning where types are similar?
Type punning is any technique that circumvents the type system.
Aliasing is a specific case of that which involves placing objects of different types at the same address. Aliasing is generally allowed when types are similar, and forbidden otherwise. In addition, one may access an object of any type through a char
(or similar to char
) lvalue, but doing the opposite (i.e. accessing an object of type char
through a dissimilar type lvalue) is not allowed. This is guaranteed by both C and C++ standards, GCC simply implements what the standards mandate.
GCC documentation seems to use "type punning" in a narrow sense of reading a union member other than the one last written to. This kind of type punning is allowed by the C standard even when types are not similar. OTOH the C++ standard does not allow this. GCC may or may not extend the permission to C++, the documentation is not clear on this.
Without -fstrict-aliasing
, GCC apparently relaxes these requirements, but it isn't clear to what exact extent. Note that -fstrict-aliasing
is the default when performing an optimised build.
Bottom line, just program to the standard. If GCC relaxes the requirements of the standard, it isn't significant and isn't worth the trouble.
answered 5 hours ago
n.m.n.m.
72.5k882168
72.5k882168
add a comment |
add a comment |
In ANSI C (AKA C89) you have (section 3.3.2.3 Structure and union members):
if a member of a union object is accessed after a value has been stored in a different member of the object, the behavior is implementation-defined
In C99 you have (section 6.5.2.3 Structure and union members):
If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.
IOW, union-based type punning is allowed in C, although the actual semantics may be different, depending on the language standard supported (note that the C99 semantics is narrower than the C89's implementation-defined).
In C99 you also have (section 6.5 Expressions):
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
And there's a section (6.2.7 Compatible type and composite type) in C99 that describes compatible types:
Two types have compatible type if their types are the same. Additional rules for
determining whether two types are compatible are described in 6.7.2 for type specifiers,
in 6.7.3 for type qualifiers, and in 6.7.5 for declarators. ...
And then (6.7.5.1 Pointer declarators):
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
Simplifying it a bit, this means that in C by using a pointer you can access signed ints as unsigned ints (and vice versa) and you can access individual chars in anything. Anything else would amount to aliasing violation.
You can find similar language in the various versions of the C++ standard. However, as far as I can see in C++03 and C++11 union-based type punning isn't explicitly allowed (unlike in C).
UV: this answer clarifies the "compatible types" concept (I suppose that's what they mean by "similar types"). I totally agree it's not explicitely allowed by the standard, but it works in some cases with GCC. It's one situation where "not explicitely allowed" does not mean forbidden.
– L.C.
6 hours ago
1
@L.C. it doesn't mean that it won't suddenly break on a different compiler, arch, OS or even new compiler version either.
– Dan M.
2 hours ago
You are right, point taken... but doing code that is open source and flexible and portable etc. is not always the main goal. It's not elegant, it's not a good practice, but sometimes one just wants a binary that runs on the current machine / OS... so if the compiler produces "valid" code that does what's expected... why not!
– L.C.
2 hours ago
add a comment |
In ANSI C (AKA C89) you have (section 3.3.2.3 Structure and union members):
if a member of a union object is accessed after a value has been stored in a different member of the object, the behavior is implementation-defined
In C99 you have (section 6.5.2.3 Structure and union members):
If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.
IOW, union-based type punning is allowed in C, although the actual semantics may be different, depending on the language standard supported (note that the C99 semantics is narrower than the C89's implementation-defined).
In C99 you also have (section 6.5 Expressions):
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
And there's a section (6.2.7 Compatible type and composite type) in C99 that describes compatible types:
Two types have compatible type if their types are the same. Additional rules for
determining whether two types are compatible are described in 6.7.2 for type specifiers,
in 6.7.3 for type qualifiers, and in 6.7.5 for declarators. ...
And then (6.7.5.1 Pointer declarators):
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
Simplifying it a bit, this means that in C by using a pointer you can access signed ints as unsigned ints (and vice versa) and you can access individual chars in anything. Anything else would amount to aliasing violation.
You can find similar language in the various versions of the C++ standard. However, as far as I can see in C++03 and C++11 union-based type punning isn't explicitly allowed (unlike in C).
UV: this answer clarifies the "compatible types" concept (I suppose that's what they mean by "similar types"). I totally agree it's not explicitely allowed by the standard, but it works in some cases with GCC. It's one situation where "not explicitely allowed" does not mean forbidden.
– L.C.
6 hours ago
1
@L.C. it doesn't mean that it won't suddenly break on a different compiler, arch, OS or even new compiler version either.
– Dan M.
2 hours ago
You are right, point taken... but doing code that is open source and flexible and portable etc. is not always the main goal. It's not elegant, it's not a good practice, but sometimes one just wants a binary that runs on the current machine / OS... so if the compiler produces "valid" code that does what's expected... why not!
– L.C.
2 hours ago
add a comment |
In ANSI C (AKA C89) you have (section 3.3.2.3 Structure and union members):
if a member of a union object is accessed after a value has been stored in a different member of the object, the behavior is implementation-defined
In C99 you have (section 6.5.2.3 Structure and union members):
If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.
IOW, union-based type punning is allowed in C, although the actual semantics may be different, depending on the language standard supported (note that the C99 semantics is narrower than the C89's implementation-defined).
In C99 you also have (section 6.5 Expressions):
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
And there's a section (6.2.7 Compatible type and composite type) in C99 that describes compatible types:
Two types have compatible type if their types are the same. Additional rules for
determining whether two types are compatible are described in 6.7.2 for type specifiers,
in 6.7.3 for type qualifiers, and in 6.7.5 for declarators. ...
And then (6.7.5.1 Pointer declarators):
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
Simplifying it a bit, this means that in C by using a pointer you can access signed ints as unsigned ints (and vice versa) and you can access individual chars in anything. Anything else would amount to aliasing violation.
You can find similar language in the various versions of the C++ standard. However, as far as I can see in C++03 and C++11 union-based type punning isn't explicitly allowed (unlike in C).
In ANSI C (AKA C89) you have (section 3.3.2.3 Structure and union members):
if a member of a union object is accessed after a value has been stored in a different member of the object, the behavior is implementation-defined
In C99 you have (section 6.5.2.3 Structure and union members):
If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.
IOW, union-based type punning is allowed in C, although the actual semantics may be different, depending on the language standard supported (note that the C99 semantics is narrower than the C89's implementation-defined).
In C99 you also have (section 6.5 Expressions):
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
And there's a section (6.2.7 Compatible type and composite type) in C99 that describes compatible types:
Two types have compatible type if their types are the same. Additional rules for
determining whether two types are compatible are described in 6.7.2 for type specifiers,
in 6.7.3 for type qualifiers, and in 6.7.5 for declarators. ...
And then (6.7.5.1 Pointer declarators):
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
Simplifying it a bit, this means that in C by using a pointer you can access signed ints as unsigned ints (and vice versa) and you can access individual chars in anything. Anything else would amount to aliasing violation.
You can find similar language in the various versions of the C++ standard. However, as far as I can see in C++03 and C++11 union-based type punning isn't explicitly allowed (unlike in C).
answered 6 hours ago
Alexey FrunzeAlexey Frunze
51.6k953128
51.6k953128
UV: this answer clarifies the "compatible types" concept (I suppose that's what they mean by "similar types"). I totally agree it's not explicitely allowed by the standard, but it works in some cases with GCC. It's one situation where "not explicitely allowed" does not mean forbidden.
– L.C.
6 hours ago
1
@L.C. it doesn't mean that it won't suddenly break on a different compiler, arch, OS or even new compiler version either.
– Dan M.
2 hours ago
You are right, point taken... but doing code that is open source and flexible and portable etc. is not always the main goal. It's not elegant, it's not a good practice, but sometimes one just wants a binary that runs on the current machine / OS... so if the compiler produces "valid" code that does what's expected... why not!
– L.C.
2 hours ago
add a comment |
UV: this answer clarifies the "compatible types" concept (I suppose that's what they mean by "similar types"). I totally agree it's not explicitely allowed by the standard, but it works in some cases with GCC. It's one situation where "not explicitely allowed" does not mean forbidden.
– L.C.
6 hours ago
1
@L.C. it doesn't mean that it won't suddenly break on a different compiler, arch, OS or even new compiler version either.
– Dan M.
2 hours ago
You are right, point taken... but doing code that is open source and flexible and portable etc. is not always the main goal. It's not elegant, it's not a good practice, but sometimes one just wants a binary that runs on the current machine / OS... so if the compiler produces "valid" code that does what's expected... why not!
– L.C.
2 hours ago
UV: this answer clarifies the "compatible types" concept (I suppose that's what they mean by "similar types"). I totally agree it's not explicitely allowed by the standard, but it works in some cases with GCC. It's one situation where "not explicitely allowed" does not mean forbidden.
– L.C.
6 hours ago
UV: this answer clarifies the "compatible types" concept (I suppose that's what they mean by "similar types"). I totally agree it's not explicitely allowed by the standard, but it works in some cases with GCC. It's one situation where "not explicitely allowed" does not mean forbidden.
– L.C.
6 hours ago
1
1
@L.C. it doesn't mean that it won't suddenly break on a different compiler, arch, OS or even new compiler version either.
– Dan M.
2 hours ago
@L.C. it doesn't mean that it won't suddenly break on a different compiler, arch, OS or even new compiler version either.
– Dan M.
2 hours ago
You are right, point taken... but doing code that is open source and flexible and portable etc. is not always the main goal. It's not elegant, it's not a good practice, but sometimes one just wants a binary that runs on the current machine / OS... so if the compiler produces "valid" code that does what's expected... why not!
– L.C.
2 hours ago
You are right, point taken... but doing code that is open source and flexible and portable etc. is not always the main goal. It's not elegant, it's not a good practice, but sometimes one just wants a binary that runs on the current machine / OS... so if the compiler produces "valid" code that does what's expected... why not!
– L.C.
2 hours ago
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54762186%2funions-aliasing-and-type-punning-in-practice-what-works-and-what-does-not%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
5
"I feel there's a mismatch between the specs and the practice" Until you upgrade your compiler and everything wreak havoc! (true story)
– YSC
8 hours ago
1
For when you really need type punning: stackoverflow.com/a/17790026/8120642
– hegel5000
5 hours ago