Unions
Unions are pretty much the definition of what Rust doesn't allow in safe code: an area of memory that can be interpreted to mean multiple things. A union is exactly the size of its largest variant. They are also really useful!
So let's add a C union to our C header file:
union MyUnion {
int integer;
char byte;
};
What does bindgen come up with?
#![allow(unused)] fn main() { #[repr(C)] #[derive(Copy, Clone)] pub union MyUnion { pub integer: ::std::os::raw::c_int, pub byte: ::std::os::raw::c_char, } }
Rust actually supports unions! It's just unsafe to do very much with them.
You can write to a Rust union safely, just ONLY set the one field and your code is both safe and sound:
#![allow(unused)] fn main() { #[test] fn testing_unions() { let u = MyUnion { integer: 12 }; let u = MyUnion { byte: 1 }; } }
Accessing a union field is always unsafe, because Unions violate the aliasing rule. It works, though:
#![allow(unused)] fn main() { #[test] fn testing_unions() { // This kinda makes sense let u = MyUnion { integer: 12 }; assert_eq!(unsafe { u.integer }, 12); assert_eq!(unsafe { u.byte }, 12); // Technically undefined behavior, but it works } }
And this kinda works:
#![allow(unused)] fn main() { // This is getting weird let u = MyUnion { integer: 512 }; assert_eq!(unsafe { u.integer }, 512); assert_eq!(unsafe { u.byte }, 0); // You're just reading the first byte! }
And please don't do this:
#![allow(unused)] fn main() { // This is just wrong let u = MyUnion { byte: 1 }; assert_eq!(unsafe { u.integer }, 1); // You're reading the first 4 bytes of a single byte! }
Unions are REALLY useful
Even though they are unsafe (even C++ tried to restrict them, but the userbase said NO), unions can be super useful. Here's the Linux defintion of an IPv6 address:
struct in6_addr
{
union
{
__u8 u6_addr8[16];
__be16 u6_addr16[8];
__be32 u6_addr32[4];
} in6_u;
};
It's 128-bits of data, but you can access it as bytes, 16-bit or 32-bit numbers. Likewise, it's common to use a union of 4 bytes or a 32-bit integer (big endian) for an IP address; access all the octets individually, or as a single number.