Swift and C interoperability

September 25, 2018 | Igor Ranieri

Bringing transparency to opaque pointers

I’ve been writing iOS apps and libraries for cryptography and FinTech for the last couple of years. Most existing libraries and frameworks are implemented in C, which shouldn’t surprise me, as it’s still considered the most efficient and performant way of writing platform agnostic code. If we want to use Swift, this leaves us with two options: wrapping everything inside ObjC wrappers that are exposed to Swift, or operating directly on those C APIs.

Of course, cryptography isn't the only realm of mostly C or C++ libraries. Amongst others, we'll see that image manipulation, computer vision and natural language processing also rely heavily on them.

Swift provides us two main ways to deal with pointers, manual memory management and the C-interoperability collection of classes, structs and methods. When faced with a C library, with the objective of minimising instances of doing the same work twice, I opted for the latter.

The first gives us the ability to manipulate memory directly, and the second, some way of operating on basic C types.

At first glance, they can seem a bit daunting. When should I use an UnsafeRawPointer, or an OpaquePointer? And what's the difference again? And all those types with unsafe right there in the name – that can't be safe, right? (🥁)

What's what

As the name implies, unsafe pointer types are unsafe, meaning that you're responsible for how it handles memory: allocating, accessing, deallocating. It’s all on you! 🙀

So let’s go over them.

OpaquePointer

Opaque pointers, are used to represent C pointers to types that can’t be represented in Swift, for example incomplete (C) structs. But one of the main uses I've found for it was as an intermediary type when I needed to cast a pointer as a different type of pointer.

var privateKey = [UInt8](repeating: 0, count: length)
var randomData = Data.generateSecureRandomData(count: length)
memcpy(&privateKey, &randomData, length)

privateKey[0] &= 248
privateKey[31] &= 127
privateKey[31] |= 64

let privateKeyPointer = UnsafeMutablePointer<ec_private_key>(OpaquePointer(privateKey))

Here, we create an OpaquePointer to our array of UInt8 that would come to represent a specific data structure unknown to the compiler at the time – an ec_private_key. If a C operation changes the nature of a pointer, Swift doesn't really know about it, so we have to handle that manually.

UnsafeRawPointer

Well, that's all nice, but what happens when we don't really know, or don't need to know, what kind of data we're pointing too? Or what if it changes?

With the UnsafeRawPointer, we can access untyped data. It's also very useful when you need to send it to different C-backed methods that expect differently named yet equivalent data types (this happens way too often, due to the nature of C types).

let pointer = UnsafeMutableRawPointer(address)
a_c_function(pointer.assumingMemoryBound(to: some_type.self)
b_c_function(pointer.assumingMemoryBound(to: equivalent_type_in_diff_library.self)

UnsafePointer and UnsafeBufferPointer

UnsafePointer, UnsafeBufferPointer and their mutable variants are responsible for holding a memory reference to a specific, known ahead of time, C type.

The first, to a specific data structure.

let private_key: UnsafePointer<ec_private_key> // holds a pointer to a ec_private_key C structure.

The latter, to a non-owning collection interface to a buffer of elements stored contiguously in memory. In most cases that means a reference to the start of an array of a known type.

/* Something like this in C */
int[10] int_ary;
// Can come to us as…
let int_ary_buffer: UnsafeBufferPointer<Int>

// Which we could use to create a Data instance, for example:
let data = Data(bytes: int_ary_buffer, count: int_ary_buffer_len, deallocator: .free)

Usage

How do we actually make use of this? Here are a few examples:

Creating a new C struct pointer from a C function:

Sometimes we need to use values created through a C function, for instance when creating encryption key pairs, for that UnsafeMutablePointer can be of great use:

var aPointer: UnsafeMutablePointer<some_c_struct>?

guard allocate_and_populate(&aPointer, self.sourceStuff) == 0,
  let pointer = aPointer
  else {
    throw .allocationFailed // error declared elsewhere
  }

// now we can use pointer and a non-optional value, that's been manipulated by a C function.
// we can access the struct properties through the `pointee` accessor.
pointer.pointee.c_struct_field

Applying transformations to a Swift struct through a C function, includes copying **Data** bytes to an array of primitive data types (**UInt8** in this case**):**

Sometimes we need to use a C function to sign a piece of data. Passing a Data structure back and forth between Swift and C can be cumbersome.

func sign(data: Data) -> Data throws {
  // create some random data
  let length = 64
  let randomData = Data.generateSecureRandomData(count: length)

  // create an "empty" array of bytes, copy the input data there.
  var message = [UInt8](repeating: 0, count: data.count)
  data.copyBytes(to: &message, count: data.count)

  // create another empty array of bytes, copy the random data there.
  var randomBytes = [UInt8](repeating: 0, count: randomData.count)
  randomData.copyBytes(to: &randomBytes, count: randomData.count)

  // create yet another empty array of bytes, this will hold our resulting signature
  var signatureBuffer = [UInt8](repeating: 0, count: length)
  // Cast our private key pointer to the type expected by the C function.
  let privateKey = UnsafeMutablePointer<UInt8>(OpaquePointer(self.keyPairPointer.pointee.private_key))

  // use Curve25519 to sign our data.
  guard curve25519_sign(&signatureBuffer, privateKey, message, UInt(data.count), randomBytes) >= 0
    else {
      throw .cantSignError
    }

  // returns our signature wrapped inside a Swift Data structure, abstracting all the C stuff away.
  return Data(bytes: signatureBuffer, count: length)
}

Allocating a C struct directly, and populating it's properties manually:

And sometimes we want more control over how a C struct gets populated, or how its memory is managed on our end. In that the case, allocating it and manually populating it can work great:

// Generate data from String
guard let data = name.data(using: .utf8) else {
  fatalError()
}

// Allocate a new instance. Capacity is usually 1, meaning we only want to allocate memory for a single instance.
let person = UnsafeMutablePointer<c_person>.allocate(capacity: 1)

// assign values to its properties
person.pointee.name_len = data.count

// Now we'll populate an array of `char` (same as `UInt8` in Swift/C) with the bytes of our name data structure.
let bytes = UnsafeMutablePointer<UInt8>.allocate(capacity: data.count)
data.copyBytes(to: bytes, count: data.count)

bytes.withMemoryRebound(to: Int8.self, capacity: data.count) { pointer in
  person.pointee.name = UnsafePointer(pointer)
}

/*
Now we have a c_person!

c_person {
  char* name;
  int name_len;
}
*/

But here we first our real head scratcher: All of this example code can silently fail. As soon as we exit the scope all the C structures will be instantly deallocated. Just like that, all your pointers will be nulled. So if you’re running an asynchronous process or abstracting your interop code inside a model, all of your C operations could be happening on nulled pointers. We need to ensure that our C structures remain in memory. But how?

You might have noticed that Swift does not give us any direct way incrementing / decrementing the reference counter. No good old retain / release calls. We're faced with a new question: when do I want this to be released from memory? If the answer is "as soon as I've exited this scope" then you're in luck, as that's exactly what's going to happen by default.

Most of us will need this in memory for a bit longer than that and we'll need to hold on to those values. An easy way to do that is by assigning them to a property. If we’re abstracting all our C interop code inside a model, that model will need to hold on to function arguments as well. That will bound our C stuff to the lifecycle of a particular class or struct, and it's much easier to manage.

class SessionRecord: NSObject {
    // We store it as a property here to bound its lifecycle to this object's lifecycle.
    private var sessionRecordPointer: UnsafeMutablePointer<session_record>

    init(data: Data, context: Context) {
        let data = data as NSData

        var recordPointer: UnsafeMutablePointer<session_record>?

        guard session_record_deserialize(&recordPointer, data.bytes.assumingMemoryBound(to: UInt8.self), data.length, context) >= 0,
          let record = recordPointer else {
            fatalError("Could not deserialize session record")
        }

        self.sessionRecordPointer = record

        super.init()
    }
}

All’s well that ends wel– EXC_BAD_ACCESS

Even for C veterans this stuff is hard. It’s hard to write, read, test and difficult to ensure that memory is correctly managed. At the end of the day, the best we can do is to double-check everything and test it and test it again, until it works just right.