Getting Started with Twilio Voice for iOS: Accepting VoIP Calls

Accepting VoIP calls using CallKit and Twilio Voice for iOS

Frameworks and Languages

helloiPhone.jpg 319.25 KB


What is VoIP?
VoIP stands for "Voice over Internet Protocol", and in short, it means using the internet to make phone calls, instead of using traditional phone technology. It has numerous advantages over traditional analog phones, including lower cost and more flexible options for custom setups such as forwarding, waiting attendant, and call recording.

Prerequisites
- This guide is for the iOS portion of a Twilio VoIP app. You must also have a server component and Twilio account with API keys for Twilio Voice.

Getting Started
There are several things we'll need to build out the iOS side of the app:
1) the CallKit framework - this allows our app to use the native iOS Phone UI for controlling calls. We will receive callbacks for when each UI element is tapped.
2) PushKit framework - allows our app to receive push notifications for incoming calls. It is responsible for registering for push notifications, and gives us a unique token which we send to Twilio. We will need to create a VoIP certificate for this as well in our Apple Developer account and upload it to the Twilio console.
3) the Twilio Voice framework (cocoapod) - connects our app to Twilio. You'll need to add pod 'TwilioVoice' to your Podfile.

Understanding the call flow
It's important to understand what is actually occurring when someone calls you and it is redirected to your iOS app.
- First, the phone number that the external user is calling will need to be a number registered with Twilio. This can either be a phone number supplied by Twilio, or your own phone number that you have registered with Twilio.
- When the external user calls the number, Twilio will fetch info for that number and will perform actions based on content you have defined server-side using TwiML.
- The behavior defined in the TwiML will specify a client with a unique identifier to call. This unique identifier can be any string, but since calls are client-specific, usually it is generated iOS-side and then sent to your server and saved there.
- This identifier is associated with a unique device push token (as mentioned above). Twilio will know which iOS device to call based on this information.
- The call comes to iOS and first stops by PKPushRegistry's "didReceiveIncomingPushWith payload: PKPushPayload..." delegate method
- We pass the push and its payload to Twilio
- Twilio invokes its delegate, where the call is reported to CallKit
- CallKit lets the iOS system know about the incoming call, and then it appears on your phone's UI
- When you push the green phone button on your phone's UI to answer the call, the CXProviderDelegate is invoked for the CXAnswerCallAction. Here is where we finally use Twilio to actually answer the call and establish a voice connection.

Strategy
With the above in mind, let's carve out a strategy for how we can set things up on the iOS client:
- Add the Twilio Voice pod to our Podfile
- Next: we will need to register for VoIP notifications using PushKit
- Then, we will need to send our device token to Twilio
- Separately, we'll need to retrieve a Twilio access token that should be provided by the server component of your app. These tokens expire every 1-24hrs so you'll need a way to manage refreshing them. It is important to note though, that strictly speaking, you can still receive calls once your access token expires. However, you won't be able to do any client side manipulation of the call via Twilio as long as your token has expired--for example, hang up--since that requires an internal API call to Twilio which uses your access token for authentication.
- We will need to setup CallKit by creating a CXProvider object and CXCallController object.

Setup
First, go to the Signing and Capabilities tab in your Xcode project, and ensure the following capabilities are enabled for your app under Background Modes:



Screen Shot 2563-12-14 at 11.48.07.png 62.13 KB


You need to add the Push Notifications capability as well.

Next, create an entry in your info.plist for the key "NSMicrophoneUsageDescription" with a description for why you need microphone access.

<key>NSMicrophoneUsageDescription</key>
<string>For answering VoIP calls</string>

Register for VoIP Notifications
There are 2 parts here--the code part, and the certificate part. First, for the code. Let's initialize our PKPushRegistry instance:

let voipRegistry = PKPushRegistry(queue: .main)
voipRegistry.delegate = self
voipRegistry.desiredPushTypes = [.voIP]

There are 3 methods we need to implement to conform to the PKPushRegistryDelegate. At this point in the application lifecycle, you should already have retrieved your access token from your server:

import TwilioVoice

func pushRegistry(_ registry: PKPushRegistry, didUpdate pushCredentials: PKPushCredentials, for type: PKPushType) {
    // Here call TwilioVoice.register(accessToken: accessToken, deviceToken: pushCredentials.token)
    // with the push credentials
}

func pushRegistry(_ registry: PKPushRegistry, didInvalidatePushTokenFor type: PKPushType) {
    // Here call TwilioVoice.unregister(accessToken: accessToken, deviceToken: pushCredentials.token)
}

func pushRegistry(_ registry: PKPushRegistry, didReceiveIncomingPushWith payload: PKPushPayload, for type: PKPushType, completion: @escaping () -> Void) { }

This code is mostly self explanatory. When you receive the push notification credentials, you send them to Twilio, along with your access token. The last method, didReceiveIncomingPushWith, is what we will use to handle incoming calls. We will implement it later.

Now, for the certificate. This isn't a tutorial on what certificates are or how to create them necessarily, but briefly, it means you need to:
1) create a certificate signing request on your Mac
2) log into your Apple Developer account on developer.apple.com
3) navigate to Certificates, Identifiers, and Profiles
4) select "Certificates"
5) create a new certificates of type VoIP Services using your certificate signing request
6) download that certificate locally
7) open the certificate in your Keychain, and export it as a .p12 file.
8) Extract the certificate and private key from the .p12 file, as described here: https://github.com/twilio/voice-quickstart-ios
8) upload the certificate info to Twilio under Dashboard -> Settings -> Credentials -> Push Credentials. Take note of the SID value for the certificate, since your app's server component will need this when setting up Twilio.

CallKit
It's important to understand that CallKit is a UI-level framework. CallKit provides an iOS system UI for our call to interact with, where we can get callbacks about UI related events--such as when the user mutes the call, pressed the "Hold" button, or the red phone button to hang up. CallKit itself is not the framework that actually mutes the call--that would be the Call object from Twilio where we can set call.isMuted = true, for example. VoIP calling using iOS apps was possible before CallKit was introduced in 2016--CallKit just makes it possible for your VoIP app to interact with the system Phone UI.

In CallKit we will be mainly working with 2 objects, one object of type CXProvider and another of type CXCallController:

let callKitCallController: CXCallController = CXCallController()
let configuration: CXProviderConfiguration = CXProviderConfiguration(localizedName: "VoIP Tutorial")
let callKitProvider: CXProvider = CXProvider(configuration: configuration)
callKitProvider.setDelegate(self, queue: nil)

Passing nil to the queue parameter will default to receiving delegate callbacks on the main queue. The CXProvider is the class used to let iOS know about external actions that arrive to our app--for example, incoming calls, or a call ending from the remote side. The CXCallController object is used for letting iOS know about local user actions from within the app--for example, like starting a call, answering a call, or ending a call by pressing a custom button in our app.

We will need to conform to the CXProviderDelegate as well:
func providerDidReset(_ provider: CXProvider) { }

func provider(_ provider: CXProvider, perform action: CXAnswerCallAction) {
    
}

func provider(_ provider: CXProvider, perform action: CXEndCallAction) { 

}

There are more methods as well, such as those for CXStartCallAction or CXSetHeldCallAction. You can check out the documentation for CXProviderDelegate to get a full list: https://developer.apple.com/documentation/callkit/cxproviderdelegate

Putting it all together
Let's revisit the func pushRegistry(_ registry: PKPushRegistry, didReceiveIncomingPushWith... method from earlier. This method is called by iOS when we receive a push for a new voice call. Here, we need to instruct our app to use Twilio to handle this call:

func pushRegistry(_ registry: PKPushRegistry, didReceiveIncomingPushWith payload: PKPushPayload, for type: PKPushType, completion: @escaping () -> Void) {    
    TwilioVoice.handleNotification(payload.dictionaryPayload, delegate: self, delegateQueue: nil)    
    completion()
}

Let's create a CallClient class to keep track of the call details. We will need to maintain references to several different pieces of data:

import TwilioVoice
import CallKit

class CallClient: NSObject {
    var activeCallInvites: [String: CallInvite] = [:] // CallInvite is a type from Twilio voice
    var activeCall: Call? // from Twilio Voice
    var activeCalls: [String: Call] = [:]
    var callKitProvider: CXProvider!
    var callKitCallController: CXCallController = CXCallController()
    let defaultAudioDevice = DefaultAudioDevice()
}

Next we'll conform to the NotificationDelegate for TwilioVoice:

extension CallClient: NotificationDelegate {
    
    func callInviteReceived(callInvite: CallInvite) {
        guard let from = callInvite.from else { return }
        let callHandle = CXHandle(type: .generic, value: from)
        let callUpdate = CXCallUpdate()
        callUpdate.remoteHandle = callHandle
        callUpdate.hasVideo = false
        activeCallInvites[callInvite.uuid.uuidString] = callInvite
        // will pass incoming call to CallKit
        callKitProvider.reportNewIncomingCall(with: callInvite.uuid, update: callUpdate) { error in
            //...
        }
    }

    func cancelledCallInviteReceived(cancelledCallInvite: CancelledCallInvite, error: Error) { }
}

Then we will implement the CXCallProviderDelgate to answer the call:

func provider(_ provider: CXProvider, perform action: CXAnswerCallAction) {
    defaultAudioDevice.block()
        
    answerCall(uuid: action.callUUID)
    action.fulfill()
}

func answerCall(uuid: UUID) {
    guard let callInvite = activeCallInvites[uuid.uuidString] else { return }
        
    let acceptOptions = AcceptOptions(callInvite: callInvite) { builder in
        builder.uuid = callInvite.uuid
    }
        
    let call = callInvite.accept(options: acceptOptions, delegate: self)
    activeCall = call
    activeCalls[call.uuid!.uuidString] = call
        
   activeCallInvites.removeValue(forKey: uuid.uuidString)
}

So we simply receive the call invite, append it to our dictionary of current invites, and then accept the call for a given UUID. The call to: "callKitProvider.reportNewIncomingCall.." is what actually notifies iOS about the incoming call and then renders the native answer/decline call UI.  When we are finished we remove the UUID from our call invites and assign the current activeCall to this call.

The method func provider(_ provider: CXProvider, perform action: CXAnswerCallAction) is called when the user selects the green phone button to answer the call.

The activeCall is a Call object which contains information such as the "from" and "to" numbers, the call SID, or whether the call is currently muted or on hold.

We need to call .fulfill() on our action to let CallKit know it was successful. In the event of a failure, you should call action.fail() instead.

Hanging up
To hang up from a custom button within our app, we will use our CXCallController object:

func hangupButtonPressed(sender: UIButton) {
    guard let uuid = activeCall?.uuid else { return }
    let endCallAction = CXEndCallAction(call: uuid)
    let transaction = CXTransaction(action: endCallAction)
    callKitCallController.request(transaction) { error in
       //...
    }
}


This will then trigger the CXProviderDelegate's CXEndCallAction, where we can tell Twilio to disconnect:

func provider(_ provider: CXProvider, perform action: CXEndCallAction) {
    if let call = activeCalls[action.callUUID.uuidString] {
        call.disconnect()
    }
    action.fulfill()
}

If we want to hang up from the system Phone UI by pressing the red disconnect button, it is important to note that this alone does not actually hang up the call. All that does is invoke the CXEndCallAction as above. It is in that method that we must tell Twilio to disconnect the call, as well as call action.fulFill(). Then the call will be completely terminated.

Managing Audio
There are a few things you'll need for managing audio in the app. First, you need to request microphone access using:

AVAudioSession.sharedInstance().requestRecordPermission { permissionGranted in
   // permissionGranted is bool
}

You can retrieve the current permission status with AVAudioSession.sharedInstance().recordPermission. Also we will need to setup the default Audio session before we can receive any calls:

TwilioVoice.audioDevice = audioDevice

And as above, when the CXProviderDelegate invokes the CXAnswerCallAction, you need to setup the audio session:

defaultAudioDevice.block()

Troubleshooting
At this point we have enough code to answer VoIP calls using Twilio in our iOS app. If for some reason the calls are still not going through, here are a few ideas for troubleshooting:
- If the call is arriving on the system UI, but you can't hear any voice, check to make sure you have configured the audio session correctly, and that you have requested and been granted microphone permission
- If the call is not going through at all, ensure that the access token is successfully registered. Also be sure and check the certificate. In the Twilio console where you upload your certificate there is a box you can check/uncheck for "sandbox testing only." If you are testing in iOS's debug configuration this will need to be checked.
- Check the server setup and ensure the PUSH_CREDENTIAL_SID is being set.

Ready to start your project? Contact Us



Resources
- WWDC 2016 Video, Enhancing VoIP Apps with CallKit:
https://developer.apple.com/videos/play/wwdc2016/230/ (highly recommended)
- Twilio Voice QuickStart Guide:
https://github.com/twilio/voice-quickstart-ios
- CallKit Documentation: https://developer.apple.com/documentation/callkit?language=objc

Like 3 likes
Joey Bodnar
Share:

Join the conversation

This will be shown public
All comments are moderated

Comments

eric
November 22nd, 2021
Could you serve the source code of the example?

Get our stories delivered

From us to your inbox weekly.