Being private doesn’t mean not being social. To get the full benefit of the all-in-one privacy app MySudo, you need to be able to connect privately and securely with family, friends, colleagues and other people you know who are using the app.
That’s why we’re introducing mobile private contact matching—so you have a very convenient way of having secure and private communications with your most important contacts. But in this post, we want to take you behind the scenes to show you the two ways of approaching mobile private contact matching.
Private communication apps like MySudo allow you to have fully end-to-end encrypted (E2EE) communications. This assures you that the service provider (including us at Anonyome Labs) or any unauthorized users cannot intercept and view your private communications. MySudo even goes a step further than traditional E2EE message providers by using E2EE communications to protect messaging, voice calling, video calling and email.*
We’ve just introduced Invite a friend into MySudo, and now we’re working on another feature that will let you more easily connect with and grow your contacts network for secure and private end-to-end encrypted communications: contact matching.
Contact matching in MySudo will help you connect with people from your mobile address book who are also using MySudo. Through this opt-in feature, MySudo will tell you when someone you know is also using MySudo, so you can easily connect and communicate with them securely and privately via E2EE. Protecting your sensitive contact information is paramount for MySudo, and we’ll get to that.
Here’s a quick overview of mobile private contact matching
Mobile private contact matching can be done in two different ways. The first is a server-side matching approach that performs the matching operations on a cloud server. The second uses a client-side matching approach where the matching is performed on the user’s mobile device. Let’s look at the privacy-protecting processes for both methods.
Cloud service-based contact matching
The algorithm for a server-side implementation is quite straightforward. This process uses a cloud service to match the intersection of all MySudo users with the entries listed in a particular user’s address book. This process is similar to the processes that WhatsApp, Signal and Telegram use.
You can see the server-side contact matching algorithm in this diagram:
When a new user (e.g. User 1) installs a communication app on their mobile device, it will report a unique identifier (usually the user’s mobile phone number) to the contact matching service. Next, once the user grants permission for the app to access their mobile device’s address book, the app uploads the address book entries to the contact matching service. As subsequent users (e.g. User 2) also upload their contacts lists, the cloud service runs a matching algorithm that looks for the users in other users’ address books (e.g. User 1 being in User 2’s address book). If there is a match, then User 2’s app is notified that User 1 also has the app. At this point, User 2 can begin communicating with User 1 (the app usually makes this happen easily).
This method can do even better on privacy, with:
- Hashed identifiers: To protect the privacy of User 1’s identifier and User 2’s address book entries at the contact matching service, hashed values of user identifiers/address book entries are sent rather than left in the clear (see here and here). In this way a casual observer is not able to see the private user contact information. Because of the known structure of phone numbers and limited entropy, hashes only provide limited privacy protection (for more, see here and here).
- Temporary storage: As well as sending hashed values of the address book entries, the contact matching service deletes the address book entries as soon as it completes the matching process. This removes the problem of having a server continuously storing both user identifiers and all user address book entries.
- Incremental contact matching: Going further, rather than sending the full address book on each occasion, only changes to the address book are sent to the contact matching service. This idea is based on the notion that the database of registered users changes only gradually over time. Similarly, the address book contacts of a user also change only slowly. Given that clients are able to store the last state for each of their contacts, they only need to query the server for changes since the last synchronization.
- Secure processing hardware: The contact matching service is run within a secure enclave, such as Intel SGX. This isolates it from other servers in the service provider’s network which may make it more difficult for a bad actor to get the data. It doesn’t stop access to the data from a privileged insider.
Like we said, server-side contact matching is the most common approach you’ll find in communication apps. This is because it’s simple to implement, reduces the amount of data transferred from mobile devices to the service, and has low processing requirements for mobile devices. Now let’s look at the other method.
Mobile client-based contact matching
This method implements the contact matching at the user’s mobile device. This addresses the privacy concerns that the client should not be able to learn the identifier(s) of every user using the app, and that the service provider should not learn the address book entries of the user.
To stop the disclosure of contact information, the client-side implementation uses algorithms known as private set intersection (PSI) (see here and here). With the PSI algorithms, it is still possible to compute the intersection of all users using an app, and the entries listed in a particular user’s address book, without exposing the non-intersecting entries to the service provider, or exposing the full list of users using the app. This dramatically increases the privacy protection for the user’s identifiers and sensitive contact address data.
The PSI algorithms work like this:
- The service provider and the user both have a set of contacts that they want to intersect.
- The service provider creates a symmetric key in which to encrypt their database of users using the app.
- The service provider inserts the encrypted database into a probabilistic database structure for efficient membership testing, such as a Bloom Filter or Cuckoo Filter. This database structure is then sent to the user’s mobile app.
- The user’s app and the service provider then take part in an Oblivious Pseudo-Random Function Protocol (OPRF) inference process. With this protocol, the user’s app is able to encrypt their address book entries so that they may be decrypted by the service provider’s symmetric key, without knowing the symmetric key and without the service provider learning the address book entries.
- The user’s app is able to locally match all entries in their address book that have already been registered with the service provider.
With PSI algorithms, all matching entries are identified, and all other contact information stays confidential. At this stage we don’t know of any commercial communication apps using PSI algorithms. To encourage commercial adoption, over the last few years industry efforts have focused on reducing the communication and computational overheads of the PSI algorithms.
So which algorithm is more private?
Here we compare the two approaches for contact matching:
Algorithm property | Cloud service-based contact matching | Mobile device-based contact matching |
Complexity of implementation | Low | High |
Data transfer to/from mobile | Low | High |
Processing requirements at mobile | Low | High |
Processing requirements at central server | High | Low |
Contact privacy preservation | High | Very high |
On the left is a breakdown of different aspects of the cloud service-based contact matching. The assumption is that these services use hashed data, remove address book entries from the server after matching has occurred, and use incremental contact matching. You can see the three main advantages of the system:
- It’s not complex to implement the algorithm.
- The amount of data transferred to and from the mobile device is low.
- Mobile device processing requirements are also low.
There are significant processing requirements at the server which is acceptable (plenty of processing power in the cloud) and with the mitigations we’ve noted, the privacy is considered to be at a high level.
The column on the right shows the breakdown for the mobile device-based contact matching using PSI algorithms. In this comparison, the first three requirements are disadvantages:
- The complexity of implementation of the algorithm is high.
- Data transferred to/from the mobile is high.
- Mobile processing requirements are also high.
The processing requirements at the server are low. Importantly, the privacy preservation of user identifier and user contact information using the client-side implementation is very high.
You’ll see in this diagram the increasing level of privacy preservation offered by some popular communication apps:
On the left side of the privacy preservation spectrum is a cloud service based contact matching method, with a medium level of privacy protection. You’ll find this in WhatsApp and Telegram.
In the middle, again based on a cloud service based contact matching, is Signal, which provides some remediations to lift the overall privacy preservation of the solution to a high level. On the right side is the mobile privacy contact matching solution based on PSI. There are no known apps that implement this approach, but if there were it would be very high.
In a future article we’ll show you how MySudo will approach mobile private contact matching.
*MySudo also allows for communication with non-MySudo users and organizations. Communications of this type are unencrypted and rely on the user provisioning a MySudo phone number or email address. Although the communications are unencrypted, this method still provides a level of privacy protection for the user by ensuring the user does not need to disclose their personal mobile number or personal email address when dealing with unknown users and organizations.