Nextcloud Talk – Voice and Video Calling – First Impressions

Nextcloud LogoOnce upon a time, Skype was THE voice and video calling platform for me. It was independent, decentralized and offered end-to-end encryption. But that was a long time ago, today it’s centralized, more closed source than ever, and encryption seems to be rather optional. But on PCs there was little else that was usable and universal, perhaps until now. A few days ago I started to test Nextcloud Talk, that, despite its name is a full blown Voice and Video Conferencing and Calling Solution.

The Basics

I actually installed Nextcloud 13 around six weeks ago but since I was very skeptical that it would work well I didn’t give it a try until now. Too often was I disappointed by other products. But this time things were different.

When Nextcloud is already installed, ‘Talk’ can be installed with the click of a button and works out of the box without any additional configuration. Calling someone is as easy as selecting another user of my Nextcloud installation at home from a drop down list and by pressing the ‘Join Call’ in the web browser. The called user gets a notification message on the screen if Nextcloud is open in a browser tab or if they’ve installed the Nextcloud Talk Android or iOS app. Unfortunately the user is not alerted in the good old telephony fashion so the invitation is easily overlooked.

External Users, Voice, Video and Group Calling

It’s also possible to invite somebody to a group voice and video call by generating a link in the ‘Talk’ app in Nextcloud in the browser and then send the link by any means, i.e. by eMail, messenger, etc. When the recipient clicks on the link a new web page is opened that leads to my Nextcloud instance from which the WebRTC based client is started in the recipient’s web browser. All very seamless, no software needs to be installed, the recipient just has to confirm that the web browser is allowed to use the microphone and camera.

While not a telephone replacement due to the missing alerting, which should not be too hard to implement on Android and perhaps also on iOS, it is still a great tool to make end-to-end encrypted voice and video calls. Also, all meta-data is exchanged via my Nextcloud instance, i.e. this information is also not stored somewhere else!

Over the past few days I’ve made several calls, many with a duration of more than one hour. Voice and video quality was excellent, and video calls use around 2 Mbit/s of bandwidth in each direction. Audio-only calls just use a few kilobytes a second.

Direct Media Streaming

Voice and video calls are not limited to two people, it’s also possible to establish audio and video conferences with several people. There is no central element, media is streamed peer to peer as I noticed when I took a closer look at Wireshark, even though both ends of the connection were behind a IPv4 NAT gateway or behind IPv6 (!!!) firewalls that only allow outgoing connection establishments. I haven’t yet checked out what kind of STUN/TURN etc. servers they use to figure out how to connect, that’s still on the to do list.

Not having a central distribution point also means that in a conference call, each client seems to send its voice and video channel separately to each party. In other words, you better have a fat uplink for larger conferences to keep up video quality.

CPU Intensive

Firefox’ WebRTC implementation is quite resource hungry. Even audio-only calls drives my CPU utilization up to 30% and video calls require even more. Despite this, it is possible to do audio calls to a browser on a mobile device that doesn’t have the talk app installed. Video calling works as well in theory, but in practice the smartphone’s processor was too slow to handle the video well on a reasonably recent Android phone. I didn’t try video calling with an Android app, another thing for the to-do list.

Where to go from here?

In summary, I was very impressed at how well the solution already works! One thing I would really like to see in the next feature release is real alerting on mobile devices rather than just an info message and a short tone or vibration that is easily missed. With this, I could very well imagine replacing a lot of ‘ordinary’ phone calls between family members with this. In addition to the superior voice quality, especially when roaming and HD-voice not being available, it’s the end-to-end encryption and metadata being only created on my Nextcloud server at home that would be the killer-features for me!