by Yue Song, 3GPP CT4 Chair
First published June 2024, in Highlights Issue 08
In March 2024, CT#103 approved new technical specifications TS 29.175 and TS 29.330, which marked the completion of the Release 18 work on NG-RTC (Next Generation Real Time Communication).
NG-RTC is a technology stack for real-time communication services, it is based on the architecture of IMS (IP Multimedia Subsystem), by involving technologies like IMS Data Channel, AI media processing and Service Based Interface, NG-RTC provides users with a richer interactive and ultra-high-definition intelligent calling experience.
Meanwhile, the introduction of a more flexible and open architecture facilitates the entry of third-party calling applications and builds a new ecosystem for multi-dimensional and interactive real-time communication.
IMS Data Channel
The mechanism of IMS Data Channel was defined in Release 16, the key purpose is to introduce a content agnostic data transmission tunnel to IMS. The IMS Data Channel has several advantages over the existing data channel directly provided by EPC/5GC:
- It is created along with an IMS session and therefore the content transmitted within IMS Data Channel can be naturally linked with a voice/video call, which makes it much easier to design enhanced voice/video features;
- The IMS Data Channel is correlated with an authenticated user identity by nature, thus it is much easier for the applications facilitated by IMS Data Channel to obtain the user identity whenever it is needed;
- Similar to IMS audio/video media, dedicated bearers/QoS flows can be established for IMS Data Channel by which the QoS can be guaranteed.
Enhancement to IMS Architecture
To facilitate the IMS Data Channel based applications, 3GPP has enhanced IMS architecture by standardizing several new network functions:
Media Function (MF): It acts as the media plane function for IMS Data Channel, by terminating IMS Data Channel from the UE and establishing further IMS Data Channel towards DC Application Server or another Media Function. Besides proxying the IMS Data Channel payload, MF can also process the content of the media flow, e.g. video rendering, speech to text, image recognition.
DC Signaling Function (DCSF): It acts as the control plane function for IMS Data Channel. The DCSF subscribes to the IMS session events from the IMS AS, accordingly it instructs the MF to perform media plane handling, i.e. IMS DC forwarding and media processing. Also, the DCSF exposes APIs on IMS Data Channel and media processing towards the DC Application Servers, by consuming the APIs, the application servers can fulfill the service logics.
DC Application Server: It does not refer to a specific network function but a kind of application servers, which consume the capabilities exposed by the network and provide the service logics to the subscribers or the enterprise users.
DC Application Repository (DC AR): To process with certain service logic, there needs to be an application (taking the form as applet) running on the UE. The applications need not to be pre-installed on the UE but downloaded and launched during runtime, and DC AR is the repository from where the UE can download the applications. The UE does not download applications directly from DC AR, but via DCSF and MF.
Figure from TS 23.228 "IP Multimedia Subsystem (IMS); Stage 2"
Service Based Interface
Another key enhancement to IMS architecture is the introduction of Service Based Interface (SBI). IMS has been using SIP protocol for most of the interfaces, the strict state machine behind SIP protocol does not fit well the various new services emerging rapidly nowadays. A new SBI (Nimsas) has been defined for IMS AS, by consuming the APIs over Nimsas the DCSF can subscribe to the IMS session events, flexible service triggering is achieved via such mechanism. Furthermore, the DCSF exposes APIs towards DC applications via Ndcsf interface, the application servers can therefore dynamically discover and invoke the APIs provided by the network. All intentions behind the introduction of SBI into IMS is to make the real-time communication network more open and flexible.
AI Media Processing
In the legacy network, the processing with the media flows is quite limited, i.e. media forwarding, transcoding and announcement playing. While in the NG-RTC network, the Media Function is designed to perform much more extended media processing, by introducing more media processing functionalities to MF. Here are some examples:
- With NLP functionality, the MF can convert the real-time speech into text and subsequently transmit the converted text to the UE via IMS Data Channel. This can make a service that provides subtitles for the audio/video calls, which will help the people with hearing impairments.
- With image recognition and 3D rendering functionalities, the MF can generate a digital avatar of the subscriber, then the avatar can be further driven by speech and movement of the subscriber, which can facilitate metaverse services.
- With the assistance of LLM, the MF can even comprehensively process the speech of the subscriber. By understanding the intention of the user, MF can subsequently invoke APIs provided by various application servers, which makes it an AI assistant.
Future Outlook
NG-RTC phase 2 is now under development in WG SA2. In Rel-19, new mechanisms such as standalone IMS Data Channel, interworking of IMS Data Channel between PLMNs are to be standardized. The target of NG-RTC is not to just define several specific services, it is to create a platform on which various enriched real-time communication services can be built, a more creative ecosystem of real-time communication can be established. NG-RTC has opened new opportunities for the real time services.
For more on WG CT4: www.3gpp.org/3gpp-groups