Multiparty Multimedia Session T. Lohmar Control Working Group Ericsson GmbH Internet-Draft J. Gordon Intended status: Experimental RealNetworks, Inc. Expires: January 30, 2010 T. Einarsson Ericsson AB July 29, 2009 Fast Content Switching with RTSP 2.0 draft-lohmar-mmusic-rtsp-fcs-00 Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on January 30, 2010. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Lohmar, et al. Expires January 30, 2010 [Page 1] Internet-Draft RTSP-FCS July 2009 Abstract RTSP defines the setup and control for on demand and live streaming media sessions, which are delivered via an external media transport protocol such as RTP/UDP. RTSP does not define a mechanism to change the content during an on-going streaming session. Such a mechanism improves the streaming experience when a user browses through multiple offerings on a single streaming site. This document describes several methods to improve content switching. The basic principle is to re-use already established transport sessions (e.g. RTP/UDP sessions) and negotiate new content to be delivered on the existing sessions. If additional transport sessions are necessary, those sessions are established separately. This principle of re-using the RTSP control and transport sessions decreases the content switch delay to a large extent and improves the end-user experience. The present document defines a mechanism for switching to new content, both when the client already has the content description available and when it does not. This document additionally considers switching of a single media stream in a session, when several alternative media components are available. For instance, the content may provide several alternate audio tracks in different languages to be played with a single video stream. The principle of Fast Content Switching and Start-up is also defined in 3GPP TS 26.234 [3GPP.26.234] for RTSP 1.0 [RFC2326]. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Lohmar, et al. Expires January 30, 2010 [Page 2] Internet-Draft RTSP-FCS July 2009 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1. High level procedure description . . . . . . . . . . . . . 4 1.2. New features for RTSP 2.0 . . . . . . . . . . . . . . . . 6 2. RTSP 2.0 Protocol Extension . . . . . . . . . . . . . . . . . 6 2.1. "Switch-Stream" RTSP Header Field . . . . . . . . . . . . 6 2.2. Semantics of RTSP PLAY method . . . . . . . . . . . . . . 7 2.3. RTP Transport Sessions . . . . . . . . . . . . . . . . . . 8 3. Switching to new content with available description . . . . . 8 4. Switching to new content without content description . . . . . 9 5. Switching Media within an RTSP session . . . . . . . . . . . . 11 6. Adding Media Components to an ongoing session . . . . . . . . 12 7. Removing Media Components from an ongoing session . . . . . . 12 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 9. Security Considerations . . . . . . . . . . . . . . . . . . . 13 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 13 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 11.1. Normative References . . . . . . . . . . . . . . . . . . . 13 11.2. Informative References . . . . . . . . . . . . . . . . . . 13 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 14 Lohmar, et al. Expires January 30, 2010 [Page 3] Internet-Draft RTSP-FCS July 2009 1. Introduction RTSP defines the setup and control for on demand and live streaming media sessions. The media data is delivered via an external media transport protocol, typically RTP/UDP. RTSP does not define a mechanism to change the content during an on-going streaming session. When changing from one RTSP resource to another offered by the same server, the streaming session must be torn down and newly established. This procedure can take an excessive amount of time and resources. The present document defines a mechanism to change the streaming content during an on-going RTSP session. The existing transport sessions are re-used. Additional transport sessions may be established when the new content conains more media components than needed for the old content and unnesessary transport sessions may be released when no longer needed. Such a mechanism improves the streaming experience when a user browses through various content offered by same streaming server. The RTSP protocol extensions defined in the present document are applicable for both live and on-demand streaming sessions. A number of general RTSP extensions are defined to enable fast content switching during an on-going streaming session. These extensions are to be implemented by RTSP clients and servers wishing to support any of these features. Feature tags are used to exchange supported capabilities. 1.1. High level procedure description A streaming client needs to have an RTSP URI to start the streaming session. The required codecs and transmission formats are described via SDP or other supported media description format. The client may need to retreive the content description from the RTSP server prior to session setup. In order to establish a streaming session, the client first needs to establish transport sessions for each media component it will be streaming. Each media component is uniquely identified by a media control URI. Transport parameters such as UDP source and destination ports are exchanged during the set-up of the transport sessions. The control URIs are described in the session description. When all needed transport sessions are established, the actual content delivery and reception process can begin (RTSP PLAY method). The client receives needed synchronization information with the PLAY response from the server. This includes timestamp synchronization, sequence numbers and synchronization source identifiers (SSRCs) per Lohmar, et al. Expires January 30, 2010 [Page 4] Internet-Draft RTSP-FCS July 2009 media components (identified by according media control URIs). When the user changes to new content on the same server, only one PLAY transaction is needed when the number of media components are the same (e.g. old session contains audio and video media components and the new content does as well). New synchronization information is provided with the RTP-Info in the PLAY response. Replacement the media control URIs for the transport sessions is also clarified during the PLAY transaction. In case the number of needed media components is increased (e.g. switching from audio and video content to audio, video and timed- text), then the client establishes additonal transport sessions. Synchronization with the earlier established components is realized through the RTSP PLAY transaction. In case the new content needs fewer media components then the old content, the unnecessary components are released. The fast content switching procedure does not require a client to have all SDP information before switching to new content. It may request the SDP information for the new content with the RTSP PLAY request. Therefore, the procedure of content switching with and without an available session description is slightly different and will be handled independently in the following. Content Switching with content description: If the client has already received the content description before the fast content switch request, it can process it prior to the switch request and determine whether or not additional media components are needed. In case new media components are needed, the client may establish the new transport sessions either before or pipelined with the PLAY request for the new content. Content Switching without content description: Often the client does not have the content description but only an aggregate RTSP content description URI. Since the streaming sessions will frequently constist of the same number of media components (e.g. one audio and one video), it is very likely that the client can re- use established transport sessions. In case fewer media sessions are required, the client can release its established resources when it receives the response. If additional media sessions are needed, the client may establish these sessions after the media description response is received. Switching an individual component in a streaming session: Lohmar, et al. Expires January 30, 2010 [Page 5] Internet-Draft RTSP-FCS July 2009 This document also defines a mechanism to change parts of the content during an ongoing streaming session. For instance, the user may choose to change the audio stream from the original language track to a translation in another language. Alternatively, he may wish to change only the video component - such as a different camera view during a racing event - while keeping the same audio stream. In most cases, a content switch can be performed with a single RTSP transaction. However, in order to preserve interoperability with RTSP-aware intermediate devices such as application layer gateways, RTSP clients should ensure that SETUP requests and responses are sent for each transport session to be used. This allows the intermediate device to establish the necessary configuration, for instance routing of UDP ports for RTP/UDP sessions. Once the transport session has been negotiate with a SETUP request and response, it may be reused for subsequent content upon a switch. 1.2. New features for RTSP 2.0 The following new RTSP feature tags are defined: o "rtsp-switch" feature-tag, defined in Section 3. o "rtsp-switch-req-sdp" feature-tag, defined in Section 4. o "rtsp-switch-stream" feature-tag, defined in Section 5. In addition the following new RTSP header fields are defined: o "Switch-Stream" header, defined in Section 2.1. o "SDP-Requested" header, defined in Section 4. 2. RTSP 2.0 Protocol Extension In order to change the content of an on-going RTSP session, a number of streaming parameters are changed for the new content. The new parameters for the transport session can be provided with the RTP- Info field in the PLAY response. A new protocol header to replace the RTSP control and media URIs is described in the next section followed by the expected semantics of the RTSP PLAY method. 2.1. "Switch-Stream" RTSP Header Field The "Switch-Stream" header field may be used in an RTSP PLAY request or response message. It is used to describe the replacement of media streams after a content switch. The "Switch-Stream" header field may Lohmar, et al. Expires January 30, 2010 [Page 6] Internet-Draft RTSP-FCS July 2009 be used with aggregated control and with media control URIs. The "Switch-Stream" header syntax in ABNF [RFC5234] is as follows: Switch-Stream = "Switch-Stream" COLON switch-spec *(COMMA switch-spec) CRLF switch-spec = old-stream ";" new-stream old-stream = "old" "=" (DQ rtsp-url DQ) / (DQ DQ) new-stream = "new" "=" (DQ rtsp-url DQ) / (DQ DQ) rtsp-url = rtsp-url ; as defined in RFC 2326 [5] DQ = %x22 ; US-ASCII double-quote mark (34) LWS = [CRLF] 1*( SP / HT ) SWS = [LWS] ; sep whitespace COMMA = * ( SP / HT ) "," SWS; comma COLON = * ( SP / HT ) ":" SWS; colon If both old media stream and new media stream URIs are indicated in the "Switch-Stream" header field of a PLAY request from an RTSP client to an RTSP server, then the server MUST interpret this as a request to replace the specified media stream with the new media stream, hence reusing the existing transport parameters. If the "Switch-Stream" header field is included in a PLAY response from an RTSP server to an RTSP client, then this header informs the client about the media streams that are currently being streamed to the client. The old media stream MAY be omitted in this case. If only the new media stream URI is indicated in the "Switch-Stream" header field of a PLAY request from an RTSP client to an RTSP server, then the RTSP server MUST interpret this as a request to switch to the new media stream. The server decides the mapping. The RTSP server MUST indicate the SSRC of the new media stream in the RTP-Info of the reply, in order to enable the client to locate the new stream. If only the old stream URI is indicated in the "Switch-Stream" header field of a PLAY request from an RTSP client to an RTSP server, then the server MUST interpret this as a request for complete removal of the specified media stream. The client and the server release the resources for this stream without explicit TEARDOWN signalling, as such signalling may lead intermediate application level gateways or RTSP proxies to also release the RTSP session and the other transport resources. The usage of the "Switch-Stream" header is defined in clauses Section 3, Section 4 and Section 7. 2.2. Semantics of RTSP PLAY method A PLAY request sent while a stream is still playing is a legal operation in RTSP 2.0 with different semantics than those defined in Lohmar, et al. Expires January 30, 2010 [Page 7] Internet-Draft RTSP-FCS July 2009 RTSP 1.0. The later PLAY request replaces the first PLAY request. The new PLAY request MAY change RTSP session attributes such as the session identifier. 2.3. RTP Transport Sessions The most common media transport for RTSP is RTP/UDP. The following descriptions and definitions are applicable when using RTP/UDP as media transport. Other media transport protocols may require similar considerations which are not defined in the present document. After switching content as defined by this document, the SSRC identifier used on a specific RTP session may change. In the event that the SSRC changes, it MUST be included in the RTP-Info header. The RTSP server MUST change the SSRC value for a specific RTP session after a fast content switching operation is performed if any of the following apply: o the payload types of the old and new media streams are the same but the payload configuration differs; o the clockrate of the new media stream is different from the old; or o the mapping of the new media stream is otherwise unknown to the RTSP client. In case the SSRC remains unchanged after a content switch, the RTP sequence numbering MUST be continuous and monotonically increasing and the timestamp clock MUST maintain continuity. Additionally, if the payload type also remains the same, the old media stream and new media stream MUST NOT contain any packets with decoding dependencies on unsent packets. Otherwise, if the SSRC is updated, a random RTP sequence number and timestamp SHOULD be chosen using the same or similar mechanisms to those used when initiating a new media session as described in RFC 3550 [RFC3550]. 3. Switching to new content with available description This clause defines all necessary RTSP client and server features for fast content switching when the client already has the content new description locally available; for instance, the client may have previously fetched the SDP using RTSP DESCRIBE or HTTP GET. Clients should assume that the herein defined fast content switching procedure is supported for all content items offered by this server. This feature reduces the RTSP protocol overhead of switching to a Lohmar, et al. Expires January 30, 2010 [Page 8] Internet-Draft RTSP-FCS July 2009 single client-server interaction. The feature-tag indicating this feature is "rtsp-switch". The client should probe the server capabilities as early as possible in the communication using the "rtsp-switch" tag in the "Supported" header. The client SHOULD use the "Require" header with this feature tag value when requesting this behaviour from the server. The server MUST use the PLAY method as defined in Section 2.2 and [I-D.ietf-mmusic-rfc2326bis] when the client requests this feature. Thus replacing the currently streaming content with the newly requested media, resulting in a switch of streamed content. When the RTSP client wants to change the content of the RTSP session, the RTSP client sends a PLAY request with the aggregated control URI of the new content to the RTSP server. The aggregated control URI is defined as in RFC 3550 [RFC3550] The RTSP client MUST add the media control URIs of the new streams requested in the "Switch-Stream" header field of the RTSP PLAY request. Whenever possible, the RTSP client SHOULD map media control URIs of the same media type (e.g. audio or video) in the old content to the same media type of the new session. Note, this is only applicable for media types which are present in both the old and new content. The server MUST always include the "Switch-Stream" header in the response, indicating the new media streams being sent. The "Switch-Stream" header field is defined in Section 2.1. If the SSRC identifiers have changed, then the server MUST indicate the new SSRC values of the new media streams within the "RTP-Info" header in the RTSP PLAY response. Note: if the new session contains more media components than the current session, the client MAY switch according to this section, describing the desired components in the "Switch-Stream" header and add the missing components using the method defined in Section 6. If fewer media components are present in the new session than the existing session, the client and the server remove the unused components as defined in Section 7. 4. Switching to new content without content description Clients should assume, that the here defined fast content switching procedure is supported for all content items offered by this server. The client uses the RTSP URI of the content session as the request URI to describe the new content item. Lohmar, et al. Expires January 30, 2010 [Page 9] Internet-Draft RTSP-FCS July 2009 Without an SDP or other adequate content description, the client is unable to specify the streams to which it wishes to subscribe. In order to initiate a content switch within a single RTSP round trip, the client MAY perform a PLAY request to initiate a switch via content URI without specifying individual streams. This allows the client to request that the server return a content description, initiate a new session, setup all relevant media streams (or make an appropriate stream selection), and begin playback. The content URI used in the PLAY request is the same content URI used in a DESCRIBE request. In order to signal that it wishes to receive the description and make a switch, the client MUST include the "SDP- Requested" header as defined below. SDP-Requested-Header = " SDP-Requested" COLON "1" If a server receives a PLAY request and completes all actions successfully, the server responds with the content description, Session-ID, RTP-Info or other required transport headers, and a "Switch-Stream" Section 2.1 descriptor and begins streaming the new content immediately. Whenever possible, the RTSP server SHOULD map the media control URIs of the same media type (e.g. audio or video) in the old content to the same media type of the new session. Note: this is only applicable for media types which are present in both the old and new content. In case of RTP transport, the RTP-Info in the PLAY response MUST contain the SSRC for each stream. The server MAY issue a new session ID in the response, or it may re-use the existing session ID. The client MUST be prepared for either case. If the server is not yet able to begin streaming, it responds with a 202 (Accepted) success code and an appropriate session description. The client may then perform a switch as described in Section 3 specifying the streams it would like to receive. This condition can occur if the server requires further client input regarding stream setup prior to beginning playback - for instance if the content requested is available in multiple alternate languages and the server does not have the information necessary to choose a language. If the server is not yet able to begin transmitting all the media streams, it MAY begin a subset of the streams and respond with a 206 (Partial Data) success code and a session description. The "Switch- Stream" header and the "RTP-Info" header will indicate which streams have been selected for playback. The client MAY then add additional media components as described in Section 6. If fewer media components are present in the new content than are currently in use, then the server responds with a 200 (OK). The client MUST then remove the "unused" media components as defined in Section 7. Lohmar, et al. Expires January 30, 2010 [Page 10] Internet-Draft RTSP-FCS July 2009 The client and the server SHOULD release the resources for the unused streams without explicit TEARDOWN signalling. The feature tag "rtsp-switch-req-sdp" is defined to describe support for this feature. The client SHOULD probe the server capabilities as early as possible in the communication using the "Supported" header and SHOULD use the "Require" header with this feature tag value when requesting this behaviour from the server. The server MUST use the PLAY method semantics defined in Section 2.2 and [I-D.ietf-mmusic-rfc2326bis] when the client requests this feature. 5. Switching Media within an RTSP session Some content may be available for streaming in different representations. An example of such a use case is the live streaming of a sport event with multiple camera views. The session description available at the receiver describes multiple options for one or several media types (e.g. video, audio, or subtitles). Upon initial setup of the session, the player (or the user) selects the preferred combination of the presentation to be consumed and sets up the corresponding media streams. At a later point, the user may trigger a switch to a different media stream carrying an alternative representation of the media. The PLAY request is sent with the "Switch-Stream" header field as defined in Section 2.1 indicating the URIs of both the old media stream and the new replacement stream. Upon receiving a PLAY request with a "Switch-Stream" header field for an active session, an RTSP server that supports this feature switches to the new media stream using the same transport parameters described in the initial SETUP request for the old media stream. After successfully processing the request, the RTSP server MUST reply with an "RTP-Info" header indicating all active media streams in the changed session, including those that are unchanged. The "RTP-Info" header MAY include the SSRCs for each active media stream; it MUST contain all new SSRCs of the changed streams if the SSRCs have changed. The response MAY also include the "Switch-Stream" header, indicating the stream switches that were successful. If the "Switch-Stream" header field is not present in a successful response and the RTSP server was identified to support the media switching functionality, the receiver MUST assume that all requested switches were successful. The feature tag "rtsp-switch-stream" is defined to describe support for this feature. This feature tag is different than the feature tag "rtsp-switch" described in Section 3 indicating the support for content (aggregated stream) switch. The client SHOULD use the "Require" header with this feature tag value when requesting this Lohmar, et al. Expires January 30, 2010 [Page 11] Internet-Draft RTSP-FCS July 2009 behaviour from the server. The server MUST use the PLAY method semantics as defined in Section 2.2 and [I-D.ietf-mmusic-rfc2326bis]when the client requests this feature. Note that several media streams of a presentation may be switched at the same time in a single PLAY request. 6. Adding Media Components to an ongoing session It may happen that the new content stream consists of more media components than the ongoing content stream. In such a case, it is RECOMMENDED that the client switch to the new content with the already established resources and then add further components. The client SHOULD pipeline the setup requests for the new components after the content switching request. The client MUST issue a PLAY request after the successful establishment of the new media components to start all media components. If the server has successfully added the media component, the "RTP-Info" header in the RTSP PLAY response MUST contain the synchronization information for all media components. The session id value of the already established session MUST be included in the SETUP request to indicate the relation of the new media component to the established session. 7. Removing Media Components from an ongoing session A RTSP client wishing to terminate the streaming of a specific media stream MAY send a PLAY request with a "Switch-Stream" header Section 2.1 indicating the URI of the media stream to be torn down as the "old-stream". No URI for the "new-stream" should be specified. Upon receiving a PLAY request with a "Switch-Stream" header field indicating that one or more media streams are to be terminated, the server MUST stop streaming the indicated media streams and SHOULD release the unused resources, (e.g. RTP/RTCP UDP ports). The other media streams MUST NOT be interrupted. After successfully processing the request, the server MUST reply with a success response message and a "Session" header field, even if the session contains no more media streams. The RTSP client MUST only use TEARDOWN to completely tear down the whole session. Lohmar, et al. Expires January 30, 2010 [Page 12] Internet-Draft RTSP-FCS July 2009 8. IANA Considerations The feature tags "rtsp-switch", "rtsp-switch-req-sdp" and "rtsp- switch-stream" need to be registered with IANA. 9. Security Considerations Same as for RTSP [I-D.ietf-mmusic-rfc2326bis]. 10. Acknowledgements This document has benefited greatly from the discussions and comments from all those participating in 3GPP TSG SA4. In particular, the following individuals have contributed to this specification: Gamze Seckin, Imed Bouazizi, Frederic Gabin, Igor Curcio, Francois Martin 11. References 11.1. Normative References [I-D.ietf-mmusic-rfc2326bis] Schulzrinne, H., Rao, A., Lanphier, R., Westerlund, M., and M. Stiemerling, "Real Time Streaming Protocol 2.0 (RTSP)", draft-ietf-mmusic-rfc2326bis-22 (work in progress), July 2009. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003. [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008. 11.2. Informative References [3GPP.26.234] 3GPP, "Transparent end-to-end Packet-switched Streaming Service (PSS); Protocols and codecs", 3GPP TS 26.234 7.5.0, March 2008. [RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998. Lohmar, et al. Expires January 30, 2010 [Page 13] Internet-Draft RTSP-FCS July 2009 Authors' Addresses Thorsten Lohmar Ericsson GmbH Ericsson Allee 1 Herzogenrath 52134 Germany Phone: +49-2407-5757816 Fax: +49-2407-575400 Email: Thorsten.Lohmar@ericsson.com Jamie Gordon RealNetworks, Inc. 2601 Elliott Avenue Seattle, WA 98121 USA Phone: +1-206-674-2700 Fax: Email: jgordon@real.com Torbjoern Einarsson Ericsson AB Faeroegatan 6 Stockholm 164 80 Sweden Phone: Fax: Email: Torbjorn.Einarsson@ericsson.com Lohmar, et al. Expires January 30, 2010 [Page 14]