RTP Audio Video Profile: History
Please note this is an old version of this entry, which may differ significantly from the current revision.
Subjects: Others
Contributor:

The RTP audio/video profile (RTP/AVP) is a profile for Real-time Transport Protocol (RTP) that specifies the technical parameters of audio and video streams. RTP specifies a general-purpose data format, but doesn't specify how encoded data should utilize the features of RTP (what payload type value to put in the RTP header, what sampling rate and clock rate [the rate at which the RTP timestamp increments] to use, etc.). An RTP profile specifies these details. The RTP audio/video profile specifies a mapping of specific audio and video codecs and their sampling rates to RTP payload types and clock rates, and how to encode each data format as an RTP data payload, as well as specifying how to describe these mappings using Session Description Protocol (SDP).

  • video codecs
  • real-time
  • sampling rate

1. RTP/AVP Audio and Video Payload Types

Payload type (PT) Name Type No. of channels Clock rate (Hz)[1] Frame size (ms) Default packet size (ms) Description References
0 PCMU audio 1 8000 any 20 ITU-T G.711 PCM µ-Law audio 64 kbit/s RFC 3551
1 reserved (previously FS-1016 CELP) audio 1 8000     reserved, previously FS-1016 CELP audio 4.8 kbit/s RFC 3551, previously RFC 1890
2 reserved (previously G721 or G726-32) audio 1 8000     reserved, previously ITU-T G.721 ADPCM audio 32 kbit/s or ITU-T G.726 audio 32 kbit/s RFC 3551, previously RFC 1890
3 GSM audio 1 8000 20 20 European GSM Full Rate audio 13 kbit/s (GSM 06.10) RFC 3551
4 G723 audio 1 8000 30 30 ITU-T G.723.1 audio RFC 3551
5 DVI4 audio 1 8000 any 20 IMA ADPCM audio 32 kbit/s RFC 3551
6 DVI4 audio 1 16000 any 20 IMA ADPCM audio 64 kbit/s RFC 3551
7 LPC audio 1 8000 any 20 Experimental Linear Predictive Coding audio 5.6 kbit/s RFC 3551
8 PCMA audio 1 8000 any 20 ITU-T G.711 PCM A-Law audio 64 kbit/s RFC 3551
9 G722 audio 1 8000[2] any 20 ITU-T G.722 audio 64 kbit/s RFC 3551 - Page 14
10 L16 audio 2 44100 any 20 Linear PCM 16-bit Stereo audio 1411.2 kbit/s,[3][4][5] uncompressed RFC 3551, Page 27
11 L16 audio 1 44100 any 20 Linear PCM 16-bit audio 705.6 kbit/s, uncompressed RFC 3551, Page 27
12 QCELP audio 1 8000 20 20 Qualcomm Code Excited Linear Prediction RFC 2658, RFC 3551
13 CN audio 1 8000     Comfort noise. Payload type used with audio codecs that do not support comfort noise as part of the codec itself such as G.711, G.722.1, G.722, G.726, G.727, G.728, GSM 06.10, Siren, and RTAudio. RFC 3389
14 MPA audio 1, 2 90000 8–72   MPEG-1 or MPEG-2 audio only RFC 3551, RFC 2250
15 G728 audio 1 8000 2.5 20 ITU-T G.728 audio 16 kbit/s RFC 3551
16 DVI4 audio 1 11025 any 20 IMA ADPCM audio 44.1 kbit/s RFC 3551
17 DVI4 audio 1 22050 any 20 IMA ADPCM audio 88.2 kbit/s RFC 3551
18 G729 audio 1 8000 10 20 ITU-T G.729 and G.729a audio 8 kbit/s; Annex B is implied unless the annexb=no parameter is used RFC 3551, Page 20, RFC 3555, Page 15
19 reserved (previously CN) audio         reserved, previously comfort noise RFC 3551
25 CELB video   90000     Sun CellB video[6] RFC 2029
26 JPEG video   90000     JPEG video RFC 2435
28 nv video   90000     Xerox PARC's Network Video (nv)[7] RFC 3551, Page 32
31 H261 video   90000     ITU-T H.261 video RFC 4587
32 MPV video   90000     MPEG-1 and MPEG-2 video RFC 2250
33 MP2T audio/video   90000     MPEG-2 transport stream RFC 2250
34 H263 video   90000     H.263 video, first version (1996) RFC 3551, RFC 2190
72–76 reserved           reserved because RTCP packet types 200–204 would otherwise be indistinguishable from RTP payload types 72–76 with the marker bit set RFC 3550, RFC 3551
dynamic H263-1998 video   90000     H.263 video, second version (1998) RFC 3551, RFC 4629, RFC 2190
dynamic H263-2000 video   90000     H.263 video, third version (2000) RFC 4629
dynamic (or profile) H264 AVC video   90000     H.264 video (MPEG-4 Part 10) RFC 6184, previously RFC 3984
dynamic (or profile) H264 SVC video   90000     H.264 video RFC 6190
dynamic (or profile) H265 video   90000     H.265 video (HEVC) RFC 7798
dynamic (or profile) theora video   90000     Theora video draft-barbato-avt-rtp-theora
dynamic iLBC audio 1 8000 20, 30 20, 30 Internet low Bitrate Codec 13.33 or 15.2 kbit/s RFC 3952
dynamic PCMA-WB audio 1 16000 5   ITU-T G.711.1 A-law RFC 5391
dynamic PCMU-WB audio 1 16000 5   ITU-T G.711.1 µ-law RFC 5391
dynamic G718 audio   32000 (placeholder) 20   ITU-T G.718 draft-ietf-payload-rtp-g718
dynamic G719 audio (various) 48000 20   ITU-T G.719 RFC 5404
dynamic G7221 audio   16000, 32000 20   ITU-T G.722.1 and G.722.1 Annex C RFC 5577
dynamic G726-16 audio 1 8000 any 20 ITU-T G.726 audio 16 kbit/s RFC 3551
dynamic G726-24 audio 1 8000 any 20 ITU-T G.726 audio 24 kbit/s RFC 3551
dynamic G726-32 audio 1 8000 any 20 ITU-T G.726 audio 32 kbit/s RFC 3551
dynamic G726-40 audio 1 8000 any 20 ITU-T G.726 audio 40 kbit/s RFC 3551
dynamic G729D audio 1 8000 10 20 ITU-T G.729 Annex D RFC 3551
dynamic G729E audio 1 8000 10 20 ITU-T G.729 Annex E RFC 3551
dynamic G7291 audio   16000 20   ITU-T G.729.1 RFC 4749
dynamic GSM-EFR audio 1 8000 20 20 ITU-T GSM-EFR (GSM 06.60) RFC 3551
dynamic GSM-HR-08 audio 1 8000 20   ITU-T GSM-HR (GSM 06.20) RFC 5993
dynamic (or profile) AMR audio (various) 8000 20   Adaptive Multi-Rate audio RFC 4867
dynamic (or profile) AMR-WB audio (various) 16000 20   Adaptive Multi-Rate Wideband audio (ITU-T G.722.2) RFC 4867
dynamic (or profile) AMR-WB+ audio 1, 2 or omit 72000 13.3–40   Extended Adaptive Multi Rate – WideBand audio RFC 4352
dynamic (or profile) vorbis audio (various) (various)     Vorbis audio RFC 5215
dynamic (or profile) opus audio 1, 2 48000[8] 2.5–60 20 Opus audio RFC 7587
dynamic (or profile) speex audio 1 8000, 16000, 32000 20   Speex audio RFC 5574
dynamic mpa-robust audio 1, 2 90000 24–72   Loss-Tolerant MP3 audio RFC 5219 (previously RFC 3119)
dynamic (or profile) MP4A-LATM audio   90000 or others     MPEG-4 Audio RFC 6416 (previously RFC 3016)
dynamic (or profile) MP4V-ES video   90000 or others     MPEG-4 Visual RFC 6416 (previously RFC 3016)
dynamic (or profile) mpeg4-generic audio/video   90000 or other     MPEG-4 Elementary Streams RFC 3640
dynamic VP8 video   90000     VP8 video RFC 7741
dynamic VP9 video   90000     VP9 video draft-ietf-payload-vp9
dynamic L8 audio (various) (various) any 20 Linear PCM 8-bit audio with 128 offset RFC 3551 Section 4.5.10 and Table 5
dynamic DAT12 audio (various) (various) any 20 (by analogy with L16) IEC 61119 12-bit nonlinear audio RFC 3190 Section 3
dynamic L16 audio (various) (various) any 20 Linear PCM 16-bit audio RFC 3551 Section 4.5.11, RFC 2586
dynamic L20 audio (various) (various) any 20 (by analogy with L16) Linear PCM 20-bit audio RFC 3190 Section 4
dynamic L24 audio (various) (various) any 20 (by analogy with L16) Linear PCM 24-bit audio RFC 3190 Section 4
dynamic raw video   90000     Uncompressed Video RFC 4175
dynamic ac3 audio (various) 32000, 44100, 48000     Dolby AC-3 audio RFC 4184
dynamic eac3 audio (various) 32000, 44100, 48000     Enhanced AC-3 audio RFC 4598
dynamic t140 text   1000     Text over IP RFC 4103
dynamic EVRC
EVRC0
EVRC1
audio   8000     EVRC audio RFC 4788
dynamic EVRCB
EVRCB0
EVRCB1
audio   8000     EVRC-B audio RFC 4788
dynamic EVRCWB
EVRCWB0
EVRCWB1
audio   16000     EVRC-WB audio RFC 5188
dynamic jpeg2000 video   90000     JPEG 2000 video RFC 5371
dynamic UEMCLIP audio   8000, 16000     UEMCLIP audio RFC 5686
dynamic ATRAC3 audio   44100     ATRAC3 audio RFC 5584
dynamic ATRAC-X audio   44100, 48000     ATRAC3+ audio RFC 5584
dynamic ATRAC-ADVANCED-LOSSLESS audio   (various)     ATRAC Advanced Lossless audio RFC 5584
dynamic DV video   90000     DV video RFC 3189
dynamic BT656 video         ITU-R BT.656 video RFC 3555
dynamic BMPEG video         Bundled MPEG-2 video RFC 2343
dynamic SMPTE292M video         SMPTE 292M video RFC 3497
dynamic RED audio         Redundant Audio Data RFC 2198
dynamic VDVI audio         Variable-rate DVI4 audio RFC 3551
dynamic MP1S video         MPEG-1 Systems Streams video RFC 2250
dynamic MP2P video         MPEG-2 Program Streams video RFC 2250
dynamic tone audio   8000 (default)     tone RFC 4733
dynamic telephone-event audio   8000 (default)     DTMF tone RFC 4733
dynamic aptx audio 2 – 6 (equal to sampling rate) 4000 ÷ sample rate 4[9] aptX audio RFC 7310
  1. The "clock rate" is the rate at which the timestamp in the RTP header is incremented, which need not be the same as the codec's sampling rate. For instance, video codecs typically use a clock rate of 90000 so their frames can be more precisely aligned with the RTCP NTP timestamp, even though video sampling rates are typically in the range of 1–60 samples per second.
  2. Although the sampling rate for G.722 is 16000, its clock rate is 8000 to remain backwards compatible with RFC 1890, which incorrectly used this value.[10]
  3. Because Opus can change sampling rates dynamically, its clock rate is fixed at 48000, even when the codec will be operated at a lower sampling rate. The maxplaybackrate and sprop-maxcapturerate parameters in SDP can be used to indicate hints/preferences about the maximum sampling rate to encode/decode.
  4. For aptX, the packetization interval must be rounded down to the nearest packet interval that can contain an integer number of samples. So at sampling rates of 11025, 22050, or 44100, a packetization rate of "4" is rounded down to 3.99.

RFC 3551 lists details of the payload format, or a reference for the details is provided. Payload identifiers 96–127 are used for payloads defined dynamically during a session. The document recommends dynamically assigned port numbers, although port numbers 5004 and 5005 have been registered for use of the profile when a dynamically assigned port is not required. The standard also describes the process of registering new payload types with IANA.

Applications operating under this profile should always support PCMU (payload type 0). Previously, DVI4 (payload type 5) was also recommended, but this recommendation was removed in August 2013 by RFC 7007 because "many RTP deployments do not support DVI4, and there is little reason to use it when much more modern codecs are available."

The content is sourced from: https://handwiki.org/wiki/RTP_audio_video_profile

References

  1. The "clock rate" is the rate at which the timestamp in the RTP header is incremented, which need not be the same as the codec's sampling rate. For instance, video codecs typically use a clock rate of 90000 so their frames can be more precisely aligned with the RTCP NTP timestamp, even though video sampling rates are typically in the range of 1–60 samples per second.
  2. Although the sampling rate for G.722 is 16000, its clock rate is 8000 to remain backwards compatible with RFC 1890, which incorrectly used this value.[1]
  3. "RFC 2586 - The Audio/L16 MIME content type". May 1999. https://tools.ietf.org/html/rfc2586. Retrieved 2010-03-16. 
  4. "RFC 3108 - Conventions for the use of the Session Description Protocol (SDP) for ATM Bearer Connections". May 2001. https://tools.ietf.org/html/rfc3108#page-62. Retrieved 2010-03-16. 
  5. "RFC 4856 - Media Type Registration of Payload Formats in the RTP Profile for Audio and Video Conferences - Registration of Media Type audio/L16". March 2007. https://tools.ietf.org/html/rfc4856#page-18. Retrieved 2010-03-16. 
  6. XIL Programmer's Guide, Chapter 22 "CellB Codec". August 1997. Retrieved on 2014-07-19. https://docs.oracle.com/cd/E19504-01/802-5863/802-5863.pdf
  7. nv - network video on Henning Schulzrinne's website, Network Video on The University of Toronto's website, Retrieved on 2009-07-09. http://www.cs.columbia.edu/~hgs/rtp/nv.html
  8. Because Opus can change sampling rates dynamically, its clock rate is fixed at 48000, even when the codec will be operated at a lower sampling rate. The maxplaybackrate and sprop-maxcapturerate parameters in SDP can be used to indicate hints/preferences about the maximum sampling rate to encode/decode.
  9. For aptX, the packetization interval must be rounded down to the nearest packet interval that can contain an integer number of samples. So at sampling rates of 11025, 22050, or 44100, a packetization rate of "4" is rounded down to 3.99.
  10. RFC 3551, RTP Profile for Audio and Video Conferences with Minimal Control, H. Schulzrinne, S. Casner, The Internet Society (July 2003).
More
This entry is offline, you can click here to edit this entry!
Video Production Service