1. 20 Years of Streaming
in 20 Minutes
Yuriy Reznik, Brightcove
Christian Timmerer, Bitmovin & Alpen-Adria-Universität Klagenfurt
Dec 3rd, 2020
1
2. Your Team
Yuriy Reznik
Technology Fellow and
Head of Research at Brightcove
http://reznik.org/
Christian Timmerer
Assoc.-Prof at
Alpen-Adria-Universität Klagenfurt
CIO | Head of Research and
Standardization at Bitmovin
http://timmerer.com/
2
3. ● Human live before streaming…
● Early streaming systems
● First ABR streaming systems
● Early standards: RTSP, ISMA, 3GPP PSS, etc.
● Shift to HTTP
Agenda
Part 1
3
5. 1993: MBONE
▸ Virtual multicast network connecting several universities & ISPs
▸ RTP-based video conferencing tool (vic) is used to stream videos
▸ 1994 Rolling Stones concert – first major event streamed online
1995: RealAudio, 1997: RealVideo
▸ First commercially successful mass-scale streaming system
▸ Proprietary protocols, codecs: PNA, RealAudio, RealVideo
▸ Worked over UDP, TCP, and HTTP (“cloaking” mode)
▸ First major broadcast: 1995 Seattle Mariners vs New York Yankees
1996: VDOnet, Vivo, NetShow, VXtreme, ...
▸ Many vendors have competed in streaming space initially
▸ Vivo & Xing have been acquired by Real, VXtreme by Microsoft
▸ By 1998, 3 main vendors remained: Real, Microsoft and Apple
1998: RealSystem G2
▸ First ABR streaming system
Early Streaming Systems
5
6. Discovery of pre-roll delay
▸ Many early systems (Vivo, NetShow, VDOnet, etc.) have tried to use H.324 / H.323-
video conferencing stacks for streaming. But they worked very poorly!
▸ The first important discovery and deviation in the design of streaming systems from
video conferencing was introduction of a much longer initial delay.
Original uses of the pre-roll delay / buffer
▸ Leaky bucket model: reducing probability of stalls with network bandwidth fluctuations
▸ Reordering of out-of-order received UDP packets
▸ Limited retransmissions (ARQ)
▸ Interleaving / multiple-description coding of audio
Interleaved packetization (RealAudio, 1995):
▸ 20-ms audio frames after encoder:
▸ UDP packets:
▸ Effects of loss of a packet:
▸ Missing audio frames were by-directionally predicted/synthesized during decoding.
▸ This worked remarkably well even with heavy (5-10%) packet loss rates!.
Working with Lossy & Time-Variant Networks
6
Expected delay, throughput, and thoughput
distribution caused by retransmissions:
7. 1998: RealSystem G2, “SureStream” technology
▸ First commercially successful ABR streaming system
▸ Encoder: Encoded streams: Player:
References
▸ B. Girod, et al, “Scalable codec architectures for Internet video-on-demand,” ACSSC, pp. 357 – 361, 1997.
▸ G. Conklin, et al, “Video Coding for Streaming Media Delivery on the Internet," TCSVT, 11 (3), pp. 20-34, 2001.
▸ US Patents: 6314466, 6480541, 7075986, 7885340
First ABR Streaming System
Multi-rate
packaging option
Selection of
streams to include
Panel showing which
stream is selected
7
8. RealSystem G2 architecture (1998):
▸ RTSP session control, RDT, RTP, or TCP for stream transmissions. Public IP is used for delivery.
▸ Stream adaptation was done by server, but it was client-driven: client was sending requests to switch
▸ Server was also responsible for retransmissions, mixing in FEC packets, etc. Everything was sent in “packets”
Challenges it faced:
▸ Scalability: each single server was scalable to ~10K streams, and anything beyond was a major challenge
▸ to address this RealNetworks was deploying proxy-servers (RealProxy) than can be chained
▸ RBN (Real Broadcast Network) was a first chained deployment of such servers for streaming of high-volume events
▸ Reliability/quality: public IP was horrible, as were end-user links, and robustness of the servers and clients
▸ Implementation complexity: advanced error resilience and error concealment functionality was required for every decoder in
every client. All codecs had to be custom written and supported on all platforms.
How First ABR Streaming System Worked?
8
9. 1996: RTSP – Real-Time Streaming Protocol
▸ Session protocol for packet-bases streaming
▸ Main contributors: RealNetworks, Netscape, Columbia University
▸ Uses as foundation for most streaming systems of 1998-2008 era
2000: ISMA – Internet Streaming Media Alliance
▸ Forum created by Apple, Cisco, Kasenna, Philips, and Sun
▸ ISMA 2.0: RTSP+RTP+RTCP + H.264 and HE-AAC codecs
▸ ISBMFF with hint tracks is employed for storage of encoded streams
▸ ISMA 2.0 was supported by many servers and clients of that era
2006: 3GPP PSS – Packet Switched Streaming
▸ Describes RTSP+RTP+RTCP ABR adaptive streaming system with
several standard video, audio and speech codecs
▸ 3GPP version of RTSP/RTP-based stack
2006: 3GPP2 MSS – Multimedia Streaming Services
▸ Similar to 3GPP PSS, but differs in speech codecs & network stack
Early Streaming Standards & Industry Fora
9
Full protocol stack in 3GPP2 MSS:
Session setup and streaming phases:
10. Networks have improved!!
▸ When streaming started, 28k and 56k modems were the common connections available
▸ But by mid-2000s consumers moved to Cable, DSL, or other high-speed connections
▸ Bitrates were up 5-100x, latencies were 4-10x down, packet losses were under 1-2%
▸ This relaxed requirements dramatically!
▸ Progressive downloads become feasible alternatives to streaming!
CDNs become ubiquitous
▸ By mid-2000s Akamai, Limelight and other CDNs were well deployed all over
▸ CDNs provided much higher density and reach than RTSP-based delivery networks
Other practical & business reasons
▸ The space was fragmented: Real, Microsoft, Apple, and then Adobe used significantly different implementations
of RTSP stacks. Even codecs and file formats were different!
▸ RTSP systems were complex: error concealment was a major pain, most hardware decoders did not support it!
▸ And eventually folks started looking for ways to do streaming simpler.. Much simpler!
And… one day a real simple solution was found… streaming over HTTP!
Why Today’s Streaming Systems use HTTP?
110
11. ● Video Streaming over HTTP/TCP
○ Progressive Download
○ Pseudo Streaming
○ HTTP Adaptive Streaming (HAS)
● MPEG DASH, HLS, CMAF
● Summary: What’s Next?
Agenda
Part 2
11
12. “The nice thing about standards
is that you have so many to
choose from.”
Andrew S. Tanenbaum, Computer Networks
12
14. Video Delivery over HTTP/TCP
○Enables playback
while still
downloading
○Server sends the
file as fast as
possible
Progressive
Download
○Enables seeking via
media indexing
○Server paces
transmission based
on encoding rate
Pseudo
Streaming
○Content is divided
into short-duration
chunks
○Enables live
streaming and ad
insertion
Chunked
Streaming
○Multiple versions
of the content are
created
○Enables to adapt
to network and
device conditions
Adaptive
Streaming
14
Acknowledgment (figure): Ali C. Begen
15. Progressive Download
Playback starts only after there
is several seconds of data in the
playback buffer
Download will continue as fast
as possible
Fetched content will be wasted
if the viewer clicks away
Bing Wang, Jim Kurose, Prashant Shenoy, and Don Towsley. 2004. Multimedia streaming via TCP: an analytic performance study.
ACM International Conference on Multimedia (MM'04). DOI:https://doi.org/10.1145/1027527.1027735
“TCP generally provides good
streaming performance when
the achievable TCP throughput
is roughly twice the media
bitrate, with only a few seconds
of startup delay”
15
HTTP Request
HTTP Response
Playback starts only after there
is several seconds of data in the
playback buffer
Download will continue as fast
as possible
Fetched content will be wasted
if the viewer clicks away
Can seek only throughout
the fetched content
Acknowledgment (figure): Ali C. Begen
16. Streaming is cooler, more viewer friendly
L. De Cicco, S. Mascolo. An Experimental Investigation of the Akamai Adaptive Video Streaming. in Proc of USAB
2010, special session Interactive Multimedia Applications (WIMA), Klagenfurt, Austria, 3-4 November 2010, LCNS 6389,
pp. 447-464, Springer-Verlag, doi:10.1007/978-3-642-16607-5
Implements “pseudo streaming”
and mimics RTSP-style streaming
but via HTTP.
16
Playback starts when there is
just few seconds of data
Download rate will match the encoding bitrate
and downloading pauses if the player pauses
à Less waste
Can seek to anywhere
in the entire content
Acknowledgment (figure): Ali C. Begen
17. February 2009: most of today’s streaming media in the Internet is
shipped using HTTP with progressive download (cf. YouTube, etc.).
The MPEG answer to this issue is MPEG Modern Transport (MMT):
● transport and file format friendly stream format
● cross-layer optimization between video and transport layer
● error resilience for MPEG streams
● easy conversion from/to other transport mechanisms, and
● content adaptation to different networks
… later on split into MMT (MPEG Media Transport) &
“HTTP Streaming of MPEG Media”
Early Days of HTTP Adaptive Streaming (HAS)
May 2009: HTTP Live
Streaming (HLS) (draft-
pantos-http-live-streaming-
00) with variants “to allow
clients to switch between
encodings dynamically”.
17
18. April 2010: MPEG issues CfP for HTTP Streaming of MPEG Media
● “there is no standard for HTTP-based streaming of MPEG
media”
July 2010: Evaluation of responses to CfP
● WD of ISO/IEC 23001-6
Dynamic Adaptive Streaming over HTTP (DASH)
October 2010: Committee Draft (CD)
● First implementations available (GPAC, VLC)
January 2011: Draft International Standard (DIS)
November 2011: Final Draft International Standard (FDIS)
● DASH ratified
● DASH Promoters Group
April 2012: ISO/IEC 23009-1:2012 (1st edition) published
● First Web-/Javascript-based clients (DASH-JS)
● DASH Industry Forum: https://dashif.org/ (dash.js)
HTTP Streaming of MPEG Media
Evaluation of responses to CfP of
HTTP Streaming of MPEG Media,
July 2010, Geneva, CH
18
19. HTTP Adaptive Streaming 101
Adaptation logic is within the
client, not normatively specified
by a standard, subject to
research and development
Christian Timmerer and Hermann Hellwagner. 2020. HTTP Adaptive Streaming – Where Is It Heading?. In
Brazilian Symposium on Multimedia and the Web (WebMedia ’20), November 30-December 4, 2020, São Luís,
Brazil. ACM, New York, NY, USA, 2 pages. https://doi.org/10.1145/3428658.3434574
19
21. MPEG DASH Data Model
MPD
Period id = 1
start = 0 s
Period id = 3
start = 300 s
Period id = 4
start = 850 s
Period id = 2
start = 100 s
Adaptation Set 0
subtitle turkish
Adaptation Set 2
audio english
Adaptation Set 1
BaseURL=http://abr.rocks.com/
Representation 2
Rate = 1 Mbps
Representation 4
Rate = 3 Mbps
Representation 1
Rate = 500 Kbps
Representation 3
Rate = 2 Mbps
Resolution = 720p
Segment Info
Duration = 10 s
Template:
3/$Number$.mp4
Segment Access
Initialization
Segment
http://abr.rocks.com/3/0.mp4
Media Segment 1
start = 0 s
http://abr.rocks.com/3/1.mp4
Media Segment 2
start = 10 s
http://abr.rocks.com/3/2.mp4
Adaptation Set 3
audio german
Adaptation Set 1
video
Period id = 2
start = 100 s
Representation 3
Rate = 2 Mbps
Selection of
components/tracks
Well-defined
media format
Selection of
representations
Splicing of arbitrary
content like ads
Chunks with
addresses and timing
21
22. MPEG DASH Status (10/20)
AMD2
• SRD
• URL param inser/on
• Role extensions
AMD3
• AuthN/AuthZ
• NTP anchor
• External MPD link
• Period continuity
• Generalized HTTP header
extensions & queries
23009-5
Server &
Network
Assisted DASH
23009-6
Full Duplex
DASH
Additional Tools under development
• Reducing redundancy in multi-DRM
linear MPDs
• Bandwidth change signaling track
• Interactive story telling / DASH
AMD1
• Server-client NTP sync
• Extended profiles
AMD4
• Flexible segment &
Broadcast TV profile
• MPD chaining
• MPD fallback
• Preselections
• Data URLs in MPD
• Labels
• Switching x adaptation sets2nd Edition 23009-1:2014
MPEG DASH
1st Edition
23009-1:2012
• Events
• Asset Iden1fier
✔ ✔
23009-4
Segment
Encryption &
Authentication
23009-8
Session based
DASH
operations
✔
23009-2 Conformance and Reference Software
3rd
Edition 23009-1:2019
3rd Edi'on
🚧
4th
Edition 23009-1:2019AMD5 (AMD1 to 3rd
edi1on)
Device informaPon, quality equivalence descriptor, Pmed text roles, announcing popular
content, flexible IOP signaling, early available periods, signaling missing/alternaPve segments
AMD1 (to 4th
edition)
CMAF support, event/timed metadata
processing, resynchronization, patch
method for MPD updates, preroll
5th
Edition
23009-1:2021
4th Edi1on 23009-1:2020 ✔
22
23. Media delivery has three main components:
● Media format
● Manifest
● Delivery
CMAF defines the media format only
(fragments, headers, segments, chunks, tracks)
Common Media Application Format (CMAF)
Encoder
Encryption
Packaging
CMAF
Header
CMAF
Fragment
CMAF
Fragment
CMAF
Chunk
CMAF
Chunk
CMAF
Chunk
CMAF
Fragment
R
A
P
R
A
P
R
A
P
R
A
P
CMAF
Fragment
CMAF
Segment
CMAF
Segment
CMAF Track FileCMAF uses ISOBMFF and common encryption (CENC)
● CENC means the media fragments can be decrypted/decoded using different DRMs
● CMAF does not mandate CTR or CBC mode
Any delivery method may be used for delivering CMAF content: HTTP, RTP multicast/unicast,
LTE broadcast
CMAF is prerequisite for low latency HAS (i.e., DASH-LL, LL-HLS)
Abdelhak Bentaleb, Christian Timmerer, Ali C. Begen, and Roger Zimmermann. 2020. Performance Analysis
of ACTE: A Bandwidth Prediction Method for Low-latency Chunked Streaming. ACM Trans. Multimedia
Comput. Commun. Appl. 16, 2s, Article 69 (July 2020), 24 pages. DOI:https://doi.org/10.1145/3387921
23
24. RTSP, RTP, RTCP, et al. (UDP)
Move Networks: long gone; some components in Slingbox;
now discontinued
Legacy
● Microsoft Smooth Streaming
● Adobe Flash
● Adobe HTTP Dynamic Streaming
State of the Art: CMAF
● Apple HTTP Live Streaming
● MPEG Dynamic Adaptive Streaming over HTTP
What’s next? QUIC (UDP, again!?) -- history repeats (again)
Summary: What’s Next?
2020 Bitmovin Video Developer Survey.
https://go.bitmovin.com/video-developer-report-2020
24
25. Thank you for your attention!
Yuriy Reznik, Brightcove
Christian Timmerer, Bitmovin & Alpen-Adria-Universität Klagenfurt
Dec 3rd, 2020
25