Computer Networking: A Top-Down Approach

Internet routing paths

Table of Contents


Overall Thoughts

I had a lot of fun reading this book. Learning about dozens of protocols in full detail paints a vivid picture of how the Internet actually works. Tangents about real-world applications, system design and Internet history provide nice motivation, and help you engage with the material by making you wonder. The end-of-chapter interviews with networks legends were always fun to read. There are countless references to a broad range of relevant and interesting external materials — from must-know software tools to foundational research papers.

It is very readable for someone with an academic computer science background, and probably readable by someone with some general computer science familiarity who is willing to do their fair share of Googling.

A lot of information is covered at a great pace and depth. The first five chapters teach core networks concepts; chapters 6 to 9 are advanced topics — a fun way to test and expand upon your core networks knowledge.

I should note that before reading this book in full, I had already taken a computer networks class in college which heavily referenced this book. Thus, I was familiar with most of the concepts ahead of time. Additionally, I did not take notes for chapters 4, 5, 6, 7 or 8. If enough people want me to badly enough, I can be convinced, but otherwise this is very low on my to-do list.


High-Level Summary

Computer networks are formed via interconnected protocol-executing machines. The protocols are established by large communities and are documented in RFCs. There are several important characteristics of protocols:

  • Packet structure and length
  • Initiation and termination
  • Error detection and correction
  • Reliability
  • Security

Combinations of protocols can be used to aggregate properties — the OSI model defines general protocol paradigms. Protocols tend to be modelled with state machines.

The Internet consists of end-users and routers, that identify and find each other via IP addresses, DNS, and routing tables.

Many Internet resources need to be managed:

  • IP address space
  • DNS and routing table entries
  • Throughput
    • Multiplexing
    • Congestion control

General architectures have been created for high-throughput services, such as content delivery networks and data centers.

Several Internet utility programs exist, such as traceroute, ping, and nslookup.

The economics of the Internet is a function of end-users paying competing Internet Service Providers (ISPs), and ISPs paying competing Autonomous Systems (ASs).


Notes

Chapter 1 — Computer Networks and the Internet

1.1 — What is the Internet?

  • The Internet
  • 1.1.1 A Nuts-And-Bolts Description
    • End systems/hosts connected by a network of communication links and packet switches
      • Hardware and software components
    • End systems/hosts access the Internet through lower-tier Internet Service Providers (ISPs), which are interconnected through national and international upper-tier ISPs
    • Protocols are run to control the sending and receiving of information
      • TCP/IP
    • Internet standards developed by the Internet Engineering Task Force (IETF), whose standard documents are requests for comments (RFCs)
  • 1.1.2 A Services Description
    • Infrastructure that provides services to distributed applications
      • Distributed since they run on multiple end systems
    • End systems and the Internet provide a socket interface for communication
  • 1.1.3 What Is a Protocol?
    • Defines the format and order of messages exchanged between two or more communicating entities, as well as the actions taken on the transmission and/or receipt of a message or other event

1.2 — The Network Edge

  • Internet of Things (IoT) — concept of loads of things (besides traditional computers) being connected to the Internet (watches, glasses, thermostats)
  • Hosts
    • Host application programs
    • Clients or servers
  • 1.2.1 Access Networks
    • Digital Subscriber Line (DSL)
    • DSL Internet access
      • Often from same local telephone company (telco) that provides local phone access
      • Exchange data with a digital subscriber line access multiplexer (DSLAM) located in the telco's local central office (CO)
      • DSL modem translates digital data to high-frequency tones
      • DSLAM translates back analogue signals from other home PCs to digital data, separates data and phone signals, and sends each into the respective networks
      • Residential telephone line carries data and traditional telephone signals simultaneously, encoded at different frequencies
        • Upstream and downstream data rates are different (asymmetric)
    • Cable
    • A hybrid fiber-coaxial access network
      • Uses cable television company's existing infrastructure
      • Uses fiber and coaxial cables, referred to as hybrid fiber coax (HFC)
      • Cable modems connect to home PCs (similar to DSL modems), convert digital signals to analogue, and divide the HFC network into typically asymmetric channels, defined by Data Over Cable Service Interface Specification (DOCSIS) to be 42.8 Mbps upstream and 30.7 Mbps downstream
      • Cable modem termination system (CMTS) converts analogue signal from cable modems back into digital
      • Shared broadcast medium majorly hinders downstream rate, and creates the need for a distributed multiple access protocol to coordinate transmissions and avoid collisions
    • Fiber to the home (FTTH)
    • FTTH Internet access
      • Active optical networks (AONs) direct signals only to the relevant customer, passive optical networks (PONs) do not
      • Also referred to as fiber optic service (FIOS)
      • Optical network terminators (ONTs) are present in each home and connect to a neighborhood splitter via a dedicated optical fiber
      • Optical line terminator (OLT) convert between optical and electrical signals
      • Gigabit per second potential
    • Dial-Up
      • Over traditional phone lines, based on same model as DSL, slow (56 kbps)
    • Satellite
      • > 1 Mbps
    • Local Area Networks (LAN)
      • Access in the enterprise and home typically work using Ethernet switches and WiFi base stations
        • Both Mbps to Gbps
    • Wide-area wireless access
      • Third-generation (3G) wireless, provides packet-switched Internet access at > 1 Mbps
      • Fourth-generation (4G)
      • Long-Term Evolution (LTE) is similar to 3G, downstream > 10 Mbps
  • 1.2.2 Physical Media
    • Guided (waves along a solid medium) and unguided (waves propagate in the atmosphere and in outer space)
    • Twisted-pair copper wire
      • Least expensive, most common guided transmission medium
      • Two insulated copper wires ~1 mm thick in a spiral pattern
        • Reduces electrical interference from similar close by pairs
      • Bundled in a cable
      • Single communication link
      • Unshielded twisted pair (UTP) is common for LANs, 10 Mbps to 10 Gbps
    • Coaxial cable
      • Guided and can be a shared medium
      • Two concentric copper conductors
      • Insulation and shielding
      • Common in cable television systems
      • High data transmission rates
    • Fiber Optics
      • Thin, flexible, guided medium which conducts pulses of light
      • 10 Gbps to 100 Gbps
      • Immune to electromagnetic interference
      • Low signal attenuation up to 100 km
      • Very hard to tap
      • High cost
    • Terrestrial Radio Channels
      • Carry signals in the electromagnetic spectrum, potentially for long distances
      • No physical wire, can penetrate walls, provide connectivity to a mobile user
      • Potential packet loss and decrease in signal strength over distance and around/through obstructing objects (shadow fading)
    • Satellite Radio Channels
      • Geostationary satellites
        • Covers same area over Earth over time
        • 36,000 km above Earth's surface
        • Signal propagation of 280 ms
        • >100 Mbps
      • Low-earth orbiting (LEO) satellites
        • Independent orbit
        • Require many satellites to continuously cover an area

1.3 — The Network Core

  • 1.3.1 Packet Switching
  • Packet switching
    • Long messages are divided into packets
    • Packets travel through communication links and packet switches (routers and link-layer switches)
    • Store-and-forward transmission: packet switch must receive entire packet before beginning to transmit the first bit of the packet onto the outbound link
    • Packet switches have output buffers for each attached link, which store packets before transmission
      • Resulting in queueing delays
      • Packet loss if the output buffer a packet needs to be placed into is full
    • Forwarding tables in each packet switch map portions of destination addresses to outbound links, and the entries of forwarding tables are set by routing protocols
    • Advantages
      • Sharing of transmission capacity
        • Allows for faster service (consider file transfer with TDM and little user activity)
      • Simpler to implement
      • Less costly
      • Provides essentially the same performance as circuit switching and can support significantly more users (because user activity is probabilistic)
    • Disadvantages
      • Not great for real-time services due to variable end-to-end delays
  • 1.3.2 Circuit Switching
  • Circuit switching
    • Resources (buffers, link transmission rate) needed along a path are reserved for the duration of the communication session between end systems
    • Traditional telephone networks have this structure
      • Circuit: the connection between the sender and receiver
    • Multiplexing
      • Frequency-division multiplexing (FDM)
      • FDM
        • Frequency band of a link is divided among the connections established across the link
          • Bandwidth: width of the frequency band (4 KHz above for both bands)
        • FM (frequency modulation) radio stations use FDM
      • Time-division multiplexing (TDM)
      • TDM
        • Time is divided into frames of fixed duration, frames are divided into a fixed number of slots
    • Advantages
      • Guarantees smooth service
    • Disadvantages
      • Dedicated circuits waste resources during silent periods
      • Difficult to establish end-to-end circuits and reserve end-to-end transmission
  • 1.3.3 A Network of Networks
  • Interconnection of ISPs
    • ISPs are interconnected in a tiered structure
    • Access ISPs (customers) pay the regional ISP (provider) it connects to
    • Regional ISPs (customers) pay the tier-1 ISP (provider) it connects to
      • Regional ISP may be split into provincial ISPs and national ISPs
    • Points of presence (PoPs)
      • Exist on all tiers above access level
      • Are groups of one or more routers at the same location in the providers network where customer ISPs can connect into the provider ISP
    • Multi-home: customer ISP connected to multiple provider ISPs
    • Peering: pairs of ISPs at the same level can directly connect their networks together
      • All traffic between them should only be for direct connections between hosts connected to the ISPs
      • Allows them to avoid the cost of routing through provider ISPs
    • Internet Exchange Points (IXPs)
      • Third parties which create a meeting point for multiple ISPs to peer
    • Content-provider networks
      • Private networks which try to bypass the upper tiers of the Internet by peering with lower-tier ISPs (either directly or at IXPs)
      • Ultimately still have to be customers to tier-1 ISPs, but reduce their costs by exchanging less traffic with them through peering
    • Around a dozen tier-1 ISPs
    • Hundreds of thousands of lower-tier ISPs

1.4 — Delay, Loss, and Throughput in Packet-Switched Networks

  • 1.4.1 Overview of Delay in Packet-Switched Networks
    • Total nodal delay
      • Nodal processing delay (microseconds)
        • Examine packet header
          • Check for bit errors
        • Determine output link
      • Queueing delay (microseconds to milliseconds)
      • Transmission delay: transmit packets onto link (microseconds to milliseconds)
      • Propagation delay (milliseconds)
  • 1.4.2 Queueing Delay and Packet Loss
    • Varies from packet to packet, as a result of
      • Fluctuations in traffic intensity (\(La/R\))
      • Dependence of averafe queuing delay on traffic intensity
        • \(L\) = bit length of packets
        • \(a\) = average rate packets arrive at a queue
        • \(R\) = transmission rate
      • Burstiness of packet arrival
    • Packet loss
      • Full, finite capacity queues will have to drop packets upon new arrivals
  • 1.4.3 End-to-End Delay
    • Depends on physical medium
      • End systems using shared mediums (such as WiFi) may purposefully delay its transmission as part of its protocol for sharing the medium with other end systems
    • Traceroute
  • 1.4.4 Throughput in Computer Networks
    • Instantaneous and average throughput
    • Transmission rate along a path is equal to that of the bottleneck link
      • Fluid analogy
      • Typically the access network

1.5 — Protocol Layers and Their Service Models

  • 1.5.1 Layered Architecture
    • Each layer provides its service via
      • Certain actions
      • Using the services of the layer directly below it
    • Advantages to layering
      • Easier to update system components
    • Disadvantages
      • Potential duplicate lower-layer functionality
      • Isolation of information
    • Internet protocol stack
      • Application layer
        • Distributed over multiple end systems
        • Messages
      • Transport layer
        • Connection-oriented service to its applications
        • Segments
      • Network layer
        • Host to host
        • Datagrams
      • Link layer
        • Node to node
        • Frames
      • Physical layer
        • Move individual bits within frames from node to node
    • Open Systems Interconnection (OSI) protocol stack
      • Presentation layer and session layer below application layer
      • Presentation layer
        • Deals with message interpretation
        • Data compression, encryption, and description
      • Session layer
        • Delimiting and synchronizing data exchange
        • Facilitate checkpointing and a recovery scheme
  • 1.5.2 Encapsulation
    • Packets from above layers become the payloads of packets from below layers, with a new packet header prepended
    • Fragmentation
    • Tunneling

1.6 — Networks Under Attack

  • Malware: malicious software
  • Botnet: group of similarly compromised devices
  • Malware is often self-replicating: seeks entry into other hosts over the Internet from newly infected hosts
  • Viruses: Malware requiring user interaction to infect the user's device
  • Worm: Malware that can enter a device without explicit user interaction
  • Denial-of-service (DoS) attacks render a piece of infrastructure unusable by legitimate users
    • Vulnerability attack to stop services or crash systems
    • Bandwidth flooding on the target's access link
    • Connection flooding to consume all threads/processes
    • Distributed DoS (DDoS) are DoS attacks from multiple sources
  • Packet sniffing
    • Wireshark
  • IP spoofing: injecting packets into the Internet with a false source address

1.7 — History of Computer Networking and the Internet

  • 1.7.1 The Development of Packet Switching: 1961 to 1972
    • Packet switching as an efficient and robust alternative to circuit switching
    • ARPAnet: the first packet-switched computer network
      • By the end of the 1969 had 4 nodes
      • By the end of the 1969 had 15 nodes
  • 1.7.2 Proprietary Networks and Internetworking: 1972 to 1980
    • Network control protocol (NCP): first host-to-host protocol
    • New packet-switching networks developed: ALOHAnet's microwave network and DARPA's packet-satellite and packet-radio networks
    • ALOHA protocol, the first multiple-access protocol (for a radio frequency), was developed
    • Internetting: mission to network all created networks
    • TCP split into modern day TCP and IP
    • Ethernet protocol was built on the ALOHA protocol
    • By the end of the decade, there were approximately 200 hosts connected to the ARPAnet
  • 1.7.3 A Proliferation of Networks: 1980 to 1990
    • Universities linked
    • 1983 — transition from NCP to TCP/IP
    • DNS developed
    • French launched the Minitel project, which aimed to bring data networking into everyone's home
    • 1986 — NSFNET (National Science Foundation) created to provide access to NSF-sponsored supercomputing centers
      • By the end of the decade, served as a primary backbone linking regional networks
    • By the end of the decade, there were approximately 100,000 hosts connected to the ARPAnet
  • 1.7.4 The Internet Explosion: The 1990s
    • ARPAnet ceased to exist
    • 1991 — NSFNET removed its restrictions its the commercial use
    • 1995 — NSFNET was decommissioned. Internet backbone traffic was carried by commercial ISPs
    • Emergence of World Wide Web application, which was invented at CERN
    • E-mail
    • Instant messaging
    • Peer-to-peer file sharing of MP3s
    • Internet startups flooded the stock market, and many collapsed
  • 1.7.5 The New Millennium
    • Aggressive deployment of broadband Internet access to homes
    • Ubiquity of high-speed (> 54 Mbps) public WiFi networks and medium-speed (tens of Mbps) Internet access via 4G cellular telephony networks
    • Online social networks
    • Private networks by online service providers
    • Cloud applications

Chapter 2 — Application Layer

2.1 — Principles of Network Applications

  • 2.1.1 Network Application Architectures
    • Application architecture dictates how the application is structured over various end systems
    • Client-server architecture
    • Client-server architecture
      • Server
        • Always-on host
        • Fixed, well-known IP address
      • Client
        • Issues requests
      • Clients communicate indirectly via a server
      • Data centers are used to create a powerful virtual server, to better deal with all of the client requests
    • Peer-to-Peer (P2P)
    • Peer-to-peer architecture
      • Minimal (or no) reliance on dedicated servers in data centers
      • Direct communication between pairs of intermittently connected hosts (peers)
      • Self-scalability
        • Peers generate workload by requesting files, but also add service capacity to the system by distributing files to other peers
      • Disadvantages due to highly decentralized structure
        • Security
        • Performance
        • Reliability
    • Hybrids of client-server and P2P
      • For example, many instant messaging applications use servers to track the IP addresses of users, but user-to-user messages are sent directly between hosts
  • 2.1.2 Processes Communicating
    • Process — program running within an end system by rules governed by the system's operating system
      • Want process communication which works independently from the end systems' operating systems
    • Processes exchange messages
    • Broadly, in each communication session between a pair of processes, there is always a client and server process
      • Client process initiates communication, and downloads from a network
      • Server process waits to be contacted, and uploads to a network
    • Socket: software interface which allows a process to send and receive messages from a network
      • Application Programming Interface (API) between the application and the network
    • Application developer's control on the transport-layer
      • Choice of transport protocol
      • Ability to configure transport-layer parameters
        • Maximum buffer size
        • Maximum segment size
    • Host is identified by IP address
    • Process is identified by port number
  • 2.1.3 Transport Services Available to Applications
    • Reliable data transfer
      • Sending process can pass its data into the socket and know with complete confidence that the data will (eventually) arrive without errors (likely) at the receiving process
      • Not necessary for loss-tolerant applications
    • Throughput
      • Guarantees are great for bandwidth-sensitive applications
      • Not necessary for elastic applications
    • Timing
      • Guarantees are great for interactive real-time applications
      • Not necessary for non-real-time applications
    • Security
      • Confidentiality
      • Data integrity
      • End-point authentication
  • 2.1.4 Transport Services Provided by the Internet
    • TCP
      • Connection-oriented service
        • Client and server exchange transport-layer control information with each other before the application-level messages begin to flow (handshaking)
        • Full-duplex TCP connection forms between the sockets of the two processes
      • Reliable data transfer service
      • Congestion-control mechanism
    • UDP
      • Connectionless service
      • Unreliable data transfer service
      • No congestion-control mechanism
    • Security
      • Provided by neither TCP or UDP
      • Secure Sockets Layer (SSL) enhancement for TCP providing confidentiality, data integrity, end-point authentication
        • Implemented in the application layer
        • Popular Internet applications, their application-layer protocols, and their underlying transport protocols
  • 2.1.5 Application-Layer Protocols
    • Define
      • The types of messages exchanged between application processes
      • The syntax of various message types (fields and delineation between fields)
      • Semantics of the fields
      • Rules for determining when and how processes send and respond to messages
  • 2.1.6 Network Applications Covered in This Book
    • The Web
    • Electronic mail
    • Directory services
    • Video streaming
    • P2P applications

2.2 — The Web and HyperText Transfer Protocol (HTTP)

  • 2.2.1 Overview of HTTP
  • Overview of HTTP
    • Web's application-layer protocol
    • Implemented in client and server programs
    • Web page
      • Consists of objects (files)
      • Base HTML file with referenced objects
      • Each object is addressable by a Universal Resource Locator (URL)
    • Uses TCP
    • Client-server application architecture
    • Stateless
      • Maintains no information about the clients
    • Persistent connection by default
  • 2.2.2 Non-Persistent and Persistent Connections
    • Non-Persistent: each request/response pair is sent over a separate TCP connection
    • Persistent: each request/response pair is sent over the same TCP connection
    • Round-trip time (RTT) — time for a small packet to travel from client to server and then back to the client
    • RTT
      • Persistent connections avoid the extraneous
        • Two RTTs per TCP handshake and object request
        • Allocation of TCP buffers and variables
        • Delay when not using pipelining (HTTP/2)
  • 2.2.3 HTTP Message Format
    • ASCII
    • Request
    • General format of an HTTP request message
      • Methods
        • GET
        • POST
        • HEAD
        • PUT
        • DELETE
      • Browser generates header lines as a function of
        • The browser type and version
        • User configuration of the browser
        • Whether the browser currently has a cached, but possibly out-of-date version of the object
    • Response
    • General format of an HTTP response message
      • Web servers generate headers in a similar way to browsers
        • Different products, versions, configurations
  • 2.2.4 User-Server Interaction: Cookies
    • An HTTP server is stateless
      • Simplifies server design
      • However, often desirable to identify users
    • Cookies
    • Keeping user state with cookies
  • 2.2.5 Web Caching
  • Clients requesting objects through a Web cache
    • Functions as both a server and a client
    • Typically purchased and installed by an ISP
      • Relatively cheap — public-domain software that runs on inexpensive PCs
    • Can substantially reduce the
      • Response time for a client request
      • Traffic on an institution's access link to the Internet
      • Web traffic in the Internet as a whole
    • Want to maximize cache hit rates
    • Content Distribution Networks (CDNs) install many distributed caches throughout the Internet, localizing much of the traffic
    • A cache can verify that its objects are up to date using conditional GET messages
      • GET method
      • If-Modified-Since header

2.3 — Electronic Mail in the Internet

  • Alice sends a message to Bob
  • User agent — allow users to
    • Read
    • Reply to
    • Forward
    • Save
    • Compose messages
  • Mail server
    • Each recipient has a mailbox in a mail server, which
      • Manages and maintains the messages sent to the recipient
    • Authenticates recipients
    • Deals with failure
      • Holds messages in a message queue and attempts (re)transmission until success
  • 2.3.1 SMTP
    • Transfer messages from senders' mail servers to the recipients' mail servers
    • Everything encoded in 7-bit ASCII (old protocol, designed before people emailed large attachments)
    • Common commands
      • HELO
      • MAIL FROM
      • RCPT TO
      • DATA
      • QUIT
    • Each command followed by CRLF
    • Terminate DATA with CRLF.CRLF
  • 2.3.2 Comparison with HTTP
    • Pull vs push protocol
      • HTTP is mainly a pull protocol (pulling uploaded information to a server)
      • SMTP is primarily a push protocol
    • Encoding
      • SMTP requires each message to be encoded in 7-bit ASCII format, including non 7-bit ASCII (which is encoded and decoded appropriately)
      • HTTP data does not impose this restriction
    • Handling other media types
      • HTTP encapsulates each object in its own HTTP response message
      • SMTP places all of the message's objects into one message
  • 2.3.3 Mail Message Formats
    • Message header
      • From:
      • To:
      • Subject:
    • Blank line after message header, then body
  • 2.3.4 Mail Access Protocols
    • Could have recipient's local PC run a mail server, to create more direct communication, but recipient's PC would have to be always on and connected to the Internet
    • Easier to have the recipient's local PC run a user agent program which pulls from the appropriate mailbox in the appropriate mail server (via a mail access protocol)
    • Post Office Protocol — Version 3 (POP3)
      • Simplistic — does not carry state information across POP3 sessions
      • Does not provide any means for a user to maintain a folder hierarchy on a remote server that can be accessed from any computer
    • Internet Mail Access Protocol (IMAP)
      • Each message associated with a folder
      • Maintains user state information across IMAP sessions
    • HTTP

2.4 — DNS — The Internet's Directory Service

  • 2.4.1 Services Provided by DNS
    • Identify a host via hostname and IP address
      • Hostnames preferred by people, fixed-length IP addresses preferred by routers
    • Domain name system (DNS) translates memorable hostnames to IP addresses
      • Distributed database implemented in a hierarchy of DNS servers
        • Often UNIX machines running Berkeley Internet Name Domain (BIND) software
    • Application-layer protocol that allows hosts to query the distributed database
    • Commonly employed by other application-layer protocols (HTTP, SMTP) to translate user-supplied hostnames to IP addresses
    • Host aliasing — hosts with complicated hostnames can have alias names
    • Mail server hostname aliasing
    • Load distribution
      • Servers can be replicated over multiple end systems with distinct IP addresses
      • DNS can associate a set of IP addresses with one canonical hostname
      • The order of the sequence of IP addresses in the set is rotated within each reply
        • Load balances since the first IP address in the reply is commonly used
  • 2.4.2 Overview of How DNS Works
  • Interaction of the various DNS servers
    • Distributed to avoid
      • A single point of failure
      • High, growing traffic volume
      • Having a large distance between querying clients
      • Constant maintenance to account for constant new hosts
    • Classes of DNS servers
      • Root DNS servers
        • Provide IP addresses of top-level domain (TLD) servers
        • Over 400 scattered globally
      • TLD servers
        • com, org, net, uk, fr
        • Provide IP addresses for authoritative DNS servers
      • Authoritative DNS servers
        • Houses an organization's publicly accessible DNS records that map the names of hosts within the organization to IP addresses
    • Local DNS (LDNS) server
      • Provided by ISPs to allow connected hosts to query the DNS database
    • In general, TLD servers do not always know the IP addresses of authoritative DNS servers for the queried hostname
      • May only know of an intermediate DNS server (often have hostname dns.organization), which knows the authoritative DNS server for the hostname
    • Recursive and iterative queries
    • DNS servers cache responses to queries for some time (often set to two days)
  • 2.4.3 DNS Records and Messages
    • DNS servers store resource records (RRs)
      • Four-tuple (Name, Value, Type, time to live (TTL))
      • The meaning of Name and Value depend on Type
        • Type=A
          • Name is a hostname
          • Value is the IP address of the hostname
          • (relay1.bar.foo.com, 145.37.93.126, A, TTL)
        • Type=NS
          • Name is a domain
          • Value is the hostname of an authoritative DNS server that knows how to obtain the IP addresses for hosts in the domain
          • (foo.com, dns.foo.com, NS, TTL)
        • Type=CNAME
          • Name is an alias hostname
          • Value is a canonical hostname for the alias hostname
          • (foo.com, relay1.bar.foo.com, CNAME, TTL)
        • Type=MX
          • Name is an alias hostname
          • Value is a canonical name of a mail server for the alias hostname
          • (foo.com, mail.bar.foo.com, MX, TTL)
          • MX record allows a company's mail server and Web server to have identical (aliased) hostnames
    • DNS record format
    • nslookup program to send DNS query messages
    • Insert records into the DNS database via a registrar
      • Commercial entity that verifies the uniqueness of a domain name and enters it into the DNS database for a small fee
    • Need to provide names and IP addresses of your primary and secondary authoritative DNS servers
    • DNS vulnerabilities
      • DDoS to TLD servers
        • Partial damage control from caching at LDNS servers
      • Man-in-the-middle
        • Intercept queries from hosts and return bogus replies
      • DNS poisoning attack
        • Attacker sends bogus replies to a LDNS server, tricking the server into accepting bogus records into its cache

2.5 — Peer-to-Peer File Distribution

  • Distribution time for P2P and client-server architectures
  • BitTorrent
    • Torrent: collection of all peers participating in the distribution of a particular file
    • Peers may selfishly or altruistically leave or stay in the torrent after acquiring the entire file
    • Tracker: infrastructure node
      • When a peer joins a torrent, it registers itself with the tracker
      • The tracker randomly selects a subset of the peers in the torrent and gives their IPs to the new peer
      • Peers periodically inform the tracker that they are still in the torrent
    • Attempt to form TCP connections with each peer, to form a set of neighboring peers
    • Periodically, ask neighboring peers which chunks they have
    • Rarest first: request chunks that are the rarest among her neighbors
    • Give priority to neighbors that are supplying data at the highest rate (tit-for-tat incentive mechanism)
      • Top four highest supply rate neighboring peers are sent chunks back (unchoked)
        • Overtime, peers capable of uploading at compatible rates tend to be in each other's neighborhoods
    • Every 30 seconds, one additional, randomly selected neighbor is also sent chunks (optimistically unchoked)
    • All other neighboring peers not in these five are choked
  • Distributed Hash Tables (DHT)
    • Database entries distributed over peers in a P2P system
    • Partition keyspace
    • Ring structure
      • Peers know some entries, the IP addresses of their two neighboring peers, and one distant peer
      • If peer is queried and doesn't have the corresponding value, asks neighboring/distant peer (whichever is closest to key with respect to keyspace partition)
    • Trust neighbors to give true values
    • More secure since no single point of failure

2.6 — Video Streaming and Content Delivery Networks

  • 2.6.1 Internet Video
    • Prerecorded video placed on servers
    • Users send requests to servers to view videos on demand
    • Video
      • Sequence of images being displayed at a constant rate (24-30 images/second)
        • Uncompressed image: array of pixels encoded into a number of bits to represent luminance and color
      • High bit rate
      • Compressible
        • To different rates
      • For continuous playout, the network must provide an average throughput to the streaming application that is at least as large as the bit rate of the compressed video
  • 2.6.2 HTTP Streaming and Dynamic Adaptive Streaming over HTTP (DASH)
    • Video encoded into several different versions, each with a different bit rate and, correspondingly, a different quality level
    • Allows clients to select different quality chunks based on perceived bandwidth
      • Exponentially weighted moving average accounting for variance
    • Manifest file: provides a URL for each video version along with its bit rate
      • Client selects one chunk at a time by specifying a URL and a byte range in an HTTP GET request message
  • 2.6.3 Content Distribution Networks (CDNs)
    • Manage servers in multiple geographically distributed locations
    • Stores copies of the videos in its servers
    • Attempts to direct each user request to a CDN location that will provide the best user experience
    • DNS redirects a user's request to a CDN server
      • (Note (6) should be a double-headed arrow)
    • Intercept requests for videos using DNS
    • Determine a suitable CDN server cluster for that client
      • Geographically closest
        • Tends to work well for majority of clients
        • Can be bad for some, since geographically closest may not minimize number of hops of the network path
        • Some end-users are configured to use remotely located LDNSs, in which case this is a bad heuristic for distance
        • Ignores variance in delay and available bandwidth over time of Internet paths
      • Based on current traffic conditions
        • Real-time measurements of delay and loss between clusters and clients
        • Make all clusters within a CDN periodically send probes to all LDNSs globally
          • Unfortunately, many LDNSs are configured to ignore such probes
    • Redirect the client's request to a server in that cluster
  • Can be private (Google's CDN) or third-party (Akamai)
  • Placement philosophies
    • Enter deep: deploy server clusters in access ISPs, globally
      • Reduce user-perceived delay
      • Highly distributed, difficult to maintain and manage the clusters
    • Bring home: build large clusters at a smaller number of sites (typically IXPs)
      • Greater user-perceived delay
      • Lower maintenance and management overhead
  • CDNs may not place copies of every video in every cluster
    • Resolved by clusters retrieving the videos from other units and storing a local copy while streaming the video
    • Remove videos that are not frequently requested
  • 2.6.4 Case Studies: Netflix, YouTube, and KanKan
    • Netflix
      • Combines Amazon cloud and its own private CDN infrastructure
      • Amazon cloud
        • Content ingestion: studio master versions of movies are uploaded to hosts in the Amazon cloud
        • Content processing: the machines in the cloud create many different formats for each movie (for all client video players), each at multiple bit rates, allowing for DASH
        • Uploading versions to its CDN
        • Netflix video streaming platform
      • Initially employed third-party CDN companies to distribute video content, but now has its own private CDN
        • Server racks in both IXPs and residential ISPs, combining bring home and enter deep philosophies
          • Each server rack has several 10 Gbps Ethernet ports and over 100 terabytes of storage
        • Does not use pull-caching — instead uses push caching — caching videos to its CDN servers during off-peak hours
    • YouTube
      • Private CDN to distribute videos
      • Server clusters in many hundreds of different IXP and ISP locations
      • Uses pull caching
      • Does not use adaptive streaming — instead requires users to manually select a version
      • Limits prefetching of video to reduce resource waste via repositioning and early termination
      • Processing of uploaded content occurs within Google data centers
    • Kankan
      • P2P, along with client-server delivery
      • Similar to BitTorrent file downloading
      • Pushes video content to hundreds of servers in China
      • Recently migrated to a hybrid CDN-P2P streaming system
        • Client requests beginning of content from CDN servers, and in parallel requests content from peers
        • When total P2P traffic is sufficient for video playback, the client ceases streaming from the CDN, only using peers
        • If P2P streaming traffic becomes insufficient, the client restarts CDN connections and returns to the mode of hybrid CDN-P2P streaming

    2.7 — Socket Programming

    • Send raw byte representation of data
    • 2.7.1 Socket Programming with UDP
    • The client-server application using UDP
      • Specify (server IP, server port) when sending
    • 2.7.2 Socket Programming with TCP
    • The TCPServer process has two sockets
    • The client-server application using TCP

    Chapter 3 — Transport Layer

    3.1 — Introduction and Transport-Layer Services

    • 3.1.1 Relationship Between Transport and Network Layers
      • Transport-layer protocol provides logical communication between processes running on different hosts
      • Network-layer protocol provides logical communication between hosts
        • Logical communication: appearance of direct connection
    • 3.1.2 Overview of the Transport Layer in the Internet
      • Internet provides User Datagram Protocol (UDP) and Transmission Control Protocol (TCP)
      • Transport layer segments
      • Internet's network-layer protocol (IP) is a best-effort (unreliable) delivery service

    3.2 — Multiplexing and Demultiplexing

    • Demultiplexing: delivering data in a transport-layer segment to the correct socket
    • Multiplexing: gathering data chunks at the source host from different sockets, encapsulating each data chunk with header information (for later demultiplexing), creating segments, and passing segments to the network layer
    • Uses source and destination port numbers
      • 16-bit number ranging from 0 to 65535
      • 0 to 1023 are well-known and reserved for use by well-known application protocols
    • UDP sockets are fully identified by the two-tuple (destination IP address, destination port number)
    • TCP sockets are fully identified by a four-tuple (source IP address, source port number, destination IP address, destination port number)
      • Need this additional information due to the protocol being connection-oriented
    • Port scanning: determining which applications are listening on which ports
      • nmap
    • Not always a one-to-one correspondence between connection sockets and processes — today's high-performing Web servers often use one process, and create a new thread with a new connection socket for each new client connection

    3.3 — Connectionless Transport: UDP

    • Advantages for applications using UDP
      • Finer application-level control over what data is sent, and when
      • No connection establishment
        • No handshaking between sending and receiving transport-layer entities before sending a segment
      • No connection state
      • Small packet header overhead
  • Popular Internet applications and their underlying transport protocols
    • 3.3.1 — UDP Segment Structure
      • UDP segment structure
    • 3.3.2 — UDP Checksum
      • 1s complement of the sum of all the 16-bit words in the segment
      • Example of system design's end-end principle

    3.4 — Principles of Reliable Data Transfer

    • 3.4.1 Building a Reliable Data Transfer Protocol
      • Use finite-state machine (FSM) definitions for the sender and receiver
      • rdt1.0 — a protocol for a completely reliable channel
        • Sending side waits for procedure call from upper-layer protocol and sends packet
        • Receiving side waits for procedure call from lower-layer protocol and extracts packet
      • rdt2.0 — reliable data transfer over a channel with bit errors
        • Need a form of error detection, attach checksum field to all packets
        • Need receiver feedback
          • Positive acknowledgements (ACK) and negative acknowledgements (NAK)
          • NAKs trigger retransmission of packets
          • Automatic Repeat reQuest (ARQ) protocols: reliable data transfer protocols based on retransmissions
        • Stop-and-wait protocols: sender does not send a new piece of data until it is sure that the receiver has correctly received the current packet (via an ACK)
      • rdt2.1 — accounting for potentially corrupted receiver feedback
        • Add sequence numbers to packets. That way if the receiver sent an ACK which became corrupted into a NAK, causing the sender to retransmit the packet the receiver already received, the receiver would be able to know that the new packet is a duplicate packet
      • rdt2.2 — equivalent to rdt2.1, but is NAK free
        • Instead of sending a NAK when receiving a corrupted packet, send an ACK + sequence number for the last correctly received packet
      • rdt3.0 — reliable data transfer over a lossy channel with bit errors
        • Implement a countdown timer that can interrupt the sender after a given amount of time, which prompts retransmission of the corresponding packet
        • rdt3.0 is sometimes known as the alternating-bit protocol, since sequence numbers alternate between 0 and 1
    • 3.4.2 Pipelined Reliable Data Transfer Protocols
      • Stop-and-wait protocols have very low link utilization
      • A pipelined protocol in operation
      • To accommodate pipelining
        • Range of sequence numbers must be increased
        • Sender and receiver sides of the protocols may have to buffer more than one packet
      • Two approaches: Go-Back-N (GBN) and selective repeat (SR) protocols
    • 3.4.3 Go-Back-N
    • Sender's view of sequence numbers in Go-Back-N
      • Sender transmits multiple packets
      • Can have no more than \(N\) unacknowledged packets in the pipeline
        • Must track this number, and decline the procedure call to send if there are already \(N\) unacknowledged packets in the pipeline
        • Limit creates flow control
      • Sliding-window protocol: window slides over once base packet is ACKed
      • Receiver feedback is a cumulative acknowledgement, indicating that all packets within a sequence number up to and including \(n\) have been correctly received
      • Sender starts timers for each transmitted packet. If any of them timeout, the sender retransmits all previously sent, unacknowledged packets (usually equal to window size \(N,\) hence the name Go-Back-N)
        • Due to this retransmission of everything, the receiver discards out-of-order packets — no buffering is done
        • Bad when window-size and bandwidth-delay product are both large, as many packets can be unnecessarily retransmitted
    • 3.4.4 Selective Repeat
    • Selective-repeat(SR) sender and receiver views of sequence-number space
      • Same as GBN, except
        • Individually acknowledge correctly received packets
        • Receiver buffers out-of-order packets
        • Potential for receiver to send ACKs for packets with a sequence number below that of rcv_base (if previously transmitted ACK is lost)
          • Thus, window size must be at most half of the size of the sequence number space
    • Note: have assumed in order delivery
      • If not, channel can be thought of as buffering packets and spontaneously emitting these packets at any point in the future
      • Due to reusing of sequence numbers, care must be taken to guard against duplicate sequence number, yet distinct packets
      • In reality, assume that packets cannot "live" in the network for longer than some amount of time (3 minutes assumed in TCP extensions for high-speed networks)

    3.5 — Connection-Oriented Transport: TCP

    • 3.5.1 The TCP Connection
      • Logical connection, with common state residing only in the TCP processes in the two communicating end systems
      • Full-duplex service
      • Point-to-point — between a single sender and a single receiver
      • TCP send and receive buffers
      • Application data passed into TCP send buffer
      • TCP intermittently passes chunks of data to the network layer
      • Maximum segment size (\(MSS\)): maximum amount of data that can be grabbed from the TCP send buffer and placed in a segment (accounting for TCP/IP header length, typically 40 bytes)
        • Determined by largest link-layer frame that can be sent by the local sending host (maximum transmission unit (\(MTU\)), usually 1500 bytes)
        • Thus typical values are \(MTU = 1500\) bytes, \(MSS = 1460\) bytes
    • 3.5.2 TCP Segment Structure
    • TCP segment structure
      • Flags
        • CWR and ECE
          • Used for explicit congestion notification
        • URG
          • Indicates whether data in the segment that the sending-side upper-layer entity has been marked as "urgent" (whose location is pointed to by the urgent data pointer, rare usage in practice)
        • ACK
          • Indicates whether the value in the acknowledgment number field is valid or not
        • PSH
          • Indicates whether the receiver should pass data to the upper layer protocol process immediately (rare usage in practice)
        • RST, SYN, FIN
          • Used for connection setup and teardown
      • Sequence number: byte-stream number of the first byte in the segment
        • Randomly chose initial sequence number, minimize the possibility that a segment still present in the network from an earlier, already-terminated connection between the same two hosts using the same port numbers is mistaken for a valid segment
      • Acknowledgment number: next byte that the sending host expects from the receiving host (cumulative acknowledgments)
        • TCP RFC does not impose rules on buffering out-of-order bytes, but in practice this is often done to increase the efficiency of network bandwidth
        • Piggybacking: TCP segment with ACK and data
    • 3.5.3 Round-Trip Time Estimation and Timeout
      • Want timeout to be greater than RTT to avoid unnecessary retransmissions, but not so much larger that link utilization is low
      • \(SampleRTT\): amount of time between when a segment is sent (passed to IP process) and received
        • Most TCP implementations only take one \(SampleRTT\) measurement (track \(SampleRTT\) for a single transmitted but currently unacknowledged segment) at a time (roughly once per RTT)
        • Never tracked for retransmitted packets
        • Fluctuates from segment to segment due to congestion in routers and varying load on the end systems
      • \(EstimatedRTT = (1 - a) \times EstimatedRTT + a \times SampleRTT\)
        • \(a\) is recommended to be 0.125
        • Exponentially weighted moving average (EWMA)
      • \(DevRTT = (1 - b) \times DevRTT + b \times \lvert SampleRTT - EstimatedRTT \rvert\)
        • \(b\) is recommended to be 0.25
        • Also EWMA
      • Finally, \(TimeoutInterval: EstimatedRTT + 4 \times DevRTT\)
    • 3.5.4 Reliable Data Transfer
      • TCP ensures that the byte stream that a process reads out of its TCP receive buffer is exactly the same byte stream as that sent by the end system on the other side of the connection
      • Specific TCP details and mechanisms
        • SendBase variable in TCP sender: sequence number of the oldest unacknowledged byte
        • Doubling the timeout interval for a segment after a timeout
          • Provides a limited form of congestion control
        • Some versions of TCP have an implicit NAK mechanism — TCP fast retransmit
          • Three duplicate ACKs for a given segment serve as an implicit NAK for the following segment, triggering retransmission before timeout
        • Delayed ACK: when in-order segment arrives with expected sequence number, wait up to 500 msec for the arrival of another in-order segment. If next in-order segment does not arrive in this interval, send an ACK
    • 3.5.5 Flow Control
    • The receive window (rwnd) and the receive buffer (RcvBuffer)
      • Eliminate possibility of sender overflowing the receiver's buffer
      • Receiver maintains
        • \(LastByteRcvd - LastByteRead \leq RcvBuffer\)
          • Since overflowing buffers is not permitted
        • Receive window \(rwnd = RcvBuffer - (LastByteRcvd - LastByteRead)\)
      • Sender maintains
        • \(rwnd\) from receiver: how much free buffer space was recently available at the receiver
        • \(LastByteSent - LastByteAcked \leq rwnd\)
      • Problem: \(rwnd = 0\) being advertised, and sender currently has nothing to send
        • Sender will never be told that \(rwnd\) has increased (after application reads from full TCP receive buffer) since this happens only if the receiver needs to send data, or has an acknowledgment to send
        • Solution: sender can send segments with one data byte when \(rwnd = 0\)
      • Note: UDP does not provide flow control, hence segment loss can occur at the receiver due to buffer overflow
    • 3.5.6 TCP Connection Management
      • Connection establishment (three way handshake)
        • Client-side sends TCP segment (SYN) with
          • SYN bit set to 1
          • Randomly chosen initial sequence number
        • Server-side receives TCP segment and
          • Allocates TCP buffers and variables to the connection
          • Sends TCP segment (SYNACK) with
            • SYN bit set to 1
            • Acknowledgement field set to received sequence number + 1
            • Randomly chosen initial sequence number
        • Client-side receives TCP segment and
          • Allocates TCP buffers and variables to the connection
          • Sends TCP segment (ACK) with
            • SYN bit set to 0
            • Acknowledgment field set to received sequence number + 1
            • Potential first payload of data
      • Connection closing
      • Connection closing
        • Close means resources (buffers and variables) are deallocated
        • FIN is a TCP segment with FIN bit set to 1
        • Timed wait is usually either 30 seconds, 1 minute or 2 minutes
        • Afterwards, the connection formally closes and all resources (including port numbers) are released
      • SYN flood attack
        • DoS
        • Send large number of TCP SYN segments, without completing the third handshake step, exhausting the server's connection resources
        • Defend with SYN cookies
          • In response to TCP SYN, server does not create a half-open TCP connection
          • Instead, creates an initial TCP sequence number that is a hash function of source and destination IP, port, and a secret number, and responds with SYNACK segment to the sender of the TCP SYN message
          • A legitimate client responds with an ACK segment, whose acknowledgment field = output of hash function + 1
            • Server verifies this value using the fields present in the ACK segment, and the secret number stored in memory. If valid, resources are allocated and a fully open connection is created
          • Otherwise, no harm is done to the server
      • nmap
        • Send TCP SYN segment, and either receive
          • TCP SYNACK segment
            • Application running TCP on corresponding port, nmap returns "open"
          • TCP RST segment
            • SYN reached target host, but the host is not running an application with TCP on corresponding port
            • Not blocked by firewall
          • Nothing
            • Likely blocked by firewall

    3.6 — Principles of Congestion Control

    • 3.6.1 The Causes and the Costs of Congestion
      • Offered load: sending + retransmission rate of a host onto a link
      • Retransmissions, especially unnecessary retransmissions (for example due to a premature timeout) cause link utilization to plummet
      • If a packet is dropped along a multihop path, the transmission capacity that was used at each of the upstream links to forward that packet to the point at which it is dropped ends up having been wasted
    • 3.6.2 Approaches to Congestion Control
    • Two feedback pathways for network-indicated congestion information
      • End-to-end congestion control
        • Network congestion must be inferred by the end systems based only on observed network behavior (packet loss and delay)
      • Network-assisted congestion control
        • Routers provide explicit feedback to the sender and/or receiver regarding the congestion state of the network, typically using a choke packet

    3.7 — TCP Congestion Control

    • End-to-end congestion control, since the IP provides no explicit feedback to the end systems regarding network congestion
    • TCP congestion-control mechanism operating at the sender tracks the variable congestion window (cwnd)
      • \(LastByteSent - LastByteAcked \leq min(cwnd, rwnd)\)
        • \(min\)(congestion control limit, flow control limit)
      • Bandwidth probing: uses ACKs to increase congestion window
      • Uses timeouts or three duplicate ACKs to decrease congestion window
      • Too small of a value and link utilization is unnecessarily low
      • Too high of a value potentially creates lots of congestion, also lowering link utilization (and wasting transmission capacity up to that point)
    • Slow start
    • TCP slow start
      • When a TCP connection begins, the value of \(cwnd\) is typically \(1\;MSS\)
      • \(cwnd\) increases by \(1\;MSS\) every time a transmitted segment is first acknowledged
      • If there is a loss event
        • Slow start threshold (\(ssthresh\)) \(= \dfrac{cwnd}{2}\)
        • \(cwnd\) is reset to \(1\;MSS\)
        • Slow start is re-entered, until \(cwnd = ssthresh\), in which case congestion avoidance mode is entered
    • Congestion avoidance
      • Increase \(cwnd\) by \(1\;MSS\) every RTT
      • If there is a timeout
        • Same as loss in slow start mode
      • If triple duplicate ACKs are received
        • \(ssthresh = \dfrac{cwnd}{2}\)
        • \(cwnd = \dfrac{cwnd}{2} + 3\;MSS\)
        • Fast recovery mode can be entered
    • TCP Tahoe only uses slow start and congestion avoidance modes
    • Fast recovery
      • Recommended but not required
      • \(cwnd\) increases by \(1\;MSS\) for every duplicate ACK received for the missing segment that caused the entry of fast recovery mode
      • When the ACK for the missing segment is received, congestion avoidance mode is entered after setting \(cwnd = ssthresh\)
      • If there is a timeout
        • Same as loss in slow start mode
      • Implemented by TCP Reno
    • Evolution of TCP's congestion window (Tahoe and Reno)
      • TCP Tahoe and Reno behave the same until triple duplicate ACK at transmission round 8
      • Loss at \(cwnd = 12\), so \(ssthresh = 6\)
      • TCP Reno sets \(cwnd = \dfrac{cwnd}{2} + 3 (= 9\;MSS)\) and enters fast recovery mode
      • TCP Tahoe sets \(cwnd = 1\;MSS\) and enters slow start mode
    • TCP splitting
      • Desirable for cloud services to provide a high-level of responsiveness
      • If end system is far from a data center, RTT will be large, potentially leading to poor response time performance due to TCP slow start
      • Solution: clients establish TCP connection to nearby front-end server. Front-end server maintains a persistent TCP connection to the data center with a large TCP congestion window
    • TCP's congestion control is often referred to as an additive-increase, multiplicative-decrease (AIMD) form of congestion control
    • Additive-increase, multiplicative-decrease congestion control
      • Ignore slow start
      • Assume loss is indicated by duplicate ACKs rather than timeouts
      • Assume fast recovery mode is entered
      • "Saw tooth" behavior
    • TCP Vegas
      • Tries to detect congestion in the routers between source and destination before packet loss occurs by observing RTT
      • If congestion is detected, lower the rate of transmission linearly
    • TCP's congestion control causes it to use high speed links inefficiently
      • Only takes one loss event for rate of transmission to completely reset
      • Researchers are investigating new versions of TCP for high-speed environments
    • 3.7.1 Fairness
      • Link capacity split equally among end systems connected to link
      • TCP connection 1 and 2 sharing a link of transmission rate \(R\)
      • Throughput realized by TCP connections 1 and 2
        • Start at point A
          • Total throughput into link \(< R\), so no loss
        • Moves to point B
          • Total throughput into link \(> R\), so loss
          • Both connections set \(cwnd = \dfrac{cwnd}{2}\)
        • Moves to point C
          • Connection 1, whose \(cwnd\) was greater than that of connection 2, experiences a greater reduction than connection 2
            • Closer to equal bandwidth sharing
        • Converges to fairness
          • Though in reality, RTT varies
            • Hosts with lower RTT are able to grab the available bandwidth (increase their \(cwnd\)) at the link more quickly as it becomes free
      • TCP's congestion control mechanism incentivizes high bandwidth applications to run over UDP
      • Can use multiple TCP connections in parallel to grab more bandwidth
    • 3.7.2 Explicit Congestion Notification (ECN): Network-assisted Congestion Control
    • Explicit Congestion Notification: network-assisted congestion control
      • ECN Echo bit = 1 should cause TCP sender to half \(cwnd\)

    Chapter 9 — Multimedia Networking

    9.1 — Multimedia Networking Applications

    • 9.1.1 Properties of Video
      • High bit rate (100 kbps to 3 Mbps)
      • Compressible
        • Can produce multiple versions of the same video for flexibility
      • Redundancy
        • Spatial
        • Temporal
    • 9.1.2 Properties of Audio
      • Analogue to digital
        • Pulse Code Modulation
        • Compressible
          • MPEG 1 layer 3 (MP3)
          • Advanced Audio Coding (AAC)
    • 9.1.3 Types of Multimedia Network Applications
      • Streaming stored audio/video
        • Streaming: begin media playout after receiving only some of media
        • Interactive
        • Continuous playout
        • Video
          • Average throughput is the most important performance measure
          • Streamed from CDN or P2P applications
      • Conversational voice/video-over-IP
        • Voice-over-IP (internet telephony, VoIP)
        • Delay-sensitive
        • Loss-tolerant
      • Streaming live audio/video
        • Typically via CDNs

    9.2 — Streaming Stored Video

    • Extensive client-side application buffering for smooth playback
    • Client playout delay in video streaming
      • Absorb variations in server-to-client delay and server-to-client bandwidth
    • UDP streaming
      • Server transmits video at a rate that matches the client's video consumption
      • Typically consists of a small client-side buffer (holds less than a second of video)
      • Typically uses the upper-level Real-Time Transport Protocol (RTP)
      • Client and server also maintain a separate control connection where the client communicates changes in session state (pause, resume, reposition)
      • Disadvantages
        • Often fails to provide continuous playout due to fluctuations in bandwidth between server and client
        • Requires a media control server, such as a Real-Time Streaming Protocol (RTSP) server to process client-to-server interactivity requests and to track client state for each ongoing client session
        • Many firewalls block UDP traffic
          • Consequently, rarely used
    • HTTP streaming
    • Streaming stored video over HTTP/TCP
      • Server stores video as an ordinary file, addressed by a URL
      • Client uses HTTP GET requests over TCP for that URL
      • Server sends video as quickly as TCP allows with flow and congestion control
      • Client application begins playback once enough bytes are present in the client application buffer (decompresses and displays frames)
      • Disadvantages
        • Transmission rate often exhibits "saw-tooth" shape as a result of TCP congestion control
        • Potential significant delay as a result of TCP's retransmission mechanism
          • Fixable with client buffering and prefetching, so typically used today (YouTube, Netflix)
      • Prefetching video
        • Stored in client application buffer
        • "Back pressure" from full TCP buffers will force the server to reduce its transmission rate
          • A full client application buffer indirectly imposes a limit on the rate that video can be sent from server to client
      • Early termination and repositioning the video
        • Client application uses the HTTP byte-range header in the HTTP GET request message to specify the desired frames
        • Early termination and repositioning the video waste bandwidth and server resources
          • Smaller client application buffers reduce waste
    • Adaptive HTTP streaming (Dynamic Adaptive Streaming over HTTP, DASH)
      • See 2.6.2

    9.3 — Voice-over-IP (internet telephony, VoIP)

    • 9.3.1 Limitations of the Best-Effort IP Service
      • Receiver must take care in determining
        • When to play back a chunk
        • What to do with a missing chunk
      • Packet loss
        • Buffer overflow in routers along path between sender and receiver
        • Could be eliminated with TCP, but
          • Retransmission mechanisms are often considered unacceptable for conversational real-time audio applications
          • Loss would result in reduction of the sender's transmission rate, likely to a rate lower than the receiver's drain rate, possibly leading to buffer starvation and severely impact voice intelligibility at the receiver
            • Thus, most VoIP applications run over UDP (Skype, unless a user is behind a NAT/firewall which blocks UDP segments)
      • End-to-end delay
        • Accumulation of transmission, processing, queuing delays in routers; propagation delays in links; and end-system processing delays
        • Perception
          • < 150 msecs not perceived by a human listener
          • 150 to 400 msecs are acceptable but unideal
          • > 400 msecs seriously hinder interactivity
      • Packet jitter
        • Varying queuing delays at routers
    • 9.3.2 Removing Jitter at the Receiver for Audio
      • Prepend each chunk with a timestamp
      • Delay playout of chunks at the receiver
      • Fixed playout delay
      • Fixed playout delay
        • Receiver attempts to play out each chunk exactly \(q\) msecs after the chunk is generated
      • Adaptive playout delay
        • Playout time of \(i\)th packet is calculated via estimations of network delay and the variance of the network delay
          • Exponentially weighted moving averages for both
    • 9.3.3 Recovering from Packet Loss
      • Loss recovery schemes
        • Forward Error Correction (FEC)
          • Add redundant information to the original packet stream
          • Send a redundant encoded chunk every \(n\) chunks = XOR of the original \(n\) chunks.
            • If any single packet in the \(n+1\) packets is lost, the receiver can recover the packet by XORing all of the other packets
            • If more than one packet is lost, there is no recovery
          • Create lower-resolution audio stream, and prepend already sent packets to new packets
          • Piggybacking lower-quality redundant information
        • Interleaving
        • Sending interleaved audio
        • Error concealment
          • Produce a replacement for a lost packet that is similar to the original
          • Possible via short-term self-similarity
          • Packet repetition
          • Interpolation
    • 9.3.4 Case Study: VoIP with Skype
      • Proprietary
      • Clients can use many different codecs
      • Audio and video packets via UDP (by default, TCP otherwise)
      • Control packets via TCP
      • FEC for loss recovery
      • Adapts streams to current network conditions
      • P2P
      • Skype super peers relay data between two callers behind UDP-blocking NATs
      • For video calls with \(N > 2\) participants, each participant's video stream is routed to a server cluster, which relays to each participant the streams of the \(N - 1\) streams of the \(N - 1\) other participants, avoiding the likely low bandwidth upstream links of each participant

    9.4 — Protocols for Real-Time Conversational Applications

    • 9.4.1 Real-Time Transfer Protocol (RTP)
      • Runs on top of UDP
      • RTP header fields
        • Payload type = audio or video encoding
        • Synchronization source identifier (SSRC) uniquely identifies the source of the RTP stream
      • No insurance of timely delivery of data, or other quality-of-service (QoS) guarantees
    • 9.4.2 Session Initiation Protocol (SIP)
    • SIP
    • SIP
      • Establishes calls
      • Can allow the caller to determine the current IP address of the callee
      • SIP
        • 1: SIP INVITE message to the umass SIP proxy
        • 2: SIP proxy does DNS lookup on the SIP registrar (not shown), and forwards the INVITE message to the SIP registrar
        • 3: Redirect response since host is registered with the NYU SIP registrar
      • Call management (adding new media streams, changing the encoding of a media stream, inviting new participants to during the call)
      • Ends calls

    9.5 — Network Support for Multimedia

    • Three network-level approaches to supporting multimedia applications
    • 9.5.1 Dimensioning Best-Effort Networks
      • Bandwidth provisioning: how much capacity to provide at network links in a given topology to achieve a given level of performance
      • Network dimensioning: how to design a network topology (where to place routers, how to interconnect routers with links, what capacity to assign links) to achieve a given level of end-to-end performance
      • Need
        • Models of traffic demand between network end points
        • Well-defined performance requirements
        • Models to predict end-to-end performance for a given workload model, and techniques to find a minimal cost bandwidth allocation that will result in all user requirements being met
      • Internet could support multimedia traffic at an appropriate performance level if dimensioned correctly, but does not, primarily due to economic and organizational reasons
        • Users may not be willing to pay their ISPs enough for sufficient bandwidth to support multimedia applications over a best-effort Internet
        • Different ISPs would have to be willing cooperate to ensure that end-to-end paths are properly dimensioned to support multimedia applications
    • 9.5.2 Providing Multiple Classes of Service
      • Type-of-service (ToS) field in the IPv4 header could be used for packet marking
      • Ideally provide traffic isolation among classes, so one class is not adversely affected by another class of traffic that misbehaves
        • Traffic policing (drop/delay packets that violate a criteria)
        • Link-level packet-scheduling to provide logically distinct links of different capacities within the same physical link
      • Desirable to use resources (link bandwidth, buffers) as efficiently as possible
        • Better than logically distinct solution since this can waste resources
      • Criteria for policing
        • Long-term average rate (6000 packets/minute)
        • Peak rate (1500 packets/second)
        • Burst size (750 packets/instant)
      • Leaky buckets
      • The leaky bucket policer
        • Packets can only be sent if a token can be removed from the bucket
        • Burst rate \(b\)
        • Maximum long term average of \(rt + b\), for any length of time \(t\)
        • Can use two leaky buckets in series to police a flow's peak rate
          • First: flow limited by link (limited to \(r_1 t + b_1\))
          • Second: limits flow after initial limit, limiting peak rate (if \(r_2 < r_1\), peak = \(r_2(r_1 t + b_1) + b_2)\)
      • Leaky bucket + Weighted Fair Queuing = Provable Maximum Delay in Queue
      • n multiplexed leaky bucket flows with WFQ scheduling
        • \(d_{max} = \dfrac{b_1 R w_1}{\sum{w_j}}\) for delay of class 1
          • \(R\) is the transmission rate of the link
          • If there are \(b_1\) packets in the queue and each packet is removed at a rate of at least \(\dfrac{R w_1}{\sum{w_j}}\), the last packet cannot be removed later than \(\dfrac{b_1 R w_1}{\sum{w_j}}\)
    • 9.5.3 Diffserv
      • Differentiated service for different classes of traffic
      • Network edge router functions: packet classification and traffic conditioning
      • Network core router function: forwarding
        • Each packet is forwarded onto its next hops according to per-hop behavior (PHB) associated with the packet's class
      • Very scalable, no need to maintain connection-specific information
      • End users may have to agree to limit its packet-sending rate to conform to a declared traffic profile
      • Metering functions compare incoming packet flow with the negotiated traffic profile, determine whether a packet is behaving as expected, and act accordingly (forward if behaving, delay/drop if not)
      • Expedited forwarding PHB specifies that the departure rate of a class of traffic must equal or exceed a configured rate
      • Assured forwarding PHB divides traffic into four classes, where each class is guaranteed with some minimum amount of bandwidth and buffering
    • 9.5.4 Per-Connection Quality-of-Service (QoS) Guarantees: Resource Reservation and Call Admission
      • For a network to make guarantees, it must integrate call admission
      • A flow must declare its QoS requirement, and have the network either accept the flow (at the required QoS) or block the flow
      • This requires the reservation of sufficient resources at each and every network router on its source-to-destination path (call setup)
      • The call setup process

    Sources

    Main image: "Internet map 1024" by The Opte Project, hosted by Wikimedia, license

    Textbook: Computer Networking: A Top-Down Approach, by James Kurose and Keith Ross