SMART Overlay: Duplicate Traffic Elimination
The Internet consumes 5% of world-wide energy. The fact is that 90% of Internet traffic is video and mostly “redundant.” As an example, 10% of the most popular videos account for 90% of total views at YouTube. As a result, redundant data are repeatedly transmitted over the Internet. Our challenge is to design the first traffic deduplication technique for more efficient network communications between video sources (video servers, or proxy servers in a CDN) and clients.
Our solution is a smart overlay network consisting of novel smart routers deployed over the Internet backbone. Each video stream is transmitted from a video source to a client over a series of smart routers. The TCP packets transmitted between any two smart routers must still be transmitted according to IP as in other overlay network designs. When a smart router recognizes that two video streams passing through are actually for the same video, it reuses the data packets from the older stream for the younger stream at some time later, and requests the upstream smart router to stop transmitting this younger stream to save network resource. This opportunistic traffic deduplication approach at each smart router dynamically merges independent streams on the overlay into a streaming tree. This is not the same as a multicast tree. All the clients of a multicast tree are at the same play point in the video stream. In contrast, the clients of a streaming tree can be at its own play point in the same video stream. This new capability enables us to exploit the efficiency of multicast for video on demand.
The proposed overlay is not a P2P streaming technique. A peer in a P2P design such as BitTorrent receives its data from different peers for a given video. Our smart router always receives a given video stream from a designated smart router according to the routing algorithm. In other words, our design is a routing, not a file sharing, technique. More importantly, P2P streaming is not a deduplication technique and does not reduce network traffic, the primary problem addressed in our design.
Our overlay design is not a CDN. It is a network communication, not a caching, technique. The different streams are merged in the router fabric at line rate. When a streaming tree session terminates, the video data is not cached in our routers. In contrast, a video is cached in a CDN for a very long duration according to some cache replacement policy (e.g., LRU). Furthermore, using proxy servers does not save bandwidth between these servers and their users (i.e., no traffic deduplication). This can be achieved using our smart routers in regional networks although our current focus is on savings in the Internet backbone.
Our experimental setup consists of six nodes in the PlanetLab network, each running our smart routing software (i.e., software router). Experiments performed on this prototype indicate that several GBytes of traffic can be saved per second with just 2GB of buffer size at each router. It is clear that significant duplicate traffic could be eliminated with a larger SMART overlay deployment.