Researchers at Aalborg University, MIT and Caltech have developed a new mathematically-based technique that can boost internet data speeds by up to 10 times, by making the nodes of a network much smarter and more adaptable. The advance also vastly improves the security of data transmissions, and could find its way into 5G mobile networks, satellite communications and the Internet of Things.
The problem with TCP/IP
Data is sent over the internet in "packets," or small chunks of digital information. The exact format of the packets and the procedure for delivering them to their destination is described by a suite of protocols known as TCP/IP, or the internet protocol suite, designed in the early 70s.
Back when it was conceived, the internet protocol suite was a tremendous leap forward that revolutionized our paradigm for transmitting digital information. Remarkably, 40 years on, it still forms the backbone of the internet. However, despite all its merits, few would say that it is particularly efficient, secure or flexible.
For instance, in order for a TCP data transmission to be successful, the recipient needs to collect the packets in the exact order in which they were sent over. If even a single packet is lost for any reason, the protocol interprets this as a sign that the network is congested – the transmission speed is immediately halved, and from there it attempts to rise again only very slowly. This is ideal in some situations and terribly inefficient in others. The issue is that the protocol doesn't have the intelligence to know what the right thing to do is.
Also, although the packets could take a theoretically infinite number of paths to travel between point A and point B in a network, it turns out that data in a TCP connection always travels along the same path – which makes it quite easy for an eavesdropper to spy on your communications.
Network coding – the solution?
An interesting proposal that might offer the solution to these problems is so-called network coding, which aims to make each node in the network much smarter that it currently is. In TCP/IP, the nodes of the network are just simple switches that can only store data packets and then forward them to the next node along their predetermined route; by contrast, in network coding each node can elaborate packets as needed, for instance by re-routing or re-encoding them.
Adding intelligence at the node level may be a truly disruptive change, because it allows for unparalleled flexibility in the way information is handled. For instance, it can take advantage of multipath TCP (implemented in iOS 7) and, on top of it, add an encoding mechanism that further increases security and speed, or even enable data storage right within the nodes of the network.
In a recent study, a team of researchers from Aalborg University (Denmark), MIT and Caltech have built an implementation of just such a protocol, displaying some impressive speed gains. In a demo, a four minute-long mobile video was downloaded five times faster than with the state of the art technology, and was then streamed without interruptions.
"In experiments with our network coding of Internet traffic, equipment manufacturers experienced speeds that are five to 10 times faster than usual. And this technology can be used in satellite communication, mobile communication and regular internet communication from computers," says Prof. Frank Fitzek, who led the study.
How it works
Whether the contents of a packet are part of a YouTube video, a text or a song, they are nonetheless encoded by a string of zeros and ones, which can also be seen as a number in binary format.
In TCP/IP, the nodes of a network treat data packets individually by simply storing their content and relaying it to the next node. But in the protocol developed by Fitzek and colleagues, the content of the packet is seen as an actual number, and packets are processed in chunks. Each node builds a set of linear equations, using both the numbers extracted from the content of the packets and a set of randomly generated coefficients.
Each linear equation forms a "coded packet" where the coefficients are stored inside the coded packet's header, and the unknown variables are the actual contents of the packets, treated as a number. In other words, each coded packet contains partial information on several "standard" packets at once, but multiplied by different coefficients.
As you might remember from high school math, you need N linear equations to solve for N unknown variables. Because each coded packet contains a single equation, this means that the recipient will need N packets (with different coefficients) before it can decode the data.
But why go to the trouble of complicating things so much? The answer is that now, unlike with TCP/IP, the recipient doesn't need to receive packets in order. In fact, the order in which packets are received becomes completely irrelevant. All that matters is that the recipient obtains N coded packets, all with different coefficients, so it can solve the equations and obtain the original data.
This flexibility in the order means that the whole system is much more efficient, because all the packets are interchangeable. A lost packet is no longer cause for severe transmission delays as in TCP/IP.
And because the order doesn't matter, the packets can now travel along different paths through the network. This also increases security, because it becomes nearly impossible for anyone to intercept the communication by tapping into a single line.
What's next?
The technology could find application in 5G telecommunications, the Internet of Things, and software-defined networks. Moreover, the intelligence of the network also opens up the possibility of vastly distributed storage solutions directly within the network.
"I think the technology will be integrated in most products because it has some crucial and necessary functions," says Fitzek. "The only thing that can stop the development is patents. Previously, individual companies had a solid grip on patents for coding. But our approach is to make it as accessible as possible."
Sources: Aalborg University, Franz Fitzek
With IPv6 routers no longer perform packet fragmentation for instance because there is a need for them to be faster/dumber/ cheaper. Out of sequence packets are generally fine with TCP because the packets are each numbered for reassembly.
Data could be sped up essentially by rolling CDN functionality directly into the protocol (today it is mostly a feature bolted on to DNS). Hosts can be assigned a CDN node through DHCP option 82 and send reference it with their GET request, the server then points them to that CDN and the client reports performance statistics back to the centralized controller for monitoring. It saves the trouble of having to know the location of all the caching DNS servers doing lookups to guess which cluster to serve the client out of and you can always still just point the client request elsewhere if there a reason to (like degraded performance).
This wouldn't be terribly difficult to deploy into the protocol but I believe Verizon and some other people own patents that would prevent it. Thankfully some of those patents are nearing their lifespan.
The notion that free market increasing competition and lowering prices in the US is a lie at the Big Corporation level. Look at other markets, like cable/data services. The companies don't compete against each other, they just quietly collude to set up fiefdoms around the country like a bunch of mini-monopolies, thereby keeping prices high and those huge bonuses coming in for the executives. There are HUNDREDS of examples of less gigantic companies (with less lawyers, resources and Congressional clout) where companies are found guilty of collusion (essentially conspiracy) and they just quietly pay their fines and then try and find another way to game the system. Monsanto and Japanese agribusiness chemical suppliers and recent flatscreen manufacturers come to mind—WE NEED STRONG WATCH DOGS TO PROTECT CONSUMERS—AND GOVERNMENT OFFICIALS NOT OWNED BY THE CORPORATIONS THEY'RE SUPPOSED TO REGULATE.
According to Pulitzer Prize winner Investigative reporter David Cay Johnston, author of "The Fine Print.":
The Internet was invented in the U.S., but we've fallen behind other countries in terms of access and speed. Our service is more expensive than in any of those countries. Why? Wealthy corporations have worked the regulatory system to their advantage to their benefit so that the fees that banks and phone and cable companies have added over the years that have made your bills incrementally larger but have added up to big money for corporations.
Through various fees above the stated cost, in our phone bills, we've actually been paying, over the years, to create the cable network that provides Internet access and cable TV. We've paid, between cable company rate increases and telephone company rate increases, over a half-trillion dollars to get the Internet.
Per bit of information, we pay 38 TIMES what the Japanese pay. The US now rank 29th in the speed of our Internet, according to Pando Networks. We're way behind countries like Lithuania, Ukraine and Moldavia in the speed of our Internet.
American triple-play packages average about $160 a month, including fees. The same service in France is only $38 a month with an Internet that is 10 times faster uploading - downloading and 20 times faster uploading, with much broader international television stations than you get here.
Thanks to our donation/payola loving local, state and national elected officials, US consumer now dance to the tune of our corporations—NOT the other way around.
This country may have once been the envy of the world, it's a sad, corrupt place now.
This proposed method sounds like just another compression algorithm. It will also suffer from dropped packets. If it doesn't, it would have had to bloat out the payloads to provide redundancy. You can't get something for nothing.
As for storing data in the nodes.... well since the "node" is a router, I don't think that will happen any time soon.
It's true the executives are making too much money but that's also true of most public companies. After calculating for other factors like population density and economy I might still pay too much for Internet but the difference is likely a miniscule percentage of our yearly budget, its really amazing how much mental energy people spend complaining about it.
People pay half their income in taxes and still have to pay out of pocket for things like education, healthcare, and retirement. Middle class housing in some areas is anywhere from $250k to $1,000,000 and some people come out of college with 100k in student loans to pay back but what matters is the ~$180 you could save in a year by paying $15/month less for Internet access. If it was really that profitable more companies would be coming up with the money needed to enter the market or expand aggressively. There are a lot of costs associated with new deployments and it takes a long time to realize the investment of building out.
@christopher A half decent read is http://en.wikipedia.org/wiki/IPv6#Simplified_processing_by_routers Essentially things like packet fragmentation, checksum, and queue timers are removed from IPv4 to IPv6. Checksum was purposely removed because its a redundant function with lower layers of the IP stack. IPv6 also uses a simplified fixed length header. It's expensive to run routers with a lot of features and processing power when you mostly just need them to move a lot of data cheaply and efficiently.
There is a huge difference in cost/bit for optical transport hardware vs router hardware in part because transport can just dumbly push on the data without having to do computationally expensive work like reading packet headers along the way. Facebook, Google etc. are frustrated with router costs and have taken to rolling their own (SDN based) routing platforms to force down costs. Another example of this was the industry move from SONET based circuits to cheaper ethernet based circuits/hardware.
There are better solutions to the problem than this (like pushing content delivery storage closer to the edge). There also isn't much need to send an existing TCP packet flow over multiple physical paths because routers have thousands of data transfers at any given time and they can just load balance per flow to achieve utilization of another path without having to subject individual flows to jitter.
Don't get me wrong, there is nothing wrong with new ideas and an outside the box look at the problem but I don't see it as a viable solution.
The answer will be complex, but the principle is simple: limit the involvement of politics as much as possible.
piper Tom: "... as much as possible ..." is correct. When people are willing to give up the worship of rulers, i.e., so-called leaders, as a viable social system, then we can begin to use the newly freed up market to provide security. It will quickly be realized that it was always "possible" to live a civilized life without being controlled, and be more prosperous.
We don't have to fear self-governance any more than free markets. You can't have one without the other.