Peer-to-Peer Technology to Trump Traditional Cloud Storage

On a lovely Thursday afternoon in June, I sat down with Sarah Koo, a data scientist with BitTorrent. I grew up in a time and place where Peer-to-Peer (P2P) transfer was almost commonplace, and I was fortunate enough to have the opportunity to learn more about this unique technology.

 

When Peer-to-Peer transfer is mentioned, the subject of copyright protection is the first thing to come to mind. Though P2P transfer is used, at times, to download Game of Thrones, the technology itself is an innovative protocol designed to efficiently transfer large files. I want to make an explicit statement that neither I, Sarah, nor BitTorrent, condones any forms of illegal file transfers. I am advocating the technology itself, which is commonly used in both academia and business. 

 

Bram Cohen created the Peer-to-Peer transfer protocol about a decade ago. This genius literally coded the program by himself in his bedroom, like many of our city’s tech entrepreneurs. Traditional file transfer protocols connected downloaders to a single server for direct file transfers. If there were 100 users downloading the same file, all these users would connect to the same server. Such traditional methods resulted in significant speed reductions and heavy server strain as demand for the file increased. Cohen's application created a distinct virtual network per file that allowed any invited users to help with distribution.

a protocol that created a virtual network based on the file itself, thus anyone within the network can help distribute the file

Lets say UC Berkeley is working on a massive database, and it needs to distribute it to 10,000 researchers across the country. If each of these 10,000 researchers downloads the file directly from the Cal server, the strain on the single server is intense and extremely inefficient. P2P transfer creates a network that links all invited researchers together, and once any researcher finishes downloading the database, he can act as a file distributor for other researchers within the network. Thus, the strain of distributing the database is crowd sourced to all its participants. 

BitTorrent, the most popular P2P transfer application, breaks down large files into small segments for even more efficient distribution. If the database described above is 20-gigabytes; BitTorrent could break down the file into 20 equal sized 1-gigabyte files. “If I request the file from peers, I can be getting it from 20 different people at the same time.” Thus, each peer only needs to distribute 1/20th of the overall load.

breaks down large files into small segments for more efficient distribution

Because BitTorrent breaks down files into many individual segments, peers can act as distributors the moment they finish downloading any piece of the overall file. Once all the segments are downloaded, BitTorrent reassembles the original file as whole. The ability to deconstruct files and then reassemble the segments after the transfer protocol is complete is groundbreaking, especially in the field of cloud storage.

BitTorrent has utilized this technology to build sync applications that are commonly used in industries such as video editing. In Hollywood, uncut video files run in the hundreds of gigabytes. If multiple video editors are working on the same film, it’s inefficient to retransfer the whole video anytime a minute change is made. Many editors utilize this segmentation technology to transfer and sync only the portions that were altered. Instead of retransferring the whole 100gb video, Hollywood only transfers the 1gb scene that was actually edited. Now that’s efficiency.

 

In our current age of NSA surveillance and CISPA interrogation, privacy and security has become a major concern. BitTorrent is unique among cloud storage applications, because it’s only a protocol, and not a storage system. “There’s no copy of (your files) on our cloud.” “If the government would be to subpoena our records, we wouldn’t have anything to turn over.”

“If the government would be to subpoena our records, we wouldn’t have anything to turn over.”

The lack of centralized storage acts as both a blessing and a curse. On one hand, your files never leave your possession; on the other hand, there’s no backup other than your own. Obviously, there are ways to alleviate this issue, but it’s something to keep in mind. I want to emphasis again that BitTorrent is only a protocol, not a storage service. 

Finally, I want to mention another firm that utilizes a similar technology, or at least one that I found to be of some interest. Bitcasa acts as a traditional cloud storage service, but it’s developing a technology called Infinite Drive: When a duplicate file is uploaded, the protocol searches its database for preexisting files, and earmarks the original without uploading the duplicate. 

If two teenage girls upload the same Justin Bieber album, only one copy is actually stored on Bitcasa's servers. When the album is downloaded, both users access the same file. The technology is interesting, but there are significant areas of concern. If one user alters the album, how would it affect related users who share the same download address. I don’t think this technology is ready for widespread use, but it’s something to look out for in the future.

the word “traditional” to describe cloud storage

It’s funny; it seems like it was only yesterday that Dropbox launched, but this Generation Y technology has already become old news. Earlier in this article, I, inadvertently, used the word “traditional” to describe cloud storage; might as well get out the cane and monocle while at it. The future of file storage will take advantage of Peer-to-Peer transfer protocol. It’s an ingenious method that enables crowd sourcing to distribute large files, and it's capabilities are greatly magnified by BitTorrent’s segmentation technology.

Peer-to-Peer file transfer is secure; but more importantly, it’s efficient.