Top tech companies embrace open source, Facebook, Google, Twitter and LinkedIn open source 12 technologies1. Google – MapReduce2. Google – Kubernetes3. Google – TensorFlow4. Facebook – Open Compute Project5. Google – Open Compute Project6. Facebook – Big Sur7. Facebook – Torch8. Facebook – Cassandra9. Twitter – Aurora, Storm10. NetFlix – Chaos Monkey11. LinkedIn – Kafka12. Airbnb – Air Flow
These companies are compelled to take care of the data on an exceptional scale considering that they own of the largest data centres in the world. However, it needs some innovative methods to handle it. In some cases, these can be seen to slowly filter down to an extensive range of businesses. So, why do they disclose the trade secrets to the broader world? The reason is that these internet firms gain access to a lively community of developers that can improve their own technology for free by just giving them a glimpse of their internal systems. Facebook, for instance, has claimed that its Open Compute Project has saved it $2 billion in data centre costs. Similarly, Google too has now joined Facebook’s Open Compute Project, providing more insights into the operation of its huge data centres that allows other to take advantage of its energy efficient hardware designs. Here are some of the innovative technologies from the big web companies that are now seeing wider use.
1. Google – MapReduce
Over 20 million of codes and hundreds of open source projects have been released by Google. MapReduce programming model is one of Google’s most prominent creations that allowed it to crunch huge data sets across large clusters of servers. Even though it is no longer used at Google, MapReduce’s legacy has been the motivation for the open source Hadoop platform, along with the Google File System. Ever since Hadoop was created by former Yahoo employee Doug Cutting, it is the most commonly used in the recent years with several IT vendors selling their own services based on the software.
2. Google – Kubernetes
Though many pointed out that containerisation is old, the technology has been one of the major buzz words of recent years. Depending on its secretive Borg and Omega technologies to run workloads internally for years, Google reportedly used around two billion containers to manage applications in its data centres. These platforms have provided the basis for its open source Kubernetes container cluster management platform, which has been made available to the public since June 2014. Kubernetes has been picked by a variety of huge businesses looking for a lightweight substitute to virtual machines.
3. Google – TensorFlow
Google’s AI system Tensor Flow sits at the heart of Google’s impressive search capabilities for Google Photos, voice recognition tools and Google Translate. In order to help quicken wider developments around the technology, which is still in its early stages, the machine learning tool was open sourced last month. “It’s a highly scalable machine learning system,” Google CEO Sundar Pichai said of TensorFlow in a blog post. “TensorFlow is faster, smarter, and more flexible than our old system, so it can be adapted much more easily to new products and research.”
4. Facebook – Open Compute Project
Facebook has taken an interesting tactic to its open source accomplishments, concentrating to a larger extent on hardware rather than software. Four years ago, in an attempt to “revolutionise data centre hardware,” the social media giant launched the initiative, which looks to share its data centre design inventions. “The result is that today we have open-sourced every major physical component of our data centre stack — a stack that is powerful enough to connect 1.39 billion people around the world and is efficient enough to have saved us $2 billion in infrastructure costs over the last three years. But we’re not finished — not even close.” Initiatives include Yosemite, announced earlier this year as what Facebook claims to be the ‘first open source modular chassis for high powered microservers’. However, it remains to be seen whether or not the OCP will filter down to more mainstream businesses although some of the bigger enterprises are using it, such as large banks, with Goldman Sachs represented on the board – but a number of tech firms such as Apple and Microsoft have already been won over.
5. Google – Open Compute Project
Google also joined OCP, adding to the ranks of service providers and some the world’s biggest banks such as Goldman Sachs and Bank of America. Google offered a new design for server racks that could help cloud data centres reduce their energy bills. Its first contribution will be a new rack design that dispenses power to servers at 48 volts, compared with the 12 volts that’s common in most data centres. “Today’s launch is a first step in a larger effort. We think there are other areas of possible collaboration with OCP,” wrote John Zipfel, technical program manager, Google, in a blog post. “We’ve recently begun engaging the industry to identify better disk solutions for cloud based applications. And we think that we can work with OCP to go even further, looking up the software stack to standardise server and networking management systems.”
6. Facebook – Big Sur
Recently, Facebook improved its open source hardware concept offering a server framework that is targeted directly at AI use cases – dubbed Big Sur. The social network firm went a step ahead of Google with its recent decision to open source its own AI software library, TensorFlow by sharing its server blueprints. Facebook also recently started to build custom servers based around Nvidia GPUs – chips that were originally planned for rendering computer game images but have proved to be well matched to deep learning.
7. Facebook – Torch
Facebook internally uses deep learning, for instance, to filter information on Facebook feeds, and has open-sourced some of the modules it generated as part of the Torch deep learning framework. The algorithms created by its Facebook AI Research (FAIR) team are claimed to be quicker than those already available in Torch, which is also used by others such as Google and Twitter.
8. Facebook – Cassandra
Facebook engineers Avinash Lakshman and Prashant Malik created the non-relational database Cassandra as a means to power its inbox search function. On its public release in 2008, Lakshman said, “Cassandra is a distributed storage system for managing structured data that is designed to scale to a very large size across many commodity servers, with no single point of failure.” Even though Facebook no longer uses Cassandra itself, it is used within other large tech firms such as Apple, Twitter, and Netflix, while five-year software firm DataStax is helping promote the technology among more traditional enterprises.
9. Twitter – Aurora, Storm
The social media firm is a major open source software user, and has paid back to the community in a several ways. Google developer, Bill Farner created its Aurora framework by taking a lead from Google’s Borg microservices architecture. Aurora builds on top of Apache Mesos and offers common features that allow any site to run large-scale production applications. It is able to make scheduling decisions, such as moving a service onto a healthy machine in the event of a failure, guaranteeing better dependability. “Aurora is software that keeps services running in the face of many types of failure, and provides engineers a convenient, automated way to create and update these services,” the company said in a blog post. A number of other companies are now using the software. Other projects include Bootstrap and Storm, which is used to examine large-scale data streams created by millions of Twitter feeds.
10. NetFlix – Chaos Monkey
NetFlix, as a major AWS user, wanted a way to test resiliency of its applications running in the cloud. Chaos Monkey was developed with the purpose of artificially creating problems with virtual machines hosted by the public cloud provider – releasing its Simian Army to test that its systems are able to respond to random failures on the network.
11. LinkedIn – Kafka
Before being open sourced in 2011, Kafka was created by business networking site LinkedIn for internal use. The team of engineers that made the real-time, distributed messaging system left the company last year, to start a new business concentrating on Kafka, called Confluent. Kafka counts a number of large tech companies among its users, such as NetFlix, Spotify, and Uber, as well as more typical enterprises making an effort to reform their operations, such as William Hill.
12. Airbnb – Air Flow
Last year, the home-rental firm revealed plans to open source two data mining tools at its OpenAir engineering summit, AirFlow and Aerosolve. Airflow is a data workflow management framework that is available under the Apache licence, supporting authoring, scheduling and observing of data pipelines. It has also opened up its Aerosolve machine learning tool, which is used within to support features such as its price suggestion engine for those renting properties.