In this post we take a look at Facebook’s latest upgrade, Big Basin, to understand how some of the biggest tech giants are preparing for the onslaught of AI and big data. By preparing our server infrastructure to handle the need for more processing power and better storage, we can make sure our organizations stay in the lead.
Facebook’s New Server Upgrade
Earlier this year Facebook introduced its latest server hardware upgrade, Big Basin. This GPU powered hardware system replaces Big Sur, which was Facebook’s first system dedicated to machine learning and AI from 2015. Big Basin is designed to train neural models that are 30% larger so Facebook can experiment faster and more efficiently. This is achieved through greater arithmetic throughput and a memory increase from 12GB to 16GB.
One major feature of Big Basin is the modularity of each component. This allows new technologies to be added without a complete redesign. Each component can be scaled independently depending on the needs of the business. This modularity also makes servicing and repairs more efficient, requiring less downtime overall.
Why does Facebook continue to invest in fast multi-GPU servers? Because it understands that the business depends on it. Without top-of-the-line hardware, Facebook can’t continue to lead the market in AI and Big Data. Let’s dive into each of these areas separately to see how they apply to your business.
Facebook’s Big Basin server was designed with AI in mind. It makes complete sense when you look at their AI first business strategy. Translations, image searching, and recommendation engines all rely on AI technology to enhance the user experience. But you don’t have to be Facebook to see the benefit of using AI for business.
Companies are turning to AI to assist data scientists in identifying trends and recommending strategies for the company to focus on. Technology like idiomatic can crunch through a huge number of unsorted customer conversations to pull out useful quantitative data. Unlocking the knowledge that lives in unstructured conversations with customers can empower the Voice of the Customer team to make strong product decisions. PWC uses AI to model complex financial situations and identify future opportunities for each customer. They can look at current customer behavior and determine how each segment feels about using insurance and investment products, and how that changes over time. Amazon Web Services uses machine learning to predict future capacity needs. In 2015, a study suggested that 25% of companies currently use AI, or would in the next year, to enable better business decision making.
But all of this relies on the technological ability to enable AI in your organization. What does that mean in practice? Essentially, top of the line GPUs. For simulations that require the same data or algorithm run over and over again, GPUs far exceed the capabilities of CPU computing. While CPUs handle the majority of the code, sending any code that requires parallel computation to GPUs massively improves speed. AI requires computers to run simulations many, many times over, similar to password-cracking algorithms. Because the simulations are very similar, you can tweak each variable slightly and take advantage of the GPU shared memory to run many more simulations much faster. This is why Big Basin is a GPU based hardware system – it’s designed to crunch enormous amounts of data to power their AI systems. To get an idea of the power involved, take a look at this:
Processing speed is especially important for deep learning and AI because of the need for iteration. As engineers see the results of experiments, they make adjustments and learn from mistakes. If the processing is too slow, a deep-learning approach can become disheartening. Improvement is slow, a return on investment seems far away and engineers don’t gain practical experience as quickly, all of which can drastically impact business strategy. Say you have a few hypotheses that you want to test when building your neural network. If you aren’t using top quality GPUs, you’ll have to wait a long time between testing each hypothesis, which can draw out development for weeks or months. It’s worth the investment in fast GPUs.
Data can come from anywhere. Your Internet-of-Things toaster, social media feeds, purchasing trends or attention-tracking advertisements are all generating data at a far higher rate than we’ve ever seen before. The last estimate is that digital data created worldwide would grow from 4.4 zettabytes in 2013 to 44 zettabytes by 2020. A zettabyte of data is equal to about 250 billion DVDs, and this growth is coming from everywhere. For example – a Ford GT generates about 100GB of data per hour.
The ability to make this influx of data work for you depends on your server infrastructure. Even if you’re collecting massive amounts of data, it’s not worth anything if you can’t analyze it, and quickly. This is where big data relies on technology. Facebook uses big data to drive its leading ad-tech platform, making advertisements hyper-targeted.
As our data storage needs to expand to handle Big Data, we need to keep two things in mind: accessibility and compatibility. Without a strong strategy, data can become fragmented across multiple servers, regions, and formats. This makes it incredibly difficult to form any conclusive analysis.
Just as AI relies on high GPU computing power to run neural network processing, Big Data relies on quick storage and transport systems to retrieve and analyze data. Modular systems tend to scale well and also allow DevOps teams to work on each component separately, leading to more flexibility. Because so much data has to be shuttled back and forth, investing in secure 10-gigabit connections will make sure your operation has the power and security to last. These features can be grouped into the 3 vs: data storage capacity (volume), rapid retrieval (velocity), and analysis (verification).
Big data and AI work together to superpower your strategy teams. But to function well, your data needs to be accessible and your servers need to be flexible enough to handle AI improvements as fast as they come. Which, it turns out, is pretty quick.
What This Means For Your Business
Poor server infrastructure should never be the reason your team doesn’t jump on opportunities that come their way. If Facebook’s AI team wasn’t able to “move fast and break things” because their tools couldn’t keep up with neural network processing demands, they wouldn’t be where they are today.
As AI and Big Data continue to dominate the business landscape, server infrastructure needs to stay flexible and scalable. We have to adopt new technology quickly and need to be able to scale existing components to keep up with ever increasing data collection requirements. Clayton Christensen recently tweeted, “Any strategy is (at best) only temporarily correct.” When strategy changes on a dime, your technology stack better keep.
Facebook open sources all of its hardware design specifications, so head on over and check it out if you’re looking for ways to stay flexible and ready for the next big business advantage.