We create as much data as we did from the beginning of civilization up until 2003 in just two days now. That would be around five Exabytes of information. The interesting stats were revealed by Google’s ex-CEO Eric Schmidt at a conference in 2010. This enormous growth of data shows no signs of slowing down. When the amount and complexity of data keeps increasing, we need better hardware systems to store and analyse it. Speed is one of the primary concerns when it comes to big data analytics. This is exactly why Cray came up with the Urika-GX, a supercomputer built with big data in focus. Cray is a US based computer manufacturer providing enterprise level computing solutions. The company has just unveiled the Urika-GX analytics system with open source software to help enterprises overcome the big data challenges.
Urika-GX is an open, enterprise framework primarily designed for the analytics market. The system can deliver data scientists improved levels of performance and the capability to derive insights from huge amounts of data at lightning speeds. Highly interactive analytics, integrated graph analysis and rapid pattern matching are some of the most notable improvements on the system. Here is more about the upcoming data powerhouse.
Decisions made by companies are now completely on a data-driven basis. The reason is obviously that the data-driven decisions gives them a competitive advantage. To truly have this competitive advantage, it is essential to make these decisions at higher frequencies and speeds in the most flexible way possible. This is the thinking that led to Urika-GX and from what we’ve heard so far, it actually stands up to the expectations. The GX will enable companies to test different hypotheses concurrently and get the best results quickly.
The Urika GX is meant for the data scientist who wants to make high quality analyses and data discovery with standard tools like Spark and Hadoop on a turbo charged hardware built for data analytics. The makers believe that this improved performance offered by Urika GX can help researchers go deeper into the data and find new relationships, patterns and dependencies.
Under the hood
The Urika-GX features top of the line hardware that can deliver the best performance possible with today’s technology. It is powered by Intel’s modern Broadwell processors, with 1,728 cores per deployment. It can support up to 35 terabytes of storage on its local SSD and has 22 terabytes of memory for high speed data handling. Aries supercomputing interconnect will enable it to solve the most demanding big data problems with its unmatched network performance. The initial configuration options will feature 16, 32 or 48 nodes with the industry standard 42U 19 inch rack. It runs on a standard Apache framework with pre-integrated Hadoop and Spark. The base software stack runs Linux, on top of which is the CentOS kernel and other modifications to make it a minimalistic and lightweight OS.
Performance is the main advantage with the Urika-GX. The benchmark tests done by Cray against a major cloud provider proved it to be twice as fast when performing simple tasks like loading, partitioning and four times faster on complex tasks like PageRank (An algorithm used by Google to rank sites). This improved performance and usability can make data analytics faster and less complicated.
Since the scope, size and complexity of big data analytics is rapidly increasing, problems and challenges associated with them are also on the rise. The increasing pressure to get faster insights from the data and struggle with newer applications demands a game changing system that can take higher loads. Since it runs on an open source framework, there are no limitations when it comes to testing and using newer technologies on the system.
The Cray Urika-GX system gives enterprises a powerful tool for delivering insights from data at faster rates. The system is presently being tested in healthcare, cybersecurity and life sciences industries. The initial versions are expected to be available by Q3 2016 and larger ones will be out by the end of the year. With hardware fine-tuned for performance and highly optimized software, the new supercomputer is all set to make big data analytics smoother and faster than ever before.