New Graph Database Performance Benchmark Confirms Graph Databases Are Ready For Solving Real-World Business Intelligence, Data Challenges
TigerGraph Consistently Outperforms Neo4j in Graph Data Management Performance Study
TigerGraph, the only scalable graph database for the enterprise, announced the results of a comprehensive graph data management study by University of California researchers that compared TigerGraph and Neo4j, measuring each company’s performance against a key industry benchmark. This benchmark determines a graph database’s data management capabilities using business intelligence (BI) queries.
The study is a complete implementation of the Linked Data Benchmark Council Social Network Benchmark (LDBC SNB), considered the reference standard for evaluating graph technology. It compared two native graph database systems — TigerGraph and Neo4j — in their loading, storage and execution of 46 queries across a range of short-running (OLTP) and long-running (OLAP) inquiries: interactive short, interactive complex, and business intelligence, which explores large portions of the graph in search of occurrences of patterns that combine both structural and attribute predicates of varying complexity.
The study found that TigerGraph consistently outperformed Neo4j, more than 100 times faster in some cases, with that gap increasing with the size of data. For the BI queries, Neo4j was able to complete only 12 of 25 sophisticated BI queries in a reasonable time (five hours). An example of a BI query is one that finds all the comments which are in response to a particular post or set of posts, and then adds up the number of comments or posts by person. Given people can add replies to each comment, the depth of graph traversal can be both deep (10 or more hops) and variable. Since BI graph queries can be computationally and logically complex, the LDBC BI benchmark is a good measure of a graph database’s real-world ability to operate at scale. Built-in parallelism in a graph database such as TigerGraph is key to efficiently answer these complex BI queries.
The study, performed by University of California Merced computer scientists Florin Rusu and Zhiyi Huang, is the first complete test of graph database vendors’ performance with intensive analytical and transactional workloads. In addition to thoroughly sizing up the performance of the 46 queries on four data input scale factors, from 1GB to 1TB, the study also measured bulk loading time and storage size.
As such, the study is a unique assessment of graph analytics platforms’ ability to handle real-world data challenges in real time, regardless of how large or complex the data set is. The power to execute increasingly arduous computations in real time is crucial for many of today’s most important applications, such as fraud and money laundering detection, customer 360, security analytics, hyper-personalized recommendation engines, artificial intelligence and machine learning.
In thoroughly covering the full spectrum of the LDBC SNB benchmark – from interactive short to BI – the study addressed what Gartner recently described as a need for data and analytics leaders to “consider the ecosystem holistically in order to get value most efficiently from your data and analytics landscape” and that “it no longer makes sense to evaluate analytic and operational use cases in isolation.”
Key findings include:
- TigerGraph is the first scalable graph database to demonstrate both Analytics and Transactional Processing capability. In the graph database market, vendors can handle either OLTP or OLAP, but not both – until now. As the first evaluation of performance with both interactive and sophisticated BI query workloads, the study demonstrated TigerGraph’s unprecedented ability to scale to ever-larger data sets when performing complex BI queries, even with graph patterns, which is considered harder than relational joins.
- Only TigerGraph could scale to 1TB. TigerGraph was shown to consistently outperform Neo4j on the majority of queries by two or more orders of magnitude (100x factor) on certain interactive complex and business intelligence queries. On the larger datasets, Neo4j often timed out, unable to complete the query within five hours.
- TigerGraph shows significantly compressed storage on typed property graph data as compared to Neo4j. While bulk loading time is comparable for these two databases, the storage sizes are significantly different, with Neo4j’s being approximately four times that of TigerGraph in all settings.
“This study confirms that graph is ready for complex business intelligence and analytics in addition to its known operational capabilities,” said Dr. Yu Xu, CEO and founder, TigerGraph. “TigerGraph’s customers in pharmaceutical, healthcare, financial services, internet, telecom, and government have been using our native parallel graph architecture to blend operational and analytics capabilities on the same graph platform to deliver innovative applications with new capabilities. It’s great to see an independent benchmark from the University of Merced confirming the maturity of graph technology to facilitate broader industry adoption.”