Neo4j Introduces Property Sharding to Boost Scalability and Support Mixed Workloads
Neo4j has announced a significant advancement in its graph database technology with the introduction of property sharding, a feature aimed at overcoming previous scalability challenges and enabling simultaneous transactional and analytical workloads on a single system.
Infinigraph: A New Distributed Graph Architecture
Earlier this month, Neo4j launched Infinigraph, an innovative distributed graph architecture designed for its self-managed platform. This system is also expected to be available soon in AuraDB, Neo4j's fully managed cloud service. Infinigraph is tailored for enterprise users such as BT Group, Dun & Bradstreet, and BASF, emphasizing scalable and versatile graph database capabilities.
What Sets Infinigraph Apart?
- Graph Data Model: Graph databases organize data via nodes and edges, making them ideal for representing complex relationships—like connections among companies, individuals, or social media interactions.
- Enhanced Scalability: The new property sharding mechanism allows graph structures to be stored cohesively in a single shard and distributed across multiple machines, facilitating horizontal scaling without compromising search and traversal performance.
- Unified Workloads: Unlike traditional systems, Infinigraph can handle both real-time transactional operations and deep analytical queries within the same environment, eliminating the need for separate extract-transform-load (ETL) processes or redundant infrastructure.
> Sudhir Hasbe, Technology President at Neo4j: “Infinigraph sets a new standard for enterprise graph databases: one system that runs real-time operations and deep analytics together, at full fidelity and massive scale.”
Property Sharding Explained
The core innovation, property sharding, preserves the graph's structure as a single cohesive unit while enabling its partitioning across distributed systems. This design ensures efficient graph traversals—critical for analysis—are maintained within individual shards, boosting overall performance and scalability.
Limitations and Industry Perspective
While Infinigraph addresses many scalability issues, industry experts like Robin Schumacher from Gartner suggest that the system might not fully replace traditional relational database management systems (RDBMS) for transactional workloads. He notes:
> "Neo4j’s scalability struggles are well-known, and Infinigraph may help tackle larger graph workloads, but it’s unlikely to displace RDBMS for all transactional use cases."
Historically, scalability issues have influenced enterprise choices. For instance, Jaguar Land Rover opted for TigerGraph over Neo4j in 2021 due to performance concerns with large distributed graphs.
Cost and Competition
Cost remains a concern for many users. Notably, NASA shifted from Neo4j to Memgraph, citing cost advantages after evaluating the Total Cost of Ownership. This highlights ongoing competition and the importance of balancing performance with budget constraints.
Broader Context
Some experts claim that dedicated graph databases might not be necessary for all graph tasks. For example, Carnegie Mellon University’s Andy Pavlo argues that relational databases like PostgreSQL—with extensions like Apache AGE—can perform graph queries effectively, broadening the options for data professionals.
---
Summary
Neo4j's property sharding and Infinigraph mark a notable step toward addressing scalability and workload integration challenges in graph databases. While promising, the industry continues to evaluate the best fit for specific use cases, balancing factors like cost, performance, and complexity.