Skip to content

Commit e4c5098

Browse files
Add documentation for shard global projections
1 parent 820b17a commit e4c5098

File tree

1 file changed

+54
-5
lines changed
  • doc/modules/ROOT/pages/production-deployment

1 file changed

+54
-5
lines changed

doc/modules/ROOT/pages/production-deployment/fabric.adoc

Lines changed: 54 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,10 @@ Neo4j Fabric is a way to store and retrieve data in multiple databases, whether
99
For more information about Fabric itself, please visit the https://neo4j.com/docs/operations-manual/4.4/fabric/introduction/[Fabric documentation].
1010

1111
A typical Neo4j Fabric setup consists of two components: one or more shards that hold the data and one or more Fabric proxies that coordinate the distributed queries.
12-
Currently, the way of running the Neo4j Graph Data Science library in a Fabric deployment is to run GDS on the shards.
13-
Executing GDS on a Fabric proxy is currently not supported.
12+
There are two ways of running the Neo4j Graph Data Science library in a Fabric deployment, both of which are covered in this section:
13+
14+
. Running GDS on a Fabric <<fabric-shard, _shard_>>
15+
. Running GDS on a Fabric <<fabric-proxy, _proxy_>>
1416

1517
[[fabric-shard]]
1618
== Running GDS on the Shards
@@ -75,7 +77,54 @@ The query first connects to the analytical database where the PageRank algorithm
7577
The algorithm results are streamed to the proxy, together with the unique node id.
7678
For every row returned by the first subquery, the operational database is then queried for the persons name, again using the unique node id to identify the `Person` node across the shards.
7779

78-
[[fabric-shard-limitations]]
79-
=== Limitations
8080

81-
* It is not possible to run algorithms across shards.
81+
[[fabric-proxy]]
82+
== Running GDS on the Fabric Proxy
83+
84+
In this mode of using GDS in a Fabric environment, the GDS operations are executed on the Fabric proxy server.
85+
The graph projections are then using the data stored on the shards to construct the in-memory graph.
86+
87+
NOTE: Currently only xref:management-ops/projections/graph-project-cypher-aggregation.adoc[Cypher Aggregation] is supported for projecting in-memory graphs on a Fabric proxy.
88+
89+
Graph algorithms can then be executed on the Fabric proxy, similar to a single machine setup.
90+
This scenario is useful, if a graph, that logically represents a single graph, is distributed to different Fabric shards.
91+
92+
[[fabric-proxy-setup]]
93+
=== Setup
94+
95+
In this scenario we need to set up the proxy to run the Neo4j Graph Data Science library.
96+
97+
The dbms that manages the Fabric proxy database needs to have the GDS plugin installed and configured.
98+
For more information see xref:installation/index.adoc[Installation].
99+
The proxy node should also be configured to handle the amount of data received from the shards as well as executing graph projections and algorithms.
100+
101+
Fabric shards do not need any special configuration, i.e., the GDS library plugin does not need to be installed.
102+
103+
[[fabric-proxy-examples]]
104+
=== Examples
105+
106+
Let's assume we have a Fabric setup with two shards.
107+
Both shards function as the operational databases and hold graphs with the schema `(Person)-[KNOWS]->(Person)`.
108+
109+
We now need to query the shards in order to drive the import process on the proxy node.
110+
111+
[source, cypher, role=noplay]
112+
----
113+
CALL {
114+
USE FABRIC_DB_NAME.FABRIC_SHARD_0_NAME
115+
MATCH (p:Person) OPTIONAL MATCH (p)-[:KNOWS]->(n:Person)
116+
RETURN p, n
117+
UNION
118+
USE FABRIC_DB_NAME.FABRIC_SHARD_1_NAME
119+
MATCH (p:Person) OPTIONAL MATCH (p)-[:KNOWS]->(n:Person)
120+
RETURN p, n
121+
}
122+
WITH gds.alpha.graph.project('graph', p, n) AS graph
123+
RETURN
124+
graph.graphName AS graphName,
125+
graph.nodeCount AS nodeCount,
126+
graph.relationshipCount AS relationshipCount
127+
----
128+
129+
We have now projected a graph with 5 nodes and 4 relationships.
130+
This graph can now be used like any standalone GDS database.

0 commit comments

Comments
 (0)