MongoDB | Model Tree Structures
In MongoDB, there are several ways to represent tree data structures.
Parent References
Here’s what the tree might look like:
- Comment 1 (id: ObjectId("60925b2f912bf6f590cdd3a3"))
- Comment 2 (id: ObjectId("60925b8a912bf6f590cdd3a6"))
- Comment 3 (id: ObjectId("60925b9c912bf6f590cdd3a7"))
- Comment 4 (id: ObjectId("60925bb4912bf6f590cdd3a8"))
- Comment 5 (id: ObjectId("60925bc4912bf6f590cdd3a9"))
- Comment 6 (id: ObjectId("60925bd5912bf6f590cdd3aa"))
In this approach, each document has a reference to its parent. For example, consider a directory structure where each directory can have subdirectories:
{
"_id": ObjectId("60925b2f912bf6f590cdd3a3"),
"parent": null,
"text": "This is a top level comment"
},
{
"_id": ObjectId("60925b8a912bf6f590cdd3a6"),
"parent": ObjectId("60925b2f912bf6f590cdd3a3"),
"text": "This is a reply to the first comment"
},
{
"_id": ObjectId("60925b9c912bf6f590cdd3a7"),
"parent": ObjectId("60925b8a912bf6f590cdd3a6"),
"text": "This is a reply to the second comment"
},
{
"_id": ObjectId("60925bb4912bf6f590cdd3a8"),
"parent": ObjectId("60925b9c912bf6f590cdd3a7"),
"text": "This is a reply to the third comment"
},
{
"_id": ObjectId("60925bc4912bf6f590cdd3a9"),
"parent": ObjectId("60925b8a912bf6f590cdd3a6"),
"text": "This is a reply to the second comment"
},
{
"_id": ObjectId("60925bd5912bf6f590cdd3aa"),
"parent": null,
"text": "This is a top level comment"
}
In this example, each document has a parent
field that contains the ID of its parent document. The root directory has a null
value for its parent
field.
Child References
Here’s what the tree might look like:
- Electronics (id: ObjectId("60925af9912bf6f590cdd3a1"))
- Phones (id: ObjectId("60925b0a912bf6f590cdd3a2"))
- Smartphones (id: ObjectId("60925b23912bf6f590cdd3a4"))
- Feature Phones (id: ObjectId("60925b31912bf6f590cdd3a5"))
- Computers (id: ObjectId("60925b1a912bf6f590cdd3a3"))
In this approach, each document has a reference to its children. For example, consider a category tree where each category can have subcategories:
{
"_id": ObjectId("60925af9912bf6f590cdd3a1"),
"name": "Electronics",
"children": [
ObjectId("60925b0a912bf6f590cdd3a2"),
ObjectId("60925b1a912bf6f590cdd3a3")
],
"parent": null
},
{
"_id": ObjectId("60925b0a912bf6f590cdd3a2"),
"name": "Phones",
"children": [
ObjectId("60925b23912bf6f590cdd3a4"),
ObjectId("60925b31912bf6f590cdd3a5")
],
"parent": ObjectId("60925af9912bf6f590cdd3a1")
},
{
"_id": ObjectId("60925b1a912bf6f590cdd3a3"),
"name": "Computers",
"children": [],
"parent": ObjectId("60925af9912bf6f590cdd3a1")
},
{
"_id": ObjectId("60925b23912bf6f590cdd3a4"),
"name": "Smartphones",
"children": [],
"parent": ObjectId("60925b0a912bf6f590cdd3a2")
},
{
"_id": ObjectId("60925b31912bf6f590cdd3a5"),
"name": "Feature Phones",
"children": [],
"parent": ObjectId("60925b0a912bf6f590cdd3a2")
}
In this example, each document has a children
field that contains an array of IDs of its child documents. The root category has a null
value for its parent
field.
Materialized Paths
Here’s what the tree might look like:
1
└── 2
├── 3
│ └── 4
└── 5
6
In this approach, each document contains a path field that represents its position in the tree. For example, consider a comment tree where each comment can have child comments:
{
"_id": ObjectId("60925b8a912bf6f590cdd3a6"),
"path": "1/2/",
"text": "This is a reply to the first comment",
},
{
"_id": ObjectId("60925b9c912bf6f590cdd3a7"),
"path": "1/2/3/",
"text": "This is a reply to the second comment",
},
{
"_id": ObjectId("60925bb4912bf6f590cdd3a8"),
"path": "1/2/3/4/",
"text": "This is a reply to the third comment",
},
{
"_id": ObjectId("60925bc4912bf6f590cdd3a9"),
"path": "1/2/5/",
"text": "This is a reply to the second comment",
},
{
"_id": ObjectId("60925bd5912bf6f590cdd3aa"),
"path": "6/",
"text": "This is a top level comment",
}
Each document has a path
field that represents its position in the tree. The path
field contains the IDs of all its ancestors in the tree, separated by slashes. For example, the first comment has a path
of 1/
because it is a child of the top level comment. The second comment has a path
of 1/2/
because it is a child of the first comment, which is a child of the top level comment. The third comment has a path
of 1/2/3/
because it is a child of the second comment, which is a child of the first comment, which is a child of the top level comment.
Nested Sets
The Nested Sets pattern identifies each node in the tree as stops in a round-trip traversal of the tree. The application visits each node in the tree twice; first during the initial trip, and second during the return trip. Example of how to represent the same tree:
{
"_id": ObjectId("60925b2f912bf6f590cdd3a3"),
"left": 1,
"right": 12,
"text": "This is a top level comment"
},
{
"_id": ObjectId("60925b8a912bf6f590cdd3a6"),
"left": 2,
"right": 5,
"text": "This is a reply to the first comment"
},
{
"_id": ObjectId("60925b9c912bf6f590cdd3a7"),
"left": 6,
"right": 9,
"text": "This is a reply to the second comment"
},
{
"_id": ObjectId("60925bb4912bf6f590cdd3a8"),
"left": 10,
"right": 11,
"text": "This is a reply to the third comment"
},
{
"_id": ObjectId("60925bc4912bf6f590cdd3a9"),
"left": 7,
"right": 8,
"text": "This is a reply to the second comment"
},
{
"_id": ObjectId("60925bd5912bf6f590cdd3aa"),
"left": 3,
"right": 4,
"text": "This is a top level comment"
}
Each document has a left
and right
field that represent its position in the tree. The left
and right
values are assigned in a way that guarantees that the range between left
and right
contains all the descendants of the node. For example, the top level comment has a left
of 1 and a right
of 12, which means that all its descendants have a left
value greater than 1 and a right
value less than 12. The first comment has a left
of 2 and a right
of 5, which means that all its descendants have a left
value greater than 2 and a right
value less than 5. The second comment has a left
of 6 and a right
of 9, which means that all its descendants have a left
value greater than 6 and a right
value less than 9. And so on.
Array of Ancestors
One approach is to use the Array of Ancestors model, which involves adding an ancestors
array to each document in the tree that contains the IDs of all its ancestors.
Here’s an example to demonstrate how the Array of Ancestors model works. Let’s say we have a simple tree structure consisting of comments on a blog post. Each comment can have zero or more child comments, forming a hierarchy. The top level comments have no parent and form the root of the tree. Here’s what the tree might look like:
Top Level Comment 1 (_id:60925b2f912bf6f590cdd3a3)
|
├── Reply to Comment 1 (_id:60925b8a912bf6f590cdd3a6)
| ├── Reply to Comment 2 (_id:60925b9c912bf6f590cdd3a7)
| | └── Reply to Comment 3 (_id:60925bb4912bf6f590cdd3a8)
| └── Reply to Comment 2 (_id:60925bc4912bf6f590cdd3a9)
|
Top Level Comment 2 (_id:60925bd5912bf6f590cdd3aa)
In this example, Comment 1 is the top level comment and has two children (Comment 2 and Comment 6). Comment 2 has two children (Comment 3 and Comment 5), and Comment 3 has one child (Comment 4).
To represent this tree using the Array of Ancestors model, we would add an ancestors
field to each comment document that contains an array of ancestor IDs. For example:
{
"_id": ObjectId("60925b2f912bf6f590cdd3a3"),
"ancestors": [], // no ancestors, this is the top level comment
"text": "This is a top level comment"
},
{
"_id": ObjectId("60925b8a912bf6f590cdd3a6"),
"ancestors": [
ObjectId("60925b2f912bf6f590cdd3a3")
],
"text": "This is a reply to the first comment"
},
{
"_id": ObjectId("60925b9c912bf6f590cdd3a7"),
"ancestors": [
ObjectId("60925b2f912bf6f590cdd3a3"),
ObjectId("60925b8a912bf6f590cdd3a6")
],
"text": "This is a reply to the second comment"
},
{
"_id": ObjectId("60925bb4912bf6f590cdd3a8"),
"ancestors": [
ObjectId("60925b2f912bf6f590cdd3a3"),
ObjectId("60925b8a912bf6f590cdd3a6"),
ObjectId("60925b9c912bf6f590cdd3a7")
],
"text": "This is a reply to the third comment"
},
{
"_id": ObjectId("60925bc4912bf6f590cdd3a9"),
"ancestors": [
ObjectId("60925b2f912bf6f590cdd3a3"),
ObjectId("60925b8a912bf6f590cdd3a6")
],
"text": "This is a reply to the second comment"
},
{
"_id": ObjectId("60925bd5912bf6f590cdd3aa"),
"ancestors": [], // no ancestors, this is the top level comment
"text": "This is a top level comment"
}
Each document has an ancestors
field that contains an array of object IDs representing all its ancestors in the tree, starting from the top level comment and ending with its immediate parent. For example, the first comment has an empty ancestors
array because.
Member discussion