An array is embedded in the Mongo aggregate nested array document, which is to aggregate the embedded array.
- MongoDB 3.6
Data sample
{
"_id" : ObjectId("5aa63f7034cdbd6928f831ff"),
"domain" : "metacafe.com",
"cdn" : "LLNW",
"geo_bytes" : [
{
"bytes" : 223,
"geo" : "Malta"
},
{
"bytes" : 446,
"geo" : "Canada"
},
{
"bytes" : 892,
"geo" : "The Philippines"
},
{
"bytes" : 223,
"geo" : "Spain"
}
],
"time" : 1520697600
}
geo_bytes is an embedded array, geo represents region, bytes represents traffic, and time is timestamp. The demand is similar: calculate the country distribution of traffic in a week? , the query description is to sum the bytes field according to the geo field.
Query writing
db.getCollection('daily_coll').aggregate(
{"$match": {
"time": {"$gte": 1520697600, "$lt": 1520697601},
}},
{"$unwind": "$geo_bytes"},
{"$group": {
"_id": {
"geo": "$geo_bytes.geo"
},
"bytes": {"$sum": "$geo_bytes.bytes"}
}},
{"$project":{
"_id": 0,
"geo": "$_id.geo",
"bytes": "$bytes"
}},
{"$sort": {"bytes": -1}}
)
The key is $unwind, Document address Its function is to make the embedded array refer to the current document and make the embedded structure flat.
The above sample data is queried as follows
db.getCollection('daily_coll').aggregate(
{"$match": {
"time": {"$gte": 1520697600, "$lt": 1520697601},
}},
{"$unwind": "$geo_bytes"}
)
As a result, the data will be split into four pieces
{
"_id" : ObjectId("5aa63f7034cdbd6928f831ff"),
"domain" : "metacafe.com",
"cdn" : "LLNW",
"geo_bytes" : {
"bytes" : 223,
"geo" : "Malta"
}
"time" : 1520697600
}
{
"_id" : ObjectId("5aa63f7034cdbd6928f831ff"),
"domain" : "metacafe.com",
"cdn" : "LLNW",
"geo_bytes" : {
"bytes" : 446,
"geo" : "Canada"
}
"time" : 1520697600
}
...
Then we can use the pipeline to aggregate.