代码之家  ›  专栏  ›  技术社区  ›  JoeSlav

如何用MongoDB对文档中的多维数组求和

  •  1
  • JoeSlav  · 技术社区  · 6 年前

    假设我有很多以下文件:

    {
      _id: “abc”,
      values: {
        0: { 0: 999999, 1: 999999, …, 59: 1000000 },
        1: { 0: 2000000, 1: 2000000, …, 59: 1000000 },
        …,
        58: { 0: 1600000, 1: 1200000, …, 59: 1100000 },
        59: { 0: 1300000, 1: 1400000, …, 59: 1500000 }
      }
    }
    {
      _id: “def”,
      values: {
        0: { 0: 999999, 1: 999999, …, 59: 1000000 },
        1: { 0: 2000000, 1: 2000000, …, 59: 1000000 },
        …,
        58: { 0: 1600000, 1: 1200000, …, 59: 1100000 },
        59: { 0: 1300000, 1: 1400000, …, 59: 1500000 }
      }
    }
    

    基本上是一个由60x60项组成的多维数组。 聚合(或任何其他mongodb构造)可以用来轻松地求两个(或更多)矩阵的和吗?即 values[x][y] 两者皆有 abc def 一起求和,对所有其他元素也一样? 理想情况下,输出应该是一个类似的多维数组。

    这个 answer 似乎暗示这是可能的一维数组,但我不确定多维。

    编辑:

    这是一个实际数据格式稍有不同的示例:

    db.col.find({}, { _id: 0, hit: 1 })
    { "hit" : [ [ 570, 0, 630, 630, 636, 735, 672, 615, 648, 648, 618, 0 ], 
    [ 492, 0, 471, 471, 570, 564, 0, 590, 513, 432, 471, 477 ], 
    [ 387, 0, 0, 0, 0, 0, 0, 456, 0, 480, 351, 415 ], 
    [ 432, 528, 0, 0, 495, 509, 0, 579, 0, 552, 0, 594 ], 
    [ 558, 603, 594, 624, 672, 0, 0, 705, 783, 0, 756, 816 ], 
    [ 0, 858, 951, 1027, 0, 0, 1058, 1131, 0, 0, 1260, 1260 ], 
    [ 1269, 0, 1287, 0, 1326, 0, 1386, 1386, 1470, 0, 0, 0 ], 
    [ 1623, 0, 1695, 1764, 1671, 1671, 0, 1824, 1872, 0, 0, 0 ], 
    [ 1950, 1894, 2034, 2034, 0, 0, 1941, 0, 2070, 1911, 2049, 2055 ], 
    [ 2052, 2052, 0, 0, 0, 2085, 2007, 2073, 0, 0, 0, 1941 ], 
    [ 1878, 1896, 0, 1875, 0, 0, 1677, 0, 1722, 0, 1545, 0 ], 
    [ 0, 0, 1317, 1469, 1501, 1634, 1494, 0, 0, 1290, 0, 0 ], 
    [ 0, 1485, 1375, 1491, 1530, 1407, 0, 0, 0, 1611, 0, 0 ], 
    [ 1652, 1800, 1686, 1643, 1923, 0, 0, 0, 1737, 1604, 1797, 0 ], 
    [ 1842, 1806, 0, 1830, 1896, 1947, 0, 1710, 1734, 1725, 0, 0 ], 
    [ 0, 0, 1932, 0, 1908, 1878, 1941, 1931, 2007, 2013, 1995, 1995 ], 
    [ 0, 2025, 2004, 1927, 0, 0, 1939, 1835, 1962, 1863, 0, 1815 ], 
    [ 0, 0, 1839, 1755, 1821, 1821, 1751, 1656, 0, 0, 1467, 0 ], 
    [ 0, 1632, 1546, 1449, 0, 1551, 1449, 0, 0, 1554, 0, 1491 ], 
    [ 1463, 1411, 0, 1491, 0, 0, 1551, 1467, 0, 0, 0, 1464 ], 
    [ 0, 0, 1311, 0, 0, 1471, 0, 0, 1581, 0, 1368, 1368 ], 
    [ 1296, 0, 0, 0, 1176, 1381, 0, 1170, 1194, 1194, 1193, 1137 ], 
    [ 0, 1244, 1221, 1039, 0, 1041, 930, 921, 1033, 813, 0, 0 ], 
    [ 0, 0, 0, 1010, 0, 0, 918, 783, 0, 609, 693, 645 ] ] }
    

    这是适当的查询(感谢Veeram在修正我的代码的评论):

    db.col.aggregate([ 
    { $project: { _id: 0, hit: 1 } }, 
    { $unwind: { path: "$hit", includeArrayIndex: "x" } }, 
    { $unwind: { path: "$hit", includeArrayIndex: "y" } }, 
    { $group: { _id: { x: "$x", y: "$y" }, hit: { $sum: "$hit" } } }, 
    { $sort: { "_id.x": 1, "_id.y": 1 } }, 
    { $group: { _id: "$_id.x", hit: { $push: "$hit" } } }, 
    { $sort: { "_id": 1 } }, 
    { $group: { _id: null, hit: { $push: "$hit" } } } 
    ])
    
    1 回复  |  直到 6 年前
        1
  •  1
  •   mickl    6 年前

    处理动态属性需要两个运算符: $objectToArray $arrayToObject . 要对所有文档中的值求和,可以尝试表示每个 x,y 作为单个文档配对(使用 $unwind )然后用几个 $group 从而获得单个文档的阶段。要获取行和列的初始顺序,可以应用 $sort 两次:

    db.col.aggregate([
        {
            $project: {
                values: {
                    $map: {
                        input: { $objectToArray: "$values" },
                        as: "obj",
                        in: { k: "$$obj.k", v: { $objectToArray: "$$obj.v" } }
                    }
                }
            }
        },
        {
            $unwind: "$values"
        },
        {
            $unwind: "$values.v"
        },
        {
            $project: {
                x: "$values.k",
                y: "$values.v.k",
                value: "$values.v.v"
            }
        },
        {
            $group: {
                _id: { x: "$x", y: "$y" },
                value: { $sum: "$value" }
            }
        },
        {
            $sort: {
                "_id.y": 1
            }
        },
        {
            $group: {
                _id: "$_id.x",
                v: { $push: { k: "$_id.y", v: "$value" } }
            }
        },
        {
            $sort: {
                "_id": 1
            }
        },
        {
            $group: {
                _id: null,
                values: { $push: { k: "$_id", v: "$v" } }
            }
        },
        {
            $project: {
                values: {
                    $arrayToObject: {
                        $map: {
                            input: "$values",
                            as: "obj",
                            in: {
                                k: "$$obj.k",
                                v: { $arrayToObject: "$$obj.v" }
                            }
                        }
                    }
                }
            }
        }
    ])
    

    对于您的样本数据,它输出:

    {
        "_id" : null,
        "values" : {
                "0" : {
                        "0" : 1999998,
                        "1" : 1999998,
                        "59" : 2000000
                },
                "1" : {
                        "0" : 4000000,
                        "1" : 4000000,
                        "59" : 2000000
                },
                "58" : {
                        "0" : 3200000,
                        "1" : 2400000,
                        "59" : 2200000
                },
                "59" : {
                        "0" : 2600000,
                        "1" : 2800000,
                        "59" : 3000000
                }
        }
    

    }