代码之家  ›  专栏  ›  技术社区  ›  Dmitry Bubnenkov

使用Pandas聚合JSON对象中的类似字段

  •  0
  • Dmitry Bubnenkov  · 技术社区  · 6 年前

    我有JSON字符串(请参阅代码中的内容):

    import numpy as np
    import pandas as pd
    
    x = [{
                "t_registration_number": "31807380529",
                "t_customer_inn": "2221123815",
                "t_customer_kpp": "222101001",
                "t_customer_ogrn": "1072221001709",
                "t_customer_short_name": "КАУ ГОСУДАРСТВЕННАЯ ЭКСПЕРТИЗА АЛТАЙСКОГО КРАЯ",
                "t_placer_inn": "2221123815",
                "t_placer_kpp": "222101001",
                "t_placer_ogrn": "1072221001709",
                "t_placer_short_name": "КАУ ГОСУДАРСТВЕННАЯ ЭКСПЕРТИЗА АЛТАЙСКОГО КРАЯ",
                "r_rus_name": "Алтайский край",
                "t_publication_date": "2018-12-28",
                "s_inn": "7706196090",
                "s_kpp": "",
                "s_name": "ООО Страховая компания СОГЛАСИЕ",
                "s_lotguid": "f12b1c2a-fb9d-4bff-9430-7fc29ec9dc88",
                "n_prc_diff_nmc": "1",
                "nlots_nmck": 500000
        },
        {
                "t_registration_number": "31805988205",
                "t_customer_inn": "2801010011",
                "t_customer_kpp": "280101001",
                "t_customer_ogrn": "1022800512624",
                "t_customer_short_name": "ГАУЗ АО ДЕТСКАЯ ГКБ",
                "t_placer_inn": "2801010011",
                "t_placer_kpp": "280101001",
                "t_placer_ogrn": "1022800512624",
                "t_placer_short_name": "ГАУЗ АО ДЕТСКАЯ ГКБ",
                "r_rus_name": "Амурская область",
                "t_publication_date": "2018-01-09",
                "s_inn": "2723071046",
                "s_kpp": "272301001",
                "s_name": "ООО МЕДИАС",
                "s_lotguid": "2fd7440e-e0a1-4fa0-ae7b-b901b1e378d5",
                "n_prc_diff_nmc": "2",
                "nlots_nmck": 34384
        }, 
    {
                "t_registration_number": "31805988205",
                "t_customer_inn": "2801010011",
                "t_customer_kpp": "280101001",
                "t_customer_ogrn": "1022800512624",
                "t_customer_short_name": "ГАУЗ АО ДЕТСКАЯ ГКБ",
                "t_placer_inn": "2801010011",
                "t_placer_kpp": "280101001",
                "t_placer_ogrn": "1022800512624",
                "t_placer_short_name": "ГАУЗ АО ДЕТСКАЯ ГКБ",
                "r_rus_name": "Амурская область",
                "t_publication_date": "2018-01-09",
                "s_inn": "7018040688",
                "s_kpp": "701701001",
                "s_name": "ООО СКАН - М",
                "s_lotguid": "2fd7440e-e0a1-4fa0-ae7b-b901b1e378d5",
                "n_prc_diff_nmc": "2",
                "nlots_nmck": 34384
    }
    ]
    
    
    df = pd.DataFrame(x)
    
    result = df.groupby(['s_lotguid','t_registration_number','t_customer_inn','t_customer_kpp','t_customer_ogrn','t_customer_short_name','t_placer_inn','t_placer_kpp','t_placer_ogrn','t_placer_short_name','r_rus_name','t_publication_date']).agg(','.join)#.to_dict()
    

    最后2个JSON对象是similar,除了3个字段:“s\u inn”、“s\u kpp”、“s\u name”

    我需要汇总数据,最终得到:

    {
                "t_registration_number": "31807380529",
                "t_customer_inn": "2221123815",
                "t_customer_kpp": "222101001",
                "t_customer_ogrn": "1072221001709",
                "t_customer_short_name": "КАУ ГОСУДАРСТВЕННАЯ ЭКСПЕРТИЗА АЛТАЙСКОГО КРАЯ",
                "t_placer_inn": "2221123815",
                "t_placer_kpp": "222101001",
                "t_placer_ogrn": "1072221001709",
                "t_placer_short_name": "КАУ ГОСУДАРСТВЕННАЯ ЭКСПЕРТИЗА АЛТАЙСКОГО КРАЯ",
                "r_rus_name": "Алтайский край",
                "t_publication_date": "2018-12-28",
                "s_inn": "7706196090",
                "s_kpp": "",
                "s_name": "ООО Страховая компания СОГЛАСИЕ",
                "s_lotguid": "f12b1c2a-fb9d-4bff-9430-7fc29ec9dc88",
                "n_prc_diff_nmc": "1",
                "nlots_nmck": 500000
        },
    {
            "t_registration_number": "31805988205",
            "t_customer_inn": "2801010011",
            "t_customer_kpp": "280101001",
            "t_customer_ogrn": "1022800512624",
            "t_customer_short_name": "ГАУЗ АО ДЕТСКАЯ ГКБ",
            "t_placer_inn": "2801010011",
            "t_placer_kpp": "280101001",
            "t_placer_ogrn": "1022800512624",
            "t_placer_short_name": "ГАУЗ АО ДЕТСКАЯ ГКБ",
            "r_rus_name": "Амурская область",
            "t_publication_date": "2018-01-09",
            "s_inn": "2723071046, 7018040688",
            "s_kpp": "272301001, 701701001",
            "s_name": "ООО МЕДИАС, ООО СКАН – М",
            "s_lotguid": "2fd7440e-e0a1-4fa0-ae7b-b901b1e378d5",
            "n_prc_diff_nmc": "2",
            "nlots_nmck": 34384
    }
    

    或在理想情况下:

    {
                "t_registration_number": "31807380529",
                "t_customer_inn": "2221123815",
                "t_customer_kpp": "222101001",
                "t_customer_ogrn": "1072221001709",
                "t_customer_short_name": "КАУ ГОСУДАРСТВЕННАЯ ЭКСПЕРТИЗА АЛТАЙСКОГО КРАЯ",
                "t_placer_inn": "2221123815",
                "t_placer_kpp": "222101001",
                "t_placer_ogrn": "1072221001709",
                "t_placer_short_name": "КАУ ГОСУДАРСТВЕННАЯ ЭКСПЕРТИЗА АЛТАЙСКОГО КРАЯ",
                "r_rus_name": "Алтайский край",
                "t_publication_date": "2018-12-28",
                "s_inn": "7706196090",
                "s_kpp": "",
                "s_name": "ООО Страховая компания СОГЛАСИЕ",
                "s_lotguid": "f12b1c2a-fb9d-4bff-9430-7fc29ec9dc88",
                "n_prc_diff_nmc": "1",
                "nlots_nmck": 500000
        },
    
    {
            "t_registration_number": "31805988205",
            "t_customer_inn": "2801010011",
            "t_customer_kpp": "280101001",
            "t_customer_ogrn": "1022800512624",
            "t_customer_short_name": "ГАУЗ АО ДЕТСКАЯ ГКБ",
            "t_placer_inn": "2801010011",
            "t_placer_kpp": "280101001",
            "t_placer_ogrn": "1022800512624",
            "t_placer_short_name": "ГАУЗ АО ДЕТСКАЯ ГКБ",
            "r_rus_name": "Амурская область",
            "t_publication_date": "2018-01-09",
            "aggregated": [
                {
                    "s_inn": "2723071046",
                    "s_kpp": "272301001",
                    "s_name": "ООО МЕДИАС",            
                },
                {
                    "s_inn": "7018040688",
                    "s_kpp": "701701001",
                    "s_name": "ООО СКАН – М",            
                }
            ],
            "s_lotguid": "2fd7440e-e0a1-4fa0-ae7b-b901b1e378d5",
            "n_prc_diff_nmc": "2",
            "nlots_nmck": 34384
    } 
    

    有人能帮我吗?我真的不知道怎么做。

    0 回复  |  直到 6 年前