代码之家  ›  专栏  ›  技术社区  ›  Alex B

python-将字节/unicode制表符分隔的数据转换为csv文件

  •  1
  • Alex B  · 技术社区  · 6 年前

    我正在从一个api中提取以下数据行。数据以 b 根据 Python 3.3 documentation 我们正在处理转义序列的“a bytes literal” \t \n 分别表示ascii水平制表符(tab)和ascii换行符(lf)。

    b'settlement-id\tsettlement-start-date\tsettlement-end-date\tdeposit-date\ttotal-amount\tcurrency\ttransaction-type\torder-id\tmerchant-order-id\tadjustment-id\tshipment-id\tmarketplace-name\tamount-type\tamount-description\tamount\tfulfillment-id\tposted-date\tposted-date-time\torder-item-code\tmerchant-order-item-id\tmerchant-adjustment-item-id\tsku\tquantity-purchased\n7293436482\t03.05.2018 09:10:07 UTC\t04.05.2018 20:30:23 UTC\t06.05.2018 20:30:23 UTC\t53,44\tEUR\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\n7293436482\t\t\t\t\t\tOrder\t303-3746292-6119509\t\t\tDRGC8lFbB\tAmazon.de\tItemPrice\tPrincipal\t179,99\tMFN\t03.05.2018\t03.05.2018 17:12:22 UTC\t30407746733299\t\t\t3700546702556-180412-chp-18c10347-1\t1\n7293436482\t\t\t\t\t\tOrder\t303-3746292-6119509\t\t\tDRGC8lFbB\tAmazon.de\tItemFees\tCommission\t-32,40\tMFN\t03.05.2018\t03.05.2018 17:12:22 UTC\t30407746733299\t\t\t3700546702556-180412-chp-18c10347-1\t1\n7293436482\t\t\t\t\t\tRefund\t305-1251749-5602732\t305-1251749-5602732\tamzn1:crow:YZkTuxs4RhO8FpZez3cGCg\t\tAmazon.de\tItemPrice\tPrincipal\t-109,99\tAFN\t04.05.2018\t04.05.2018 18:24:39 UTC\t38048998219979\t\t142721169810\t3700546702082-180124-jpn-131N28-6\t\n7293436482\t\t\t\t\t\tRefund\t305-1251749-5602732\t305-1251749-5602732\tamzn1:crow:YZkTuxs4RhO8FpZez3cGCg\t\tAmazon.de\tItemFees\tCommission\t19,80\tAFN\t04.05.2018\t04.05.2018 18:24:39 UTC\t38048998219979\t\t142721169810\t3700546702082-180124-jpn-131N28-6\t\n7293436482\t\t\t\t\t\tRefund\t305-1251749-5602732\t305-1251749-5602732\tamzn1:crow:YZkTuxs4RhO8FpZez3cGCg\t\tAmazon.de\tItemFees\tRefundCommission\t-3,96\tAFN\t04.05.2018\t04.05.2018 18:24:39 UTC\t38048998219979\t\t142721169810\t3700546702082-180124-jpn-131N28-6\t\n'
    

    当我使用 .decode("utf-8") 我得到相应的制表符分隔的数据:

    settlement-id   settlement-start-date   settlement-end-date deposit-date    total-amount    currency    transaction-type    order-id    merchant-order-id   adjustment-id   shipment-id marketplace-name    amount-type amount-description  amount  fulfillment-id  posted-date posted-date-time    order-item-code merchant-order-item-id  merchant-adjustment-item-id sku quantity-purchased
    7293436482  03.05.2018 09:10:07 UTC 04.05.2018 20:30:23 UTC 06.05.2018 20:30:23 UTC 53,44   EUR                                                                 
    7293436482                      Order   303-3746292-6119509         DRGC8lFbB   Amazon.de   ItemPrice   Principal   179,99  MFN 03.05.2018  03.05.2018 17:12:22 UTC 30407746733299          3700546702556-180412-chp-18c10347-1 1
    7293436482                      Order   303-3746292-6119509         DRGC8lFbB   Amazon.de   ItemFees    Commission  -32,40  MFN 03.05.2018  03.05.2018 17:12:22 UTC 30407746733299          3700546702556-180412-chp-18c10347-1 1
    7293436482                      Refund  305-1251749-5602732 305-1251749-5602732 amzn1:crow:YZkTuxs4RhO8FpZez3cGCg       Amazon.de   ItemPrice   Principal   -109,99 AFN 04.05.2018  04.05.2018 18:24:39 UTC 38048998219979      142721169810    3700546702082-180124-jpn-131N28-6   
    7293436482                      Refund  305-1251749-5602732 305-1251749-5602732 amzn1:crow:YZkTuxs4RhO8FpZez3cGCg       Amazon.de   ItemFees    Commission  19,80   AFN 04.05.2018  04.05.2018 18:24:39 UTC 38048998219979      142721169810    3700546702082-180124-jpn-131N28-6   
    7293436482                      Refund  305-1251749-5602732 305-1251749-5602732 amzn1:crow:YZkTuxs4RhO8FpZez3cGCg       Amazon.de   ItemFees    RefundCommission    -3,96   AFN 04.05.2018  04.05.2018 18:24:39 UTC 38048998219979      142721169810    3700546702082-180124-jpn-131N28-6   
    

    但是,我似乎无法将此数据保存到以制表符分隔的csv文件中。我尝试了多种方法将此数据保存到csv文件中,但都失败,包括:

    with open("folder_GET_V2_SETTLEMENT_REPORT_DATA_FLAT_FILE_V2_/" + grl_id + ".csv", "w") as csv_file:
        writer = csv.writer(csv_file)
        for row in csv_file:
            print(row)
    

    这给了我以下错误:

        for row in csv_file:
    io.UnsupportedOperation: not readable
    

    更新: 所以问题就在别处。实际上,在我的各种测试中,我成功地生成了与您相同的文件,因为输出看起来不正确,所以我认为它不起作用。在excel中打开文件时,数据被分成两列。

    enter image description here

    我现在明白了,这是因为有些数字是用欧洲的方法来记小数的,那就是昏迷 179,99 是的。因此,excel将其解释为一个分隔符,而如果我在记事本中打开文件,它将正确读取。

    1 回复  |  直到 6 年前
        1
  •  1
  •   mahesh    6 年前

    因为你想把数据写入csv文件,但是在for循环中你试图从文件中读取,所以你得到了错误。如果我理解正确,您希望接受bytes对象,并将其很好地写入一个以制表符分隔的csv文件中。以下代码将执行此操作:

    import csv, re
    
    orig = b'settlement-id\tsettlement-start-date\tsettlement-end-date\tdeposit-date\ttotal-amount\tcurrency\ttransaction-type\torder-id\tmerchant-order-id\tadjustment-id\tshipment-id\tmarketplace-name\tamount-type\tamount-description\tamount\tfulfillment-id\tposted-date\tposted-date-time\torder-item-code\tmerchant-order-item-id\tmerchant-adjustment-item-id\tsku\tquantity-purchased\n7293436482\t03.05.2018 09:10:07 UTC\t04.05.2018 20:30:23 UTC\t06.05.2018 20:30:23 UTC\t53,44\tEUR\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\n7293436482\t\t\t\t\t\tOrder\t303-3746292-6119509\t\t\tDRGC8lFbB\tAmazon.de\tItemPrice\tPrincipal\t179,99\tMFN\t03.05.2018\t03.05.2018 17:12:22 UTC\t30407746733299\t\t\t3700546702556-180412-chp-18c10347-1\t1\n7293436482\t\t\t\t\t\tOrder\t303-3746292-6119509\t\t\tDRGC8lFbB\tAmazon.de\tItemFees\tCommission\t-32,40\tMFN\t03.05.2018\t03.05.2018 17:12:22 UTC\t30407746733299\t\t\t3700546702556-180412-chp-18c10347-1\t1\n7293436482\t\t\t\t\t\tRefund\t305-1251749-5602732\t305-1251749-5602732\tamzn1:crow:YZkTuxs4RhO8FpZez3cGCg\t\tAmazon.de\tItemPrice\tPrincipal\t-109,99\tAFN\t04.05.2018\t04.05.2018 18:24:39 UTC\t38048998219979\t\t142721169810\t3700546702082-180124-jpn-131N28-6\t\n7293436482\t\t\t\t\t\tRefund\t305-1251749-5602732\t305-1251749-5602732\tamzn1:crow:YZkTuxs4RhO8FpZez3cGCg\t\tAmazon.de\tItemFees\tCommission\t19,80\tAFN\t04.05.2018\t04.05.2018 18:24:39 UTC\t38048998219979\t\t142721169810\t3700546702082-180124-jpn-131N28-6\t\n7293436482\t\t\t\t\t\tRefund\t305-1251749-5602732\t305-1251749-5602732\tamzn1:crow:YZkTuxs4RhO8FpZez3cGCg\t\tAmazon.de\tItemFees\tRefundCommission\t-3,96\tAFN\t04.05.2018\t04.05.2018 18:24:39 UTC\t38048998219979\t\t142721169810\t3700546702082-180124-jpn-131N28-6\t\n'
    
    # Split the long string into a list of lines
    data = orig.decode('utf-8').splitlines()
    
    # Open the file for writing
    with open("tmp.csv", "w") as csv_file:
        # Create the writer object with tab delimiter
        writer = csv.writer(csv_file, delimiter = '\t')
        for line in data:
            # Writerow() needs a list of data to be written, so split at all empty spaces in the line 
            writer.writerow(re.split('\s+',line))