代码之家  ›  专栏  ›  技术社区  ›  Prasanna Nandakumar

跳过第一行-使用get_object api读取对象时

  •  0
  • Prasanna Nandakumar  · 技术社区  · 6 年前

    如何跳过第一行-使用get_object api读取对象时

    import os
    import boto3
    import json
    import logging
    
    def lambda_handler(event, context):
    
        # Fetch the bucket name and the file
        bucket = event['Records'][0]['s3']['bucket']['name']
        key = event['Records'][0]['s3']['object']['key']
    
    
        # Generate record in DynamoDB
        try :
            # Declare S3 bucket and DynamoDB Boto3 Clients
            s3_client = boto3.client('s3')
            dynamodb = boto3.resource('dynamodb')
    
            # Read the Object using get_object API
            obj = s3_client.get_object(Bucket=bucket, Key=key)
            rows = obj['Body'].read().decode("utf-8").split('\n')
    
            tableName = os.environ['DB_TABLE_NAME']
            table = dynamodb.Table(tableName)
    
            log.info("TableName: " + tableName)
    
            # Need client just to access the Exception
            dynamodb_client = boto3.client('dynamodb')
    
            try :
                # Write the CSV file to the DynamoDB Table
                with table.batch_writer() as batch:
                    for row in rows:       
                        batch.put_item(Item={
                            'x': row.split(',')[0],
                            'c': row.split(',')[1],
                            'w': row.split(',')[2],
                            'f': row.split(',')[3]
                            })
    
    
                print('Finished Inserting into TableName: ' + tableName)
            except dynamodb_client.exceptions.ResourceNotFoundException as tableNotFoundEx:
                return ('ERROR: Unable to locate DynamoDB table: ', tableName)
    
    
        except KeyError as dynamoDBKeyError:
            msg = 'ERROR: Need DynamoDB Environment Var: DB_TABLE_NAME'
            print(dynamoDBKeyError)
            return msg;
    

    上面的代码读取csv并插入到dynamo db中。这里的问题是-标题行(列nmaes)也插入到表中。如何跳过第一行并从第二行开始解析? next 不适合我

    1 回复  |  直到 6 年前
        1
  •  3
  •   Jack T    6 年前

    也许不是最好的解决方案,但这应该可以做到:

    import os
    import boto3
    import json
    import logging
    
    def lambda_handler(event, context):
    
        # Fetch the bucket name and the file
        bucket = event['Records'][0]['s3']['bucket']['name']
        key = event['Records'][0]['s3']['object']['key']
    
    
        # Generate record in DynamoDB
        try :
            # Declare S3 bucket and DynamoDB Boto3 Clients
            s3_client = boto3.client('s3')
            dynamodb = boto3.resource('dynamodb')
    
            # Read the Object using get_object API
            obj = s3_client.get_object(Bucket=bucket, Key=key)
            rows = obj['Body'].read().decode("utf-8").split('\n')
    
            tableName = os.environ['DB_TABLE_NAME']
            table = dynamodb.Table(tableName)
    
            log.info("TableName: " + tableName)
    
            # Need client just to access the Exception
            dynamodb_client = boto3.client('dynamodb')
    
            try :
                first = True
                # Write the CSV file to the DynamoDB Table
                with table.batch_writer() as batch:
                    for row in rows:
                        if first:
                            first = False
                        else:       
                            batch.put_item(Item={
                                'x': row.split(',')[0],
                                'c': row.split(',')[1],
                                'w': row.split(',')[2],
                                'f': row.split(',')[3]
                                })
    
    
                print('Finished Inserting into TableName: ' + tableName)
            except dynamodb_client.exceptions.ResourceNotFoundException as tableNotFoundEx:
                return ('ERROR: Unable to locate DynamoDB table: ', tableName)
    
    
        except KeyError as dynamoDBKeyError:
            msg = 'ERROR: Need DynamoDB Environment Var: DB_TABLE_NAME'
            print(dynamoDBKeyError)
            return msg;
    

    最好使用 for i in range(1, len(rows)) 循环,但上面要求对代码进行较少的更改