代码之家  ›  专栏  ›  技术社区  ›  Chase Midler

gcloud ml引擎在大文件上返回错误

  •  5
  • Chase Midler  · 技术社区  · 7 年前

    我有一个经过训练的模型,需要一些大的输入。我通常将其作为形状的numpy数组(1473473,3)。当我把它放到JSON中时,我得到了一个大约9.2MB的文件。即使我将其转换为JSON文件的base64编码,输入仍然相当大。

    ml engine predict在发送JSON文件时拒绝我的请求,错误如下:

    (gcloud.ml-engine.predict) HTTP request failed. Response: {
    "error": {
        "code": 400,
        "message": "Request payload size exceeds the limit: 1572864 bytes.",
        "status": "INVALID_ARGUMENT"
      }
    }
    

    看起来我无法将大小超过1.5MB的任何内容发送到ML引擎。这肯定是一件事吗?其他人是如何对大数据进行在线预测的?我必须启动一个计算引擎,还是会遇到同样的问题?

    编辑:

    我从Keras模型开始,尝试导出到tensorflow服务。我将我的Keras模型加载到一个名为“model”的变量中,并定义了一个目录“export\u path”。我构建了tensorflow服务模型,如下所示:

    signature = predict_signature_def(inputs={'input': model.input},
                                    outputs={'output': model.output})
    builder = saved_model_builder.SavedModelBuilder(export_path)
    builder.add_meta_graph_and_variables(
        sess=sess,
        tags=[tag_constants.SERVING],
        signature_def_map={
            signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature
        }
    )
    builder.save()
    

    输入如何查找此signature\u def?JSON会像{'input':' https://storage.googleapis.com/projectid/bucket/filename '}文件的位置是(1473473,3)numpy数组?

    第二次编辑: 查看Lak Lakshmanan发布的代码,我尝试了几种不同的变体,但都没有成功地读取图像url并尝试以这种方式解析文件。我尝试了以下方法,但没有成功:

    inputs = {'imageurl': tf.placeholder(tf.string, shape=[None])}
    filename = tf.squeeze(inputs['imageurl']) 
    image = read_and_preprocess(filename)#custom preprocessing function
    image = tf.placeholder_with_default(image, shape=[None, HEIGHT, WIDTH, NUM_CHANNELS])
    features = {'image' : image}
    inputs.update(features)
    signature = predict_signature_def(inputs= inputs,
                                    outputs={'output': model.output})
    
    
    with K.get_session() as session:
        """Convert the Keras HDF5 model into TensorFlow SavedModel."""
        builder = saved_model_builder.SavedModelBuilder(export_path)
        builder.add_meta_graph_and_variables(
            sess=session,
            tags=[tag_constants.SERVING],
            signature_def_map={
                signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature
            }
        )
        builder.save()
    

    我认为问题在于从imageurl占位符获取映射以构建功能。思考我做错了什么?

    2 回复  |  直到 7 年前
        1
  •  5
  •   Lak    4 年前

    我通常做的是让json引用Google云存储中的文件。用户首先必须将其文件上传到地面军事系统,然后调用预测。但这种方法还有其他优点,因为存储实用程序允许并行和多线程上传。

    在TensorFlow 2.0中,这是服务函数的外观:

    @tf.function(input_signature=[tf.TensorSpec([None,], dtype=tf.string)])
    def predict_bytes(img_bytes):
        input_images = tf.map_fn(
            preprocess,
            img_bytes,
            fn_output_signature=tf.float32
        )
        batch_pred = model(input_images) # same as model.predict()
        top_prob = tf.math.reduce_max(batch_pred, axis=[1])
        pred_label_index = tf.math.argmax(batch_pred, axis=1)
        pred_label = tf.gather(tf.convert_to_tensor(CLASS_NAMES), pred_label_index)
        return {
            'probability': top_prob,
            'flower_type_int': pred_label_index,
            'flower_type_str': pred_label
        }
    
    @tf.function(input_signature=[tf.TensorSpec([None,], dtype=tf.string)])
    def predict_filename(imageurl):
        img_bytes = tf.map_fn(
            tf.io.read_file,
            filenames
        )
        result = predict_bytes(img_bytes)
        result['filename'] = filenames
        return result
    
    shutil.rmtree('export', ignore_errors=True)
    os.mkdir('export')
    model.save('export/flowers_model3',
              signatures={
                  'serving_default': predict_filename,
                  'from_bytes': predict_bytes
              })
    

    完整代码如下: https://nbviewer.jupyter.org/github/GoogleCloudPlatform/practical-ml-vision-book/blob/master/09_deploying/09d_bytes.ipynb

    TensorFlow 1.0

    在TensorFlow 1.0中,代码如下所示:

    def serving_input_fn():
        # Note: only handles one image at a time ... 
        inputs = {'imageurl': tf.placeholder(tf.string, shape=())}
        filename = tf.squeeze(inputs['imageurl']) # make it a scalar
        image = read_and_preprocess(filename)
        # make the outer dimension unknown (and not 1)
        image = tf.placeholder_with_default(image, shape=[None, HEIGHT, WIDTH, NUM_CHANNELS])
    
    features = {'image' : image}
    return tf.estimator.export.ServingInputReceiver(features, inputs)
    

    此处为完整代码: https://github.com/GoogleCloudPlatform/training-data-analyst/blob/61ab2e175a629a968024a5d09e9f4666126f4894/courses/machine_learning/deepdive/08_image/flowersmodel/trainer/model.py#L119

        2
  •  2
  •   janiemi    4 年前

    我在人工智能平台上运行具有大图像的预测时遇到了相同的错误。我解决了负载限制问题,首先将图像编码为PNG格式,然后再将其发送到AI平台。

    我的Keras模型没有将PNG编码的图像作为输入,因此我需要将Keras模型转换为张量流估计器,并定义其 serving input function 包含将PNG编码图像解码回模型期望的格式的代码。

    当模型需要两个不同的灰度图像作为输入时的示例代码:

    import tensorflow as tf
    from tensorflow.keras.estimator import model_to_estimator
    from tensorflow.estimator.export import ServingInputReceiver
    
    IMG_PNG_1 = "encoded_png_image_1"
    IMG_PNG_2 = "encoded_png_image_2"
    
    
    def create_serving_fn(image_height, image_width):
        def serving_input_fn():
            def preprocess_png(png_encoded_img):
                img = tf.reshape(png_encoded_img, shape=())
                img = tf.io.decode_png(img, channels=1)
                img = img / 255
                img = tf.expand_dims(img, axis=0)
                return img
    
            # receiver_tensors worked only when the shape parameter wasn't defined
            receiver_tensors = {
                IMG_PNG_1: tf.compat.v1.placeholder(tf.string),
                IMG_PNG_2: tf.compat.v1.placeholder(tf.string)
            }
    
            img_1 = preprocess_png(png_encoded_img=receiver_tensors[IMG_PNG_1])
            img_2 = preprocess_png(png_encoded_img=receiver_tensors[IMG_PNG_2])
    
            input_img_1 = tf.compat.v1.placeholder_with_default(img_1, shape=[None, image_height, image_width, 1])
            input_img_2 = tf.compat.v1.placeholder_with_default(img_2, shape=[None, image_height, image_width, 1])
    
            features = {
                "model_input_1": input_img_1,
                "model_input_2": input_img_2,
            }
    
            return ServingInputReceiver(features=features, receiver_tensors=receiver_tensors)
    
        return serving_input_fn
    
    # Convert trained Keras model to Estimator
    estimator = model_to_estimator(keras_model=model)
    save_path = "location_of_the_SavedModel"
    export_path = estimator.export_saved_model(
        export_dir_base=save_path,
        serving_input_receiver_fn=create_serving_fn(1000, 1000)
    )