代码之家  ›  专栏  ›  技术社区  ›  Austin A

TensorFlow和Keras的转移学习问题

  •  3
  • Austin A  · 技术社区  · 6 年前

    我一直在努力重新创造 this blog post . 这篇文章非常全面,而且 code 通过协作共享。

    我要做的是从预训练的VGG19网络中提取层,并创建一个以这些层作为输出的新网络。然而,当我组装新的网络时,它非常类似于vg19网络,并且似乎包含我没有提取的层。下面是一个例子。

    import tensorflow as tf
    from tensorflow.python.keras import models
    
    ## Create network based on VGG19 arch with pretrained weights
    vgg = tf.keras.applications.vgg19.VGG19(include_top=False, weights='imagenet')
    vgg.trainable = False
    

    当我们查看vg19的摘要时,我们看到了我们所期望的体系结构。

    vgg.summary()
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    input_2 (InputLayer)         (None, None, None, 3)     0         
    _________________________________________________________________
    block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
    _________________________________________________________________
    block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
    _________________________________________________________________
    block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
    _________________________________________________________________
    block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
    _________________________________________________________________
    block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
    _________________________________________________________________
    block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
    _________________________________________________________________
    block3_conv1 (Conv2D)        (None, None, None, 256)   295168    
    _________________________________________________________________
    block3_conv2 (Conv2D)        (None, None, None, 256)   590080    
    _________________________________________________________________
    block3_conv3 (Conv2D)        (None, None, None, 256)   590080    
    _________________________________________________________________
    block3_conv4 (Conv2D)        (None, None, None, 256)   590080    
    _________________________________________________________________
    block3_pool (MaxPooling2D)   (None, None, None, 256)   0         
    _________________________________________________________________
    block4_conv1 (Conv2D)        (None, None, None, 512)   1180160   
    _________________________________________________________________
    block4_conv2 (Conv2D)        (None, None, None, 512)   2359808   
    _________________________________________________________________
    block4_conv3 (Conv2D)        (None, None, None, 512)   2359808   
    _________________________________________________________________
    block4_conv4 (Conv2D)        (None, None, None, 512)   2359808   
    _________________________________________________________________
    block4_pool (MaxPooling2D)   (None, None, None, 512)   0         
    _________________________________________________________________
    block5_conv1 (Conv2D)        (None, None, None, 512)   2359808   
    _________________________________________________________________
    block5_conv2 (Conv2D)        (None, None, None, 512)   2359808   
    _________________________________________________________________
    block5_conv3 (Conv2D)        (None, None, None, 512)   2359808   
    _________________________________________________________________
    block5_conv4 (Conv2D)        (None, None, None, 512)   2359808   
    _________________________________________________________________
    block5_pool (MaxPooling2D)   (None, None, None, 512)   0         
    =================================================================
    Total params: 20,024,384
    Trainable params: 0
    Non-trainable params: 20,024,384
    _________________________________________________________________
    

    然后,我们提取层并创建一个新模型

    ## Layers to extract
    content_layers = ['block5_conv2'] 
    style_layers = ['block1_conv1','block2_conv1','block3_conv1','block4_conv1','block5_conv1']
    ## Get output layers corresponding to style and content layers 
    style_outputs = [vgg.get_layer(name).output for name in style_layers]
    content_outputs = [vgg.get_layer(name).output for name in content_layers]
    model_outputs = style_outputs + content_outputs
    
    new_model = models.Model(vgg.input, model_outputs)
    

    什么时候? new_model 是创造出来的,我相信我们应该有一个更小的模型。然而,对模型的总结表明,新模型非常接近于原始模型(它包含了vg19中22个层中的19个),并且它包含了我们没有提取的层。

    new_model.summary()
    
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    input_2 (InputLayer)         (None, None, None, 3)     0         
    _________________________________________________________________
    block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
    _________________________________________________________________
    block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
    _________________________________________________________________
    block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
    _________________________________________________________________
    block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
    _________________________________________________________________
    block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
    _________________________________________________________________
    block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
    _________________________________________________________________
    block3_conv1 (Conv2D)        (None, None, None, 256)   295168    
    _________________________________________________________________
    block3_conv2 (Conv2D)        (None, None, None, 256)   590080    
    _________________________________________________________________
    block3_conv3 (Conv2D)        (None, None, None, 256)   590080    
    _________________________________________________________________
    block3_conv4 (Conv2D)        (None, None, None, 256)   590080    
    _________________________________________________________________
    block3_pool (MaxPooling2D)   (None, None, None, 256)   0         
    _________________________________________________________________
    block4_conv1 (Conv2D)        (None, None, None, 512)   1180160   
    _________________________________________________________________
    block4_conv2 (Conv2D)        (None, None, None, 512)   2359808   
    _________________________________________________________________
    block4_conv3 (Conv2D)        (None, None, None, 512)   2359808   
    _________________________________________________________________
    block4_conv4 (Conv2D)        (None, None, None, 512)   2359808   
    _________________________________________________________________
    block4_pool (MaxPooling2D)   (None, None, None, 512)   0         
    _________________________________________________________________
    block5_conv1 (Conv2D)        (None, None, None, 512)   2359808   
    _________________________________________________________________
    block5_conv2 (Conv2D)        (None, None, None, 512)   2359808   
    =================================================================
    Total params: 15,304,768
    Trainable params: 15,304,768
    Non-trainable params: 0
    _________________________________________________________________
    

    所以我的问题是…

    1. 为什么我没有提取的图层会出现在 纽克模型 . 这些是通过模型的实例化过程推断出来的吗? per the docs ?这似乎太接近VGG19体系结构,无法推断。
    2. 从我对喀拉斯的理解来看 Model (functional API) ,传递多个输出层应该创建具有多个输出的模型,但是,新模型似乎是连续的,并且只有一个输出层。是这样吗?
    1 回复  |  直到 6 年前
        1
  •  2
  •   umutto    6 年前
    < Buff行情>
    1. 为什么我没有提取的图层显示在 new_model>code>中?
    < /块引用>

    这是因为当您使用 model s.model(vgg.input,model_output s)创建模型时, the“intermediate”layers between vgg.input and the output layers are included a s well.这是预期的方式,因为vgg是这样构造的。

    例如,如果您要以这种方式创建模型: models.model(vgg.input,vgg.get_layer('block2_pool'). 每个中间层之间的 input_1. and block2_pool. will be included since the input has to. flow through them before reasing block2_pool. 。下面是一个部分的VGG图,可以帮助解决这一问题。

    现在,-如果我没有误解-如果你想创建一个不包含中间层的模型(这可能会很糟糕),你必须自己创建一个。函数式API在这方面非常有用。在 documentation上有示例,但您要做的事情的要点如下:

    from keras.layers import conv2d,input
    
    x_输入=输入(形状=(28,28,1,))
    block1_conv1=conv2d(64,(3,3),padding='same')(x_输入)
    block2_conv2=conv2d(128,(3,3),padding='same')(x_输入)
    …
    
    new_model=models.model(x_input,[block1_conv1,block2_conv2,…])
    < /代码> 
    
    < Buff行情>
    <开始=“2”>
    然而,新模型似乎是连续的,只有一个输出层。是这样吗?
    
    < /块引用>
    
    

    不,您的模型有多个您想要的输出。model.summary()should have display which layers are connected to what(which will help understanding the structure),but I think there is a small bug with some versions that prevents.在任何情况下,您都可以通过检查new_model.output

    [<tf.tensor'block1_conv1/relu:0'shape=???,64)dtype=float32>,
    <tf.tensor'block2_conv1/relu:0'形状=???,128)dtype=float32>,
    <tf.tensor'block3_conv1/relu:0'形状=???,256)dtype=float32>,
    <tf.tensor'block4_conv1/relu:0'形状=???,512)dtype=float32>,
    <tf.tensor'block5_conv1/relu:0'形状=???,512)dtype=float32>,
    <tf.tensor'block5_conv2/relu:0'形状=???,512)dtype=float32>]
    < /代码> 
    
    

    new_model.summary()中按顺序打印它。is just a design choice,as it would get hairy with complex models.

    使用创建模型models.Model(vgg.input, model_outputs)中间层vgg.input输出层也包括在内。这是预期的方式,因为vgg是这样构造的。

    例如,如果要以这种方式创建模型:models.Model(vgg.input, vgg.get_layer('block2_pool')每个中间层之间input_1block2_pool因为输入不得不在到达之前流过它们块2A池. 下面是一个部分的VGG图,可以帮助实现这一点。

    enter image description here

    现在,-如果我没有误解-如果你想创建一个不包含中间层的模型(这可能会很糟糕),你必须自己创建一个。函数式API在这方面非常有用。有一些关于documentation但你想做的事情的要点如下:

    from keras.layers import Conv2D, Input
    
    x_input = Input(shape=(28, 28, 1,))
    block1_conv1 = Conv2D(64, (3, 3), padding='same')(x_input)
    block2_conv2 = Conv2D(128, (3, 3), padding='same')(x_input)
    ...
    
    new_model = models.Model(x_input, [block1_conv1, block2_conv2, ...])
    
    1. …然而,新模型似乎是连续的,只有一个输出层。是这样吗?

    不,您的模型有多个您想要的输出。model.summary()应该有显示哪些层与什么连接(这有助于理解结构),但我相信有一个小错误,有些版本阻止了这一点。在任何情况下,您都可以通过检查看到您的模型有多个输出。new_model.output,应该给你:

    [<tf.Tensor 'block1_conv1/Relu:0' shape=(?, ?, ?, 64) dtype=float32>,
     <tf.Tensor 'block2_conv1/Relu:0' shape=(?, ?, ?, 128) dtype=float32>,
     <tf.Tensor 'block3_conv1/Relu:0' shape=(?, ?, ?, 256) dtype=float32>,
     <tf.Tensor 'block4_conv1/Relu:0' shape=(?, ?, ?, 512) dtype=float32>,
     <tf.Tensor 'block5_conv1/Relu:0' shape=(?, ?, ?, 512) dtype=float32>,
     <tf.Tensor 'block5_conv2/Relu:0' shape=(?, ?, ?, 512) dtype=float32>]
    

    按顺序打印new_model.summary()只是一个设计选择,因为复杂的模型会让它变得毛茸茸的。