前言
一、keras提供了三種定義模型的方式
1. 序列式(Sequential) API
序貫(sequential)API允許你為大多數問題逐層堆疊創建模型。雖然說對很多的應用來說,這樣的一個手法很簡單也解決了很多深度學習網絡結構的構建,但是它也有限制-它不允許你創建模型有共享層或有多個輸入或輸出的網絡。
2. 函數式(Functional) API
Keras函數式(functional)API為構建網絡模型提供了更為靈活的方式。
它允許你定義多個輸入或輸出模型以及共享圖層的模型。除此之外,它允許你定義動態(ad-hoc)的非周期性(acyclic)網絡圖。
模型是通過創建層的實例(layer instances)并將它們直接相互連接成對來定義的,然后定義一個模型(model)來指定那些層是要作為這個模型的輸入和輸出。
3.子類(Subclassing) API
補充知識:keras pytorch 構建模型對比
使用CIFAR10數據集,用三種框架構建Residual_Network作為例子,比較框架間的異同。
數據集格式
pytorch的數據集格式
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
|
import torch import torch.nn as nn import torchvision # Download and construct CIFAR-10 dataset. train_dataset = torchvision.datasets.CIFAR10(root = '../../data/' , train = True , download = True ) # Fetch one data pair (read data from disk). image, label = train_dataset[ 0 ] print (image.size()) # torch.Size([3, 32, 32]) print (label) # 6 print (train_dataset.data.shape) # (50000, 32, 32, 3) # type(train_dataset.targets)==list print ( len (train_dataset.targets)) # 50000 # Data loader (this provides queues and threads in a very simple way). train_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = 64 , shuffle = True ) """ # 演示DataLoader返回的數據結構 # When iteration starts, queue and thread start to load data from files. data_iter = iter(train_loader) # Mini-batch images and labels. images, labels = data_iter.next() print(images.shape) # torch.Size([100, 3, 32, 32]) print(labels.shape) # torch.Size([100]) 可見經過DataLoader后,labels由list變成了pytorch內置的tensor格式 """ # 一般使用的話是下面這種 # Actual usage of the data loader is as below. for images, labels in train_loader: # Training code should be written here. pass |
keras的數據格式
1
2
3
4
5
6
|
import keras from keras.datasets import cifar10 (train_x, train_y) , (test_x, test_y) = cifar10.load_data() print (train_x.shape) # ndarray 類型: (50000, 32, 32, 3) print (train_y.shape) # (50000, 1) |
輸入網絡的數據格式不同
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
|
""" 1: pytorch 都是內置torch.xxTensor輸入網絡,而keras的則是原生ndarray類型 2: 對于multi-class的其中一種loss,即cross-entropy loss 而言, pytorch的api為 CorssEntropyLoss, 但y_true不能用one-hoe編碼!這與keras,tensorflow 都不同。tensorflow相應的api為softmax_cross_entropy 他們的api都僅限于multi-class classification 3*: 其實上面提到的api都屬于categorical cross-entropy loss, 又叫 softmax loss,是函數內部先進行了 softmax 激活,再經過cross-entropy loss。 這個loss是cross-entropy loss的變種, cross-entropy loss又叫logistic loss 或 multinomial logistic loss。 實現這種loss的函數不包括激活函數,需要自定義。 pytorch對應的api為BCEloss(僅限于 binary classification), tensorflow 對應的api為 log_loss。 cross-entropy loss的第二個變種是 binary cross-entropy loss 又叫 sigmoid cross- entropy loss。 函數內部先進行了sigmoid激活,再經過cross-entropy loss。 pytorch對應的api為BCEWithLogitsLoss, tensorflow對應的api為sigmoid_cross_entropy """ # pytorch criterion = nn.CrossEntropyLoss() ... for epoch in range (num_epochs): for i, (images, labels) in enumerate (train_loader): images = images.to(device) labels = labels.to(device) # Forward pass outputs = model(images) # 對于multi-class cross-entropy loss # 輸入y_true不需要one-hot編碼 loss = criterion(outputs, labels) ... # keras # 對于multi-class cross-entropy loss # 輸入y_true需要one-hot編碼 train_y = keras.utils.to_categorical(train_y, 10 ) ... model.fit_generator(datagen.flow(train_x, train_y, batch_size = 128 ), validation_data = [test_x,test_y], epochs = epochs,steps_per_epoch = steps_per_epoch, verbose = 1 ) ... |
整體流程
keras 流程
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
|
model = myModel() model. compile (optimizer = Adam( 0.001 ),loss = "categorical_crossentropy" ,metrics = [ "accuracy" ]) model.fit_generator(datagen.flow(train_x, train_y, batch_size = 128 ), validation_data = [test_x,test_y], epochs = epochs,steps_per_epoch = steps_per_epoch, verbose = 1 , workers = 4 ) #Evaluate the accuracy of the test dataset accuracy = model.evaluate(x = test_x,y = test_y,batch_size = 128 ) # 保存整個網絡 model.save( "cifar10model.h5" ) """ # https://blog.csdn.net/jiandanjinxin/article/details/77152530 # 使用 # keras.models.load_model("cifar10model.h5") # 只保存architecture # json_string = model.to_json() # open('my_model_architecture.json','w').write(json_string) # 使用 # from keras.models import model_from_json #model = model_from_json(open('my_model_architecture.json').read()) # 只保存weights # model.save_weights('my_model_weights.h5') #需要在代碼中初始化一個完全相同的模型 # model.load_weights('my_model_weights.h5') #需要加載權重到不同的網絡結構(有些層一樣)中,例如fine-tune或transfer-learning,可以通過層名字來加載模型 # model.load_weights('my_model_weights.h5', by_name=True) """ |
pytorch 流程
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
|
model = myModel() # Loss and optimizer criterion = nn.CrossEntropyLoss() for epoch in range (num_epochs): for i, (images, labels) in enumerate (train_loader): images = images.to(device) labels = labels.to(device) # Forward pass outputs = model(images) loss = criterion(outputs, labels) # Backward and optimize # 將上次迭代計算的梯度值清0 optimizer.zero_grad() # 反向傳播,計算梯度值 loss.backward() # 更新權值參數 optimizer.step() # model.eval(),讓model變成測試模式,對dropout和batch normalization的操作在訓練和測試的時候是不一樣的 # eval()時,pytorch會自動把BN和DropOut固定住,不會取平均,而是用訓練好的值。 # 不然的話,一旦test的batch_size過小,很容易就會被BN層導致生成圖片顏色失真極大。 model. eval () with torch.no_grad(): correct = 0 total = 0 for images, labels in test_loader: images = images.to(device) labels = labels.to(device) outputs = model(images) _, predicted = torch. max (outputs.data, 1 ) total + = labels.size( 0 ) correct + = (predicted = = labels). sum ().item() print ( 'Accuracy of the model on the test images: {} %' . format ( 100 * correct / total)) # Save the model checkpoint # 這是只保存了weights torch.save(model.state_dict(), 'resnet.ckpt' ) """ # 使用 # myModel.load_state_dict(torch.load('params.ckpt')) # 若想保存整個網絡(architecture + weights) # torch.save(resnet, 'model.ckpt') # 使用 #model = torch.load('model.ckpt') """ |
對比流程
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
#https://blog.csdn.net/dss_dssssd/article/details/83892824 """ 1: 準備數據(注意數據格式不同) 2: 定義網絡結構model 3: 定義損失函數 4: 定義優化算法 optimizer 5: 訓練-keras 5.1:編譯模型(傳入loss function和optimizer等) 5.2:訓練模型(fit or fit_generator,傳入數據) 5: 訓練-pytorch 迭代訓練: 5.1:準備好tensor形式的輸入數據和標簽(可選) 5.2:前向傳播計算網絡輸出output和計算損失函數loss 5.3:反向傳播更新參數 以下三句話一句也不能少: 5.3.1:將上次迭代計算的梯度值清0 optimizer.zero_grad() 5.3.2:反向傳播,計算梯度值 loss.backward() 5.3.3:更新權值參數 optimizer.step() 6: 在測試集上測試-keras model.evaluate 6: 在測試集上測試-pytorch 遍歷測試集,自定義metric 7: 保存網絡(可選) 具體實現參考上面代碼 """ |
構建網絡
對比網絡
1、對于keras,不需要input_channels,函數內部會自動獲得,而pytorch則需要顯示聲明input_channels
2、對于pytorch Conv2d需要指定padding,而keras的則是same和valid兩種選項(valid即padding=0)
3、keras的Flatten操作可以視作pytorch中的view
4、keras的dimension一般順序是(H, W, C) (tensorflow 為backend的話),而pytorch的順序則是( C, H, W)
5、具體的變換可以參照下方,但由于沒有學過pytorch,keras也剛入門,不能保證正確,日后學的更深入了之后再來看看。
pytorch 構建Residual-network
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
|
import torch import torch.nn as nn import torchvision import torchvision.transforms as transforms # Device configuration device = torch.device( 'cuda' if torch.cuda.is_available() else 'cpu' ) # Hyper-parameters num_epochs = 80 learning_rate = 0.001 # Image preprocessing modules transform = transforms.Compose([ transforms.Pad( 4 ), transforms.RandomHorizontalFlip(), transforms.RandomCrop( 32 ), transforms.ToTensor()]) # CIFAR-10 dataset # train_dataset.data.shape #Out[31]: (50000, 32, 32, 3) # train_dataset.targets list # len(list)=5000 train_dataset = torchvision.datasets.CIFAR10(root = './data/' , train = True , transform = transform, download = True ) test_dataset = torchvision.datasets.CIFAR10(root = '../../data/' , train = False , transform = transforms.ToTensor()) # Data loader train_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = 100 , shuffle = True ) test_loader = torch.utils.data.DataLoader(dataset = test_dataset, batch_size = 100 , shuffle = False ) # 3x3 convolution def conv3x3(in_channels, out_channels, stride = 1 ): return nn.Conv2d(in_channels, out_channels, kernel_size = 3 , stride = stride, padding = 1 , bias = False ) # Residual block class ResidualBlock(nn.Module): def __init__( self , in_channels, out_channels, stride = 1 , downsample = None ): super (ResidualBlock, self ).__init__() self .conv1 = conv3x3(in_channels, out_channels, stride) self .bn1 = nn.BatchNorm2d(out_channels) self .relu = nn.ReLU(inplace = True ) self .conv2 = conv3x3(out_channels, out_channels) self .bn2 = nn.BatchNorm2d(out_channels) self .downsample = downsample def forward( self , x): residual = x out = self .conv1(x) out = self .bn1(out) out = self .relu(out) out = self .conv2(out) out = self .bn2(out) if self .downsample: residual = self .downsample(x) out + = residual out = self .relu(out) return out # ResNet class ResNet(nn.Module): def __init__( self , block, layers, num_classes = 10 ): super (ResNet, self ).__init__() self .in_channels = 16 self .conv = conv3x3( 3 , 16 ) self .bn = nn.BatchNorm2d( 16 ) self .relu = nn.ReLU(inplace = True ) self .layer1 = self .make_layer(block, 16 , layers[ 0 ]) self .layer2 = self .make_layer(block, 32 , layers[ 1 ], 2 ) self .layer3 = self .make_layer(block, 64 , layers[ 2 ], 2 ) self .avg_pool = nn.AvgPool2d( 8 ) self .fc = nn.Linear( 64 , num_classes) def make_layer( self , block, out_channels, blocks, stride = 1 ): downsample = None if (stride ! = 1 ) or ( self .in_channels ! = out_channels): downsample = nn.Sequential( conv3x3( self .in_channels, out_channels, stride = stride), nn.BatchNorm2d(out_channels)) layers = [] layers.append(block( self .in_channels, out_channels, stride, downsample)) self .in_channels = out_channels for i in range ( 1 , blocks): layers.append(block(out_channels, out_channels)) # [*[1,2,3]] # Out[96]: [1, 2, 3] return nn.Sequential( * layers) def forward( self , x): out = self .conv(x) # out.shape:torch.Size([100, 16, 32, 32]) out = self .bn(out) out = self .relu(out) out = self .layer1(out) out = self .layer2(out) out = self .layer3(out) out = self .avg_pool(out) out = out.view(out.size( 0 ), - 1 ) out = self .fc(out) return out model = ResNet(ResidualBlock, [ 2 , 2 , 2 ]).to(device) # pip install torchsummary or # git clone https://github.com/sksq96/pytorch-summary from torchsummary import summary # input_size=(C,H,W) summary(model, input_size = ( 3 , 32 , 32 )) images,labels = iter (train_loader). next () outputs = model(images) # Loss and optimizer criterion = nn.CrossEntropyLoss() optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate) # For updating learning rate def update_lr(optimizer, lr): for param_group in optimizer.param_groups: param_group[ 'lr' ] = lr # Train the model total_step = len (train_loader) curr_lr = learning_rate for epoch in range (num_epochs): for i, (images, labels) in enumerate (train_loader): images = images.to(device) labels = labels.to(device) # Forward pass outputs = model(images) loss = criterion(outputs, labels) # Backward and optimize optimizer.zero_grad() loss.backward() optimizer.step() if (i + 1 ) % 100 = = 0 : print ( "Epoch [{}/{}], Step [{}/{}] Loss: {:.4f}" . format (epoch + 1 , num_epochs, i + 1 , total_step, loss.item())) # Decay learning rate if (epoch + 1 ) % 20 = = 0 : curr_lr / = 3 update_lr(optimizer, curr_lr) # Test the model model. eval () with torch.no_grad(): correct = 0 total = 0 for images, labels in test_loader: images = images.to(device) labels = labels.to(device) outputs = model(images) _, predicted = torch. max (outputs.data, 1 ) total + = labels.size( 0 ) correct + = (predicted = = labels). sum ().item() print ( 'Accuracy of the model on the test images: {} %' . format ( 100 * correct / total)) # Save the model checkpoint torch.save(model.state_dict(), 'resnet.ckpt' ) |
keras 對應的網絡構建部分
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
|
""" #pytorch def conv3x3(in_channels, out_channels, stride=1): return nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False) """ def conv3x3(x,out_channels, stride = 1 ): #out = spatial_2d_padding(x,padding=((1, 1), (1, 1)), data_format="channels_last") return Conv2D(filters = out_channels, kernel_size = [ 3 , 3 ], strides = (stride,stride),padding = "same" )(x) """ # pytorch # Residual block class ResidualBlock(nn.Module): def __init__(self, in_channels, out_channels, stride=1, downsample=None): super(ResidualBlock, self).__init__() self.conv1 = conv3x3(in_channels, out_channels, stride) self.bn1 = nn.BatchNorm2d(out_channels) self.relu = nn.ReLU(inplace=True) self.conv2 = conv3x3(out_channels, out_channels) self.bn2 = nn.BatchNorm2d(out_channels) self.downsample = downsample def forward(self, x): residual = x out = self.conv1(x) out = self.bn1(out) out = self.relu(out) out = self.conv2(out) out = self.bn2(out) if self.downsample: residual = self.downsample(x) out += residual out = self.relu(out) return out """ def ResidualBlock(x, out_channels, stride = 1 , downsample = False ): residual = x out = conv3x3(x, out_channels,stride) out = BatchNormalization()(out) out = Activation( "relu" )(out) out = conv3x3(out, out_channels) out = BatchNormalization()(out) if downsample: residual = conv3x3(residual, out_channels, stride = stride) residual = BatchNormalization()(residual) out = keras.layers.add([residual,out]) out = Activation( "relu" )(out) return out """ #pytorch def make_layer(self, block, out_channels, blocks, stride=1): downsample = None if (stride != 1) or (self.in_channels != out_channels): downsample = nn.Sequential( conv3x3(self.in_channels, out_channels, stride=stride), nn.BatchNorm2d(out_channels)) layers = [] layers.append(block(self.in_channels, out_channels, stride, downsample)) self.in_channels = out_channels for i in range(1, blocks): layers.append(block(out_channels, out_channels)) # [*[1,2,3]] # Out[96]: [1, 2, 3] return nn.Sequential(*layers) """ def make_layer(x, out_channels, blocks, stride = 1 ): # tf backend: x.output_shape[-1]==out_channels #print("x.shape[-1] ",x.shape[-1]) downsample = False if (stride ! = 1 ) or (out_channels ! = x.shape[ - 1 ]): downsample = True out = ResidualBlock(x, out_channels, stride, downsample) for i in range ( 1 , blocks): out = ResidualBlock(out, out_channels) return out def KerasResidual(input_shape): images = Input (input_shape) out = conv3x3(images, 16 ) # out.shape=(None, 32, 32, 16) out = BatchNormalization()(out) out = Activation( "relu" )(out) layer1_out = make_layer(out, 16 , layers[ 0 ]) layer2_out = make_layer(layer1_out, 32 , layers[ 1 ], 2 ) layer3_out = make_layer(layer2_out, 64 , layers[ 2 ], 2 ) out = AveragePooling2D(pool_size = ( 8 , 8 ))(layer3_out) out = Flatten()(out) # pytorch 的nn.CrossEntropyLoss()會首先執行softmax計算 # 當換成keras時,沒有tf類似的softmax_cross_entropy # 自帶的categorical_crossentropy不會執行激活操作,因此得在Dense層加上activation out = Dense(units = 10 , activation = "softmax" )(out) model = Model(inputs = images,outputs = out) return model input_shape = ( 32 , 32 , 3 ) layers = [ 2 , 2 , 2 ] mymodel = KerasResidual(input_shape) mymodel.summary() |
pytorch model summary
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
|
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Layer ( type ) Output Shape Param # = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Conv2d - 1 [ - 1 , 16 , 32 , 32 ] 432 BatchNorm2d - 2 [ - 1 , 16 , 32 , 32 ] 32 ReLU - 3 [ - 1 , 16 , 32 , 32 ] 0 Conv2d - 4 [ - 1 , 16 , 32 , 32 ] 2 , 304 BatchNorm2d - 5 [ - 1 , 16 , 32 , 32 ] 32 ReLU - 6 [ - 1 , 16 , 32 , 32 ] 0 Conv2d - 7 [ - 1 , 16 , 32 , 32 ] 2 , 304 BatchNorm2d - 8 [ - 1 , 16 , 32 , 32 ] 32 ReLU - 9 [ - 1 , 16 , 32 , 32 ] 0 ResidualBlock - 10 [ - 1 , 16 , 32 , 32 ] 0 Conv2d - 11 [ - 1 , 16 , 32 , 32 ] 2 , 304 BatchNorm2d - 12 [ - 1 , 16 , 32 , 32 ] 32 ReLU - 13 [ - 1 , 16 , 32 , 32 ] 0 Conv2d - 14 [ - 1 , 16 , 32 , 32 ] 2 , 304 BatchNorm2d - 15 [ - 1 , 16 , 32 , 32 ] 32 ReLU - 16 [ - 1 , 16 , 32 , 32 ] 0 ResidualBlock - 17 [ - 1 , 16 , 32 , 32 ] 0 Conv2d - 18 [ - 1 , 32 , 16 , 16 ] 4 , 608 BatchNorm2d - 19 [ - 1 , 32 , 16 , 16 ] 64 ReLU - 20 [ - 1 , 32 , 16 , 16 ] 0 Conv2d - 21 [ - 1 , 32 , 16 , 16 ] 9 , 216 BatchNorm2d - 22 [ - 1 , 32 , 16 , 16 ] 64 Conv2d - 23 [ - 1 , 32 , 16 , 16 ] 4 , 608 BatchNorm2d - 24 [ - 1 , 32 , 16 , 16 ] 64 ReLU - 25 [ - 1 , 32 , 16 , 16 ] 0 ResidualBlock - 26 [ - 1 , 32 , 16 , 16 ] 0 Conv2d - 27 [ - 1 , 32 , 16 , 16 ] 9 , 216 BatchNorm2d - 28 [ - 1 , 32 , 16 , 16 ] 64 ReLU - 29 [ - 1 , 32 , 16 , 16 ] 0 Conv2d - 30 [ - 1 , 32 , 16 , 16 ] 9 , 216 BatchNorm2d - 31 [ - 1 , 32 , 16 , 16 ] 64 ReLU - 32 [ - 1 , 32 , 16 , 16 ] 0 ResidualBlock - 33 [ - 1 , 32 , 16 , 16 ] 0 Conv2d - 34 [ - 1 , 64 , 8 , 8 ] 18 , 432 BatchNorm2d - 35 [ - 1 , 64 , 8 , 8 ] 128 ReLU - 36 [ - 1 , 64 , 8 , 8 ] 0 Conv2d - 37 [ - 1 , 64 , 8 , 8 ] 36 , 864 BatchNorm2d - 38 [ - 1 , 64 , 8 , 8 ] 128 Conv2d - 39 [ - 1 , 64 , 8 , 8 ] 18 , 432 BatchNorm2d - 40 [ - 1 , 64 , 8 , 8 ] 128 ReLU - 41 [ - 1 , 64 , 8 , 8 ] 0 ResidualBlock - 42 [ - 1 , 64 , 8 , 8 ] 0 Conv2d - 43 [ - 1 , 64 , 8 , 8 ] 36 , 864 BatchNorm2d - 44 [ - 1 , 64 , 8 , 8 ] 128 ReLU - 45 [ - 1 , 64 , 8 , 8 ] 0 Conv2d - 46 [ - 1 , 64 , 8 , 8 ] 36 , 864 BatchNorm2d - 47 [ - 1 , 64 , 8 , 8 ] 128 ReLU - 48 [ - 1 , 64 , 8 , 8 ] 0 ResidualBlock - 49 [ - 1 , 64 , 8 , 8 ] 0 AvgPool2d - 50 [ - 1 , 64 , 1 , 1 ] 0 Linear - 51 [ - 1 , 10 ] 650 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Total params: 195 , 738 Trainable params: 195 , 738 Non - trainable params: 0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Input size (MB): 0.01 Forward / backward pass size (MB): 3.63 Params size (MB): 0.75 Estimated Total Size (MB): 4.38 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - |
keras model summary
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
|
__________________________________________________________________________________________________ Layer ( type ) Output Shape Param # Connected to = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = input_26 (InputLayer) ( None , 32 , 32 , 3 ) 0 __________________________________________________________________________________________________ conv2d_103 (Conv2D) ( None , 32 , 32 , 16 ) 448 input_26[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_99 (BatchNo ( None , 32 , 32 , 16 ) 64 conv2d_103[ 0 ][ 0 ] __________________________________________________________________________________________________ activation_87 (Activation) ( None , 32 , 32 , 16 ) 0 batch_normalization_99[ 0 ][ 0 ] __________________________________________________________________________________________________ conv2d_104 (Conv2D) ( None , 32 , 32 , 16 ) 2320 activation_87[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_100 (BatchN ( None , 32 , 32 , 16 ) 64 conv2d_104[ 0 ][ 0 ] __________________________________________________________________________________________________ activation_88 (Activation) ( None , 32 , 32 , 16 ) 0 batch_normalization_100[ 0 ][ 0 ] __________________________________________________________________________________________________ conv2d_105 (Conv2D) ( None , 32 , 32 , 16 ) 2320 activation_88[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_101 (BatchN ( None , 32 , 32 , 16 ) 64 conv2d_105[ 0 ][ 0 ] __________________________________________________________________________________________________ add_34 (Add) ( None , 32 , 32 , 16 ) 0 activation_87[ 0 ][ 0 ] batch_normalization_101[ 0 ][ 0 ] __________________________________________________________________________________________________ activation_89 (Activation) ( None , 32 , 32 , 16 ) 0 add_34[ 0 ][ 0 ] __________________________________________________________________________________________________ conv2d_106 (Conv2D) ( None , 32 , 32 , 16 ) 2320 activation_89[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_102 (BatchN ( None , 32 , 32 , 16 ) 64 conv2d_106[ 0 ][ 0 ] __________________________________________________________________________________________________ activation_90 (Activation) ( None , 32 , 32 , 16 ) 0 batch_normalization_102[ 0 ][ 0 ] __________________________________________________________________________________________________ conv2d_107 (Conv2D) ( None , 32 , 32 , 16 ) 2320 activation_90[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_103 (BatchN ( None , 32 , 32 , 16 ) 64 conv2d_107[ 0 ][ 0 ] __________________________________________________________________________________________________ add_35 (Add) ( None , 32 , 32 , 16 ) 0 activation_89[ 0 ][ 0 ] batch_normalization_103[ 0 ][ 0 ] __________________________________________________________________________________________________ activation_91 (Activation) ( None , 32 , 32 , 16 ) 0 add_35[ 0 ][ 0 ] __________________________________________________________________________________________________ conv2d_108 (Conv2D) ( None , 16 , 16 , 32 ) 4640 activation_91[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_104 (BatchN ( None , 16 , 16 , 32 ) 128 conv2d_108[ 0 ][ 0 ] __________________________________________________________________________________________________ activation_92 (Activation) ( None , 16 , 16 , 32 ) 0 batch_normalization_104[ 0 ][ 0 ] __________________________________________________________________________________________________ conv2d_110 (Conv2D) ( None , 16 , 16 , 32 ) 4640 activation_91[ 0 ][ 0 ] __________________________________________________________________________________________________ conv2d_109 (Conv2D) ( None , 16 , 16 , 32 ) 9248 activation_92[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_106 (BatchN ( None , 16 , 16 , 32 ) 128 conv2d_110[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_105 (BatchN ( None , 16 , 16 , 32 ) 128 conv2d_109[ 0 ][ 0 ] __________________________________________________________________________________________________ add_36 (Add) ( None , 16 , 16 , 32 ) 0 batch_normalization_106[ 0 ][ 0 ] batch_normalization_105[ 0 ][ 0 ] __________________________________________________________________________________________________ activation_93 (Activation) ( None , 16 , 16 , 32 ) 0 add_36[ 0 ][ 0 ] __________________________________________________________________________________________________ conv2d_111 (Conv2D) ( None , 16 , 16 , 32 ) 9248 activation_93[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_107 (BatchN ( None , 16 , 16 , 32 ) 128 conv2d_111[ 0 ][ 0 ] __________________________________________________________________________________________________ activation_94 (Activation) ( None , 16 , 16 , 32 ) 0 batch_normalization_107[ 0 ][ 0 ] __________________________________________________________________________________________________ conv2d_112 (Conv2D) ( None , 16 , 16 , 32 ) 9248 activation_94[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_108 (BatchN ( None , 16 , 16 , 32 ) 128 conv2d_112[ 0 ][ 0 ] __________________________________________________________________________________________________ add_37 (Add) ( None , 16 , 16 , 32 ) 0 activation_93[ 0 ][ 0 ] batch_normalization_108[ 0 ][ 0 ] __________________________________________________________________________________________________ activation_95 (Activation) ( None , 16 , 16 , 32 ) 0 add_37[ 0 ][ 0 ] __________________________________________________________________________________________________ conv2d_113 (Conv2D) ( None , 8 , 8 , 64 ) 18496 activation_95[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_109 (BatchN ( None , 8 , 8 , 64 ) 256 conv2d_113[ 0 ][ 0 ] __________________________________________________________________________________________________ activation_96 (Activation) ( None , 8 , 8 , 64 ) 0 batch_normalization_109[ 0 ][ 0 ] __________________________________________________________________________________________________ conv2d_115 (Conv2D) ( None , 8 , 8 , 64 ) 18496 activation_95[ 0 ][ 0 ] __________________________________________________________________________________________________ conv2d_114 (Conv2D) ( None , 8 , 8 , 64 ) 36928 activation_96[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_111 (BatchN ( None , 8 , 8 , 64 ) 256 conv2d_115[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_110 (BatchN ( None , 8 , 8 , 64 ) 256 conv2d_114[ 0 ][ 0 ] __________________________________________________________________________________________________ add_38 (Add) ( None , 8 , 8 , 64 ) 0 batch_normalization_111[ 0 ][ 0 ] batch_normalization_110[ 0 ][ 0 ] __________________________________________________________________________________________________ activation_97 (Activation) ( None , 8 , 8 , 64 ) 0 add_38[ 0 ][ 0 ] __________________________________________________________________________________________________ conv2d_116 (Conv2D) ( None , 8 , 8 , 64 ) 36928 activation_97[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_112 (BatchN ( None , 8 , 8 , 64 ) 256 conv2d_116[ 0 ][ 0 ] __________________________________________________________________________________________________ activation_98 (Activation) ( None , 8 , 8 , 64 ) 0 batch_normalization_112[ 0 ][ 0 ] __________________________________________________________________________________________________ conv2d_117 (Conv2D) ( None , 8 , 8 , 64 ) 36928 activation_98[ 0 ][ 0 ] __________________________________________________________________________________________________ batch_normalization_113 (BatchN ( None , 8 , 8 , 64 ) 256 conv2d_117[ 0 ][ 0 ] __________________________________________________________________________________________________ add_39 (Add) ( None , 8 , 8 , 64 ) 0 activation_97[ 0 ][ 0 ] batch_normalization_113[ 0 ][ 0 ] __________________________________________________________________________________________________ activation_99 (Activation) ( None , 8 , 8 , 64 ) 0 add_39[ 0 ][ 0 ] __________________________________________________________________________________________________ average_pooling2d_2 (AveragePoo ( None , 1 , 1 , 64 ) 0 activation_99[ 0 ][ 0 ] __________________________________________________________________________________________________ flatten_2 (Flatten) ( None , 64 ) 0 average_pooling2d_2[ 0 ][ 0 ] __________________________________________________________________________________________________ dense_2 (Dense) ( None , 10 ) 650 flatten_2[ 0 ][ 0 ] = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Total params: 197 , 418 Trainable params: 196 , 298 Non - trainable params: 1 , 120 __________________________________________________________________________________________________ |
以上這篇keras的三種模型實現與區別說明就是小編分享給大家的全部內容了,希望能給大家一個參考,也希望大家多多支持服務器之家。
原文鏈接:https://blog.csdn.net/yeziyezi1986/article/details/106780379