Is

**Mathematica** (v12, my uninformed attempt at manual conversion)

`h = StringSplit@Import["https://raw.githubusercontent.com/IBM/tensorflow-hangul-recognition/master/labels/2350-common-hangul.txt"]; n = NetChain[{(*2*) ConvolutionLayer[032, 5], Ramp, PoolingLayer[2, 2], ConvolutionLayer[064, 5], Ramp, PoolingLayer[2, 2], ConvolutionLayer[128, 3], Ramp, PoolingLayer[2, 2], FlattenLayer[], LinearLayer[1024], Ramp, DropoutLayer[], LinearLayer[h // Length], SoftmaxLayer[] }, "Input" -> NetEncoder[{"Image", {64, 64}, ColorSpace -> "Grayscale"}], "Output" -> NetDecoder[{"Class", h}] ] `

a good conversion of

**Tensorflow** (v1?, source network, excerpt from complete github file)

` # First convolutional layer. 32 feature maps. W_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32]) x_conv1 = tf.nn.conv2d(x_image, W_conv1, strides=[1, 1, 1, 1], padding='SAME') h_conv1 = tf.nn.relu(x_conv1 + b_conv1) # Max-pooling. h_pool1 = tf.nn.max_pool(h_conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') # Second convolutional layer. 64 feature maps. W_conv2 = weight_variable([5, 5, 32, 64]) b_conv2 = bias_variable([64]) x_conv2 = tf.nn.conv2d(h_pool1, W_conv2, strides=[1, 1, 1, 1], padding='SAME') h_conv2 = tf.nn.relu(x_conv2 + b_conv2) h_pool2 = tf.nn.max_pool(h_conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') # Third convolutional layer. 128 feature maps. W_conv3 = weight_variable([3, 3, 64, 128]) b_conv3 = bias_variable([128]) x_conv3 = tf.nn.conv2d(h_pool2, W_conv3, strides=[1, 1, 1, 1], padding='SAME') h_conv3 = tf.nn.relu(x_conv3 + b_conv3) h_pool3 = tf.nn.max_pool(h_conv3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') # Fully connected layer. Here we choose to have 1024 neurons in this layer. h_pool_flat = tf.reshape(h_pool3, [-1, 8*8*128]) W_fc1 = weight_variable([8*8*128, 1024]) b_fc1 = bias_variable([1024]) h_fc1 = tf.nn.relu(tf.matmul(h_pool_flat, W_fc1) + b_fc1) # Dropout layer. This helps fight overfitting. keep_prob = tf.placeholder(tf.float32, name=keep_prob_node_name) h_fc1_drop = tf.nn.dropout(h_fc1, rate=1-keep_prob) # Classification layer. W_fc2 = weight_variable([1024, num_classes]) b_fc2 = bias_variable([num_classes]) y = tf.matmul(h_fc1_drop, W_fc2) + b_fc2 # This isn't used for training, but for when using the saved model. tf.nn.softmax(y, name=output_node_name) `

- How can the Mathematica model improved to match the Tensorflow version exactly?
- Is there a resource anywhere to learn the correspondences between the two?
- Specifically, I am not sure about
`padding='SAME'`

– how to stay true to this in Mathematica?
`tf.nn.relu(x_conv1 + b_conv1)`

`==`

`Ramp`

?
`tf.matmul`

`==`

`LinearLayer`

?
`FlattenLayer[]`

`DropoutLayer[.5]`

(tensor flow switches between `0.5`

and `1.0`

, see complete file linked above)

I feel like I am making critical mistakes somewhere. The resulting network is too sensitive.