Issue
In keras / tensorflow it is often quite simple to describe layers directly as functions that map their input to an output, like so:
def resnet_block(x, kernel_size):
ch = x.shape[-1]
out = Conv2D(ch, kernel_size, strides = (1,1), padding='same', activation='relu')(x)
out = Conv2D(ch, kernel_size, strides = (1,1), padding='same', activation='relu')(out)
out = Add()([x,out])
return out
whereas subclassing Layer
to get something like
r = ResNetBlock(kernel_size=(3,3))
y = r(x)
is a little more cumbersome (or even a lot more cumbersome for more complex examples).
Since keras seems perfectly happy to construct the underlying weights of its layers when they're being called for the first time, I was wondering if it was possible to just wrap functions such as the one above and let keras figure things out once there are inputs, i.e. I would like it to look like this:
r = FunctionWrapperLayer(lambda x:resnet_block(x, kernel_size=(3,3)))
y = r(x)
I've made an attempt at implementing FunctionWrapperLayer
, which looks as follows:
class FunctionWrapperLayer(Layer):
def __init__(self, fn):
super(FunctionWrapperLayer, self).__init__()
self.fn = fn
def build(self, input_shape):
shape = input_shape[1:]
inputs = Input(shape)
outputs = self.fn(inputs)
self.model = Model(inputs=inputs, outputs=outputs)
self.model.compile()
def call(self, x):
return self.model(x)
This looks like it might work, however I've run into some bizarre issues whenever I use activations, e.g. with
def bad(x):
out = tf.keras.activations.sigmoid(x)
out = Conv2D(1, (1,1), strides=(1,1), padding='same')(out)
return out
x = tf.constant(tf.reshape(tf.range(48,dtype=tf.float32),[1,4,-1,1])
w = FunctionWrapperLayer(bad)
w(x)
I get the following error
FailedPreconditionError: Error while reading resource variable _AnonymousVar34 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/_AnonymousVar34/class tensorflow::Var does not exist.
[[node conv2d_6/BiasAdd/ReadVariableOp (defined at <ipython-input-33-fc380d9255c5>:12) ]] [Op:__inference_keras_scratch_graph_353]
What this suggests to me is that there is something inherently wrong with initializing models like that in the build method. Maybe someone has a better idea as to what might be going on there or how else to get the functionality I would like.
Update: As mentioned by jr15, the above does work when the function involved only uses keras layers. However, the following ALSO works, which has me a little puzzled:
i = Input(x.shape[1:])
o = bad(i)
model = Model(inputs=i, outputs=o)
model(x)
Incidentally, model.submodules
yields
(<tensorflow.python.keras.engine.input_layer.InputLayer at 0x219d80c77c0>,
<tensorflow.python.keras.engine.base_layer.TensorFlowOpLayer at 0x219d7afc820>,
<tensorflow.python.keras.layers.convolutional.Conv2D at 0x219d7deafa0>)
meaning the activation is automatically turned into a "TensorFlowOpLayer" when doing it like that.
Another update: Looking at the original error message, it seems like the activation isn't the only culprit. If I remove the convolution and use the wrapper everything works as well and again I find a "TensorFlowOpLayer" when inspecting the submodules.
Solution
With Tensorflow 2.4 it apparently just works now. The submodules now show a "TFOpLambda" layer.
To anybody interested, here is some slightly improved wrapper code that also accommodates multi-input models:
class FunctionWrapperLayer(Layer):
def __init__(self, fn):
super(FunctionWrapperLayer, self).__init__()
self.fn = fn
def build(self, input_shapes):
super(FunctionWrapperLayer, self).build(input_shapes)
if type(input_shapes) is list:
inputs = [Input(shape[1:]) for shape in input_shapes]
else:
inputs = Input(input_shapes[1:])
outputs = self.fn(inputs)
self.fn_model = Model(inputs=inputs, outputs=outputs)
self.fn_model.compile()
def call(self, x):
return self.fn_model(x)
Answered By - Cereal
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.