Autoencoders without decoders


The purpose of an autoencoder is to get a latent representation z of some input X where z = g(X).
If we had an oracle which would tell us the true latent representation z for each X then we could train the function
g in a supervised manner and were done. We don’t have such an oracle, so we create a function d(z) = X’ which reconstructs the original input from the latent representation and then compare the reconstruction to the original input.
Can’t we train function g without needing a decoder d? I was thinking that perhaps we can simply take the gradient of z wrt X as our reconstruction term. In tensorflow this can be done in one line:

# Instead of 
reconstruction = decoder(z)
# We use
reconstruction = tf.gradients(ys=z, xs=X)

I wrote up some code which compares this ‘Virtual’ autoencoder to a standard convolutional autoencoder (CAE) on MNIST and it seems to be working and getting a significantly lower MSE than the standard CAE.
It’s not learning a useful representation of the data, but what is it doing?

The code can be found on: