A learnable layer that upsamples spatial feature maps by reversing the convolution operation.
A transposed convolutional layer is a neural network component designed to increase the spatial resolution of feature maps, effectively performing a learned upsampling operation. Unlike a standard convolutional layer, which typically reduces spatial dimensions by sliding a filter across an input, the transposed convolution works in the opposite direction — mapping a smaller input to a larger output. Despite being commonly called a "deconvolution," it does not mathematically invert a convolution; rather, it computes the transpose of the convolution operation, which is where the name originates.
Mechanically, the transposed convolution achieves upsampling by inserting zeros between input values (a process called striding or dilation in the input space) and then applying a learned filter via a standard convolution. This expands the spatial footprint of the input while allowing the network to learn the best way to fill in the upsampled space. The filters are trained end-to-end via backpropagation, giving the layer a significant advantage over fixed interpolation methods like bilinear or nearest-neighbor upsampling, which cannot adapt to the task at hand.
Transposed convolutional layers became central to many influential architectures in the mid-2010s. Fully Convolutional Networks (FCNs) used them to project coarse, deep feature maps back to full image resolution for pixel-wise semantic segmentation. Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) rely on transposed convolutions in their generator and decoder components, respectively, to synthesize high-resolution images from compact latent representations. Encoder-decoder architectures like U-Net also depend on them to recover spatial detail lost during downsampling.
Despite their utility, transposed convolutions can produce a characteristic checkerboard artifact in generated images, caused by uneven overlap of the learned filters during upsampling. This has led practitioners to sometimes prefer alternatives such as resize-convolution, where a fixed upsampling step is followed by a standard convolution. Nevertheless, transposed convolutional layers remain a foundational building block in generative modeling, image segmentation, and any task requiring learned spatial upscaling.