Learned image compression has demonstrated superior rate-distortion performance compared to traditional standards for over the past years. However, despite these advancements, legacy formats like JPEG remain dominant. The primary barrier to adoption is not quality, but computational complexity. Real-time applications demand low decoding latencies for smooth user experiences, whereas current learned models often require significantly longer processing times, rendering them impractical for standard hardware. We identify that the majority of this computational burden lies in the initial high-resolution convolutional layers that transform pixels into feature maps. To address this, we propose an innovative hierarchical feature extraction transform. By utilizing fewer channels for high spatial resolution inputs and increasing channel depth only as spatial dimensions are reduced in the latent space, we significantly cut computational load without sacrificing bit rate reduction efficiency. This strategy reduces forward pass complexity from 1256 kMAC/Pixel to just 270 kMAC/Pixel. This architectural shift offers an immediate solution for deploying efficient learned compression on existing devices without relying on future hardware acceleration.
Our novel approach utilizes hierarchical feature extraction transforms to map images from the pixel domain to the latent domain and vice versa, reducing both memory and computational complexity. The key features of our approach include:
Our proposed method demonstrates significant improvements in both computational efficiency. The following figures illustrate the performance of our model compared to state-of-the-art methods.
@article{ameen2026loclic,
title={LoC-LIC: Low Complexity Learned Image Coding Using Hierarchical Feature Transforms},
author={Ameen, Ayman A and Richter, Thomas and Kaup, Andr{\'e}},
year={2026}
}