Research

My research lies at the broader area of computer vision and machine learning, with a particular focus on data- and computation-efficient deep learning models and their applications in computer vision. In the long term, my goal is to develop intelligent systems that can learn from, adapt to, and interact with the environment, and autonomously perform complex tasks in the real world.

My work includes an adaptive downsampling method [1] that selectively resamples pixels to accelerate inference without sacrificing accuracy, and a spatiotemporal state space model for prostate cancer detection in mpMRI [2]; open-vocabulary segmentation with vision langeuage models (VLMs) [3], efficient image enhancement with neural implicit representation [4] and diffusion models [5], semantic line detection from images with Hough representations [6], palmprint recognition with purely synthetic training data [7]. and token pruning of visual langeuage models (VLMs).

References

[1]
K. Zhao, L. Ruan, H. Jiang, X. Zhu, X. Zhang, and D. Zeng, “Beyond Predictive Resampling: Learning Input-Agnostic Downsampling for Efficient Aligned Vision Recognition,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2026, pp. 3–11. doi: https://doi.org/10.1609/aaai.v40i16.38319.
[2]
K. Zhao et al., “PCa-Mamba: Spatiotemporal State Space Models for Prostate Cancer Detection in Multi-Parametric MRI,” Medical Image Analysis, p. 104033, 2026, doi: https://doi.org/10.1016/j.media.2026.104033.
[3]
K. Zhao et al., “Open-Vocabulary Camouflaged Object Segmentation with Cascaded Vision Language Models,” Computational Visual Media, vol. 8, no. 3, pp. 331–368, 2025, doi: 10.26599/CVM.2025.9450512.
[4]
K. Pang, K. Zhao, A. L. Y. Hung, H. Zheng, R. Yan, and K. Sung, “NExpR: Neural Explicit Representation for fast arbitrary-scale medical image super-resolution,” Computers in Biology and Medicine, vol. 184, p. 109354, 2025, doi: 10.1016/j.compbiomed.2024.109354.
[5]
K. Zhao, A. L. Y. Hung, K. Pang, H. Zheng, and K. Sung, “MRI Super-Resolution with Partial Diffusion Models,” IEEE Transactions on Medical Imaging, 2024, doi: 10.1109/TMI.2024.3483109.
[6]
K. Zhao, Q. Han, C.-B. Zhang, J. Xu, and M.-M. Cheng, “Deep hough transform for semantic line detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 9, pp. 4793–4806, 2021, doi: 10.1109/TPAMI.2021.3077129.
[7]
K. Zhao et al., “Bézierpalm: A free lunch for palmprint recognition,” in European Conference on Computer Vision, Springer, 2022, pp. 19–36. doi: 10.1007/978-3-031-19778-9_2.

This page is written in MDX and the citations are automatically rendered from publications.bib using the rehype-citation plugin.