Benlin Liu

I am a Ph.D. student in Paul G. Allen School of Computer Science & Engineering, University of Washington, working with Ranjay Krishna. I am affiliated with UW Graphics and Imaging Laboratory (GRAIL).

I receive my master degree from Department of Computer Science, UCLA, where I was a research assistant under the supervision of Prof. Cho-jui Hsieh . I also collaborated with Prof. Xiaolong Wang at UCSD . I obtained my BEng. degree from the Department of Electronic Engineering , Tsinghua University , and I worked with Prof. Jiwen Lu of the Department of Automation . During undergraduate, I visited the GRASP Lab at University of Pennsylvania and worked with Prof. Jianbo Shi.

My current research interest is at the intersection of compution vision and machine learning, with a current focus on diffusion, self-supervised learning and large language model. Past research is more about efficient machine learning model and how to learn more generlizable vision model in a data-efficient way.

Email  /  CV  /  Google Scholar  /  Github  /  Twitter

profile photo

  • 2023-07: Two papers got accepted by ICCV 2023!
  • 2023-05: I'll be a student researcher at Google Deepmind this summer!
  • 2022-06: I am greatly honored to be chosen as UW Reality Lab-Meta Fellow!
  • 2021-09: DynamicViT is accepted to NeurIPS 2021.
  • 2021-07: Two papers are accepted to ICCV 2021.
  • 2020-12: One paper on image classification is accepted to AAAI 2021.
  • 2020-07: One paper on knowledge distillation is accepted by ECCV 2020.
  • Publications

    * indicates equal contribution

    dise Efficient Inference of Vision and Language Instruction-Following Models with Elastic Cache
    In Submission

    Enhancing large language model efficiency, especially in multimodal contexts, with 'importance-driven cache merging' to manage KV cache memory needs, thereby boosting long instruction following and long output generation in multimodal chatbots!

    dise TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
    Yushi Hu, Benlin Liu, Jungo Kasai, Yizhong Wang, Mari Ostendorf, Ranjay Krishna, Noah A. Smith
    IEEE/CVF International Conference on Computer Vision (ICCV), 2023
    [Paper][Project Page][Code]

    Fine-grained and accurate evaluation of synthesized images using Image-to-Text Models (e.g. GPT-4, BLIP-2, etc.) and Large Language Models (e.g. GPT-3.5). More accurate than CLIP!

    dise Unleashing Text-to-Image Diffusion Models for Visual Perception
    Wenliang Zhao*, Yongming Rao*, Zuyan Liu*, Benlin Liu, Jie Zhou, Jiwen Lu
    IEEE/CVF International Conference on Computer Vision (ICCV), 2023
    [Paper][Project Page][Code]

    Text-to-Image generation models (e.g. Stable Diffusion) are not only for creating cool stuff, but can also be applied to multiple dense prediction tasks!

    dise DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
    Yongming Rao, Wenliang Zhao, Benlin Liu, Jiwen Lu, Jie Zhou, Cho-Jui Hsieh
    Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS), 2021
    [Paper][Project Page][Code][Video]

    We propose a dynamic token sparsification framework to prune redundant tokens progressively and dynamically for vision transformer acceleration.

    dise RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection
    Benlin Liu*, Yongming Rao*, Yi Wei, Jiwen Lu, Cho-Jui Hsieh, Jie Zhou
    IEEE/CVF International Conference on Computer Vision (ICCV), 2021

    We propose to generate random layouts of a scene by making use of the objects in the synthetic CAD dataset and learn the 3D scene representation by applying object-level contrastive learning on two random scenes generated from the same set of synthetic objects.

    dise Robust Object Detection via Instance-Level Temporal Cycle Confusion
    Xin Wang, Benlin Liu*, Thomas E. Huang*, Fisher Yu, Xiaolong Wang, Joseph E. Gonzalez, Trevor Darrell
    IEEE/CVF International Conference on Computer Vision (ICCV), 2021
    [Paper][Project Page]

    We introduce a new self-supervised task on videos to improve the out-of-domain generalization of object detectors.

    dise Multi-ProxyWasserstein Classifier for Image Classification
    Benlin Liu*, Yongming Rao*, Jiwen Lu, Jie Zhou, Cho-Jui Hsieh
    35th AAAI Conference on Artificial Intelligence (AAAI), 2021

    We present a new Multi-Proxy Wasserstein Classifier to imporve the image classification models by calculating a non-uniform matching flow between the elements in the feature map of a sample and multiple proxies of a class using optimal transport theory.

    dise MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down Distillation
    Benlin Liu, Yongming Rao, Jiwen Lu, Jie Zhou, Cho-Jui Hsieh
    16th European Conference on Computer Vision (ECCV), 2020

    We boost the performance of CNNs by learning soft targets for shallow layers via meta-learning.

    Academic Services

  • Conference Reviewer: CVPR 2021-2024, WACV 2021-2023, NeurIPS 2023, ICCV 2023, ECCV 2022, ICLR 2022-2024

  • Website Template

    © Benlin Liu | Last updated: Nov 30, 2023