News
2023-07: Two papers got accepted by ICCV 2023!
2023-05: I'll be a student researcher at Google Deepmind this summer!
2022-06: I am greatly honored to be chosen as UW Reality Lab-Meta Fellow!
2021-09: DynamicViT is accepted to NeurIPS 2021.
2021-07: Two papers are accepted to ICCV 2021.
2020-12: One paper on image classification is accepted to AAAI 2021.
2020-07: One paper on knowledge distillation is accepted by ECCV 2020.
|
Publications
* indicates equal contribution
|
|
Efficient Inference of Vision and Language Instruction-Following Models with Elastic Cache
Anonymous
In Submission
[Paper]
Enhancing large language model efficiency, especially in multimodal contexts, with 'importance-driven cache merging' to manage KV cache memory needs, thereby boosting long instruction following and long output generation in multimodal chatbots!
|
|
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
Yushi Hu, Benlin Liu, Jungo Kasai, Yizhong Wang, Mari Ostendorf, Ranjay Krishna, Noah A. Smith
IEEE/CVF International Conference on Computer Vision (ICCV), 2023
[Paper][Project Page][Code]
Fine-grained and accurate evaluation of synthesized images using Image-to-Text Models (e.g. GPT-4, BLIP-2, etc.) and Large Language Models (e.g. GPT-3.5). More accurate than CLIP!
|
|
Unleashing Text-to-Image Diffusion Models for Visual Perception
Wenliang Zhao*, Yongming Rao*, Zuyan Liu*, Benlin Liu, Jie Zhou, Jiwen Lu
IEEE/CVF International Conference on Computer Vision (ICCV), 2023
[Paper][Project Page][Code]
Text-to-Image generation models (e.g. Stable Diffusion) are not only for creating cool stuff, but can also be applied to multiple dense prediction tasks!
|
|
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
Yongming Rao, Wenliang Zhao, Benlin Liu, Jiwen Lu, Jie Zhou, Cho-Jui Hsieh
Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS), 2021
[Paper][Project Page][Code][Video]
We propose a dynamic token sparsification framework to prune redundant tokens progressively and dynamically for vision transformer acceleration.
|
|
RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection
Benlin Liu*, Yongming Rao*, Yi Wei, Jiwen Lu, Cho-Jui Hsieh, Jie Zhou
IEEE/CVF International Conference on Computer Vision (ICCV), 2021
[Paper]
We propose to generate random layouts of a scene by making use of the objects in the synthetic CAD dataset and learn the 3D scene representation by applying object-level contrastive learning on two random scenes generated from the same set of synthetic objects.
|
|
Robust Object Detection via Instance-Level Temporal Cycle Confusion
Xin Wang, Benlin Liu*, Thomas E. Huang*, Fisher Yu, Xiaolong Wang, Joseph E. Gonzalez, Trevor Darrell
IEEE/CVF International Conference on Computer Vision (ICCV), 2021
[Paper][Project Page]
We introduce a new self-supervised task on videos to improve the out-of-domain generalization of object detectors.
|
|
Multi-ProxyWasserstein Classifier for Image Classification
Benlin Liu*, Yongming Rao*, Jiwen Lu, Jie Zhou, Cho-Jui Hsieh
35th AAAI Conference on Artificial Intelligence (AAAI), 2021
[Paper]
We present a new Multi-Proxy Wasserstein Classifier to imporve the image classification models by calculating a non-uniform matching
flow between the elements in the feature map of a sample and multiple proxies of a class using optimal transport theory.
|
|
MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down Distillation
Benlin Liu, Yongming Rao, Jiwen Lu, Jie Zhou, Cho-Jui Hsieh
16th European Conference on Computer Vision (ECCV), 2020
[Paper]
We boost the performance of CNNs by learning soft targets for shallow layers via meta-learning.
|
Academic Services
Conference Reviewer: CVPR 2021-2024, WACV 2021-2023, NeurIPS 2023, ICCV 2023, ECCV 2022, ICLR 2022-2024
|
© Benlin Liu | Last updated: Nov 30, 2023
|