Jilan Xu
jilanxu18 at fudan dot edu dot cn

I am a final year PhD student at Fudan University, advised by Professor Yuejie Zhang. I also work closely with Professor Weidi Xie. My research focuses on multimodal machine learning, video understanding, and medical image analysis. I hope that someday medical AI agents would heal the world, make it a better place, for the entire human race.

Google Scholar  /  Twitter  /  GitHub  /  Zhihu 

profile photo
News

  • I'm actively looking for postdoc/job positions in 2025, please feel free to email me !
  • [05/2024] Our CVPR papers Egoinstructor and EgoExoLearn are also accepted to 1st LPVL Workshop @ CVPR 2024
  • [04/2024] We rank 1st at 4th-COV19D Competition Track 2 and 4th at Track1 @ CVPR 2024

Research
egoexolearn EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World
Yifei Huang*, Guo Chen*, Jilan Xu*, Mingfang Zhang*, Baoqi Pei, Hongjie Zhang, Lu Dong, Yali Wang, Limin Wang, Yu Qiao,
CVPR 2024  
arXiv / project page / code

A cross-view benchmark dataset that emulates the human demonstration following process, containing recorded egocentric videos guided by exocentric-view demonstration videos.

egoinstructor Retrieval-Augmented Egocentric Video Captioning
Jilan Xu, Yifei Huang, Junlin Hou, Guo Chen, Yuejie Zhang, Rui Feng, Weidi Xie
CVPR 2024  
arXiv / project page / code

Given an egocentric video, Egoinstructor automatically retrieves relevant exocentric instructional videos for assisting egocentric video captioning.

ovsegmentor Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision
Jilan Xu, Junlin Hou, Yuejie Zhang, Rui Feng, Yi Wang, Yu Qiao, Weidi Xie
CVPR 2023  
arXiv / project page / code

Training open-vocabulary semantic segmentation models with image-text pairs only, which enables zero-transfer to various segmentation datasets.

cream CREAM: Weakly supervised object localization via class re-activation mapping
Jilan Xu, Junlin Hou, Yuejie Zhang, Rui Feng, Rui-Wei Zhao, Tao Zhang, Xuequan Lu, Shang Gao
CVPR 2022  
arXiv

A weakly-supervised object localization model that generates better CAMs via soft-clustering algorithms.

cream Does video-text pretraining help open vocabulary online action detection
Qingsong Zhao, Yi Wang, Jilan Xu, Yinan He, Zifan Song, Limin Wang, Yu Qiao, Cairong Zhao
NeurIPS 2024  
arXiv

A zero-shot online action detector that leverages vision-language models and enables open-world temporal understanding.

internvideo InternVideo: General Video Foundation Models via Generative and Discriminative Learning
Yi Wang, Kunchang Li, Yizhuo Li, Yinan He, Bingkun Huang, Zhiyu Zhao, Hongjie Zhang, Jilan Xu, Yi Liu, Zun Wang, Sen Xing, Guo Chen, Junting Pan, Jiashuo Yu, Yali Wang, Limin Wang, Yu Qiao
Tech report 2022  
arXiv / code

A fundation model for video / video-text understanding, achieving SOTA over 30 benchmark datasets.

caw Concept-Attention Whitening for Interpretable Skin Lesion Diagnosis
Junlin Hou, Jilan Xu, Hao Chen
MICCAI 2024  
arXiv

An XAI framework that aligns the axes of the latent space with concepts of interest for interpretable skin lesion diagnosis.

caw Anatomical structure-guided medical vision-language pre-training
Qingqiu Li, Xiaohan Yan, Jilan Xu, Runtian Yuan, Yuejie Zhang, Rui Feng, Quanli Shen, Xiaobo Zhang, Shujun Wang
MICCAI 2024  
arXiv

An Anatomical Structure-Guided visual-text pre-training framework that leverages the anatomical knowledge.

cmcv2 CMC_v2: Towards More Accurate COVID-19 Detection with Discriminative Video Priors
Junlin Hou, Jilan Xu, Nan Zhang, Yi Wang, Yuejie Zhang, Xiaobo Zhang, Rui Feng
ECCV 2022 AIMIA Workshop  
arXiv / code

A Transformer-based model with contrastive representation enhancement. Winner of the 2nd COVID-19 Detection in ECCV 2022.

tccnet TCCNet: Temporally Consistent Context-Free Network for Semi-supervised Video Polyp Segmentation
Xiaotong Li, Jilan Xu, Yuejie Zhang, Rui Feng, Rui-Wei Zhao, Tao Zhang, Xuequan Lu, Shang Gao
IJCAI 2022, Oral  
paper

Co-training a model for semi-supervised video polyp segmentation, achieving comparable results using only 15% labeled data.

cmcv1 CMC-COV19D: Contrastive Mixup Classification for COVID-19 Diagnosis
Junlin Hou*, Jilan Xu*, Rui Feng, Yuejie Zhang, Fei Shan, Weiya Shi
ICCV 2021, AIMIA Workshop.  
paper / code

A ResNest-50 model combined with contrastive mixup technique for 3D COVID-19 CT image classification. Winner of the 1st COVID-19 detection challenge.

drl Data-Efficient Histopathology Image Analysis with Deformation Representation Learning
Jilan Xu, Junlin Hou, Yuejie Zhang, Rui Feng, Chunyang Ruan, Tao Zhang, Weiguo Fan
BIBM 2020, Oral  
paper

Introducing a self-supervised deformation representation learning technique for histopathology image analysis.

Awards & Honors

  • Winner of the 4th-COV19D Competition Track 2 (COVID19 Domain Adaptation Challenge) and rank 4th at Track1 (COVID-19 Detection Challenge) @ CVPR 2024
  • Winner of the MMAC Challenge Track1 (Classification of Myopic Maculopathy) and Track2 (Segmentation of Myopic Maculopathy Plus Lesions) @ MICCAI 2023
  • Winner of the 1st & 2nd COVID-19 Detection Challenge @ ICCV 2021 & ECCV 2022
  • Winner of the 1st COVID-19 Severity Detection Challenge @ ECCV 2022
  • VenusTech Enterprise Scholarship

Working Experience

  • Research intern @ Shanghai AI Laboratory, supervised by Dr.Yifei Huang, Yi Wang and Prof. Yu Qiao.
  • Research intern @ Bell AI Lab, Shanghai, supervised by Dr. Chenhui Ye.
  • Google Winter AI Camp. Our team won the best presentation award !!!
  • SWE intern @ Morgan Stanley Technology, supervised by Ray Zhou.

Academic Services

Conference Reviewer : ICLR25, NeurIPS24, ECCV24, MICCAI24, CVPR24, CVPR23, ICCV23, NeurIPS22

Journal Reviewer : Nature Communications, TPAMI, IJCV, TMM, NeuroComputing

TA : Data Structure, The Theory of Computation


This guy is good at website design.