已发表成果:
WOK 论文 113 篇;中文核心 1 篇;
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
Learning Dynamic Prior Knowledge for Text-to-Face Pixel Synthesis
X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval
Towards Open-Ended Text-to-Face Generation, Combination and Manipulation
Clover: Towards A Unified Video-Language Alignment and Fusion Model
X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval
What Goes beyond Multi-modal Fusion in One-stage Referring Expression Comprehension: An Empirical Study
Towards Lightweight Transformer via Group-wise Transformation for Vision-and-Language Tasks
PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation
End-to-End Zero-Shot HOI Detection via Vision and Language Knowledge Distillation
SeqTR: A Simple yet Universal Network for Visual Grounding
Global2Local: A Joint-Hierarchical Attention for Video Captioning
Differentiated Relevances Embedding for Group-based Referring Expression Comprehension
Plenty is Plague: Fine-Grained Learning for Visual Question Answering
Fast Monocular Depth Estimation via Side Prediction Aggregation with Continuous Spatial Refinement
Knowing What it is: Semantic-Enhanced Dual Attention Transformer
Multi-Branch Distance-Sensitive Self-Attention Network for Image Captioning
Towards Lightweight Transformer Via Group-Wise Transformation for Vision-and-Language Tasks
Knowing What to Learn: A Metric-Oriented Focal Mechanism for Image Captioning
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
Active Teacher for Semi-Supervised Object Detection
DIFNet: Boosting Visual Information Flow for Image Captioning
An Information Theoretic Approach for Attention-Driven Face Forgery Detection
SeqTR: A Simple Yet Universal Network for Visual Grounding
PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation