已发表成果:
WOK 论文 537 篇;中文核心 11 篇;其它论文 1 篇;专利发明 8 个;
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation
Adaptive Feature Selection for No-Reference Image Quality Assessment using Contrastive Mitigating Semantic Noise Sensitivity
PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization
Aligning and Prompting Everything All at Once for Universal Visual Perception
Lottery Jackpots Exist in Pre-Trained Models
X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation
Code Search Debiasing: Improve Search Results beyond Overall Ranking Performance
I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization
Semi-Supervised Panoptic Narrative Grounding
Semi-Supervised Panoptic Narrative Grounding
PixelFace plus : Towards Controllable Face Generation and Manipulation with Text Descriptions and Segmentation Masks
Learning Occlusion Disentanglement with Fine-grained Localization for Occluded Person Re-identification
Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation
Improving Human-Object Interaction Detection via Virtual Image Learning
Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval
Shadow Removal by High-Quality Shadow Synthesis
CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes
JM3D & JM3D-LLM: Elevating 3D Representation with Joint Multi-modal Cues
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
AutoDiffusion: Training-Free Optimization of Time Steps and Architectures for Automated Diffusion Model Acceleration
Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models
Towards Unified Token Learning for Vision-Language Tracking
DLIP: Distilling Language-Image Pre-training
M3PS: End-to-End Multi-Grained Multi-Modal Attribute-Aware Product Summarization in E-commerce
EALink: An Efficient and Accurate Pre-trained Framework for Issue-Commit Link Recovery
HODN: Disentangling Human-Object Feature for HOI Detection
Towards Language-Guided Visual Recognition via Dynamic Convolutions
Continual Face Forgery Detection via Historical Distribution Preserving
Pseudo-label Alignment for Semi-supervised Instance Segmentation
Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation
Improving Human-Object Interaction Detection via Virtual Image Learning
Towards General Visual-Linguistic Face Forgery Detection
REFBERT: A Two-Stage Pre-trained Framework for Automatic Rename Refactoring
Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer
CF-ViT: A General Coarse-to-Fine Method for Vision Transformer
OMPQ: Orthogonal Mixed Precision Quantization
Approximated Prompt Tuning for Vision-Language Pre-trained Models
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Shadow-Aware Dynamic Convolution for Shadow Removal
Spatial Re-parameterization for N:M Sparsity
STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection
Adapting Pre-trained Language Models to Vision-Language Tasks via Dynamic Visual Prompting
Towards local visual modeling for image captioning
CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion Models
DiffRate: Differentiable Compression Rate for Efficient Vision Transformers
RefBERT: A Two-Stage Pre-trained Framework for Automatic Rename Refactoring
Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models
MultiQuant: A Novel Multi-Branch Topology Method for Arbitrary Bit-width Network Quantization
Distribution-Flexible Subset Quantization for Post-Quantizing Super-Resolution Networks
InterFormer: Real-time Interactive Image Segmentation
Latent Feature Relation Consistency for Adversarial Robustness
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance
CAT: Collaborative Adversarial Training
You Only Segment Once: Towards Real-Time Panoptic Segmentation
Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective
Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-Identification
Active Teacher for Semi-Supervised Object Detection
DistilPose: Tokenized Pose Regression with Heatmap Distillation
Dynamic Support Network for Few-Shot Class Incremental Learning
HGNN<SUP>+</SUP>: General Hypergraph Neural Networks
Self-supervised Graph Representation Learning for Black Market Account Detection
Towards End-to-end Semi-supervised Learning for One-stage Object Detection
Towards Efficient Visual Adaption via Structural Re-parameterization
Towards Local Visual Modeling for Image Captioning
Bi-directional Masks for Efficient N:M Sparse Training
REAL-TIME IMAGE DEMOIRéING ON MOBILE DEVICES
Exploring Invariant Representation for Visible-Infrared Person Re-Identification
Spectral Aware Softmax for Visible-Infrared Person Re-Identification
Unsupervised Domain Adaptation on Person Re-Identification via Dual-level Asymmetric Mutual Learning
DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning
Training Compact CNNs for Image Classification Using Dynamic-Coded Filter Fusion
Prioritized Subnet Sampling for Resource-Adaptive Supernet Training
Transformer Tracking via Frequency Fusion
Positive-Sample-Free Object Tracking via a Soft Constraint
Toward Unified Token Learning for Vision-Language Tracking
HODN: Disentangling Human-Object Feature for HOI Detection
Semantically Consistent Visual Representation for Adversarial Robustness
Bilateral Knowledge Interaction Network for Referring Image Segmentation
A Survivor in the Era of Large-Scale Pretraining: An Empirical Study of One-Stage Referring Expression Comprehension
Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective
RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension
You Only Segment Once: Towards Real-Time Panoptic Segmentation
Bi-directional Masks for Efficient N:M Sparse Training
Interactive Object Placement with Reinforcement Learning
DistilPose: Tokenized Pose Regression with Heatmap Distillation
RefTeacher: A Strong Baseline for Semi-Supervised Referring Expression Comprehension
Meta Architecture for Point Cloud Analysis
Discriminator-Cooperated Feature Map Distillation for GAN Compression
Clover : Towards A Unified Video-Language Alignment and Fusion Model
STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection
EALink: An Efficient and Accurate Pre-trained Framework for Issue-Commit Link Recovery
AutoDiffusion: Training-Free Optimization of Time Steps and Architectures for Automated Diffusion Model Acceleration
Category-aware Allocation Transformer for Weakly Supervised Object Localization
SMMix: Self-Motivated Image Mixing for Vision Transformers
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance
InterFormer Real-time Interactive Image Segmentation
DiffRate : Differentiable Compression Rate for Efficient Vision Transformers
Automatic Network Pruning via Hilbert-Schmidt Independence Criterion Lasso under Information Bottleneck Principle
Pseudo-label Alignment for Semi-supervised Instance Segmentation
Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models
Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models
Improving Adversarial Robustness via Information Bottleneck Distillation
Discover and Align Taxonomic Context Priors for Open-world Semi-Supervised Learning
REAL-TIME IMAGE DEMOIRéING ON MOBILE DEVICES
基于采样和加权损失函数的模型窃取攻击方法
中国科学:信息科学,1674-7267,2023-05-16.信息技术 神经网络表示与模型压缩 第1部分:卷积神经网络
国家市场监督管理总局;国家标准化管理委员会,,2023-03-17.