已发表成果:
WOK 论文 83 篇;中文核心 12 篇;其它论文 1 篇;专利发明 6 个;
DYNAMIC LANGUAGE GROUP-BASED MOE: ENHANCING EFFICIENCY AND FLEXIBILITY FOR CODE-SWITCHING SPEECH RECOGNITION
LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation
MM-TTS: Multi-Modal Prompt Based Style Transfer for Expressive Text-to-Speech Synthesis
IMPROVING MULTI-SPEAKER ASR WITH OVERLAP-AWARE ENCODING AND MONOTONIC ATTENTION
THE XMUSPEECH SYSTEM FOR AUDIO-VISUAL TARGET SPEAKER EXTRACTION IN MISP 2023 CHALLENGE<bold> </bold>
REFLOW-TTS: A RECTIFIED FLOW MODEL FOR HIGH-FIDELITY TEXT-TO-SPEECH
COMMUNITY DETECTION GRAPH CONVOLUTIONAL NETWORK FOR OVERLAP-AWARE SPEAKER DIARIZATION
Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge
Interpretable Style Transfer for Text-to-Speech with ControlVAE and Diffusion Bridge
Cross-Modal Semantic Alignment before Fusion for Two-Pass End-to-End Spoken Language Understanding
Conformer-based Language Embedding with Self-Knowledge Distillation for Spoken Language Identification
Meta Learning with Adaptive Loss Weight for Low-Resource Speech Recognition
The XMU System for Audio-Visual Diarization and Recognition in MISP Challenge 2022
Unsupervised Speaker Verification Using Pre-Trained Model and Label Correction
Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization
Towards A Unified Conformer Structure: from ASR to ASV Task
CASA-Net: Cross-attention and Self-attention for End-to-End Audio-visual Speaker Diarization
A Pipelined Framework with?Serialized Output Training for?Overlapping Speech Recognition
TOWARDS A UNIFIED CONFORMER STRUCTURE: FROM ASR TO ASV TASK
Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting
Deep Representation Decomposition for Rate-Invariant Speaker Verification
GRAPH CONVOLUTIONAL NETWORK BASED SEMI-SUPERVISED LEARNING ON MULTI-SPEAKER MEETING DATA
The XMUSPEECH System for Accented English Automatic Speech Recognition
Spatial-aware Speaker Diarizaiton for Multi-channel Multi-party Meeting
Oriental Language Recognition (OLR) 2021: Summary and Analysis
A Multi-task Framework of Speaker Recognition with TTS Data Augmentation
Towards Language-universal Mandarin-English Speech Recognition with Unsupervised Label Synchronous Adaptation
Respiratory Sound Classification: From Fluid-Solid Coupling Analysis to Feature-Band Attention
When Speaker Recognition Meets Noisy Labels: Optimizations for Front-Ends and Back-Ends
Reverberation aware deep learning for environment tolerant microphone array DOA estimation
When Speaker Recognition Meets Noisy Labels: Optimizations for Front-ends and Back-ends
Deep joint learning for language recognition
MULTI-FEATURE LEARNING WITH CANONICAL CORRELATION ANALYSIS CONSTRAINT FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
LIGHTSPEECH: LIGHTWEIGHT NON-AUTOREGRESSIVE MULTI-SPEAKER TEXT-TO-SPEECH
LIGHT-TTS: LIGHTWEIGHT MULTI-SPEAKER MULTI-LINGUAL TEXT-TO-SPEECH
END-TO-END MULTI-ACCENT SPEECH RECOGNITION WITH UNSUPERVISED ACCENT MODELLING
ASV-SUBTOOLS: OPEN SOURCE TOOLKIT FOR AUTOMATIC SPEAKER VERIFICATION
Automatic Error Correction for Speaker Embedding Learning with Noisy Labels
An Integrated Framework for Two-pass Personalized Voice Trigger
Additive Phoneme-aware Margin Softmax Loss for Language Recognition
Phoneme-aware and Channel-wise Attentive Learning for Text Dependent Speaker Verification
Real-time End-to-End Monaural Multi-speaker Speech Recognition
Oriental Language Recognition (OLR) 2020: Summary and Analysis
OLR 2021 Challenge: Datasets, Rules and Baselines
AP20-OLR Challenge: Three Tasks and Their Baselines
AP20-OLR Challenge: Three Tasks and Their Baselines
XMU-TS Systems for NIST SRE19 CTS Challenge
Extraction of noise-robust speaker embedding based on generative adversarial networks
Speaker embedding extraction with multi-feature integration structure
Phone-aware multi-task learning and length expanding for short-duration language recognition
Training Multi-task Adversarial Network for Extracting Noise-robust Speaker Embedding
Improving the generalized performance of deep embedding for text-independent speaker verification
Anti-spoofing speaker verification system with multi-feature integration and multi-task learning
Deep speaker embedding extraction with channel-wise feature responses and additive supervision softmax loss function
Electroencephalogram-based brain-computer interface for the Chinese spelling system: a survey
Evaluation of the I-vector system for text-dependent speaker verification
Transfer learning for PLDA-based speaker verification
Classification between normal and adventitious lung sounds using deep neural network
Speech enhancement based on nonparametric factor analysis
Adaptive noise cancellation and classification of lung sounds under practical environment
A transfer learning method for PLDA-based speaker verification
Advancement in the EEG-based Chinese spelling systems
Transfer learning for speaker verification on short utterances
Modified-prior plda based speaker recognition system
Modified-prior PLDA and score calibration for duration mismatch compensation in speaker recognition system
Duration dependent covariance regularization in PLDA modeling for speaker verification
A Robust Speaker-Adaptive and Text-Prompted Speaker Verification System
Fuzzy neural network based dynamic path planning
GMM-UBM for text-dependent speaker recognition
Precise prediction model and simplified scoring system for sustained combined response to interferon-alpha
A method based on septrogram and pitch for biometric authentication
A word alignment model based on multiobjective evolutionary algorithms
Voiceprint verificaion based on two-level decision HMM-UBM
Incorporating syntax-based language models in phrase-based SMT models
A chunk-based reordering model for phrase-based SMT systems
Real-time speaker verification based on GMM-UBM for PDA
Embedded speech recognition system for intelligent robot
Translation memory sharing models in XMCAT
A discriminative training approach for text-independent speaker recognition
A model for ranking sentence pairs in parallel corpora
A hybrid method for syntactic and semantic structure disambiguation for Chinese
基于预训练模型的半监督说话人验证系统
清华大学学报(自然科学版),1000-0054,2024-07-31.面向闽南方言的自监督模型迁移学习
厦门大学学报(自然科学版),0438-0479,2024-07-28.基于端到端的多语种语音识别研究
信号处理,1003-0530,2021-10-15.说话人识别系统中特征提取的优化方法
厦门大学学报(自然科学版),0438-0479,2020-11-28.端到端闽南语合成系统的设计与实现
厦门大学学报(自然科学版),0438-0479,2020-04-15.一种采用旁瓣增强的麦克风阵列抗混响算法
厦门大学学报. 自然科学版,0438-0479,2017.基于最小分类误差准则的呼吸音分类技术
厦门大学学报(自然科学版),0438-0479,2016-05-16.基于概率修正PLDA的说话人识别系统
天津大学学报(自然科学与工程技术版),0493-2137,2015-08-15.一种可跟踪移动声源方向的麦克风阵列语音增强算法
厦门大学学报(自然科学版),0438-0479,2015-07-28.采用可调波束形成器的GSC麦克风阵列语音增强方法
厦门大学学报(自然科学版),0438-0479,2013-03-28.采用DTW算法和语音增强的嵌入式声纹识别系统
厦门大学学报(自然科学版),0438-0479,2012.基于GMM的实时说话人识别系统
电声技术,1002-8684,2007-06-17.声纹识别在开放仪器管理中的应用
华侨大学学报(自然科学版),1000-5013,2015-09-20.