About
I am a Senior Research Scientist and Senior Research Manager at the Fujitsu R & D Center in Beijing. My current research interests include music, audio, speech, and image signal processing. I obtained my PhD degree in the spring of 2013 from the Harbin Institute of Technology in computer science under the supervision of Prof. Jiqing Han. Additionally, I obtained a Master of Science in the summer of 2008 in computer science from the same institute learned from Prof. Haifeng Li. I earned a bachelor’s degree in computer science from Northeastern University in Shenyang, where I was admitted without Chinese GAOKAO. Last, but certainly not least, I got married to my wonderful wife Lei Shi in the summer of 2012. (By the way, I am originally from the suburbs of Yangzhou City, Jiangsu province)
Publications
(Note: Most of my papers can be found on arxiv.)
专著
- 韩纪庆,石自强. 声学事件检测理论与方法[M]. 科学出版社,2016. (购买链接:http://item.jd.com/10563712295.html)
Preprints
- Ziqiang Shi, Rujie Liu. Generative Modelling with High-Order Langevin Dynamics. https://arxiv.org/abs/2404.12814
- Ziqiang Shi, Rujie Liu, Jiqing Han. LaFurca: Iterative Refined Speech Separation Based on Context-Aware Dual-Path Parallel Bi-LSTM. 2020. https://arxiv.org/abs/2001.08998 (achieved 20.55dB SDR improvement, 20.35dB SI-SDR improvement, 3.69 of PESQ, and 94.86% of ESTOI on WSJ-2mix dataset. You can check the separated voices in this page:https://shiziqiang.github.io/tastas/).
- Ziqiang Shi, Tieran Zheng, Jiqing Han. Identifiability of multivariate logistic mixture models. arxiv.org/abs/1208.3546.
Journal Papers
- Liwen Zhang, Jiqing Han, Ziqiang Shi. Learning Temporal Relations from Semantic Neighbors for Acoustic Scene Classification. IEEE Signal Processing Letters. 2020.
- Liwen Zhang, Ziqiang Shi, Jiqing Han. Pyramidal Temporal Pooling with Discriminative Mapping for Audio Classification. IEEE/ACM Trans. on Audio, Speech and Language Processing, 2020, DOI:10.1109/TASLP.2020.2966868
- Ziqiang Shi, Jiqing Han, Tieran Zheng. Soft Margin Based Low-rank Audio Signal Classification. Neural Processing Letters, 2014, DOI:10.1007/s11063-014-9357-6.
- Ziqiang Shi,Jiqing Han,Tieran Zheng,Shiwen Deng. Audio Segment Classification Using Online Learning Based Tensor Representation Feature Discrimination[J]. IEEE Transactions on Audio, Speech, And Language Processing, 2013, 21(1): 184-194.
- Ziqiang Shi,Jiqing Han,Tieran Zheng,Ji Li. Identification of Objectionable Audio Segments Based on Pseudo and Heterogeneous Mixture Models[J]. IEEE Transactions on Audio, Speech, And Language Processing, 2013, 21 (3): 611-623.
- Ziqiang Shi,Jiqing Han,Tieran Zheng. Audio Classification with Low-rank Matrix Representation Features[J]. ACM Transactions on Intelligent Systems and Technology, 2014, 5 (1).
- Ziqiang Shi,Tieran Zheng,Jiqing Han,Boyang Gao. Erotic Audio Recognition Using Heterogeneous Ensemble Classifiers[J]. International Journal of Computer and Electrical Engineering, vol. 4, no. 5, pp. 666-669 , 2012.
- Shi Ziqiang, Gao Boyang, Zheng Tieran, Han Jiqing, ``Study On The Recognition Of Objectionable Audio’’, International Journal of Pattern Recognition and Artificial Intelligence, 2010,24(6):981-994.
- 石自强,李海峰,孙佳音,``基于SVM的流行音乐中人声的识别’’, 计算机工程与应用, 2008 44(25): 126-128.
Conference Papers
- Ziqiang Shi, Rujie Liu. MULTIMEDIA GENERATIVE MODELLING WITH HIGH-ORDER LANGEVIN DYNAMICS. ICME 2024.
- Ziqiang Shi, Rujie Liu. LANGWAVE: REALISTIC VOICE GENERATION BASED ON HIGH-ORDER LANGEVIN DYNAMICS. ICASSP 2024.
- Ziqiang Shi, Rujie Liu. NOISY IMAGE RESTORATION BASED ON CONDITIONAL ACCELERATION SCORE APPROXIMATION. ICASSP 2024.
- Ziqiang Shi, Rujie Liu. Conditional Velocity Score Estimation for Image Restoration. WACV 2024. (Best paper award)
- Ziqiang Shi, Zhongling Liu, Liu Liu, Rujie Liu, Takuma Yamamoto, Xiaoyu Mi, and Daisuke Uchida. CheckSORT: Refined synthetic data combination and optimized sort for automatic retail checkout. In CVPR Workshop, 2023. (1st prize in the 7th AI CITY CHALLENGE)
- Zhongling Liu, Rujie Liu, Ziqiang Shi, Liu Liu Xiaoyu Mi, Kentaro Murase. SEMI-SUPERVISED CONTRASTIVE LEARNING WITH SOFT MASK ATTENTION FOR FACIAL ACTION UNIT DETECTION. ICASSP 2023.
- Shoule Wu, Ziqiang Shi. ItoWave: Ito Stochastic Differential Equation Is All You Need For Wave Generation. ICASSP 2022. https://arxiv.org/abs/2201.12519
- Zhongling Liu, Ziqiang Shi, Rujie Liu, Liu Liu, Xiaoyu Mi, Kentaro Murase. Expression-assisted facial action unit detection through an attention mechanism and smooth class-weighted Loss. Thirteenth International Conference on Signal Processing Systems (ICSPS 2021)
- Ziqiang Shi, Liu Liu, Zhongling Liu, Rujie Liu, Xiaoyu Mi, Murase Kentaro. HiCOMEX: Facial action unit recognition based on hierarchy intensity distribution and COMEX relation learning. 2021 4th International Conference on Intelligent Robotics and Control Engineering, IRCE 2021. (Best oral presentation)
- Ziqiang Shi, Rujie Liu, Jiqing Han. Speech Separation Based on Multi-Stage Elaborated Dual-Path Deep BiLSTM with Auxiliary Identity Loss. Interspeech 2020. https://arxiv.org/abs/2008.03149
- Liwen Zhang, Jiqing Han, Ziqiang Shi. ATReSN-Net: Capturing Attentive Temporal Relations in Semantic Neighborhood for Acoustic Scene Classification. Interspeech 2020.
- Shi Ziqiang, Liu Liu, Liu Rujie. HODGE AND PODGE: HYBRID SUPERVISED SOUND EVENT DETECTION WITH MULTI-HOT MIXMATCH AND COMPOSITION CONSISTENCE TRAINING. EUSIPCO 2020. https://arxiv.org/abs/2002.06021
- Liwen Zhang, Ziqiang Shi, et al. FurcaNeXt: End-to-end monaural speech separation with dynamic gated dilated temporal convolutional networks. MMM 2020.
- Ziqiang Shi, et al. HODGEPODGE: SOUND EVENT DETECTION BASED ON ENSEMBLE OF SEMI-SUPERVISED LEARNING METHODS. DCASE2019 workshop. arxiv.org/abs/1907.07398. (Ranked 3rd in the dcase2019 challenge task4: “sound event detection in domestic environments”.)
- Ziqiang Shi, et al. FurcaNet: An end-to-end deep gated convolutional, long short-term memory, deep neural networks for single channel speech separation. 2019 全国人机语音通讯学术会议. arxiv.org/abs/1902.00651
- Ziqiang Shi, et al. Is CQT more suitable for monaural speech separation than STFT? an empirical study. 2019 全国人机语音通讯学术会议. arxiv.org/abs/1902.00631.
- Ziqiang Shi, Huibin Lin, Liu Liu, Rujie Liu, Jiqing Han, and Anyan Shi. Deep Attention Gated Dilated Temporal Convolutional Networks with Intra-Parallel Convolutional Modules for End-to-End Monaural Speech Separation [C]. Interspeech 2019.
- Ziqiang Shi, Huibin Lin, Liu Liu, Rujie Liu, Shoji Hayakawa, Shouji Harada and Jiqing Han. End-to-End Monaural Speech Separation with Multi-Scale Dynamic Weighted Gated Dilated Convolutional Pyramid Network [C]. Interspeech 2019.
- Ziqiang Shi, Huibin Lin, Liu Liu, Rujie Liu, Shoji Hayakawa and Jiqing Han. FurcaX: End-to-end monaural speech separation based on deep gated (de)convolutional neural networks with adversarial example training [C]. ICASSP 2019.
- Ziqiang Shi, Huibin Lin, Liu Liu, Rujie Liu, Shoji Hayakawa and Jiqing Han. Deep Clustering With Constant Q Transform For Multi-Talker Single Channel Speech Separation [C]. IEEE FRUCT 2018.
- Ziqiang Shi, Huibin Lin, Liu Liu, Rujie Liu. Double Joint Bayesian Modeling of DNN LocalI-Vector for Text Dependent Speaker Verification with Random Digit Strings [C]. Interspeech 2018.
- Ziqiang Shi, Huibin Lin, Liu Liu, Rujie Liu. Latent Factor Analysis of Deep Bottleneck Features for Speaker Verification with Random Digit Strings [C]. Interspeech 2018.
- Ziqiang Shi, Liu Liu, Huibin Lin, Rujie Liu. Joint Learning of J-Vector Extractor and Joint Bayesian Model for Text Dependent Speaker Verification [C]. Interspeech 2018.
- Ziqiang Shi, Mengjiao Wang, Liu Liu, Huibin Lin, Rujie Liu. A Double Joint Bayesian Approach for J-Vector Based Text-dependent Speaker Verification[C]. Speaker Odyssey 2018.
- Ziqiang Shi, Rujie Liu. A better convergence analysis of the block coordinate descent method for large scale machine learning[C]. ICMLA 2017.
- Ziqiang Shi, Liu Liu, Mengjiao Wang, Rujie Liu. Multi-View (Joint) Probability Linear Discrimination Analysis For J-Vector Based Text Dependent Speaker Verification[C]. ASRU 2017.
- Ziqiang Shi, Rujie Liu. Online and stochastic Douglas-Rachford splitting method for large scale machine learning[C]. ACML workshop on Learning on big data 2016.
- Ziqiang Shi, Rujie Liu. Empirical study of PROXTONE and PROXTONE + for Fast Learning of Large Scale Sparse Models[C]. ICSP 2016.
- Ziqiang Shi, Rujie Liu. Large Scale Optimization with Proximal Stochastic Newton-type Gradient Descent [C]. ECML 2015. (Acceptance rate: 89/383=23%)
- Ziqiang Shi, Rujie Liu. Online and Stochastic Universal Gradient Methods for Minimizing Regularized H”older Continuous Finite Sums in Machine Learning[C]. PAKDD 2015. (Acceptance rate: 90/405=22%)
- Ziqiang Shi,Tieran Zheng, Jiqing Han, Ji Li. Guarantees of Augmented Trace Norm Models in Tensor Recovery[C]. IJCAI 2013. (Acceptance rate: 413/1473=28%)
- Ziqiang Shi,Tieran Zheng, Jiqing Han, Shiwen Deng. Low-rank Audio Signal Classification Under Soft Margin and Trace Norm Constraints[C]. Interspeech2012, pp.2401-2404.
- 石自强, 韩纪庆, 郑铁然, ``基于锚空间的音频场景识别’’, 2011 全国人机语音通讯学术会议.
- Ziqiang Shi, Jiqing Han, Tieran Zheng, “A Novel Framework Based on Trace Norm Minimization for Audio Event Detection”, ICONIP 2011, Part II, LNCS 7063, pp. 646-654. Springer, Heidelberg (2011).
- Ziqiang Shi, Jiqing Han, Tieran Zheng, ``Heterogeneous Mixture Models Using Sparse Representation Features For Applause And Laugh Detection’’, IEEE International Workshop on Machine Learning For Signal Processing (MLSP), pp.1-5, 2011.
- Ziqiang Shi, Jiqing Han, Tieran Zheng, ``Real-World Speech/Non-Speech Audio Classification Based on Sparse Representation Features and GPCs’’, Interspeech2011,pp.2401-2404.
- Miao Li, Jin Li, Jiqing Han, Ziqiang Shi, ``Singing Melody Extraction from Pop Songs Using a Novel Feature and Viterbi Search’’, IEEE International Conference on Computational Intelligence and Software Engineering (CiSE), pp.1-4, 2010.
- Jin Li, Jiqing Han, Ziqiang Shi, ``An Efficient Approach to Humming Transcription for Query-by-Humming System’’, IEEE International Congress on Image and Signal Processing (CISP), pp.3746-3749, 2010.
- Hao Xue, HaiFeng Li, Chang Gao, Ziqiang Shi, ``Computationally Efficient Audio Segmentation through a Multi-Stage BIC Approach’’, IEEE International Conference on Image and Signal Processing (CISP), pp.3774-3777, 2010.
- Ziqiang Shi, Boyang Gao, Tieran Zheng, Jiqing Han, ``Objectionable Audio Content Understanding Based On In-Class Clustering Method’’, IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC2009), pp. 712-716, 2009.
- Ziqiang Shi, Boyang Gao, Jiqing Han, Zhen Wu, ``Study of Objectionable Sound Recognition based on Histogram Features and SVM’’, IEEE International Conference on Image and Signal Processing (CISP), pp. 1-4, 2009.
PhD Thesis
ROBUST ACOUSTIC EVENT DETECTION BASED ON LONG-TERM FEATURES ( 基于长时特征的鲁棒声学事件检测 ).
审稿
IEEE Signal Processing Letter, Applied Acoustics, Speech Communication,Multimedia Tools and Applications, IEEE Transactions on Audio, Speech, And Language Processing,自动化学报,电子学报,ECML,AAAI,IJCAI,WACV。
荣誉
- 最佳论文奖 2024年IEEE/CVF 计算机视觉应用冬季会议(IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2024)
- 冠军 第七届人工智能城市挑战赛(AI CITY CHALLENGE 2023, https://www.aicitychallenge.org/2023-challenge-winners/)
- 副研究员 (2021)
- 北京市朝阳区国际高端商务人才 之 青年英才 (2019)
- 季军 第五届声学场景和事件的检测和分类竞赛(DCASE 2019, https://dcase.community/challenge2019/)
- 富士通研发中心 总经理特别奖 之 信息处理技术的本地化推广及应用奖(2015)
- 富士通研发中心 总经理特别奖 之 团队突出贡献奖(2014)
- 哈尔滨工业大学优秀博士论文提名(2014,3/42 计算机学院)
TEL: +86-13621160486
E-mails: shiziqiang7@gmail.com and shiziqiang@fujitsu.com.
Blog:http://blog.sciencenet.cn/u/Riemann7.