Full Program

Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2022

All conference programs will take place over 3 days on November 8-10, 2022 at Empress Convention Center.

Session Room Chair
TuAM1-1 (SS13:Advanced Topics on Sound Event and Scene Analysis) Chiang Mai 1 Nobutaka Ono, Keisuke Imoto, Tatsuya Komatsu
Date Time Title Authors
8 November 2022 10.35-10.55 On Sorting and Padding Multiple Targets for Sound Event Localization and Detection With Permutation Invariant and Location-Based Training Robin Scheibler; Tatsuya Komatsu; Yusuke Fujita; Michael Hentschel
10.55-11.15 How Information on Acoustic Scenes and Sound Events Mutually Benefits Event Detection and Scene Classification Tasks Ami Igarashi; Keisuke Imoto; Yuka Komatsu; Shunsuke Tsubaki; Shuto Hario; Tatsuya Komatsu
11.15-11.35 Compressed Sensing of Sparse Spectrum Using Distributed Sound-To-Light Conversion Device Blinkies Satoshi Motoyama; Natsuki Ueno; Yuma Kinoshita; Nobutaka Ono
11.35-11.55 CochlScene: Acquisition of Acoustic Scene Data Using Crowdsourcing Il-Young Jeong; Jeongsoo Park
11.55-12.15 Vision Transformer Based Audio Classification Using Patch-Level Feature Fusion Juan Luo; Jielong Yang; Eng Siong Chng; Xionghu Zhong
12.15-12.35 Self-Consistency Training With Hierarchical Temporal Aggregation for Sound Event Detection Yunlong Li; Xiujuan Zhu; Mingyu Wang; Ying Hu
Session Room Chair
TuAM1-2 (Speech, Language, and Audio 1) Chiang Mai 2 Tomoki Toda
Date Time Title Authors
8 November 2022 10.35-10.55 Music Similarity Calculation of Individual Instrumental Sounds Using Metric Learning Yuka Hashizume; Li Li; Tomoki Toda
10.55-11.15 Investigation of Noise-Reverberation-Robustness of Modulation Spectral Features for Speech-Emotion Recognition Taiyang Guo; Sixia Li; Masashi Unoki; Shogo Okada
11.15-11.35 Combine Waveform and Spectral Methods for Single-Channel Speech Enhancement Miao Li; Hui Zhang; Xueliang Zhang
11.35-11.55 Perceptual Loss Function for Speech Enhancement Based on Generative Adversarial Learning Xin Bai; Xueliang Zhang; Hui Zhang; Haifeng Huang
11.55-12.15 Joint Speech Activity and Overlap Detection With Multi-Exit Architecture Ziqing Du; Kai Liu; Xucheng Wan; Huan Zhou
Session Room Chair
TuAM1-3 (Human Biometrics and Security Systems) Chiang Mai 3 Jessada Karnjana
Date Time Title Authors
8 November 2022 10.35-10.55 On Wrist Vein Recognition for Human Biometrics Felix Marattukalam; David Cole; Pranav Gulati; Waleed H. Abdulla
10.55-11.15 Continuous Authentication on Unconstrained Activities Using Window and Cycle Based Segmentation Lina Septiana; Narishige Abe; Tomoaki Matsunami; Hidetsugu Uchida; Kazuki Osamura; Shigefumi Yamada
11.15-11.35 Smoothed Teager Energy Cepstral Feature for Replay Attack Detection on Voice Assistants Madhu R Kamble; Anand Therattil; Hemant A. Patil; M. Ali Basha Shaik; Vikram Vij
11.35-11.55 Disentangled Speaker Representation Learning via Mutual Information Minimization Sung Hwan Mun; Min Hyun Han; Minchan Kim; Dongjune Lee; Nam Soo Kim
11.55-12.15 Contribution of Timbre and Shimmer Features to Deepfake Speech Detection Anuwat Chaiwongyen; Norranat Songsriboonsit; Suradej Duangpummet; Jessada Karnjana; Waree Kongprawechnon; Masashi Unoki
12.15-12.35 Combined 2D and 3D Convolution Residual Attention Network for Hand Gesture Recognition Chang-Ting Tsai; Jian-Jiun Ding
10.35-10.55 On Wrist Vein Recognition for Human Biometrics Felix Marattukalam; David Cole; Pranav Gulati; Waleed H. Abdulla
Session Room Chair
TuAM1-4 (Signal Image and Information Processing Theory and Methods) Board Room 2 Daranee Hormdee
Date Time Title Authors
8 November 2022 10.35-10.55 Investigate Bidirectional Functional Brain Networks Using Directed Information Qiang Li
10.55-11.15 Effective ASR Error Correction Leveraging Phonetic, Semantic Information and N-Best Hypotheses Hsin-Wei Wang; Bi-Cheng Yan; Yi-Cheng Wang; Berlin Chen
11.15-11.35 A Lossless Audio Codec Based on Hierarchical Residual Prediction Taiyo Mineo; Hayaru Shouno
11.35-11.55 Investigating Low-Distortion Speech Enhancement With Discrete Cosine Transform Features for Robust Speech Recognition Yu-Sheng Tsao; Jeih-weih Hung; Kuan-Hsun Ho; Berlin Chen
11.55-12.15 Consistent MDT-Tucker: A Hankel Structure Constrained Tucker Decomposition in Delay Embedded Space Ryuki Yamamoto; Hidekata Hontani; Akira Imakura; Tatsuya Yokota
12.15-12.35 Sound Reproduction With a Circular Loudspeaker Array Using Differential Beamforming Method Yankai Zhang; Jiayi Mao; Yefeng Cai; Chao Ye
Session Room Chair
TuAM1-5 (SS01: Reconfigurable Computing and Performance Evaluation) Board Room 3 Ukrit Mankong
Date Time Title Authors
8 November 2022 10.35-10.55 Design and System Implementation of a Configurable Optical Interconnection Network Bowen Yang; Junyong Deng; Jiaying Luo; Yu Feng
10.55-11.15 2S-AGCN Human Behavior Recognition Based on New Partition Strategy Jin Wu; Lei Wang; Gege Chong; Haoran Feng
11.15-11.35 Design of Optimal FIR Digital Filter by Swarm Optimization Technique Jin Wu; Yaqiong Gao; Ling Yang; Zhengdong Su
11.35-11.55 Design and Implementation of Reconfigurable Array Structure for Convolutional Neural Network Supporting Data Reuse Rui Shan; Ziqing Huo; Xiaoshuo Li; Huan Chang; Rui Qin
11.55-12.15 DBR: A Depth-Branch-Resorting Algorithm for Locality Exploration in Graph Processing Lin Jiang; Ru Feng; Junjie Wang; Junyong Deng
12.15-12.35 Performance Evaluation of Popularity-Aware Dynamic Clustering Scheme for Distributed Caching in ICN Mikiya Yoshida; Yusuke Ito; Yurino Sato; Hiroyuki Koga
Session Room Chair
TuAM1-6 (SS03: Security Techniques of Speaker Recognition) Chiang Mai 4 Xiao-Lei Zhang
Date Time Title Authors
8 November 2022 10.35-10.55 Masking Speech Feature to Detect Adversarial Examples for Speaker Verification Xing Chen; Jiadi Yao; Xiao-Lei Zhang
10.55-11.15 F0 Modification via PV-TSM Algorithm for Speaker Anonymization Across Gender Candy Olivia Mawalim; Shogo Okada; Masashi Unoki
11.15-11.35 Pay Attention to Hard Trials Lantian Li; Di Wang; Dong Wang
11.35-11.55 A Multi-Task Framework of Speaker Recognition With TTS Data Augmentation Xingjia Xie; Yiming Zhi; Beibei Ouyang; Qingyang Hong; Lin Li
11.55-12.15 Source Tracing: Detecting Voice Spoofing Tinglong Zhu; Xingming Wang; Xiaoyi Qin; Ming Li
12.15-12.35 Replay Attack Detection Based on Voice and Non-Voice Sections for Speaker Verification Ananda Garin Mills; Patthranit Kaewcharuay; Pannathorn Sathirasattayanon; Suradej Duangpummet; Kasorn Galajit; Jessada Karnjana; Pakinee Aimmanee
Session Room Chair
TuAM1-7 (Speech, Language, and Audio 2) Chiang Mai 5 Natthanan Promsuk
Date Time Title Authors
8 November 2022 10.35-10.55 Learning Emotion Information for Expressive Speech Synthesis Using Multi-Resolution Modulation-Filtered Cochleagram Kaili Zhang; Masashi Unoki
10.55-11.15 VocEmb4SVS: Improving Singing Voice Separation With Vocal Embeddings Chenyi Li; Yi Li; Xuhao Du; Yaolong Ju; Shichao Hu; Zhiyong Wu
11.15-11.35 Dialect-Aware Semi-Supervised Learning for End-To-End Multi-Dialect Speech Recognition Sayaka Shiota; Ryo Imaizumi; Ryo Masumura; Hitoshi Kiya
11.35-11.55 Design and Construction of Japanese Multimodal Utterance Corpus With Improved Emotion Balance and Naturalness Daisuke Horii; Akinori Ito; Takashi Nose
11.55-12.15 Non-Parallel Voice Conversion Based on Free-Energy Minimization of Speaker-Conditional Restricted Boltzmann Machine Takuya Kishida; Toru Nakashika
12.15-12.35 The TNT Team System Descriptions of Cantonese, Mongolian and Kazakh for IARPA OpenASR21 Challenge Kai Tang; Jing Zhao; Jinghao Yan; Jian Kang; Haoyu Wang; Jinpeng Li; Shuzhou Chai; Guan-Bo Wang; Shen Huang; Guoguo Chen; Pengfei Hu; Wei-Qiang Zhang
Session Room Chair
TuAM1-8 (SS10: Real-world sensing technologies of human function) Board Room 4 Yumie Ono/Toshihisa Tanaka
Date Time Title Authors
8 November 2022 10.35-10.55 Evaluation of Cognitive Test Results Using Concentration Estimation From Facial Videos Terumi Umematsu; Masanori Tsujikawa; Hideyuki Sawada
10.55-11.15 Clustering of Advertising Images Using Electroencephalogram Ingon Chanpornpakdi; Motoi Noda; Toshihisa Tanaka; Yuval Harpaz; Amir B. Geva
11.15-11.35 Evaluation of Influence of Positions and Numbers of EEG Electrodes on Quantification of Independent Component Matrix Ingon Chanpornpakdi; Ryohei Mizuochi; Maro G Machizawa
11.35-11.55 Wearable Microfluidic Biosensor for Real-Time Sweat Content Monitoring Hiroyuki Kudo; Yuto Goto
11.55-12.15 Ear-EEG Based Eye State Classification Using Convolutional Neural Network Chang-Hee Han; Han-Jeong Hwang
12.15-12.35 Development of Virtual-Reality-Based Exergame for Lower-Extremity Rehabilitation of Stroke Patients Mamiko Sasakawa; Daigo Ito; Ryo Ogura; Takanori Tominaga; Yumie Ono
Session Room Chair
TuPM1-1 ( Speech, Language, and Audio 1) Chiang Mai 1 Rohan Kumar Das
Date Time Title Authors
8 November 2022 15.20-15.40 Is Your Baby Fine at Home? Baby Cry Sound Detection in Domestic Environments Tanmay Khandelwal; Rohan Kumar Das; Eng-Siong Chng
15.40-16.00 Acoustic Echo and Noise Canceller Using Shared-Error Normalized Least Mean Square Algorithm Kenta Iwai; Takanobu Nishiura
16.00-16.20 Subband-Based Spectrogram Fusion for Speech Enhancement by Combining Mapping and Masking Approaches Hao Shi; Longbiao Wang; Sheng Li; Jianwu Dang; Tatsuya Kawahara
16.20-16.40 Neural Virtual Microphone Estimator: Application to Multi-Talker Reverberant Mixtures Hanako Segawa; Tsubasa Ochiai; Marc Delcroix; Tomohiro Nakatani; Rintaro Ikeshita; Shoko Araki; Takeshi Yamada; Shoji Makino
16.40-17.00 SE-Mixer: Towards an Efficient Attention-Free Neural Network for Speech Enhancement Kai Wang; Bengbeng He; Wei-Ping Zhu
17.00-17.20 How Should We Evaluate Synthesized Environmental Sounds Yuki Okamoto; Keisuke Imoto; Shinnosuke Takamichi; Takahiro Fukumori; Yoichi Yamashita
17.20-17.40 FeatureCut: An Adaptive Data Augmentation for Automated Audio Captioning Zhongjie Ye; Yuqing Wang; Helin Wang; Dongchao Yang; Yuexian Zou
Session Room Chair
TuPM1-2 (Signal Processing Systems: Design and Implementation) Chiang Mai 2 Kasemsit Teeyapan
Date Time Title Authors
8 November 2022 15.20-15.40 Robust Steerable Differential Beamformer for Concentric Circular Array With Directional Microphones Weilong Huang; Jinwei Feng
15.40-16.00 A Deep Proximal-Unfolding Method for Monaural Speech Dereverberation Meihuang Wang; Minmin Yuan; Andong Li; Chengshi Zheng; Xiaodong Li
16.00-16.20 Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization Xiao-Ying Zhao; Qiu-Shi Zhu; Jie Zhang
16.20-16.40 HouseX: A Fine-Grained House Music Dataset and Its Potential in the Music Industry Xinyu Li
16.40-17.00 Interpretable Control for Emotional Text-To-Speech System Toward Development of Sympathetic Educational-Support Robots Jingyi Feng; Tomohiro Yoshikawa; Tomoki Toda
17.00-17.20 Direction-Aware Target Speaker Extraction With a Dual-Channel System Based on Conditional Variational Autoencoders Under Underdetermined Conditions Rui Wang; Li Li; Tomoki Toda
17.20-17.40 LCN: Label Correction Based on Network Prediction for Cross-Modal Retrieval With Noisy Labels Daiki Okamura; Ryosuke Harakawa; Masahiro Iwahashi
Session Room Chair
TuPM1-3 (Signal Image and Information Processing Theory and Methods) Chiang Mai 3 Tatsuya Yokota
Date Time Title Authors
8 November 2022 15.20-15.40 Using Self-Learning Representations for Objective Assessment of Patient Voice in Dysphonia Shaoxiang Dang; Tetsuya Matsumoto; Yoshinori Takeuchi; Hiroaki Kudo; Takashi Tsuboi; Yasuhiro Tanaka; Masahisa Katsuno
15.40-16.00 Fast Signal Completion Algorithm With Cyclic Convolutional Smoothing Hiromu Takayama; Tatsuya Yokota
16.00-16.20 Single-Channel Speech Enhancement Student Under Multi-Channel Speech Enhancement Teacher Yuzhu Zhang; Hui Zhang; Xueliang Zhang
16.20-16.40 Distance-Based Dynamic Weight: A Novel Framework for Multi-Source Information Fusion Cuiping Cheng; Xiaoning Zhang; Taihao Li
16.40-17.00 Improvement of the Direction-Of-Arrival Estimation Method Using a Single Channel Microphone by Correcting a Spectral Slope of Speech Masaki Ikeuchi; Hiroki Tanji; Takahiro Murakami
17.00-17.20 Studying Human-Based Speaker Diarization and Comparing to State-Of-The-Art Systems Simon W. McKnight; Aidan O. T. Hogg; Vincent W. Neo; Patrick A. Naylor
17.20-17.40 Optimization of CU Partition Based on Texture Degree in H.266/VVC Jingyuan Tang; Songlin Sun
Session Room Chair
TuPM1-4 (SS02: Deep Learning Systems and Applications for Cloud, Fog, and Edge) Board Room 2 Jia-Ching Wang
Date Time Title Authors
8 November 2022 15.20-15.40 Selection of Supplementary Acoustic Data for Meta-Learning in Under-Resourced Speech Recognition I-Ting Hsieh; Chung-Hsien Wu; Zhe-Hong Zhao
15.40-16.00 Using Prosodic Phrase-Based VQVAE on Audio ALBERT for Speech Emotion Recognition Jia-Hao Hsu; Chung-Hsien Wu; Tsung-Hsien Yang
16.00-16.20 ESPnet-ONNX: Bridging a Gap Between Research and Production Masao Someki; Yosuke Higuchi; Tomoki Hayashi; Shinji Watanabe
16.20-16.40 Multi-Loss Function in Robust Convolutional Autoencoder for Reconstruction Low-Quality Fingerprint Image Farchan Hakim Raswa; Franki Halberd; Agus Harjoko; Wahyono; Chung-Ting Lee; Yung-Hui Li; Jia Ching Wang
Session Room Chair
TuPM1-5 (Research Review) Board Room 3 Jesin James
Date Time Title Authors
8 November 2022 15.20-15.40 EmotionGUI: Visualisation and Annotation of Emotions in a 2D Space for Multi-Modal Signals Jesin James; Felix Marattukalam; Owen Eng; Aron Jeremiah
15.40-16.00 Enhancing the Performance of Automatic Speech Recognition With Optical Microphone Technology Through Data Augmentation Approach: A Pilot Study Ruei-Ci Shen; Ji-Yan Han; Ying-Hui Lai
16.00-16.20 Process Monitoring Based on Nearest Correlation and Variational Graph Auto-Encoder and Its Application to Tennessee Eastman Process Yoshiaki Uchida; Koichi Fujiwara
16.20-16.40 Decoding of Individual Emotions Induced During Interaction With Voice-User Interface Using Electroencephalography Jun-Seok Lee, Ga-Young Choi, Ji-Yoon Lee, Jong-Gyu Shin, Sang-Ho Kim, Han-Jeong Hwang
16.40-17.00 Leverage Limited Features of Partial Fingerprint Recognition Using Improved Siamese Network With Self-Spatial Attention Farchan Hakim Raswa, Franki Halberd, Agus Harjoko, Chung-Ting Lee, Yung-Hui Li, Pao-Chi Chang, Jia-Ching Wang
17.00-17.20 Design and Signal Analysis of a Compact Antenna for UWB MIMO Systems Long Jin; Yangmiao Lin; Iickho Song; Ruohan Zhang
17.20-17.40 A Filtered-x Active Noise Control Algorithm Robust to Impulsive Noise Using Novel Subband Adaptive Filter Algorithm Chan Park; Minho Lee; PooGyeon Park
Session Room Chair
TuPM1-6 (Speech, Language, and Audio 2) Chiang Mai 4 Christian H Ritz
Date Time Title Authors
8 November 2022 15.20-15.40 Neural Conversational Speech Synthesis With Flexible Control of Emotion Dimensions Hiroki Mori; Hironao Nishino
15.40-16.00 Temporal Feedback Convolutional Recurrent Neural Networks for Speech Command Recognition Taejun Kim; Juhan Nam
16.20-16.40 Impact of Compression on the Performance of the Room Impulse Response Interpolation Approach to Spatial Audio Synthesis Hualin Ren; Christian Ritz; Jiahong Zhao; Daeyoung Jang
16.40-17.00 Machine Anomalous Sound Detection Based on Self-Supervised Classification Shuxian Wang; Jun Du; Yajian Wang
17.00-17.20 A Study on Low-Latency Recognition-Synthesis-Based Any-To-One Voice Conversion Yi-Yang Ding; Li-Juan Liu; Yu Hu; Zhen-Hua Ling
17.20-17.40 Speech Enhancement With Perceptually-Motivated Optimization and Dual Transformations Xucheng Wan; Kai Liu; Ziqing Du; Huan Zhou
Session Room Chair
TuPM1-7 (SS12: Advanced signal detection and inspection technology) Chiang Mai 5 Settha Tangkawanit
Date Time Title Authors
8 November 2022 15.20-15.40 Automatic Sound Detection and Notification System Using MFCC Jaruwat Patmanee; Prapatson Kotipang; Pawarisorn Sinpeang; Surachet Kanprachar; Settha Tangkawanit
15.40-16.00 Sound Identification Using MFCC With Machine Learning Pattarapong Kammee; Chairat Pinthong; Surachet Kanprachar; Settha Tangkawanit
16.20-16.40 Direct-Lattice Adaptive Notch Filter for Frequency Estimation and Tracking Prayuth Inban; Rachu Punchalard; Chawalit Benjangkaprasert
16.40-17.00 Distance Estimation Between Camera and Vehicles From an Image Using YOLO and Machine Learning Rattapoom Waranusast; Panomkhawn Riyamongkol; Pattanawadee Pattanathaburt
17.00-17.20 OCR Application for Cancer Care Settha Tangkawanit; Jiraporn Pooksook; Jirarat Ieamsaard; Panupong Sornkhom
17.20-17.40 The Development of Mobile Application for Assisting COVID-19 Antigen Test Kit Results Reading Rattapoom Waranusast; Pattanawadee Pattanathaburt
17.40 - 18.00 Matched Filter Detector for Textile Fiber Classification of Signals With Near-Infrared Spectrum Suchart Yammen; Wachira Limsripraphan
Session Room Chair
WedAM1-1 (SS11: Transfer Learning for Real World) Chiang Mai 1 Xiaoxu Li/ Dome Potikanond
Date Time Title Authors
9 November 2022 9.00-9.20 Semantics-Guided Knowledge Integration for Domain Adaptation Few-Shot Relation Extraction Zeyuan Wang; Yifan Du; Guangwei Zhang; Ruifan Li; Yongping Xiong; Chuang Zhang
9.20-9.40 PVGCRA: Prediction Variance Guided Cross Region Domain Adaptation Ran Xu; Yixiang Huang; Chuang Zhang
9.40-10.00 Multi-Branch Network for Few-Shot Learning Kai Ren; Zijie Guo; Zhimin Zhang; Rui Zhu; Xiaoxu Li
10.00-10.20 Few-Shot Classification With Feature Reconstruction Bias Zhen Li; Lang Wang; Shuo Ding; Xiaochen Yang; Xiaoxu Li
10.20-10.40 Dual Prototypical Network for Robust Few-Shot Image Classification Qi Song; Zebin Peng; Luchen Ji; Xiaochen Yang; Xiaoxu Li
10.40-11.00 Graph Evolving and Embedding in Transformer Jen-Tzung Chien; Chia-Wei Tsao
Session Room Chair
WedAM1-2 (Speech, Language, and Audio 1) Chiang Mai 2 Xiaofen Xing
Date Time Title Authors
9 November 2022 9.00-9.20 Punctuation Restoration for Singaporean Spoken Languages: English, Malay, and Mandarin Abhinav Rao; Ho Thi-Nga; Chng Eng Siong
9.20-9.40 C-CycleTransGAN: A Non-Parallel Controllable Cross-Gender Voice Conversion Model With CycleGAN and Transformer Changzeng Fu; Chaoran Liu; Carlos Toshinori Ishi; Hiroshi Ishiguro
9.40-10.00 The Realization and Perception of Narrow Focus in English Sentences by Cantonese EFL Learners Chong Cao; Aijun Li
10.00-10.20 Cross-Lingual Dysarthria Severity Classification for English, Korean, and Tamil Eun Jung Yeo; Kwanghee Choi; Sunhee Kim; Minhwa Chung
10.20-10.40 3M: An Effective Multi-View, Multi-Granularity, and Multi-Aspect Modeling Approach to English Pronunciation Assessment Fu-An Chao; Tien-Hong Lo; Tzu-I Wu; Yao-Ting Sung; Berlin Chen
10.40-11.00 I Feel Stressed Out: A Mandarin Speech Stress Dataset With New Paradigm Shuaiqi Chen; Xiaofen Xing; Guodong Liang; Xiangmin Xu
Session Room Chair
WedAM1-3 ( Deep Learning: Algorithm, Implementations, and Applications) Chiang Mai 3 Hiroyoshi Ito
Date Time Title Authors
9 November 2022 9.00-9.20 End-To-End Reinforcement Learning of Robotic Manipulation With Robust Keypoints Representation Tianying Wang; En Yen Puang; Marcus Lee; Wei Jing; Yan Wu
9.20-9.40 BEAM - an Algorithm for Detecting Phishing Link Sea Ran Cleon Liew; Ngai Fong Law
9.40-10.00 I2CR: Improving Noise Robustness on Keyword Spotting Using Inter-Intra Contrastive Regularization Dianwen Ng; Jia Qi Yip; Tanmay Surana; Zhao Yang; Chong Zhang; Yukun Ma; Chongjia Ni; Eng Siong Chng; Bin Ma
10.00-10.20 Human-In-The-Loop Chord Progression Generator With Generative Adversarial Network Yoshiteru Matsumoto; Hiroyoshi Ito; Hiroko Terasawa; Yuya Yamamoto; Yuzuru Hiraga; Masaki Matsubara
10.20-10.40 A Resource-Limited FPGA-Based MobileNetV3 Accelerator Yutana Jewajinda; Thanapol Thongkum
10.40-11.00 CG-Net: A Compound Gaussian Prior Based Unrolled Imaging Network Carter A Lyons; Raghu G. Raj; Margaret Cheney
Session Room Chair
WedAM1-4 (Signal Image and Information Processing Theory and Methods) Board Room 2 Mingyi He
Date Time Title Authors
9 November 2022 9.00-9.20 A Policy-Based Approach to the SpecAugment Method for Low Resource E2E ASR Rui Li; Guodong Ma; Dexin Zhao; Ranran Zeng; Xiaoyu Li; Hao Huang
9.20-9.40 Manifold Rewiring for Unlabeled Imaging Valentin Debarnot; Vinith Kishore; Cheng Shi; Ivan Dokmanic
9.40-10.00 CRDet: An Object-Context-Aware Detection Network for Oriented Object in Aerial Images Lele Liang; Linghan Li; Qi Liu; Yuchao Dai; Mingyi He
10.00-10.20 Effects of Incorporating a Deep-Unfolding Framework Into a Deep Neural Network: Implications for Image Restoration Tatsuki Itasaka; Masahiro Okuda
10.20-10.40 Cross-Modal Knowledge Distillation With Dropout-Based Confidence Won Ik Cho; Jeunghun Kim; Nam Soo Kim
10.40-11.00 A Multi-Objective Perceptual Aware Loss Function for End-To-End Target Speaker Separation Zhan Jin; Bang Zeng; Fan Zhang
Session Room Chair
WedAM1-5 (Research Review) Board Room 3 Ying-Hui Lai
Date Time Title Authors
9 November 2022 9.00-9.20 EEG-Based Anomaly Detection Model by One-Class Support Vector Machine for Dream Enactment Behavior in REM Sleep Behavior Disorder Shumpei Date, Koichi Fujiwara, Yukiyoshi Sumi, Hiroshi Kadotani, Makoto Imai, Keiko Ogawa
9.20-9.40 Development of Heat Stroke Detection Model Based on Heart Rate Variability Using LSTM-AutoEncoder Shota Saeda, Koshi Ota, Koichi Fujiwara, Takatomi Kubo, Toshitaka Yamakawa, Aozora Yamamoto, Yuki Maruno, Manabu Kano
9.40-10.00 Driving Fitness Evaluation Model for Patients With Schizophrenia Based on Driving Data of Healthy Participants and Random Forest Shuji Tsunoda, Koichi Fujiwara, Seiko Miyata, Akiko Yamaguchi, Shogo Kitagawa, Yuki Konishi, Reiji Yoshimura, Isao Taguchi, Yutaka Sawa, Kunihiro Iwamoto, Norio Ozaki
10.00-10.20 Method for Estimating Test Contrast Peak Time in Computed Tomography Angiography Toshihide Otsuki; Kazuto Sakamoto; Homare Saisho; Hiroyoshi Yokoi; Toshitaka Yamakawa
10.20-10.40 Development of an Epileptic Seizure Prediction Algorithm Based on R-R Intervals With Temporal Convolutional Networks Rikumo Ode; Koichi Fujiwara; Miho Miyajima; Toshitaka Yamakawa; Manabu Kano; Taketoshi Maehara
Session Room Chair
WedAM1-6 (SS17: Emerging Diseases and Smart Image Processing) Chiang Mai 4 Krisana Chinnasarn
Date Time Title Authors
9 November 2022 9.00-9.20 Pre-Processing SARS-CoV-2 Sequence Data for Application of Machine Learning Techniques for Visualization and Clustering of Virus Characteristics Juhyeon Kim; Insung Ahn
9.20-9.40 Educational Multi-Purpose Kit for Coding and Robotic Design Atikhun Thongpool; Daranee Hormdee; Raksit Chutipakdeevong; Wasan Tansakul;
9.40-10.00 Forecasting Dengue Fever in France and Thailand Using XGBoost Thanin Methiyothin; Insung Ahn
10.00-10.20 Fine-Tuning BERT for Question and Answering Using PubMed Abstract Dataset Saeyeon Cheon; Insung Ahn
10.20-10.40 Coarse X-Ray Lumbar Vertebrae Pose Localization Using Triangulation Correspondence Watcharaphong Yookwan; Jiranun Sangrueng; Krisana Chinnasarn
10.40-11.00 4G Signal RSSI Recommendation System for ISP Quality of Service Improvement Tanatpon Duangta; Watcharaphong Yookwan; Krisana Chinnasarn; Anuparp Boonsongsrikul
Session Room Chair
WedAM1-7 (Speech, Language, and Audio 2) Chiang Mai 5 Wei-Ping Zhu
Date Time Title Authors
9 November 2022 9.00-9.20 SE-DPTUNet: Dual-Path Transformer Based U-Net for Speech Enhancement Bengbeng He; Kai Wang; Wei-Ping Zhu
9.20-9.40 Encoder Re-Training With Mixture Signals on FastMVAE Method Shuhei Yamaji; Taishi Nakashima; Nobutaka Ono; Li Li; Hirokazu Kameoka
9.40-10.00 Unsupervised Disentanglement of Timbral, Pitch, and Variation Features From Musical Instrument Sounds With Random Perturbation Keitaro Tanaka; Yoshiaki Bando; Kazuyoshi Yoshii; Shigeo Morishima
10.00-10.20 Estimation of Transfer Coefficients and Signals of Sound-To-Light Conversion Device Blinky Under Saturation Kosuke Nishida; Natsuki Ueno; Yuma Kinoshita; Nobutaka Ono
10.20-10.40 Design and Evaluation of Instrument Sound Identification Difficulty for the Deaf and Hard-Of Hearing Shiho Akaki; Rumi Hiraga; Keiichi Yasu; Keiji Tabuchi; Hiroko Terasawa
10.40-11.00 Correcting, Rescoring and Matching: An N-Best List Selection Framework for Speech Recognition Chin-Hung Kuo; Kuan-Yu Chen
Session Room Chair
WedAM1-8 (SS04: Advanced Signal Processing and Machine Learning for Audio and Speech Applications) Board Room 4 Shoji Makino
Date Time Title Authors
9 November 2022 9.00-9.20 Hyperbolic Timbre Embedding for Musical Instrument Sound Synthesis Based on Variational Autoencoders Futa Nakashima; Tomohiko Nakamura; Norihiro Takamune; Satoru Fukayama; Hiroshi Saruwatari
9.20-9.40 Multi-Task Adversarial Training Algorithm for Multi-Speaker Neural Text-To-Speech Yusuke Nakai; Yuki Saito; Kenta Udagawa; Hiroshi Saruwatari
9.40-10.00 Inverse-Free Online Independent Vector Analysis With Flexible Iterative Source Steering Taishi Nakashima; Nobutaka Ono
10.00-10.20 Accelerating online algorithm using geometrically constrained independent vector analysis with iterative source steering Kana Goto; Tetsuya Ueda; Li Li; Takeshi Yamada; Shoji Makino
10.20-10.40 A Dilated Inception Convolutional Neural Network for Gridless DOA Estimation Under Low SNR Scenarios Zhi-Wei Tan; Yuan Liu; Andy W. H. Khong
10.40-11.00 Efficient Low-Latency Convolution With Uniform Filter Partition and Its Evaluation on Real-Time Blind Source Separation Yui Kuriki; Taishi Nakashima; Kouei Yamaoka; Natsuki Ueno; Yukoh Wakabayashi; Nobutaka Ono; Ryo Sato
Session Room Chair
WedPM1-1 (SS05: Advanced Image and Video Processing using Deep Learning) Chiang Mai 1 Chul Lee
Date Time Title Authors
9 November 2022 14.00-14.20 Object Segmentation Using Parametric Representation Hochang Rhee; Hyung Il Koo; Nam Ik Cho
14.20-14.40 Deep Color Constancy Using Multi-Band NIR Jeong-Won Ha; Dong-keun Han; Min-Je Park; Jong-Ok Kim
14.40-15.00 Smooth Panoramic Walkthrough for Adjacent Panoramic Viewpoints With Dense Spherical Matching Points Kyungjune Lee; Mingyu Jang; Sanghoon Lee; Kim Taewan
15.00-15.20 Region Adaptive Self-Attention for an Accurate Facial Emotion Recognition Seongmin Lee; Jeonghaeng Lee; Minsik Kim; Sanghoon Lee
15.20-15.40 Quality Enhancement of Screen Content Video Using Dual-Input CNN Ziyin Huang; Yue Cao; Sik-Ho Tsang; Yui-Lam Chan; Kin-Man Lam
15.40-16.00 Underwater Image Enhancement Using Realistic Dataset With Turbidity and Color Distortion Eunpil Park; Eunsung Jo; Jae-Young Sim
Session Room Chair
WedPM1-2 (Speech, Language, and Audio 1) Chiang Mai 2 Ashish Panda
Date Time Title Authors
9 November 2022 14.00-14.20 Neural Vocoder Feature Estimation for Dry Singing Voice Separation Jaekwon Im; Soonbeom Choi; Sangeon Yong; Juhan Nam
14.20-14.40 Adapting GCC-PHAT to Co-Prime Circular Microphone Arrays for Speech Direction of Arrival Estimation Using Neural Networks Jiahong Zhao; Christian Ritz
14.40-15.00 A Novel Approach to Structured Pruning of Neural Network for Designing Compact Audio-Visual Wake Word Spotting System Haotian Wang; Jun Du; Hengshun Zhou; Heng Lu; Yuhang Cao
15.00-15.20 Hierarchic Temporal Convolutional Network With Attention Fusion for Target Speaker Extraction Zihao Chen; Wenbo Qiu; Haitao Xu; Ying Hu
15.20-15.40 Acoustic Model Adaption Using x-Vectors for Improved Automatic Speech Recognition Meet Soni; Aditya Raikar; Ashish Panda; Sunil Kumar Kopparapu
15.40-16.00 Acoustic Pornography Recognition Using Convolutional Neural Networks and Bag of Refinements Lifeng Zhou; Kaifeng Wei; Yuke Li; Yiya Hao; Weiqiang Yang; Haoqi Zhu
Session Room Chair
WedPM1-3 ( Deep Learning: Algorithm, Implementations, and Applications) Chiang Mai 3 Jen-Tzung Chien
Date Time Title Authors
9 November 2022 14.00-14.20 An Optimal Vehicle Counting Framework for Non-Canonical CCTV Placements Ng Chin Hooi; Edwin Tan Chee Pin; Chiew Yeong Shiong; Lim Mei Kuan
14.20-14.40 Response Sentence Modification Using a Sentence Vector for a Flexible Response Generation of Retrieval-Based Dialogue Systems Ryota Yahagi; Akinori Ito; Takashi Nose; Yuya Chiba
14.40-15.00 End-To-End Stereo Audio Coding Using Deep Neural Networks Wootaek Lim; Inseon Jang; Seungkwon Beack; Jongmo Sung; Taejin Lee
15.00-15.20 Neural Beamformer With Automatic Detection of Notable Sounds for Acoustic Scene Classification Sota Ichikawa; Takeshi Yamada; Shoji Makino
15.20-15.40 DNN-Based Frequency-Domain Permutation Solver for Multichannel Audio Source Separation Fumiya Hasuike; Daichi Kitamura; Rui Watanabe
15.40-16.00 Detection Method From 4K Images Using SSD300 Without Retraining Kei Irie; Kiyoshi Nishikawa
Session Room Chair
WedPM1-4 (Signal Image and Information Processing Theory and Methods) Board Room 2 Zhang Ke
Date Time Title Authors
9 November 2022 14.00-14.20 PAformer: Visually Indistinguishable Bolt Defect Recognition Based on Bolt Position and Attributes Wenshuo Lou; Ke Zhang; Yangjie Xiao; Xiwang Guo; Jiacun Wang
14.20-14.40 Adapted Spectrogram Transformer for Unsupervised Cross-Domain Acoustic Anomaly Detection Gilles Van De Vyver; Zhaoyi Liu; Koustabh Dolui; Danny Hughes; Sam Michiels
14.40-15.00 A Two-Stage Cascading Method Based on Finetuning in Semi-Supervised Domain Adaptation Semantic Segmentation Huiying Chang; Kaixin Chen; Ming Wu
15.00-15.20 Landmark Management in the Application of Radar SLAM Shuai Sun; Beth Jelfs; Kamran Ghorbani; Glenn I. Matthews; Chris Gilliam
15.20-15.40 Parameterization of Dominant Spectral Peak Trajectory for Whisper Speech Recognition Chang Feng; Xiaolong Wu; Mingxing Xu; Thomas Fang Zheng
15.40-16.00 Specific Emitter Identification at Different Time Based on Multi-Domain Migration Jiaxu Liu; Jianqing Li; Jiao Wang; Hao Huang
Session Room Chair
WedPM1-5 (Research Review) Board Room 3 Koichi Fujiwara
Date Time Title Authors
9 November 2022 14.00-14.20 Long-Term Prognostic Prediction of West Syndrome Based on Scalp EEG Using Convolution Neural Network Autoencoder Tatsuki Saito; Koichi Fujiwara; Jun Natsume; Ryosuke Suzui
14.20-14.40 Modification of RRI Data by NBEATS Model Hongtao Chen, Koichi Fujiwara, Manabu Kano
14.40-15.00 Transformer With Noise Divider Mun-Hyung Lee, Seon-Woo Lee, Jung-Mu Choi, Jang-Woo Kwon
15.00-15.20 Schizophrenia Classification Based on the Natural Language Processing Technology-A Pilot Study Ying Hsuan Chen; Pei-Yun Lin; Tsung-Tse Ho; Yuh-Jer Chang; Ying-Hui Lai
15.20-15.40 Signed Graph Balancing Based on Spectral Clustering Haruki Yokota, Junya Hara, Yuichi Tanaka
15.40-16.00 Graph Signal Sampling for Multiple Generator Functions Junya Hara; Yuichi Tanaka
Session Room Chair
WedPM1-6 (Signal Proceesing for Audio and Speech Applications) Chiang Mai 4 Tomoyosi Akiba
Date Time Title Authors
9 November 2022 14.00-14.20 Semi-Supervised ASR Based on Iterative Joint Training With Discrete Speech Synthesis Keiya Takagi; Tomoyosi Akiba; Hajime Tsukada
14.20-14.40 Analysis of Amplitude and Frequency Perturbation in the Voice for Fake Audio Detection Kai Li; Yao Wang; Minh Le Nguyen; Masato Akagi; Masashi Unoki
14.40-15.00 Deep Hashing for Speaker Identification and Retrieval Based on Auditory Sparse Representation Dung Kim Tran; Masato Akagi ; Masashi Unoki
15.00-15.20 Divide and Conquer: A Low-Complexity Neural Network for Monophonic Speech Enhancement Bingxiao Fang; Liang Liu
15.20-15.40 Domain Adaptation and Language Conditioning to Improve Phonetic Posteriorgram Based Cross-Lingual Voice Conversion Pin-Chieh Hsu; Nobuaki Minematsu; Daisuke Saito
15.40-16.00 Von Mises Mixture Model-Based DNN for Sign Indetermination Problem in Phase Reconstruction Nguyen Binh Thien; Yukoh Wakabayashi; Geng Yuting; Kenta Iwai; Takanobu Nishiura
Session Room Chair
WedPM1-7 (Speech, Language, and Audio 2) Chiang Mai 5 Daranee Hormdee
Date Time Title Authors
9 November 2022 14.00-14.20 Speaker Representation Learning via Contrastive Loss With Maximal Speaker Separability Zhe Li; Man Wai Mak
14.20-14.40 Design of Discriminators in GAN-Based Unsupervised Learning of Neural Post-Processors for Suppressing Localized Spectral Distortion Riku Ogino; Kohei Saijo; Tetsuji Ogawa
14.40-15.00 Simultaneous Frequency Estimation for Three or More Sinusoids Based on Sinusoidal Constraint Differential Equation Kenta Yamada, Yoshiki Masuyama, Yukoh Wakabayashi, Nobutaka Ono
15.00-15.20 Do You Know How Humans Sound? Exploring a Qualification Test Design for Crowdsourced Evaluation of Voice Synthesis Quality Moe Yaegashi; Susumu Saito; Teppei Nakano; Tetsuji Ogawa
15.20-15.40 Exploring the Gender Difference on Mandarin Tone Realization in Lombard Speech Weizhong Zhang; Jian Gong; Kai Sheng; Yuhong Sun; William Bellamy; Xiaoli Ji
Session Room Chair
WedPM1-8 (Data Analytics and Machine Learning) Board Room 4 Chern Hong Lim
Date Time Title Authors
9 November 2022 14.00-14.20 Improving Co-SVD for Cold-Start Recommendations Using Sparsity Reduction Low Jia Ming; Chern Hong Lim; Ian K. T. Tan
14.20-14.40 Epoch-Wise Double Descent Triggered by Learning a Single Sample Aoshi Kawaguchi; Hiroshi Kera; Toshihiko Yamasaki
14.40-15.00 Current Source Localization Using Deep Prior With Depth Weighting Hajime Yano; Rio Yamana; Ryoichi Takashima; Tetsuya Takiguchi; Seiji Nakagawa
15.00-15.20 A Proposal for Emotion-Expressive Editor:EmoEditor by Font Changing Yuki Shimamura; Michiharu Niimi
15.20-15.40 Traceback Memory Reduction for Three-Sequence Alignment Algorithm With Affine Gap Models Rui-Ting Chien; Mao-Jan Lin; Yang-Ming Yeh; Yi-Chang Lu
15.40-16.00 Acceleration of Subspace Learning Machine via Particle Swarm Optimization and Parallel Processing Hongyu Fu; Yijing Yang; Yuhuai Liu; Joseph Lin; Ethan Harrison; Vinod K. Mishra; C.-C. Jay Kuo
Session Room Chair
WedPM2-1 (SS05: Advanced Image and Video Processing using Deep Learning) Chiang Mai 1 Chul Lee
Date Time Title Authors
9 November 2022 16.20-16.40 Enhanced Bidirectional Motion Estimation Using Feature Refinement for HDR Imaging An Gia Vien; Truong Thanh Nhat Mai; Seonghyun Park; Gahyeon Kim; Chul Lee
16.40-17.00 Fast Asymmetric Bilateral Motion Estimation for Video Frame Interpolation Jintae Kim; Junheum Park; Chang-Su Kim
17.00-17.20 Future Object Localization in Autonomous Driving Using Ego-Centric Images and Motions Seoyoung Jo; Jung-Kyung Lee; Je-won Kang
17.20-17.40 Restoration of High-Frequency Components in Under Display Camera Images Youngjin Oh; Gu Yong Park; Nam Ik Cho
17.40-18.00 Non-Intrusive Speech Intelligibility Estimation Using Deep Learning With Speech Enhancement and Convolutional Layers Kazushi Nakazawa; Kazuhiro Kondo
18.00-18.20 Unified Angle Adjustment Network for Image Composition Enhancement Jinwon Ko; Nyeong-Ho Shin; Seonho Lee; Chang-Su Kim
Session Room Chair
WedPM2-2 (Speech, Language, and Audio 1) Chiang Mai 2 Kasemsit Teeyapan
Date Time Title Authors
9 November 2022 16.20-16.40 Automated Audio Captioning With Epochal Difficult Captions for Curriculum Learning Andrew Koh; Soham Tiwari; Chng Eng Siong
16.40-17.00 Application of Deep Learning-Based Single-Channel Speech Enhancement for Frequency-Modulation Transmitted Speech Ying Ma; Xueliang Zhang
17.00-17.20 An Empirical Study of Training Mixture Generation Strategies on Speech Separation: Dynamic Mixing and Augmentation Shukjae Choi; Younglo Lee; Jihwan Park; Hyung Yong Kim; Byeong-Yeol Kim; Zhong-Qiu Wang; Shinji Watanabe
17.20-17.40 Speech Intelligibility Prediction for Hearing Aids Using an Auditory Model and Acoustic Parameters Benita Angela Titalim; Candy Olivia Mawalim; Shogo Okada; Masashi Unoki
17.40-18.00 Predicting Speech Fluency in Children Using Automatic Acoustic Features Lionel Fontan; Shinyoung Kim; Verdiana De Fino; Sylvain Detey
18.00-18.20 TC-SKNet With GridMask for Low-Complexity Classification of Acoustic Scene Luyuan Xie; Yan Zhong; Lin Yang; Zhaoyu Yan; Zhonghai Wu; Junjie Wang
Session Room Chair
WedPM2-3 ( Deep Learning: Algorithm, Implementations, and Applications) Chiang Mai 3 Masaomi Kimura
Date Time Title Authors
9 November 2022 16.20-16.40 Design and Control of a Muscle-Skeleton Robot Elbow Based on Reinforcement Learning Jianyin Fan; Haoran Xu; Yuwei Du; Jing Jin; Qiang Wang
16.40-17.00 Non-Autoregressive Speech Recognition With Error Correction Module Yukun Qian; Xuyi Zhuang; Zehua Zhang; Lianyu Zhou; Xu Lin; Mingjiang Wan
17.00-17.20 A Method for Adversarial Example Generation by Perturbing Selected Pixels KAMEGAWA Tomoki; KIMURA Masaomi
17.20-17.40 A Title Generation Method With Transformer for Journal Articles MATSUMOTO Riku; KIMURA Masaomi
17.40-18.00 Catastrophic Forgetting Avoidance Method for a Classification Model by Model Synthesis and Introduction of Background Data HIRAYAMA Akari; KIMURA Masaomi
18.00-18.20 Consistency Regularization for GAN-Based Neural Vocoders Kotaro Onishi; Toru Nakashika
18.20-18.40 Parallel Training of TN and ITN Models Through CycleGAN for Improved Sequence to Sequence Learning Performance Md. Mizanur Rahaman Nayan; Mohammad Ariful Haque
Session Room Chair
WedPM2-4 (SS14:Emerging Signal Processing Technology for Medical Applications/ Biomedical Signal Processing and Systems) Board Room 2 Yuttapong Jiraraksopakun
Date Time Title Authors
9 November 2022 16.20-16.40 Laparoscope Manipulating Robot (LMR) Navigation Using Deep Learning-Based Surgical Instruments Detection Nyi Nyi Myo; Apiwat Boonkong; Daranee Hormdee; Suphachoke Sonsilphong; Amornthep Sonsilphong; Kovit Khampitak
16.40-17.00 Human-Machine Interface Device Using Piezoelectric Sensors Based on Facial Muscle Movements for Wheelchair Control Charoenporn Bouyam; Theerat Saichoo; Nannaphat Siribunyaphat; Yunyong Punsawad
17.00-17.20 Obstructive Sleep Apnea Classification Using Snore Sounds Based on Deep Learning Apichada Sillaparaya; Apichai Bhatranand; Chudanat Sudthongkhong; Kosin Chamnongthai; Yuttapong Jiraraksopakun
17.20-17.40 Heart Rate Estimation of Car Driver Using Radar Sensors and Blind Source Separation Keito Murata; Daichi Kitamura; Ryo Saito; Daichi Ueki
17.40-18.00 Total Variation Algorithms for PAT Image Reconstruction Mary Anjaley Josy John; Imad Barhumi
18.00-18.20 Visual Function and Emotional Regulation in Achromatic Color and Chromatic Color Using Low Resolution Brain Electromagnetic Tomography Analysis (LORETA) Watchara Sroykham; Yodchanan Wongsawat
18.20-18.40 Effect of Electrooculography on Electroencephalography Classifying Accuracy in Deep Learning and Reducing Number of Channels in Motor-Imagery Brain-Computer Interface Musashi Ino; Yoshihiro Kono; Nobuaki Kobayashi
Session Room Chair
WedPM2-5 (SS16: Emerging Techniques in Multimedia Data Analytics and Codings) Board Room 3 Patiwet Wuttisarnwattana/ Kampol Woradit
Date Time Title Authors
9 November 2022 16.20-16.40 Optimal Deep Multi-Route Self-Attention for Single Image Super-Resolution Nisawan Ngambenjavichaikul; Sovann Chen; Supavadee Aramvith
16.40-17.00 Object Detection in Aerial Images With Attention-Based Regression Loss Chandler Timm C. Doloriel; Rhandley D. Cajote
17.00-17.20 Performance Analysis of JPEG XR With Deep Learning-Based Image Super-Resolution Taingliv Min; Supavadee Aramvith
17.20-17.40 MCSNet: Multi-Channel Sharing Network for Single Image Super-Resolution Wazir Muhammad; Supavadee Aramvith; Watchara Ruangsang
17.40-18.00 DCAN: Deep Consecutive Attention Network for Video Super Resolution Talha Saleem; Sovann Chen; Supavadee Aramvith
18.00-18.20 Wiener Filter-Based Color Attribute Quality Enhancement for Geometry-Based Point Cloud Compression Jinrui Xing; Hui Yuan; Chen Chen; Wei Gao
18.20-18.40 Mixed Context Techniques in the Adaptive Arithmetic Coding Process for DC Term and Lossless Image Encoding Evan Shih; Jian-Jiun Ding
Session Room Chair
WedPM2-6 (Signal Proceesing for Audio and Speech Applications) Chiang Mai 4 Sunao Hara/Sutasinee Thovuttikul
Date Time Title Authors
9 November 2022 16.20-16.40 Prediction Method of Soundscape Impressions Using Environmental Sounds and Aerial Photographs Yusuke Ono; Sunao Hara; Masanobu Abe
16.40-17.00 Robust Speech Dereverberation Based on Adaptive Weighted Prediction Error Algorithm With Eigenvector Extraction Yitong Chen; Wen Zhang
17.00-17.20 Multi-Task Learning for Speech Emotion and Emotion Intensity Recognition Pengcheng Yue; Leyuan Qu; Shukai Zheng; Taihao Li
17.20-17.40 Karaoke Generation From Songs: Recent Trends and Opportunities Preet Patel; Ansh Ray; Khushboo Thakkar; Kahan Sheth; Sapan H Mankad
17.40-18.00 Multi-Branch Learning for Noisy and Reverberant Monaural Speech Separation Chao Ma; Dongmei Li
18.00-18.20 Significance of Quadrature and In-Phase Components for Synthetic Spoofed Speech Detection Priyanka Gupta; Piyushkumar K. Chodingala; Hemant A. Patil
Session Room Chair
WedPM2-7 (SS20: High Performance Intelligent Technologies for Image and Video Applications) Chiang Mai 5 Jing-Ming Guo
Date Time Title Authors
9 November 2022 16.20-16.40 Mammography Quality Evaluation and Model Interpretation Based on CNN-Based Inframammary Fold Classification Yi-Chong Zeng; Yu-Cheng Wu; Chen-Yen Yeh; Shu-Chi Li; Tzu-Han Chou; Yi-Wen Huang; Giu-Cheng Hsu; Hsian-He Hsu
16.40-17.00 Hybrid Image Compression Framework Based on Single Image Training Tien-Ying Kuo; Yu-Jen Wei; Kuan-Yu Su
17.00-17.20 Highly Robust Action Retrieval Using View-Invariant Pose Feature and Simple Yet Effective Query Expansion Method Noboru Yoshida; Jianquan Liu
17.20-17.40 A Unified Compression and Watermarking Scheme for MT-BTC Images Jing-Ming Guo; Sankarasrinivasan Seshathiri
17.40-18.00 Fusion With Hierarchical Graphs for Multimodal Emotion Recognition Shuyun Tang; Zhaojie Luo; Guoshun Nan; Jun Baba; Yuichiro Yoshikawa; Hiroshi Ishiguro
18.00-18.20 Multi-Stage Superpixel-Based Segmentation Algorithm Using Fully Convolutional Networks and Discriminative Features Pei-Chi Huang; Jian-Jiun Ding
18.20-18.40 Deep Learning Acceleration Design Based on Low-Rank Approximation Yi-Hsiang Chang*, Gwo Giun (Chris) Lee*, Shiu-Yu Chen*
Session Room Chair
WedPM2-8 (Data Analytics and Machine Learning) Board Room 4 Wanus Srimaharaj
Date Time Title Authors
9 November 2022 16.20-16.40 Internet of Behavior and Brain Response Identification for Cognitive Performance Analysis Wanus Srimaharaj; Roungsan Chaisricharoen
16.40-17.00 Refinement of Utterance Fluency Feature Extraction and Automated Scoring of L2 Oral Fluency With Dialogic Features Ryuki Matsuura; Shungo Suzuki; Mao Saeki; Tetsuji Ogawa; Yoichi Matsuyama
17.00-17.20 A Vision Transformer-Based Approach to Bearing Fault Classification via Vibration Signals Abid Hasan Zim; Aeyan Ashraf; Aquib Iqbal; Asad Malik; Minoru Kuribayashi
17.20-17.40 Analysis Method for Motion Factors Related to Joint Contact Forces at the Knee During Walking Using Grad-CAM Satoshi Suwa; Koh Inoue; Ryo Matsuoka
17.40-18.00 A Dataset and a Lightweight Object Detection Network for Thermal Image-Based Home Surveillance Zhengqiang Shao; Longbin Yan; Jie Chen; Jingdong Chen
18.00-18.20 SCQ: Self-Supervised Cross-Modal Quantization for Unsupervised Large-Scale Retrieval Fuga Nakamura; Ryosuke Harakawa; Masahiro Iwahashi
Session Room Chair
ThAM1-1 (Image Video Multimedia) Chiang Mai 1 Masaaki Ikehara
Date Time Title Authors
10 November 2022 9.00-9.20 Single Image Raindrop Removal Using a Non-Local Operator and Feature Maps in the Frequency Domain Shinya Ezumi; Masaaki Ikehara
9.20-9.40 Dual-Teacher Distillation for Low-Light Image Enhancement Jeong-Hyeok Park; Tae-Hyeon Kim; Jong-Ok Kim
9.40-10.00 Automatic Data Augmentation Method With Improved Interpretability for Image Classification in Computer Vision Applications Dair Ungarbayev; Osman Demirel; Muhammad Tahir Akhtar
10.00-10.20 Learning to Sharpen Partially Blurred Image via Iterative Blurred Region Mining and Recovery Jung Yeh; Wen-Li Wei; Duan-Yu Chen; Jen-Chun Lin
10.20-10.40 Shape-Bias Evaluation of Pretrained Models Using Image Decomposition Akinori Iwata; Masahiro Okuda
10.40-11.00 Proposal of Associative Watermarking Method Ryoto Kanegae; Masaki Kawamura
Session Room Chair
ThAM1-2 (Speech, Language, and Audio 1) Chiang Mai 2 Ying Hu/ Toshio Irino
Date Time Title Authors
10 November 2022 9.00-9.20 DMF-Net: A Decoupling-Style Multi-Band Fusion Model for Full-Band Speech Enhancement Guochen Yu; Yuansheng Guan; Weixin Meng; Chengshi Zheng; Hui Wang; Yutian Wang
9.20-9.40 Speak Like a Dog: Human to Non-Human Creature Voice Conversion Kohei Suzuki; Shoki Sakamoto; Tadahiro Taniguchi; Hirokazu Kameoka
9.40-10.00 Pre-Trained Multimodal End-To-End Network for Spoken Language Assessment Incorporating Prompts Binghuai Lin; Liyuan Wang
10.00-10.20 Gated Fusion of Handcrafted and Deep Features for Robust Automatic Pronunciation Assessment Binghuai Lin; Liyuan Wang
10.20-10.40 Effective Data Screening Technique for Crowdsourced Speech Intelligibility Experiments: Evaluation With IRM-Based Speech Enhancement Ayako Yamamoto; Toshio Irino; Shoko Araki; Kenichi Arai; Atsunori Ogawa; Keisuke Kinoshita; Tomohiro Nakatani
Session Room Chair
ThAM1-3 (Deep Learning: Algorithm, Implementations, and Applications) Chiang Mai 3 Kasemsit Teeyapan
Date Time Title Authors
10 November 2022 9.00-9.20 Leveraging Pre-Trained Acoustic Feature Extractor for Affective Vocal Bursts Tasks Bagus Tris Atmaja; Akira Sasou
9.20-9.40 Flow-Based Variational Sequence Autoencoder Jen-Tzung Chien; Tien-Ching Luo
9.40-10.00 Speech Intelligibility Prediction Through Direct Estimation of Word Accuracy Using Conformer Naoyuki Kamo; Kenichi Arai; Atsunori Ogawa; Shoko Araki; Tomohiro Nakatani; Keisuke Kinoshita; Marc Delcroix; Tsubasa Ochiai; Toshio Irino
10.00-10.20 DNN-Rule Hybrid Dyna-Q for Sample-Efficient Task-Oriented Dialog Policy Learning Mingxin Zhang; Takahiro Shinozaki
10.20-10.40 MoCoVC: Non-Parallel Voice Conversion With Momentum Contrastive Representation Learning Kotaro Onishi; Toru Nakashika
10.40-11.00 Controllable Voice Conversion Based on Quantization of Voice Factor Scores Takumi Isako; Kotaro Onishi; Takuya Kishida; Toru Nakashika
Session Room Chair
ThAM1-4 (Biomedical Signal Processing and Systems) Board Room 2 Daranee Hormdee
Date Time Title Authors
10 November 2022 9.00-9.20 Deep Adaptive Denoising Auto-Encoder Networks for ECG Noise Cancelation via Time-Frequency Domain Amir Mohammadisarab; Poorya Aghaomidi; Jalil Mazloum; Mohammad Ali Akbarzadeh; Mahdi Orooji; Nader Mokari; Halim Yanikomeroglu
9.20-9.40 User-Item Recommendation Approaches to Detect Genomic Variant Interactions Emma Andrade; Nicholas Tom; Mario Banuelos
9.40-10.00 Teager Energy Cepstral Coefficients for Classification of Dysarthric Speech Severity-Level Aastha Kachhi; Anand Therattil; Ankur T. Patil; Hardik B. Sailor; Hemant A. Patil
10.00-10.20 Decoding Emotional Valence from EEG in Immersive Virtual Reality Guanxiong Pei; Bingjie Li; Taihao Li; Ruohao Xu; Jianmin Dong; Jia Jin
10.20-10.40 Design of A Wearable System for Hypoxic Training Management Using Blood Oxygenation and Heart Rate Takuma Kitagawa; Toshitaka Yamakawa
10.40-11.00 MedBERT: A Pre-Trained Language Model for Biomedical Named Entity Recognition Charangan Vasantharajan; Kyaw Zin Tun; Ho Thi-Nga; Sparsh Jain; Tong Rong; Chng Eng Siong
Session Room Chair
ThAM1-5 (SS21: Recent Advances and Applications in Encrypted Domain) Board Room 3 Simying Ong
Date Time Title Authors
10 November 2022 9.00-9.20 Encrypted JPEG Image Retrieval via Huffman-Code Based Self-Attention Networks Zhixun Lu; Qihua Feng; Peiya Li
9.20-9.40 Reversible Data Hiding in Encrypted Text Using Paillier Cryptosystem Asad Malik; Aeyan Ashraf; Hanzhou Wu; Minoru Kuribayashi
9.40-10.00 Scrambling-Embedding in Partially-Encrypted Images Koi Yee Ng, Simying Ong
10.00-10.20 Image Classification Using Vision Transformer for EtC Images Genki HAMANO; Shoko IMAIZUMI; Hitoshi KIYA
10.20-10.40 Image Watermarking Based on Saliency Detection and Multiple Transformations Ahmed Khan; KokSheik Wong; Vishnu Monn Baskaran
Session Room Chair
ThAM1-6 (SS19: Towards real-world human-centric acoustic signal processing) Chiang Mai 4 Sermsak Uatrongjit
Date Time Title Authors
10 November 2022 9.00-9.20 A Fast Converge Spectral Modulation Sensitive Active Noise Control System Kah-Meng Cheong; Yih Liang Shen; Tai-Shih Chi
9.20-9.40 Multimodal Forgery Detection Using Ensemble Learning Ammarah Hashmi; Sahibzada Adil Shahzad; Wasim Ahmad;Chia Wen Lin;Yu Tsao;Hsin-Min Wang
9.40-10.00 Speech Enhancement-Assisted Voice Conversion in Noisy Environments Yun-Ju Chan; Chiang-Jen Peng; Syu-Siang Wang; Hsin-Min Wang; Yu Tsao; Tai-Shih Chi
10.00-10.20 Effect of Noise on the Perceptual Contribution of Cochlea-Scaled Entropy and Speech Level in Mandarin Sentence Understanding Weikang Wu; Shangdi Liao; Fei Chen
10.20-10.40 EEG-Based Auditory Attention Detection With Estimated Speech Sources Separated From an Ideal-Binary-Masking Process Lei Wang; Fei Chen
10.40-11.00 Automatic Step Detection of Tandem Gait Test in Patients With Vestibular Hypofunction Using Wearable Sensors Yi-Ju Huang; Chien-Pin Liu; Kuan-Chung Ting; Chia-Yeh Hsieh; Kai-Chun Liu; Chia-Tai Chan
Session Room Chair
ThAM1-7 (SS22: Recent Advances in Biometrics and Security) Chiang Mai 5 Koichi Ito
Date Time Title Authors
10 November 2022 9.00-9.20 Continuous Authentication for Smartphones Using Face Images and Touch-Screen Operation Shuto Kinoshita; Yuka Watanabe; Yasushi Yamazaki
9.20-9.40 Spoofing Attack Detection in Face Recognition System Using Vision Transformer With Patch-Wise Data Augmentation Kota Watanabe; Koichi Ito; Takafumi Aoki
9.40-10.00 A Simple and Accurate CNN for Iris Recognition Shokei Kawakami; Hiroya Kawai; Koichi Ito; Takafumi Aoki; Yoshiko Yasumura; Masakazu Fujio; Yosuke Kaga; Kenta Takahashi
10.00-10.20 Eyeglass Frame Segmentation for Face Image Processing Kanta Miura; Takamichi Miyamoto; Kazuyuki Sakurai; Koichi Ito; Takafumi Aoki
10.20-10.40 A Fair Model is Not Fair in a Biased Environment Yuya Sato; Soshi Maeda; Muku Akasaka; Masakatsu Nishigaki; Tetsushi Ohki
Session Room Chair
ThAM1-8 (Other related speech processing) Board Room 4 Sansanee Auephanwiriyakul
Date Time Title Authors
10 November 2022 9.00-9.20 Intelligibility Prediction of Enhanced Speech Using Recognition Accuracy of End-To-End ASR System Kenichi Arai; Atsunori Ogawa; Shoko Araki; Keisuke Kinoshita; Tomohiro Nakatani; Naoyuki Kamo; Toshio Irino
9.20-9.40 Hi, KIA: A Speech Emotion Recognition Dataset for Wake-Up Words Taesu Kim; SeungHeon Doh; Gyunpyo Lee; Hyeongseok Jeon; Juhan Nam; Hyeon-Jeong Suk
9.40-10.00 Improving Speech Emotion Recognition via Fine-Tuning ASR With Speaker Information Bao Thang Ta, Tung Lam Nguyen, Dinh Son Dang, Nhat Minh Le, Van Hai Do
10.00-10.20 3CMLF: Three-Stage Curriculum-Based Mutual Learning Framework for Audio-Text Retrieval Yi-Wen Chao; Dongchao Yang; Rongzhi Gu; Yuexian Zou
Session Room Chair
ThPM1-1 (Image Video Multimedia) Chiang Mai 1 Masaki Kawamura
Date Time Title Authors
10 November 2022 12.30-12.50 Neural Network Based Watermarking Trained With Quantized Activation Function Shingo Yamauchi; Masaki Kawamura
12.50-13.10 A Multiframe Super-Resolution Pipeline for Sub-Image-Typed Light Field Data Chien-Han Hsu; Yi-Hsien Lin; Yen-Po Lin; Yi-Chang Lu
13.10-13.30 Restoring Edge and Color Using Weighted Near-Infrared Image and Color Transmission Maps for Robust Haze Removal Onhi Kato; Akira Kubota
13.30-13.50 Dense View Interpolation of 4D Light Fields for Real-Time Augmented Reality Applications Hidemichi Yoshino; Kazuya Kodama; Takayuki Hamamoto
13.50-14.10 Bolt Looseness Identification Using Faster R-CNN and Grid Mask Augmentation Natchapon Panmatharit; Yuttapong Jiraraksopakun; Anek Siripanichgorn; Punnarai Siricharoen
14.10-14.30 Large-Scale Blind Face Super-Resolution via Edge Guided Frequency Aware Generative Facial Prior Networks Xi Cheng; Wan-Chi Siu; Jian Yang
Session Room Chair
ThPM1-2 (Speech, Language, and Audio 1) Chiang Mai 2 Takanobu Nishiura
Date Time Title Authors
10 November 2022 12.30-12.50 Language-Based Audio Retrieval With Converging Tied Layers and Contrastive Loss Andrew Koh; Chng Eng Siong
12.50-13.10 D²Net: A Denoising and Dereverberation Network Based on Two-Branch Encoder and Dual-Path Transformer Liusong Wang; Wenbing Wei; Yadong Chen; Ying Hu
13.10-13.30 Direct Speech-Reply Generation From Text-Dialogue Context Kenichi Fujita; Yusuke Ijima; Hiroaki Sugiyama
13.30-13.50 Sequence-Wise Optimization for Quasi-Harmonic Speech Waveform Modeling Shaowen Chen; Tomoki Toda
13.50-14.10 Lattice-Based Data Augmentation for Code-Switching Speech Recognition Roland Hartanto; Kuniaki Uto; Koichi Shinoda
14.10-14.30 Phase-Aware Audio Super-Resolution for Music Signals Using Wasserstein Generative Adversarial Network Yanqiao Yan; Binh Thien Nguyen; Yuting Geng; Kenta Iwai; Takanobu Nishiura
Session Room Chair
ThPM1-3 (Deep Learning: Algorithm, Implementations, and Applications) Chiang Mai 3 Jen-Chun Lin
Date Time Title Authors
10 November 2022 12.30-12.50 Speech Emotion Recognition Based on the Reconstruction of Acoustic and Text Features in Latent Space Jennifer Santoso; Rintaro Sekiguchi; Takeshi Yamada; Kenkichi Ishizuka; Taiichi Hashimoto; Shoji Makino
12.50-13.10 A Light CNN With Split Batch Normalization for Spoofed Speech Detection Using Data Augmentation Haojian Lin; Yang Ai; Zhenhua Ling
13.10-13.30 On the Optimal Classifier for Affective Vocal Bursts and Stuttering Predictions Based on Pre-Trained Acoustic Embedding Bagus Tris Atmaja; Zanjabila; Akira Sasou
13.30-13.50 Nonlinear Residual Echo Suppression Based on Gated Dual Signal Transformation LSTM Network Kai Xie; Ziye Yang; Jie Chen
13.50-14.10 Adaptive End-To-End Text-To-Speech Synthesis Based on Error Correction Feedback From Humans Kazuki Fujii; Yuki Saito; Hiroshi Saruwatari
14.10-14.30 Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-To-Speech Byoung Jin Choi; Myeonghun Jeong; Minchan Kim; Sung Hwan Mun; Nam Soo Kim
Session Room Chair
ThPM1-4 (SS07: Latest Wireless Technologies for Sensing and Communications) Board Room 2 Osamu Takyu
Date Time Title Authors
10 November 2022 12.30-12.50 Performance Evaluation of FISTA With Constant Inertial Parameter Kaito Kameda; Ryo Hayakawa; Kazunori Hayashi; Youji Iiguni
12.50-13.10 An Approximated ADMM Based Algorithm for \(\ell_1-\ell_2\) Optimization Problem Rui Lin; Kazunori Hayashi
13.10-13.30 Antenna Beamforming Selection With Low Complexity and High Exploitation of White Space in Frequency Spectrum Sharing Kizuku Kawamura; Kohei Akimoto; Osamu Takyu
13.30-13.50 Individual Memory Driven Transformer Deep Learning Model for Multi-Cell Massive MIMO Beam Prediction Taisei Urakami; Haohui Jia; Na Chen; Minoru Okada
13.50-14.10 Deep Unfolding-Aided Sum-Product Algorithm for Error Correction of CRC Coded Short Message Qilin Zhang; Shinsuke Ibi; Takumi Takahashi; Hisato Iwai
14.10-14.30 Successive Interference Cancellation for Signal Demodulation of Multiple LPWA Systems Shinichiro Kakuda; Takeo Fujii; Shusuke Narieda
Session Room Chair
ThPM1-5 (SS08: Digital Convergence of 5G/B5G, AIoT and Security) Board Room 3 Kampol Woradit
Date Time Title Authors
10 November 2022 12.30-12.50 Evaluation of Voice Service in LEO Communication With 3GPP PUSCH Repetition Enhancement Shou-Hong Liu; Chun-Tai Liu; Wei-Hung Chou; JenYi Pan
12.50-13.10 Modeling of Malware Diffusion With Mobile Devices in Intermittently Connected Networks Hideyoshi Miura; Shoya Abukawa; Tomotaka Kimura; Kouji Hirata
13.10-13.30 Software Defined Radio Access Network Sharing by Multi-Operator Core Networks Wen-Ping Lai; Wen-Ru Chen; Ming-Jay Lai; Hong-Lun Lai; Chia-Ying Lin; Po-Chen Tseng
13.30-13.50 Machine Learning Based End-To-End Constellation Training for Communication Systems Po-Chiang Lin
13.50-14.10 Flow-Based DDoS Detection Using Deep Neural Network With Radial Basis Function Neural Network Ting-Chung Leung; Lee Chung-Nan
14.10-14.30 Implement a Continuous Learning Model to Detect Different Types of DDoS Attacks With Hierarchical Temporal Memory Hung Manh Nguyen; Yu-Kuen Lai
Session Room Chair
ThPM1-6 (SS23: Selected Papers from APSIPA Workshop in Hanoi, Vietnam) Chiang Mai 4 Nguyen Linh Trung
Date Time Title Authors
10 November 2022 12.30-12.50 Dynamic Hand Gesture Recognition From Egocentric Videos Based on SlowFast Architecture Ha-Dang Ho, Hong-Quan Nguyen, Thuy-Binh Nguyen, Sinh-Thuong Vu, Thi-Lan Le
12.50-13.10 Deep Learning-Based Signal Detection for Dual-Mode Index Modulation 3D-OFDM Dang-Y Hoang, Tien-Hoa Nguyen, Vu-Duc Ngo, Trung Tan Nguyen†, Nguyen Cong Luong, Thien Van Luong
13.10-13.30 A Comparison of Feature Selection and Feature Extraction in Network Intrusion Detection Systems Tuan-Cuong Vuong, Hung Tran, Mai Xuan Trang, Vu-Duc Ngo, Thien Van Luong
13.30-13.50 Deep Neural Network-Based Detector for Single-Carrier Index Modulation NOMA Toan Gian, Vu-Duc Ngo,Tien-Hoa Nguyen, Trung tan Nguyen, Thien Van Luong
13.50-14.10 Vibration Measurement Using Spatial Shifting Coherent Digital Holography Long Hai Ngo; Quang Duc Pham
14.10-14.30 Robust Online Tucker Dictionary Learning From Multidimensional Data Streams Le Trung Thanh; Tran Trong Duy; Karim Abed-Meraim; Nguyen Linh Trung; Adel Hafiane
Session Room Chair
ThPM1-7 (SS06: Adversarial Attacks and Defense) Chiang Mai 5 Minoru Kuribayashi
Date Time Title Authors
10 November 2022 12.30-12.50 Survey on Vision Based Fake News Detection and Its Impact Analysis Mehul S Raval; Mohendra Roy; Minoru Kuribayashi
12.50-13.10 StyleGAN Encoder-Based Attack for Block Scrambled Face Images AprilPyone MaungMaung; Hitoshi Kiya
13.10-13.30 On the Adversarial Transferability of ConvMixer Models Ryota Iijima; Miki Tanaka; Isao Echizen; Hitoshi Kiya
13.30-13.50 Detection and Correction of Adversarial Examples Based on JPEG-Compression-Derived Distortion Kenta Tsunomori; Yuma Yamasaki; Minoru Kuribayashi; Nobuo Funabiki; Isao Echizen
13.50-14.10 Defense Against Adversarial Examples Using Beneficial Noise Param Raval; Harin Khakhi; Minoru Kuribayashi; Mehul S. Raval
14.10-14.30 Privacy Protection Against Automated Tracking System Using Adversarial Patch Hiroto Takiwaki; Minoru Kuribayashi; Nobuo Funabiki; Mehul Shirishchandra Raval
Session Room Chair
ThPM1-8 (Industrial Forum "New era opened by AI-based image processing) Board Room 4 Jangwoo Kwon
Date Time Title Authors
10 November 2022 12.30-14.30 Towards Best Possible Deep Learning Acceleration on the Edge – A Compression-Compilation Co-Design Framework Yanzhi Wang, Northeastern University, Chairman and former CEO of CoCoPIE Inc., USA
Empowering Future Pathology with Artificial Intelligence Shuhao Wang, Co-founder and CTO of Thorough Future, China
Session Room Chair
ThPM2-1 (Image Video Multimedia) Chiang Mai 1 Nam Ik Cho
Date Time Title Authors
10 November 2022 15.00-15.20 Syllable Analysis Data Augmentation for Khmer Ancient Palm Leaf Recognition Nimol Thuon; Jun Du; Jianshu Zhang
15.20-15.40 Multi-Class Vehicle Counting System for Multi-View Traffic Videos Wichukorn Kuntintara; Kanokphan Lertniphonphan; Punnarai Siricharoen
15.40-16.00 Table Structure Recognition Based on Grid Shape Graph Eunji Lee; Junhyeong Kwon; Haeyoon Yang; Jaewoo Park; Soonyoung Lee; Hyung Il Koo; Nam Ik Cho
16.00-16.20 Feature Distillation Network for Multi-Band NIR Colorization Tae-Sung Park; Tae-Hyeon Kim; Jong-Ok Kim
16.20-16.40 Blur Detection for Surveillance Camera System Yikun Pan, Sik-Ho Tsang, Yui-Lam Chan, Daniel P.K. Lun
16.40-17.00 Lip Sync Matters: A Novel Multimodal Forgery Detector Sahibzada Adil Shahzad; Ammarah Hashmi; Sarwar Khan; Yan-Tsung Peng; Yu Tsao; Hsin-Min Wang
Session Room Chair
ThPM2-2 (Speech, Language, and Audio 1) Chiang Mai 2 Kittichai Wantanajittikul
Date Time Title Authors
10 November 2022 15.00-15.20 Frame-Level Matching Scheme Using Posteriorgram Probability Distance of Spoken Data to Improve Search Accuracy of Spoken Term Detection Reo Minakawa; Kazunori Kojima; Shi-wook Lee; Yoshiaki Itoh
15.20-15.40 Empirical Study Incorporating Linguistic Knowledge on Filled Pauses for Personalized Spontaneous Speech Synthesis Yuta Matsunaga; Takaaki Saeki; Shinnosuke Takamichi; Hiroshi Saruwatari
15.40-16.00 Using Perceptual Quality Features in the Design of the Loss Function for Speech Enhancement Nicholas Eng; Yusuke Hioka; Catherine I Watson
16.00-16.20 Correlation Loss for MOS Prediction of Synthetic Speech Beibei Hu; Qiang Li
16.20-16.40 Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation Chunyu Qiang; Peng Yang; Hao Che; Jinba Xiao; Xiaorui Wang; Zhongyuan Wang
16.40-17.00 Classification of Short Audio Acoustic Scenes Based on Data Augmentation Methods Xuan Zhang; Yunfei Shao; Junjie Xu; Yong Ma; Wei-Qiang Zhang
Session Room Chair
ThPM2-3 (Deep Learning: Algorithm, Implementations, and Applications) Chiang Mai 3 Kasemsit Teeyapan
Date Time Title Authors
10 November 2022 15.00-15.20 Improving Unsupervised Anomalous Sound Detection Performance of Autoencoder and Its Variant With Pretrained Deep Belief Network Yufeng Deng; Jia Liu; Wei-Qiang Zhang
15.20-15.40 ASGAN-VC: One-Shot Voice Conversion With Additional Style Embedding and Generative Adversarial Networks Wei-Cheng Li; Tzer-Jen Wei
15.40-16.00 Fusing Multiple Bandwidth Spectrograms for Improving Speech Enhancement Hao Shi; Yuchun Shu; Longbiao Wang; Jianwu Dang; Tatsuya Kawahara
16.00-16.20 End-To-End Two-Dimensional Sound Source Localization With Ad-Hoc Microphone Arrays Yijun Gong; Shupei Liu; Xiao-Lei Zhang
16.20-16.40 Exploring Speaker Age Estimation on Different Self-Supervised Learning Models Tuan Duc Truong; Tran The Anh; Eng-Siong Chng
16.40-17.00 Mandarin Singing Voice Synthesis With Denoising Diffusion Probabilistic Wasserstein GAN Yin-Ping Cho; Yu Tsao; Hsin-Min Wang; Yi-Wen Liu
Session Room Chair
ThPM2-4 (SS18: Metaverse: Future of Internet) Board Room 2 Navadon Khunlertgit
Date Time Title Authors
10 November 2022 15.00-15.20 Physiological Study on the Effect of Game Events in Response to Player's Laughter Mikito Fukuda; Yoshiko Arimoto
15.20-15.40 Development of a Virtual Telecommunication System Research Laboratory Siwanart Jearavongtakul; Imran Saeed Mirza; Lunchakorn Wuttisittikulkij; Pruk Sasithong; Suebphong Noisri; Pisit Vanichchanunt
15.40-16.00 Camera-Based Log System for Human Physical Distance Tracking in Classroom Somrudee Deepaisarn; Angkoon Angkoonsawaengsuk; Charn Arunkit; Chayud Srisumarnk; Krongkan Nimmanwatthana; Nanmanas Linphrachaya; Nattapol Chiewnawintawat; Rinrada Tanthanathewin; Sivakorn Seinglek; Suphachok Buaruk; Virach Sornlertlamvanich
16.00-16.20 Detecting Replay Attacks Using Single-Channel Audio: The Temporal Autocorrelation of Speech Shih-Kuang Lee; Yu Tsao; Hsin-Min Wang
Session Room Chair
ThPM2-5 ( Wireless Communication and networking) Board Room 3 Poompat Saengudomlert
Date Time Title Authors
10 November 2022 15.00-15.20 Automatic Detection of Dimmable Pulse Position Modulation for Visible Light Communication Poompat Saengudomlert; Karel Sterckx
15.20-15.40 Estimation of Angular Power Spectrum Using Multikernel Adaptive Filtering Eiji Ninomiya; Masahiro Yukawa; Renato L. G. Cavalcante; Lorenzo Miretti
15.40-16.00 Novel Smart Sectoring and Beam Designs in mmWave Broadcast Channels Yan-Yin He; Shang-Ho (Lawrence) Tsai; Jen-Ming Wu
16.00-16.20 New Methods for Fast Detection for Embedded Cognitive Radio Grégoire de Broglie; Louis Morge-Rollet; Denis Le Jeune; Frédéric Le Roy; Christian Roland; Charles Canaff; Jean-Philippe Diguet
Session Room Chair
ThPM2-6 (SS23: Selected Papers from APSIPA Workshop in Hanoi, Vietnam) Chiang Mai 4 Nguyen Linh Trung
Date Time Title Authors
10 November 2022 15.00-15.20 Needle Localization and Segmentation for Radiofrequency Ablation of Liver Tumors Under CT Image Guidance Le Quoc Anh; Luu Manh Ha; Theo van Walsum; Adriaan Moelker; Dao Viet Hang; Pham Cam Phuong; Vu Duy Thanh
15.20-15.40 End-To-End Visual-Guided Audio Source Separation With Enhanced Losses Duc-Huy Pham; Quang-Anh Do; Thanh Thi-Hien Duong; Thi-Lan Le; Phi Le Nguyen
15.40-16.00 Automated Classification of Lung Injury From X-Ray Images Using Deep Learning Network Huy Le; Thanh-Ha Do
16.00-16.20 AI-Based Video Analysis for Traffic Monitoring Bui Son Tung; Phung The Ngoc; Do Duy Thanh; Nguyen Hong Thinh
16.20-16.40 Adaptive Filtering-Based Heavy-Noise Removal in Born Iterative Method Tran Quang-Huy; Luong Thi Theu; Nguyen Canh Minh; Duc-Nghia Tran; Duc-Tan Tran
16.40-17.00 A Novel Deep Learning-Based Approach for Sleep Apnea Detection Using Single-Lead ECG Signals Anh-Tu Nguyen; Thao Nguyen; Huy-Khiem Le; Huy-Hieu Pham; and Cuong Do
Session Room Chair
ThPM2-7 (SS15: Advanced Sensing Technologies using Wireless Signal) Chiang Mai 5 Kampol Woradit
Date Time Title Authors
10 November 2022 15.00-15.20 Multi-Resolution GPR Clutter Suppression Method Based on Low-Rank and Sparse Decomposition Yanjie Cao; Xiaopeng Yang; Tian Lan
15.20-15.40 Indoor Human Motion Recognition Method Based on Kernel-Distance Doppler Velocity Estimation and Lightweight Network Weicheng Gao; Xiaopeng Yang; Xiaodong Qu; Jiancheng Liao; Zixiang Yin; Ding Zhang
15.40-16.00 Mainlobe Interference Suppression Method Based on Blocking Matrix Preprocessing With Low Sidelobe Constraint Meng Haoyu; Qu Xiaodong; Zhang Xingyu; Li Wolin; Zhang Zhengyan; Yang Xiaopeng
16.00-16.20 Continuous Tracking of Indoor Human Targets Based on Millimeter Wave Radar Meiqiu Jiang; Shisheng Guo; Haolan Luo; Guolong Cui
16.20-16.40 Reconfigurable Intelligent Surfaces Aided WiFi Imaging Ying He; Dongheng Zhang; Yan Chen