Tutorials ©APSIPA ASC 2013

Tutorial #1: Ekachai Leelarasmee
Tutorial #2: Tomasz M. Rutkowski
Tutorial #3: Waleed H. Abdulla | Iman T. Ardekani
Tutorial #4: Y.-W. Peter Hong | Tsung-Hui Chang
Tutorial #5: Koichi Shinoda | Jen-Tzung Chien
Tutorial #6: Weisi Lin
Tutorial #7: Ying-Dar Lin
Tutorial #8: Oscar. C. Au

Tutorial #1
Power Line Communication: Introduction and Implementation to Solar Farm Monitoring

Ekachai Leelarasmee
Chulalongkorn University, Bangkok, Thailand

Abstract
The use of power cable to carry both electrical energy and communication signal is presented. This power line communication or PLC technologies transmit signal into the cable in the form of modulated carrier at high frequency around 100kHz or more. PLC finds its uses in many areas such as home automation, internet access and automatic meter reading. Several standards such as G3 and PRIME have been proposed and implemented. Its advantages lie in the fact that no additional communication wires are needed and it can cover a larger area than using radio frequency, e.g. 2.4GHz, communication.
Although PLC is well known for AC power lines where communication devices are connected in parallel, it can be used in another situation where these devices are in series connection. Such a situation can be found in a DC solar farm where voltages of series connected photovoltaic panels are to be monitored remotely. Circuits that implement this type of transmission is quite different from the normal parallel PLC and will be described.

Tutorial #2
Beyond the Visual and Imagery Based BCI – The New Developments in Spatial Auditory and So-Matosensory Based Paradigms

Tomasz M. Rutkowski
University of Tsukuba, Japan

Abstract
The tutorial content is based on the workshops to be delivered this year by the proposer at IEEE World Haptics Congress 2013 and BCI Meeting 2013. State-of-the-art stimuli-driven BCI paradigms rely mostly on visual modalities. Recently somatosensory (tactile or haptic) and auditory modality approaches have been proposed to offer alternative ways to deliver sensory stimulation inputs which could be crucial for patients suffering from weak or lost eye-sight or hearing. Already several techniques have been developed to connect the BCI to a traditional haptic interface or to utilize those interfaces as stimulation sources. The tutorial will present recent developments and discuss pros and cons of the tactile and auditory based BCI approaches. Vibrotactile stimulation brings also a possibility to create bone-conduction sensory effect in case of the head area exciters application. This concept, created by the tutorial author and collaborators, is still very preliminary yet already has existing applications. It brings a very interesting possibility to deliver multimodal stimuli (somatosensory and auditory combined) to TLS/ALS subjects with a very fast information transfer rate. I will also present classical haptic HCI examples to discuss possible applications for future BCI prototypes. The new BCI paradigms bring also the novel data driven signal processing and machine learning methods which will be presented and discussed during the tutorial.

Tutorial #3
Active Noise Control: Fundamentals and Recent Advances

Waleed H. Abdulla
The University of Auckland, New Zealand

Iman T. Ardekani
The University of Auckland, New Zealand

Abstract
Noise is a major cause of concern in modern life! It is the fastest growing pollutant in the urban environment and causes annoyance, illness, loss of quality of life, and reduces life expectancy. Medical studies show that noise affects the nervous and hormonal systems and consequently disrupts the stability of human biological system. In this tutorial we will see how we can minimize the noise at certain location by introducing an anti-noise signal. Theoretically, a mechanical disturbance such as acoustic noise, machinery vibration, and seismic vibration can be actively decimated by an artificially-generated disturbance (control field) of the same type which is equal in magnitude and opposite in phase. For example acoustic noise at a specific spot can be reduced by introducing another noise (anti-noise) by a loudspeaker which destructively combines with the original noise to minimize it. This basic principle is the essence of Active Noise Control.
Active Noise Control (ANC) is an elegant approach to neutralize noise in the acoustic domain. ANC is developing rapidly because it permits improvements in noise control with potential benefits in size, weight reduction, and lower system cost. In addition, noise can be reduced without physical modification of the existing noise sources or their physical arrangement. ANC systems have been excelled by the digital technology revolution over time. Many algorithms and techniques have developed to realization of efficient active noise control systems to decimate noise in response to the growing demand of such systems.
In this tutorial we will discuss the fundamentals of ANC systems from the theoretical and practical aspects. We will also talk about the recent advancements, novel analysis technique based on root locus, and the 3D ANC where we discuss the pathway of optimizing a zone of silence rather than a point of silence.

Tutorial #4
MIMO Signal Processing Techniques to Enhance Physical Layer Security

Y.-W. Peter Hong
National Tsing Hua University, Tiawan

Tsung-Hui Chang
National Taiwan University of Science and Technology, Taiwan

Abstract
This tutorial provides an overview of signal processing techniques proposed to enhance information security in the physical-layer of multi-antenna wireless communication systems. Wireless physical layer secrecy has attracted much attention in recent years due to the broadcast nature of the wireless medium and its inherent vulnerability to eavesdropping. Motivated by results in information theory, signal processing techniques in both the data transmission and the channel estimation phases have been explored in the literature to enlarge the signal quality difference between the target receiver (or called the destination) and the eavesdropper. In the data transmission phase, secrecy beamforming and precoding schemes are used to enhance signal quality at the destination while limiting the signal strength at the eavesdropper. Artificial noise (AN) is also used on top of beamformed or precoded signals to further reduce the reception performance at the eavesdropper. In the channel estimation phase, training procedures are developed to enable different channel estimation performance at the destination and the eavesdropper. As a result, the effective signal-to-noise ratios at the two terminals will be different and a more favorable secrecy channel will be made available for use in the data transmission phase.
Different from most talks on physical-layer secrecy, we focus on the signal processing aspects of the problem as opposed to coding or information-theoretic discussions. In particular, this tutorial covers 4 main topics: (i) basic review of information-theoretic secrecy, (ii) secrecy beamforming and precoding techniques for data transmission, (iii) secrecy-enhancing artificial noise and jamming signal usage and (iv) training designs for discriminatory channel estimation. Extensions to more advanced wireless applications will also be discussed.

Tutorial #5
Machine Learning for Multimedia Sequential Pattern Recognition

Koichi Shinoda
Tokyo Institute of Technology, Japan

Jen-Tzung Chien
National Chiao Tung University, Taiwan

Abstract
In this tutorial, we will present state-of-the-art machine learning approaches for multimedia sequential pattern recognition. As we know, the sequential patterns, e.g. speech, music, language, image and video, etc., are ubiquitous in real-world information systems. We require extensive knowledge of statistical models and design a flexible and robust system to meet heterogeneous environments. This tutorial starts from an introduction of sequential pattern modeling based on the hidden Markov models (HMMs) and survey a series of machine learning approaches to different issues in pattern recognition including mismatch conditions, poor alignment, missing labels, ambiguous classes, over-trained parameters and non-stationary environments. We will address recent advances in adaptive learning, sparse learning, semi-supervised learning and discriminative training and present challenging solutions to speech recognition, face recognition, video indexing, speaker clustering and person authentication. In speech recognition, we will present the robust statistics for feature normalization, Bayesian sparse learning and semi-supervised learning for acoustic model and semantic topic modeling for language model. An online learning for speaker clustering will be also addressed. In addition, we will present how online learning and discriminative learning work for face recognition. In video information retrieval, we will first introduce the research activities in NIST TRECVID workshop, in which many teams from all over the world compete with each other in several predetermined tasks related to video information retrieval. Then, we will explain video semantic indexing task and its state-of-the-art statistical framework based on Gaussian mixture model (GMM) supervectors and support vector machines (SVM). We will also explain how maximum a posterioriadaptation is effectively used in this task. Next we explain another task, multimedia event detection, and introduce several machine learning approaches for this task. We also briefly introduce the other tasks in TRECVID and show how machine learning is effectively used in those tasks. At last, we will point out new trends of pattern recognition and machine learning approaches for multimedia signal and information processing.

Tutorial #6
Perceptual Quality Evaluation for Image and Video: from Modules to Systems

Weisi Lin
Nanyang Technological University, Singapore

Abstract
Since the human visual system (HVS) is the ultimate receiver and appreciator for the majority of processed images and video, it would be better to use a perceptually plausible criterion in visual signal quality evaluation, as well as the related system design and optimization. After million-years of evolution, the HVS develops unique characteristics, so it is meaningful and important to make the machine perceive as the HVS does. Significant research effort has been made toward modelling the HVS’ picture quality evaluation mechanism during the past two decades, and to apply the resultant models to various situations (e.g., quality metrics, image/video compression, watermarking, channel coding, signal restoration/enhancement, computer graphics, visual content retrieval and medical instrumentation).
In this tutorial, we will first introduce the problems under attack, the relevant physiological/psychological knowledge, and the work so far in the related fields. Afterward, we will present three major parts of this tutorial. In the first mart part, the basic computational modules are to be discussed. These include the models for signal decomposition, Just-noticeable Difference (JND), visual attention, and common artifact detection. In the second major part, different perceptually-driven types of techniques (in either embedded or standalone forms; both the classical and the state-of-the-art ones) will be presented for picture quality evaluation. Finally, we will discuss the emerging trends and future R&D possibilities.
This tutorial aims at providing a systematic, comprehensive and up-to-date overview in perception-based evaluating for images and video. It can also provide a practical user’s guide to the various relevant techniques (and those well-cited works to be highlighted), and all approaches are to be presented with clear classification, and careful comparison/comments whenever possible, based upon our understanding and experience in the said areas (in both academic and industrial aspects).

Tutorial #7
Research Roadmap Driven by Network Benchmarking Lab (NBL): Deep Packet Inspection, Traffic Forensics, WLAN/LTE, Embedded Benchmarking, and Beyond

Ying-Dar Lin
National Chiao Tung University, Taiwan

Abstract
Most researchers look for topics from the literature. But our research has been driven mostly by development which in turn has been driven by industrial projects or lab works. We first compare three different sources of research topics. We then derive two research tracks driven by product development and product testing, named as the blue track and the green track, respectively. Each track is further divided into development plane and research plane. The blue track on product development has fostered a startup company (L7 Networks Inc.) and a textbook (Computer Networks: An Open Source Approach, McGraw-Hill 2011) at the development plane and also a research roadmap on QoS and deep packet inspection (DPI) at the research plane. On the other
hand, the green track on product testing has triggered a 3rd-party test bed, Network Benchmarking Lab (NBL, www.nbl.org.tw), at the development plane and a research roadmap on traffic forensics, WLAN/LTE, and embedded benchmarking at the research plane. Throughout this talk, we illustrate how development and research could be highly interleaved. At the end, we give lessons accumulated over the past decade. The audience could see how research could be conducted in a different way.

Tutorial #8
Next Generation Video Coding- H.265/HEVC and Its Extensions

Oscar. C. Au
The Hong Kong University of Science and Technology, Hong Kong

Abstract
In March 2013, H.265/HEVC was completed and achieved its FDIS status. It is surely the most significant event in digital video compression field in a decade. With the collaborative effort of a lot of experts, H.265/HEVC can provide approximately twice the compression performance of prior standard, i.e. maintain the same level of video quality while using only half of the bit rate. In particular, it addresses a special emphasis on the hardware friendly design and parallel-processing architectures. Now the Joint Collaborative Team on Video Coding (JCTVC) is working hard on developing the extensions of H.265/HEVC to enhance the design and address different application scenarios (e.g. enhanced chroma formats, scalable video coding (SVC), and 3D appilcations).
In this tutorial, we review the development of H.265/HEVC and the main coding tools that are accepted. We also examine some coding tools not accepted, and hotly discussed topics, which aroused a lot of attention and study during the meetings. The development status of SVC extension and 3D extension is introduced. There are plenty of research opportunities in H.265/HEVC and beyond. Participants will gain an understanding of novel techniques in the next generation video coding standards, along with some perspectives for the future applications and research opportunities.