聚類分析與機器學習研討會

聚類分析與機器學習研討會

【簡　介】

地點：新竹教育大學推廣大樓9313室
1st	Date:2010.05.21 (五)	Time:10：00∼11：00	Speaker: 楊鎮槐博士中原大學應用數學系博士後研究
Title:An integrated control chart pattern recognition system with parameter estimation using RBF neural networks
Abstract: Abnormal patterns in control charts mean that there are some unnatural causes for variations in statistical process control (SPC) and need to be eliminated. Hence control chart pattern recognition becomes important in SPC. Although pattern recognition techniques have been widely applied to identify abnormal patterns in control charts, a complete pattern recognition system should better have the abilities with pattern identification and parameter estimation. In this paper, we present radial basis function (RBF) neural networks for parameter estimation of abnormal control chart patterns and then an integrated control chart pattern recognition system which contains a correlation coefficient method for pattern identification and RBF neural networks for parameter estimation is proposed. This control chart
2nd	Date:2010.06.08 (二)	Time:14：00∼15：00	Speaker: 謝俊男中原大學應用數學系博士候選人
Title:Mountain Method and Mountain C-Regressions
Abstract: Since Quandt [The estimation of the parameters of a linear regression system obeying two separate regimes, Journal of the American Statistical Association 53 (1958) 873–880] initiated the research on 2-regressions analysis, switching regression had been widely studied and applied in psychology, economics, social science and music perception. In fuzzy clustering, the fuzzy c-means (FCM) is the most commonly used algorithm. Hathaway and Bezdek [Switching regression models and fuzzy clustering, IEEE Transactions on Fuzzy Systems 1 (1993) 195–204] embedded FCM into switching regression where it was called fuzzy c-regressions (FCR). However, the FCR always depends heavily on initial values. In this paper, we propose a mountain c-regressions (MCR) method for solving the initial-value problem. First, we perform data transformation for the switching regression data set, and then implement the modified mountain clustering on the transformed data to extract c cluster centers. These extracted c cluster centers in the transformed space will correspond to c regression models in the original data set. The proposed MCR method can form well-estimated c regression models for switching regression data sets. According to the properties of transformation, the proposed MCR is also robust to noise and outliers. Several examples show the effectiveness and superiority of our proposed method.
3th	Date:2010.07.15 (四)	Time:10：00∼11：00	Speaker: 賴建佑博士中原大學博士
Title:A Robust Automatic Merging Possibilistic Clustering Method
Abstract: Krishnapuram and Keller (1993) proposed possibilistic c-means (PCM) clustering by relaxing the constraint in fuzzy c-means (FCM) that the memberships of a data point across classes sum to 1. The PCM algorithm has a tendency to produce coincident clusters. This can be a merit of PCM as a good mode-seeking algorithm if initials and parameters are suitably chosen. However, the performance of PCM heavily depends on initializations and parameters selection. In this paper we propose a mechanism of robust automatic merging. We then create an automatic merging possibilistic clustering method (AM-PCM) where the proposed algorithm does not only solve these parameter selection and initialization problems, but also obtain an optimal cluster number. The proposed AM-PCM algorithm first uses all data points as initial cluster centers and then automatically merges these surrounding points around each cluster mode such that it can self-organize data groups according to the original data structure. The AM-PCM can exhibit the robustness to parameter, noise, cluster number, different volumes and initializations. The computational complexity of AM-PCM is also analyzed. Comparisons between AM-PCM and other clustering methods are made. Some numerical data and real data sets are used to show these good aspects of AM-PCM. Experimental results and comparisons actually demonstrate that the proposed AM-PCM is an effective and parameter-free robust clustering algorithm. Joint work with Miin-Shen Yang
4th	Date:2010.08.05 (四)	Time:10：00∼11：00	Speaker: 廖家德博士生清華大學資訊工程學系
Title:Novel Robust Kernels for Visual Learning Problems
Abstract: Robustness, which is the ability of learning algorithms to resist data corruption and irrelevant data variations so consequently make correct decisions, is critical for most visual learning problems. Recent advances are the remarkable development of the kernel methods in visual learning applications. By putting special attention in the kernel design, we develop different robust kernels for kernel methods and improve the robustness in dealing with image-related problems. In the first part, we compute the residue with respect to the estimate using the proposed kernel function. Those elements of too large residuals are considered as outliers and down-weighted according to the error size. In the second part, we design a kernel function by generalizing the notions of robust error function and tangent distance. This kernel is expected to be concurrently insensitive to the irrelevant data transformations and noise disturbances. From a theoretical point of view, these kernels are proved to satisfy the Mercer's condition, so they are valid to be used in a class of kernel-based learning algorithms to enhance their robustness. From a practical point of view, these kernels are shown to bring significant improvement in robustness of the incorporated algorithms for many visual learning applications.
5th	Date:2010.09.17 (五)	Time:10：00∼11：00	Speaker: 黃思皓博士清華大學資訊工程學系
Title:Adaboost Learning Algorithm and its application
Abstract: Object :detection has been a popular research topic. Face detection / recognition. Human detection. Industry inspection. Intelligent multimedia analysis and understanding. Fundamental research issues: Multimedia representation Classification algorithms Applications : Image retrieval with relevance feedback. Face detection in wavelet compressed domain. Contrarian trading strategy via dual classifier model. Vertebra detection and segmentation for spinal MRI.
6th	Date:2010.09.21 (二)	Time:14：00∼16：00	Speaker: 盧鴻興教授交通大學統計研究所
Title:Introduction to image science
Abstract: This talk will give a general introduction to image science. Specific examples of medical images will be used as illustration applications.
7th	Date:2010.09.28 (二)	Time:14：00∼16：00	Speaker:王才沛交通大學資訊工程學系
Title:Possibilistic Shell Clustering of Template-Based Shapes
Abstract: In this talk, we present a new type of alternating optimization-based possibilistic c-shell algorithm for clustering template-based shapes. A cluster prototype consists of a copy of the template after translation, scaling, rotation, and/or affine transformations. This extends the capability of shell clustering beyond a few standard geometrical shapes that have been in the literature so far. We use a number of 2-D datasets, consisting of both synthetic and real-world images, to illustrate the capability of our algorithm in detecting generic-template-based shapes in images. We also describe a progressive clustering procedure aimed to relax the requirements for a known number of clusters and good initialization, as well as new performance measures of shell-clustering algorithms.
8th	Date:2010.10.12 (二)	Time:14：10∼15：00	Speaker:蘇木春教授中央大學資訊工程學系系主任
Title:人機互動於數位學習夥伴之應用
Abstract: 傳統教育的學習方式是否有效率？有沒有更有效率的學習方式？那麼該怎麼做？數位學習改變： (1)學習情境。 (2)學習平台。 (3)教材內容及編撰方式。但電子書或網路教學若只提供線上閱讀，會有以下之缺點： – 缺乏引起學生閱讀的動機。 – 學生使用電子書時亦無法完整記錄學習歷程。 – 學習系統不易診斷學生的學習狀態及提供學習支援。 – 老師不易掌握學生的學習狀態，無法適時修正教學內容和及時與學生互動。可能的解決方案：具人機互動功能的數位學習夥伴(學習機器人)。
9th	Date:2010.10.19 (二)	Time:15：30∼17：00	Speaker:李百靈教授淡江大學統計系
Title:Correlation-Based Functional Clustering via Subspace Projection
Abstract: A correlation-based functional clustering method is proposed for grouping curves with similar shapes. A correlation between two random functions defined through the functional inner product is used as similarity measure. Curves with similar shapes are embedded in the cluster subspace spanned by a mean shape function and eigenfunctions of the covariance kernel. The cluster membership prediction for each curve attempts to maximize the functional correlation between the observed and predicted curves via shape standardization and subspace projection among all possible clusters. The proposed method accounts for shape differentials through the functional multiplicative random-effects shape function model for each cluster, which regards random scales and intercept shifts as a nuisance. A consistent estimate is proposed for the random scale effect, whose sample variance estimate is also consistent. The derived identifiability conditions for the clustering procedure unravel the predictability of cluster memberships. Simulation studies and a real data example illustrate the proposed method.
10th	Date:2010.10.22 (五)	Time:10：30∼11：30	Speaker:李宗錂教授國立中山大學應用數學系
Title:Rank revealing, updating, and downdating problems
Abstract: The rank revealing problem arises widely in scientific computing and engineering applications, such as signal processing, information retrieval, numerical polynomial algebra, and so on. Some of those applications give rise to a large matrix whose rank or nullity is known to be small apriori. Although the Golub-Reinsch SVD algorithm can be applied to calculate all singular values and hence determine numerical rank, it becomes unnecessarily expensive in these situations. Furthermore, it is relatively difficult to update the SVD when the matrix is altered by inserting or removing a few rows or columns. Several alternative methods have been proposed as alternatives for this purpose. In general, they compute a gap-revealing factorization for estimating the singular values. Recently we proposed two new rank-revealing methods to deal with large matrices with high ranks and low ranks respectively. Rather than computing a gap-revealing factorization, our methods construct an orthonormal basis for the numerical subspace of the matrix directly. The resulting algorithms are quite efficient, and they only output those numerical subspaces that are usually in demand in real applications. For the new methods, the corresponding updating and downdating algorithms become quite straightforward.
11th	Date:2010.11.09 (二)	Time:14：00∼15：30	Speaker:楊敏生教授中原大學應用數學系
Title:Maximum likelihood clustering via a multimodal probability model
Abstract: In this talk, we consider a new proposed multimodal probability model for clustering. Based on the probability model, we could use the maximum likelihood (ML) estimation to establish an ML clustering approach. We show that the partitional clustering can be equivalent to the ML clustering under the multimodal probability model. This ML clustering approach can lead to most partitional clustering algorithms, such as k-means, fuzzy c-means (FCM), possibilistic c-means (PCM), mean shift, classification ML and latent class methods. Furthermore, we construct two ML clustering frameworks based on the multimodal probability model for producing new clustering algorithms. One framework can induce penalized-type clustering algorithms. Another framework induces entropy-type clustering algorithms. Several numerical and real data sets are made for comparisons. These experimental results show that our new constructions based on the proposed ML clustering can produce useful and effective clustering algorithms.
12th	Date:2010.11.26 (五)	Time:10：30∼12：00	Speaker:吳國龍教授崑山科技大學資訊管理系
Title:Robust cluster validity indexes
Abstract: Cluster validity indexes can be used to evaluate the fitness of data partitions produced by a clustering algorithm and are usually independent of clustering algorithms. However, the values of validity indexes may be heavily influenced by noise and outliers. These noise and outliers may not influence the results from clustering algorithms, but they may affect the values of validity indexes. In the literature, there is little discussion about the robustness of cluster validity indexes. In this study, we analyze the robustness of a validity index using the function of M-estimate and then propose several robust-type validity indexes. We discuss the validity measure on a single data point and focus on those validity indexes that can be categorized as the mean type of validity indexes. We then propose median-type validity indexes that are robust to noise and outliers. Comparative examples with numerical and real data sets show that the proposed median-type validity indexes work better than the mean-type validity indexes.
13th	Date:2010.12.21 (二)	Time:14：00∼17：00	Speaker:柳清地教授 Department of Biostatistics Boston University School of Public Health, USA
Title:A forest-Based approach in Association Study
Abstract: Multiple genes, gene-by-gene interactions, and gene-by-environment interactions are believed to underline most complex diseases. However, such interactions are difficult to identify. While there have been recent successes in identifying genetic variants for complex diseases, it still remains difficult to identify gene-gene and gene-environment interactions. To overcome this difficulty, we propose a forest-based approach and a concept of variable importance. The proposed approach is demonstrated by simulation study for its validity and illustrated by a real data analysis for its use. Analyses of both real data and simulated data based on published genetic models show the effectiveness of our approach.

本研討會的主旨在於結合數學、統計與資訊等領域之學者並專注於聚類分析與機器學習模型建立、資料分析、統計

方法與計算方法之研究相關問題。其應用將涵蓋與電子商務等領域。另本研討會亦提供了對聚類分析與機器學習的

跨領域交流平台。透過學術演講的方式（每二至三週一次，七月與八月各一次）增進參與者彼此間之互動，並能在

實質上達到跨領域之學術合作。

參與研討會員：
洪文良(新竹教育大學應用數學系/資訊科學研究所)
張延彰(新竹教育大學應用數學系)
李金龍(新竹教育大學應用數學系)
楊敏生(中原大學應用數學系)
賴尚宏(清華大學資訊工程學)
王才沛(交通大學資訊工程學系)
吳國龍(崑山科技大學資訊管理系)
黃思皓(清華大學資訊工程學系博士後研究)
廖家德(清華大學資訊工程學系博士生)
楊鎮槐(中原大學應用數學系博士後研究)
謝俊男(中原大學應用數學系博士班)
賴建佑(中原大學應用數學系博士班)
張簡守仁(中原大學應用數學系博士班)
林志穎(中原大學應用數學系博士班)

新竹教育大學應用數學系碩士班

主辦單位：新竹教育大學應用數學系
補助單位：國科會數學研究推動中心
*聯絡人：鄭雅卉
*聯絡電話：03-5213132#5901
*E-Mail：w97006@mail.nhcue.edu.tw

<<歡迎諸位蒞臨本系參與研討會活動 !! >>

清華大學應用數學系