Dissimilarity Learning for Nominal Data

Victor Cheng, Chun-Hung Li, James T. Kwok, Chi-Kwong Li

Abstract: Defining a good distance (dissimilarity) measure between patterns is of crucial importance in many classification and clustering algorithms. While a lot of work has been performed on continuous attributes, nominal attributes are more difficult to handle. A popular approach is to use the value difference metric (VDM) to define a real-valued distance measure on nominal values. However, VDM treats the attributes separately and ignores any possible interactions among attributes. In this paper, we propose the use of adaptive dissimilarity matrices for measuring the dissimilarities between nominal values. These matrices are learned via optimizing an error function on the training samples. Experimental results show that this approach leads to better classification performance. Moreover, it also allows easier interpretation of (dis)similarity between different nominal values.

Pattern Recognition, 37(7): 1471-1477, July 2004.

PDF: http://www.cs.ust.hk/~jamesk/papers/pr04.pdf


Back to James Kwok's home page.