A knowledge-light approach to regression using case-based reasoning

Case-based reasoning (CBR) is among the most influential paradigms in modern machine learning. It advocates a strategy of storing specific experiences in the form of cases, and solving new problems by re-using solutions from similar past cases. The most difficult aspect of CBR is deciding how to adapt past solutions to precisely match the circumstances of new problems. No generally applicable method of doing this has been found; different domains and tasks have their own individual characteristics, and successful adaptation has usually relied on the presence of explicit, hand-coded domain knowledge. Such knowledge is usually difficult both to acquire and maintain. For this reason, most CBR systems in operation today are ‘retrieval only’ in that they do not attempt to adapt the solutions of past cases to solve new problems. For certain machine learning tasks, however, customisation of old solutions can be performed using only knowledge contained within the set of stored cases. One such task is regression (i.e. predicting the value of a numeric variable). Regression is among the oldest machine learning tasks, dating back to Francis Galton’s work on predicting the heights of parents and their children in nineteenth century England. A modern example would be to predict tomorrow’s stock market prices based on today’s financial data. Many different approaches to solving regression problems have been developed over the years, for example, k -NN, locally weighted linear regression and artificial neural networks. The aim of this thesis is to apply CBR to the problem of regression. It begins by analysing previous attempts to do this, paying particular attention to those aspects that might be improved. One CBR-based approach from the mid-1990’s is examined in considerable detail. It works by finding the differences between a new problem and a similar past problem, then searching for a pair of stored cases with the same differences between them. These stored cases indicate the effect of the differences on the solution. This ‘case differences’ approach has much to recommend it. In particular, the knowledge needed to solve new problems is automatically generated from stored cases—no additional external knowledge must be added. Unfortunately, it also suffers from some theoretical limitations that greatly restrict its use. This thesis presents two new CBR-based regression algorithms that build on the strengths of previous approaches while addressing their limitations. One is a minor variant of the v traditional k -NN algorithm, while the other uses the case differences approach and is more sophisticated. The main contribution of the second algorithm is that it uses locally weighted linear regression as a guide to help choose past cases that are likely to be useful for solving new problems. It also takes steps to increase robustness when basing predictions on noisy datasets. An experimental evaluation of the new techniques shows that they perform well relative to standard regression algorithms on a range of datasets.

Browse

All of TARA

This Collection

Statistics

A knowledge-light approach to regression using case-based reasoning

File Type:

Item Type:

Date:

Author:

Access:

Citation:

Download Item:

Abstract:

URI:

Advisor:

Qualification name:

Publisher:

Note:

Type of material:

URI:

Collections:

Availability:

Keywords: