Although Support Vector Machines have been used to develop highly accurate classification and regression models in various real-world problem domains, the most significant barrier is that SVM generates black box model that is difficult to understand. The procedure to convert these opaque models into transparent models is called rule
extraction. This thesis investigates the task of extracting comprehensible models from trained SVMs, thereby alleviating this limitation. The primary contribution of the thesis is the proposal of various algorithms to overcome the significant limitations of SVM by
taking a novel approach to the task of extracting comprehensible models. The basic contribution of the thesis are systematic review of literature on rule extraction from SVM, identifying gaps in the literature and proposing novel approaches for addressing the gaps.
The contributions are grouped under three classes, decompositional, pedagogical and eclectic/hybrid approaches. Decompositional approach is closely intertwined with the internal workings of the SVM. Pedagogical approach uses SVM as an oracle to re-label training examples as well as artificially generated examples. In the eclectic/hybrid approach, a combination of these two methods is adopted.
The thesis addresses various problems from the finance domain such as bankruptcy prediction in banks/firms, churn prediction in analytical CRM and Insurance fraud detection. Apart from this various benchmark datasets such as iris, wine and WBC for classification problems and auto MPG, body fat, Boston housing, forest fires and pollution for regression problems are also tested using the proposed appraoch. In addition, rule extraction from unbalanced datasets as well as from active learning based approaches has been explored. For classification problems, various rule extraction methods such as FRBS, DT, ANFIS, CART and NBTree have been utilized. Additionally for regression problems, rule extraction methods such as ANFIS, DENFIS and CART have also been employed. Results are analyzed using accuracy, sensitivity, specificity, fidelity, AUC and t-test measures. Proposed approaches demonstrate their viability in extracting accurate, effective and comprehensible rule sets in various benchmark and real world problem domains across classification and regression problems. Future directions have been indicated to extend the approaches to newer variations of SVM as well as to other problem domains.
Inhaltsverzeichnis (Table of Contents)
- Introduction
- Support Vector Machine (SVM)
- Linear SVM
- Non-linear SVM
- Kernel Functions
- Soft Margin SVM
- Rule Extraction from SVM
- Rule Extraction Methods
- Decomposition Methods
- Association Rule Mining Methods
- Hyperplane-Based Methods
- Approximation Methods
- Other Methods
- Comparison of Rule Extraction Methods
- Applications of Rule Extraction from SVM
- Banking and Finance
- Credit Scoring
- Fraud Detection
- Market Segmentation
- Other Applications
- Conclusion
Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)
This thesis aims to explore the application of rule extraction techniques from Support Vector Machine (SVM) models in the domains of banking and finance. The work investigates the potential of utilizing SVM as a classification technique, extracting interpretable rules from the trained models, and analyzing their applicability to real-world financial problems. The key themes explored in this research include:- Support Vector Machine (SVM) models and their role in classification tasks.
- Rule extraction techniques for generating human-understandable insights from SVM models.
- Applications of rule extraction from SVM models in the areas of credit scoring, fraud detection, and market segmentation.
- Evaluation of the effectiveness of rule extraction techniques in generating accurate and insightful rules.
- The potential of SVM-based rule extraction for improving decision-making in banking and financial applications.
Zusammenfassung der Kapitel (Chapter Summaries)
The introductory chapter provides an overview of the research topic and its significance. It introduces the concept of Support Vector Machine (SVM) as a powerful classification technique and highlights the increasing need for interpretable models in various applications, especially in banking and finance.
Chapter 2 delves into the fundamentals of Support Vector Machines (SVM). It covers the key concepts, including linear SVM, non-linear SVM, kernel functions, and soft margin SVM, providing a comprehensive understanding of this classification technique.
Chapter 3 explores various methods for extracting rules from SVM models. It discusses different approaches, including decomposition methods, association rule mining methods, hyperplane-based methods, approximation methods, and others. This chapter aims to provide a comprehensive review of existing rule extraction methods, highlighting their advantages, limitations, and suitability for different applications.
Chapter 4 focuses on the application of rule extraction from SVM models in the banking and finance domain. It examines specific cases, such as credit scoring, fraud detection, and market segmentation, demonstrating the potential of this approach for solving real-world financial problems.
Schlüsselwörter (Keywords)
The primary keywords and focus topics of this thesis encompass Support Vector Machine (SVM), rule extraction, classification, banking, finance, credit scoring, fraud detection, market segmentation, interpretability, and decision-making. The research explores the application of SVM for classification and the development of rule extraction techniques to enhance interpretability and decision-making in financial contexts.- Quote paper
- Mohammed Farquad (Author), 2010, Rule Extraction from Support Vector Machine, Munich, GRIN Verlag, https://www.hausarbeiten.de/document/193686