In this project, we would tackle three different parts using Python programming language and JupyterLab. The first part is focusing on programming KNNs (K-nearest neighbors) and NBs (Naive Bayes) from scratch. Then, we would move on afterward to comparing the results obtained by these both algorithms for final evaluation. Therefore, we would consider which one is performing the best.
In the second part, we would use sklearn library to compare the two algorithms on a larger dataset, specifically in four different settings: Influence of reduced training set, influence of large training set, influence of absence of a teacher
and unknown distribution.
In the third part, we would compare the same algorithms for image classification on 10 different classes, using feature descriptors.
Inhaltsverzeichnis (Table of Contents)
- Introduction
- 1. Programming of a discrimination method: KNNs (K-nearest neighbors) and Naïve Bayes (NB)
- 1.1 Data
- 1.2 Implementation of KNNs Algorithm
- 1.3 Implementation of NBs Algorithm
- 1.4 Experiment and Results
- 2. Comparison of the two methods (parametric and nonparametric)
- 2.1 Influence of the size of the training set: the case of reduced size
- 2.2 Influence of the size of the training set: the case of large size
- 2.3 In the case of the absence of a teacher
- 2.4 Distribution unknown
- 3. Approach based on Descriptors
- 3.1 Calculation of descriptors
- 3.2 Implementation of a classification system
- Conclusion
- References
Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)
This project explores the implementation and comparison of KNNs and Naïve Bayes algorithms for classification tasks, using Python and JupyterLab. The project aims to understand the performance of these algorithms in different scenarios, including varying training set sizes, the absence of a teacher, and unknown data distributions. It also investigates the use of feature descriptors for image classification.
- Implementation and comparison of KNNs and Naïve Bayes algorithms
- Impact of training set size on algorithm performance
- Exploring algorithm behavior in the absence of a teacher and with unknown data distributions
- Utilizing feature descriptors for image classification
- Assessing the effectiveness of different algorithms in various scenarios
Zusammenfassung der Kapitel (Chapter Summaries)
The project commences by introducing the algorithms KNNs and NBs, outlining the objectives and structure of the research. The first chapter focuses on programming both algorithms from scratch, using Python. It details the data used, including its distribution in the victor space, and explains the implementation of each algorithm. The chapter concludes with an analysis of experimental results and a discussion of the error rate function for KNNs.
The second chapter delves into a comparative analysis of the two algorithms. It examines the influence of both reduced and large training sets on their performance, explores their behavior in the absence of a teacher, and investigates their effectiveness when dealing with unknown data distributions.
The third chapter shifts to the application of feature descriptors for image classification. It outlines the process of calculating descriptors and implementing a classification system using the two algorithms, but does not delve into the specifics of the image data or classification results.
Schlüsselwörter (Keywords)
This project focuses on the use and comparison of KNNs and Naïve Bayes algorithms for classification tasks. Key terms include: data distribution, training set size, error rate function, feature descriptors, and image classification. The project investigates the performance of these algorithms in various scenarios, including the absence of a teacher and dealing with unknown data distributions. It also explores the effectiveness of feature descriptors for image classification, demonstrating a comprehensive understanding of the algorithms and their applications.
- Quote paper
- Marwan Al Omari (Author), 2021, Understanding of Algorithms. KNNs and Naive Bayes, Munich, GRIN Verlag, https://www.hausarbeiten.de/document/1215098