This research works seeks to explore and provide an improved fault detection approach for inspection and fault detection. It systematically investigate and characterize software faults and faults to improve fault detection and prevention mechanisms in the quality software development process. Firstly, it contributes an Adaptive PSO-based association rule mining techniques for software fault classification using ANN. This task categorizes real defects by finding the best support and reliability to have the best policy for software fault classification using ANN. Secondly, it provides a Fault Prediction Approach (FPA) based on probabilistic models to perform software testing in Software Inspection. This describes a cost-effective way to accurately detect the defects by performing software inspection to develop quality software. The proposed FPA probes stochastic methods using the modified Naive Bayes classification to estimate the possible faults in the experimental environment to suggest novel defect control development.
Software reliability engineering has become very important as the complexity of the system has increased exponentially with technological advances. The fact that all systems today depend on many other systems and interfaces is not only an application error but also a number of environmental factors that lead to failure. The impact of these failures depends on the nature of the system, but many of them cause customer dissatisfaction and business loss. System testing and fault detection have become the most important processes in the software life cycle. Various failure prediction models can be analyzed and suggested so that failures can be detected at an early stage and many test efforts can be saved. Software development has many defects in the design phase. In the past, many examples of software development deficits have been presented. This practice demonstrates various effects on defect reduction.
Methodology for defect detection and better maintenance strategies that can be used to support defective or non-defective software modules before they are deployed to all software projects.
Table of contents
CHAPTER-1
INTRODUCTION
1.1 Overview
1.2 Research Motivation
1.3 Problem Definition
1.4 Research Objective
1.5 Research Contribution
1.6 Dissertation Organisation
CHAPTER-2
BACKGROUND STUDY
2.1 Overview
2.2 Software Fault Detection Mechanism
2.2.1 Detection Using Automated Static Analysis
2.2.2 Detection Using Graph mining
2.2.3 Detection Using Classifiers
2.2.4 Detection Using Pattern Mining
2.3 Software Fault Prevention Mechanism
2.3.1 Importance of Fault Prevention
2.3.2 Activities in Fault Prevention
2.4 Faults Prevention Benefits And Limitations
2.5 Approach to Improve Defect Prediction Model
2.5.1 Software Defect Prediction
2.5.2 Major Experimental Components In Defect Prediction Modeling
2.6 Related Works
CHAPTER-3
Adaptive PSO Based Association Rule Mining Technique for Software Defect Classification using ANN
3.1 Overview
3.2 Problem for Software Defect Classification
3.3 Proposed Methodology
3.3.1. Frequent Item Set Mining
3.3.2. Adaptive Particle Swarm Optimization (APSO)
3.3.3 Classification Phase using Artificial Neural Network
3.4 Experiment Evaluation
3.4.1 Setup
REFERENCES
CHAPTER-1
INTRODUCTION
This chapter presents the overview of the thesis. It will give the reader an insight of current situation in the field of fault occurrences in Software Development. It presents the Research Motivation, Problem Definition, Objective and its Contributions. At last, it presents of of the thesis.
1.1 Overview
All software contains faults and undoubtedly some of these faults will result in failures. The consequences of such failures are sometimes unacceptable. Better understanding and quantification of the relationships between software faults and failures are essential to more efficient detection and elimination of faults and prevention of failures, and thus, to improvement of software quality.
A failure is defined as a departure of the system or system component behavior from its required behavior. On the other hand, a fault is an accidental condition, which if encountered, may cause the system or system component to fail to perform as required. Thus, faults represent problems that developers see, while failures represent problems that the users (human or computer) see. Not every fault corresponds to a failure since the conditions under which fault(s) result in a failure may never be met. It should be emphasized that faults can be introduced at any phase of the software life cycle, that is, they can be tied to any software artifacts such as requirements, design, and source code.
Inspection is an effective defects identification process in software development but also an expensive quality assurance activity.A number of variations for inspection processes have been suggested and tried [1]. All inspection processes include steps for individual defect detection and for defect collection, which may include a team meeting. A basic question is whether defect detection during inspection is an individual activity, where inspectors find most of the defects before or instead of a meeting, or rather a group activity, where most defects are detected during discussion in a meeting. According to P. Johnson [2] and L. Votta [3], meetings have often little detection effect, but may lengthen the duration of an inspection considerably. In all cases, intensive individual defect detection work is an important prerequisite for fruitful elicitation of defects.
Software inspections are being applied with great success to detect defects in different kinds of software documents such as specifications, designs, test plans, or source code. In an inspection, several reviewers independently inspect the same document. Afterward, all defects detected by the inspection team are collected. Some defects will be detected by more than one reviewer; hence, the outcome of an inspection is a zero-one matrix showing which reviewer detected which defect.
The defect in software usually reduce the software quality because they cause failures [4], [5]. A software failure occurs when a service delivered by software deviates from fulfilling its intended functionality, whereas error cause the discrepancy between the delivered and intended functionality and the adjudged or hypothesized cause of an error can considered as a fault, which is commonly known as a defect or as a bug among developers in software development.
Usually, not all the defects contained in a document are detected during an inspection. After the inspection, management must decide whether to reinspect the document to find more defects or pass the document on to the next development step. A common way to reach a decision is to prescribe a certain level of defect-freeness of the documents. For example, management could demand that a document be 95 percent defect-free before it is released. In reality, it is unknown how many defects actually are contained in a document. Therefore, it is an important problem in software engineering practice to reliably estimate the number of defects in a document from the outcome of an inspection.
This research intends to focus in line of reduction of faults during software development through proposing improvised inspection mechanism and novel defect detection process for achieving high quality development.
1.2 Research Motivation
Software engineering denotes a systematic, disciplined, and quantifiable approach to software development, which involves a large part of human effort. In human based activities, defects, which are product anomalies that deviate from required quality properties, are inevitable and need to be systematically detected, tracked, and resolved.
No single fault-detection technique is capable of addressing all fault-detection concerns [6].An important focus of research has been on defect detection process effectiveness and efficiency [7], [8]. Team size, document quality and size, and the defect detection process the inspectors in the team use are expected to have an important impact on the number of defects the team finds during inspection. Reading techniques used in the individual defect detection step of an inspection have shown to considerably impact the individual defect detection effectiveness and efficiency . In practice, having more inspectors on a team typically increases the inspection interval, mostly due to people's schedule [9].
Although some evidence of nonlocalized faults has been observed by others, this phenomenon has not been specifically investigated and quantified. To the best of our knowledge, very few exception are studied in past years [6], which reported that most fixes are confined to single, small, well-encapsulated modules. On the other hand, in [10], the author commented that many major software failures have been traced to unanticipated combinations of otherwise minor problems, but did not provide specific data that support this claim. Additionally, thorough investigation of several shattering accidents of safety/mission critical systems [7], [11] led to the conclusion that these accidents can be attributed to combinations of faults, system misconfiguration, and procedure violations.
These above mentions scope, limitation and issues of inspection approaches and defect detection in different domains of software development motivates us to enhance the process of quality software development through reduction software failures caused due to various faults or errors.
1.3 Problem Definition
There are various practices are being presented for faults reduction in software development in the past. These practices has been shown that the effectiveness of inspections can vary widely [5], [12]. Primarily, this variation may due to the fact that companies may not have implemented an optimal inspection solution given their environment and development situation. Other probable causes might be the differences in artifacts being inspected and available resources to perform inspections.
Secondly, the inspection packages for individual inspectors should include information on the planned effort for the inspection to enable inspectors to fit their inspection tasks into their work schedules. So far, there is little empirical information available on the effort inspection techniques need for best results and the effectiveness of these techniques for different levels of actual inspection effort invested.
However, regardless of the specific situation at hand, one needs to make a decision whether the inspected artifacts is of sufficient quality or whether a reinspection is warranted. The current inspection use the standard methods use the zero-one matrixas the only input for computing the estimate. These all makes such decision as objectively as possible and therefore a practical decision problems is raised which need to be address for quality and cost effective development.
The current standard techniques for estimating the defect content after an inspection fall into two categories: the capture-recapture methods and the curve-fitting methods [1] . Several studies show that the capture-recapture and curve-fitting estimates are much too unreliable to be used in various practices . The methods show extreme outliers and a high variation in the estimation error. These explanation that the standard methods do not takes into account the observation and experience made in past inspections.
The software defect prediction models [13] are weak because of their inability to cope with the yet unknown relationship between defects or faults and failures. These relationship between faults and failures is a complex paradigm because it is possible that a fault may result in many failures and vice versa. Since associating failures with the faults that caused them in a difficult task, simplifying assumptions and heuristics are often used. So, it is another very important problem in software development to have accurate defect prediction models or process in place to reduce the high faults to reduce systems failures.
1.4 Research Objective
Quality is well understood to be an important factor in software. Deming succinctly describes the business chain reaction resulting from quality: improving quality leads to decreasing rework, costs, and schedules, which all lead to improved capability, which leads to lower prices and larger market share, which leads to increased profits and business continuity [14]. Software process improvement is inspired by this chain reaction and focuses on implementing disciplined processes, i.e., performing work consistently according to documented policies and procedures [15]. If these disciplined processes conform to accepted best practice for doing the work, and if they are continually and measurably improving, they are characterized as mature processes.
Software inspection is a proven approach that enables the detection and removal of defects in software artifacts soon after these artifacts are created [7]. Inspections not only contribute toward software quality improvement, but can also lead to budget and time benefits through early defect detection.
The main objective of this research is to presents an enhanced approaches which are to be used for the inspection and defect detection to determine whether every functional scenario defined in the specification is implemented correctly in relate to a set of programs where every program processes will be contributed to the implementation of some functional scenario of the specifications.
It also explored the adequacy, accuracy, scalability, and uncertainty of architecture-based software reliability models based on the large-scale and open-source case studies on certain assumptions that do not appear to hold true and some heuristics cases are also not justified.
For the successful enhancement it systematically investigate and characterize the software faults and failures based on data extracted from the change tracking systems in large-scale, real-world software projects. The research objective also contributes to the body of empirical knowledge about faults and failures. In order to do so, it not only work toward our specific research goals, but also compare our results with related studies wherever it is required.
1.5 Research Contribution
In this thesis it contribute four novel mechanisms for Reduction of Fault based on Software Defect Classification and Fault Prediction in Software System.
Firstly, it contributes an Adaptive PSO based Association Rule Mining Technique for Software Defect Classification using ANN. This work discover the best assistance and confidence value to have the best policies for Software Defect Classification and the Artificial Neural Network (ANN) is used to classify the actual defects.
Secondly, it contributes a Fault Prediction Approach based on the Probabilistic Model for Improvising Software Inspection. This work improvise the software inspection to detect the defect accurately and cost effective for the quality software development. The proposed FPA investigate a probabilistic methods using modified Naive Bayes classification to estimate the probable faults in an experiment context to suggest fault controlling development.
Thirdly, it contributes a Defect Classification using Relational Association Rule Mining Based on Fuzzy Classifier along with Modified Artificial Bee Colony Algorithm. This work propose a fuzzy method to identify the relationships among features which may express the quantitative information that exist in the vector characterizing a software entity. The rules are generated from the selected feature subset by employing Relational Association Rule Mining to identify the defects in the database.
Finally, it contribute a Rule-based Prediction Method for Defect Detection in Software System. This work present a rule-based prediction (RBP) method for defect detection and for planning the better maintenance strategy, which can support in the forecast a defective or non-defective software module before it can deploy for any software project.
1.6 Dissertation Organisation
The thesis comprises of seven chapters as follows.
- The Chapter-1, presents "Introduction" of the thesis describing Overview, Research Objective, Problem Definition, Research Contribution and chapters structure organization.
- The Chapter-2, presents the "Background Study", which investigates the Software Fault Detection Mechanism, Software Fault Prevention Mechanism and its Prevention Benefits and Limitations. Later it discuss the approaches to Improve Defect Prediction Model and the related works in software Fault Detection and Prevention mechanisms.
- The Chapter-3, presents an "Adaptive PSO based Association Rule Mining Technique for Software Defect Classification using ANN". It presents the related works for Software Defect Classification initially and later it discussed the proposed classification methodology and experiment Evaluation and its results analysis.
CHAPTER-2
BACKGROUND STUDY
This chapter investigates on current practices for software fault detection and prevention mechanisms in the software development. It discuss the Software fault detection and its prevention mechanism, and the benefits and limitations of these mechanisms. It also discuss the approaches to improve defect prediction model and its related works in software fault detection and prevention.
2.1 Overview
The need of distributed and complex commercial applications in enterprise demands error free and quality application systems. This makes it extremely important in software development to develop quality and fault free software. It is also very important to design reliable and easy to maintain as it involves a lot of human efforts, cost and time during software life cycle. A software development process performs various activities to minimize the faults such as fault prediction, detection, prevention and correction.
Assurance of software quality with testing alone is not enough since its effect comes late in development and is rather costly. Inspection can be applied early in the life cycle, immediately after the development of a product, and has established an impressive track record to find defects in software development products in time to determine product quality and to save rework later in the project [16].
The software is a single entity which has established a strong impact on all the domain software which includes education, defence, medical, scientific, transportation, telecommunications and others. The activities of this domain always demands for high quality software for their accurate service need [17], [18], [19]. Software quality means to be an error-free product, which will be competent to produce predictable results and able to deliver within the constraints of time and cost. Therefore, a systematic approach for developing high quality software is increased in the competitiveness in today's business world, technology advances, the complexity of the hardware and the changing business requirements. So far, for the fault-prone modules various techniques have been proposed for predicting and forecasting in terms of performance evaluation. However, the kind of quality improvement and cost reduction as their actual need to meet the business objectives is rarely assessed.
Software failures are mainly caused by design deficiencies that occur when a software engineer, either misunderstood a specification or simply makes an error. It is estimated that 60-90% of current computer errors are caused due to the software failures [20], [21], [22]. These failure predictions has been studied in the context of fault-prone modules, self healing systems, developer information, maintenance models, etc., but a lot of things like modelling and weighting of the impact of different types of faults in different types of software systems must be explored for the fault severity in software development.
Performance requirements and reliability are fundamental to the development of high assurance systems. Based on the failure analysis it has proved a useful tool for detecting and preventing failures requirements early in the software lifecycle. Adapting a generic taxonomy fault, one is able to better prevent past mistakes and develop requirements specifications with less general failures. Fewer failures in the software specification, with respect to the requirements for performance and reliability, will result in high security and quality systems. The scope of this research works is to provide an overview of the mechanism in fault detection and techniques for the prevention of faults that can be followed in the quality software development process.
2.2 Software Fault Detection Mechanism
A failure refers to any fault or imperfection in a work activity for a software product or software process cause due to an error, fault or failure. The IEEE Standards defines the terms Error as, a human action that leads to inaccurate results, Fault as, a wrong decision while understanding the information given to solve the problems or the application process. A single error can lead to one or more faults and a several faults can lead to failure. To avoid this failure in software products, faults detections activities are carried out in every phase of the software development life cycle based on their need and criticality.
A Monden et al. [17] proposes simulation model using fault prediction results for software testing to measure the cost effectiveness of test effort allocation strategies. The proposed model evaluates the number of qualified faults in relates to resource allocation strategy, a set of modules, and the result of fault prediction. In a case study applying a small fault prediction system acceptance testing in the telecommunications industry, the results of our simulation model showed that the best strategy was to let the test effort is proportional to "number of failures expected in a module ". By using this strategy with our best prediction model of failure, the test effort reduced by 25%, while detecting as flawed normally found in testing, even if the company requires approximately 6% of the test effort for the collection of statistics, data cleansing and modelling.
2.2.1 Detection Using Automated Static Analysis
Automated Static Analysis (ASA) detection is mostly performed for the Manual Code analysis, which is one of the oldest practices are still practiced, but automated tools are increasingly used especially for the standard problems related to non-compliance faults possible memory leaks, variable usage etc. They have a essential place in the development phase because they save effort and significant resumption fault leakage test cycles. Findbugs, CheckStyle and PMD are some of the commonly used tools in the Java technology and there are many of these tools in all technologies. Although this plays an important role in the development cycle is not widely practiced in the maintenance mode. However, for systems that have compatible source for automatic static analysis detection tools can be used as a hygiene factor and good detection mechanism as any error introduced in the field is highly expensive. Maintenance cycle of ASA detection tools cannot find many flaws that may result in failures. A study on the effectiveness of ASA detection tools in the open source code reveals that less than 3% of the failures [18].
S Liu et al.[19] address static analysis technique problem that is commonly used for fault detection, but which suffers from a lack of rigor. It supports a systematic and rigorous inspection method takes advantage of the formal specification and analysis. The purpose of the method defined in the specification of a set of paths from each functional landscape program and the path specification of the program in every program contributes to the implementation of a functional landscape that is implemented correctly determine whether the inspection is used. Specification of functional scenarios to get the program paths, the paths linking scenarios, analyzing the paths against the scenarios, and the production of an inspection report, and a list of a systematic and automatic generation for inspection.
2.2.2 Detection Using Graph mining
Graph Mining is a dynamic control flow based approach that helps identify flaws that may be not crashing in nature. Use graphics calls are reduced by the simplicity in processing. The graph node represents the functions and function calls to another is represented by the edges. Edge weights are entered based on the calling frequencies. The variation in the frequency of call and change in the structure of call are potential failures. If there are problems in the data that is transmitted between the methods could also affect the graph of the named because of its implications.
2.2.3 Detection Using Classifiers
Classifiers based on the clustering algorithm and decision tree or neural network can be used to identify abnormal events of normal events for the detections. Classifiers are also formed by labelling defective tracks when a fault is observed. Some classifiers are commonly used NaiveBayes and bagging. Bayesian classification is a supervised learning method and a statistical method for classification. Representing an underlying probabilistic model that allows us to capture the uncertainty in the model of a reasoned determining the probabilities of outcomes. Recent research works [16] done in this area, without secondary supervision model that captures the normal code of behaviour probability distribution of each region is proposed to identify events when it behaves abnormally. This information is used to filter the labelling abnormality submitted to the ranking algorithm to focus on anomalous observations.
Machine learning classifiers [23] have recently introduced in the faults to predict changes in the source files. The classifier is first trained on software development, and are then used to predict whether an upcoming change causes an error. Disadvantages of existing classifier-based bug prediction techniques are not enough power for practical use and slow prediction times due to a large number of machines learned functions.
S Shivaji et al. [24] investigates several feature selection techniques, which are generally for classification based fault prediction method using Naive Bayes and Support Vector Machine (SVM) classifiers. The techniques discard less important functions until optimal classification performance achieved. The total number of functions used for the formation is substantially reduced, often to less than 10 percent of the original. Both Naive Bayes and SVM with feature selection provides significant improvement in Buggy F-measure compared to the prior classification change failure prediction results compare to proposed in [6], work.
2.2.4 Detection Using Pattern Mining
Pattern based detection also the classifier based but uses unique iterative patterns for classification sequential data using the software trace analysis for failure detection. A set of discriminatory features capture repetitive series of events from the program execution traces first executed. Subsequently, the choice is made to select the best features for classification. Classifier model is trained with these sets of features that will be used to identify the failures. Processing pattern modelling allows together the analysis and improvement of processes, the work coordinate multiple people and tools to perform a task. Process modelling focuses generally on the normative process that is how transpires cooperation, if all goes as desired. Unfortunately, real-world processes rarely go that smoothly. A more complete analysis of the process requires that the process model and details of what to do when emergency situations occur.
B.S. Lerner et al. [25] have shown that in many cases there are abstract pattern to detect the relationship between the exception handling functions and the normative process. Just as object-oriented design patterns facilitate the development, documentation and maintenance of object-oriented programs, they believe that process patterns can facilitate the development, documentation and maintenance of process models. They focus on the exception handling pattern that it have observed over many years of process modelling. They also describe these patterns using three process modelling notations: UML 2.0 Activity Diagram [26], BPMN and Little-JIL [27]. They provide both the abstract structure of the pattern, as well as examples of the pattern is used. They present some preliminary statistical data to support the contention that these patterns are commonly found in practice, and represent in relation to their ability to use these patterns to discuss the relative merits of the three notations.
2.3 Software Fault Prevention Mechanism
In software development, many faults emerged during the development process. It is a mistake to believe that faults are injected into the beginning of the cycle and removed through the rest of the development process [20]. The faults occur all the way through the development process. Therefore, fault prevention becomes an essential part of improving the quality of software processes.
Fault prevention is a process of quality improvement which aims to identify common causes of faults and change the relevant processes to prevent the type of fault recurrence. It also increases the quality of a software product and reduce overall costs, time and resources. This ensures that a project can keep the time, cost and quality in balance. The purpose of fault prevention is to identify faults in the beginning of the life cycle and prevent it happening again so that the fault cannot appear again.
2.3.1 Importance of Fault Prevention
Faults prevention is an important activity in any software project development cycle. Most software project team focuses on fault detection and correction. Thus, fault prevention, often becomes a neglected component. Right from the early stages of the project to prevent faults from being introduced into the product that measure is therefore appropriate to make. Such measures are low cost, the total cost savings achieved due to profit later on stage are quite high compared to the cost of fixing faults. Thus, the time required for the analysis of faults in the early stages, reducing the cost and resources. Fault injection methods and processes enable fault prevention knowledge. After practicing this knowledge has improved quality. It also enhances overall productivity.
2.3.2 Activities in Fault Prevention
- Fault Identification
Fault can be a pre-planned activities aimed at highlighting the specific faults found. In general, faults can be identified in design review, code inspection, GUI Review, function and unit testing activities performed at different stages of software development life cycle. Once the faults are identified it will be classified using classification approach for the detection.
- Fault Classification
Classification of fault can be made using the general Orthogonal Defect Classification (ODC) technique [28] to find the fault group and it type. The ODC technique classifies the faults at the time when fault first occurs and when the fault get fixed. The ODC methodology for each fault in orthogonal (mutually exclusive) to certain technology and some managerial Characteristics. These characteristics change through massive amounts of data can be analyzed and the root cause, the pattern to be able to access all the information on offer. Good action planning and tracking across with this fault reduction and can achieve high levels of learning.
- Fault Analysis
Fault analysis is the continuous process for the quality improvement using fault data. Fault analysis generally classified in categories blame and direct process improvement efforts in order to attempt to identify possible causes. Root Cause Analysis (RCA) software fault has played a useful role in the analysis. RCA's goal to identify the root cause of faults and flaws that the source is eliminated so is to initiate action.
- Fault Prevention
Fault prevention is an important activity in any software project. Identify the cause of faults and fault prevention objective is to prevent them from recurring. Fault Prevention had suffered in the past to analyze the faults and faults in the future to prevent the occurrence of these types include special operations. Fault prevention software process to improve the quality of one or more phases of the software life cycle can be applied.
The benefits of analysis software faults and failures are widely recognized. However, a detailed study based on concrete data is rare. M Hamill et al. [21] analyze the fault and failure data from two large, real-world case studies. The results show that individual faults are caused often distributed through multiple errors in the entire system. This observation is important because it does not support multiple uses heuristics and assumptions about the past. Moreover, it is clear that the search for and fixing errors, such software errors that result in large, complex systems are often in spite of the advances in software development difficult and challenging tasks.
2.4 Faults Prevention Benefits And Limitations
Fault prevention strategies exist, but reflect a high level of test maturity discipline associated with the testing effort represents the most cost-effective expenditure. To detect errors in the development life cycle from design to implement code specifications require that helps to prevent the escape of errors. Therefore, test strategies can be classified into two different categories as, fault detection technologies and fault prevention technologies.
The lack of specific domain knowledge, where new and diverse domain software is a need to develop and implement. In many occasions, appropriate quality requirements specified are not in the first place. The inspection operation is labour intensive and requires high skill. Sometimes well-developed quality measurement may not have been identified at design time.
2.5 Approach to Improve Defect Prediction Model
Defect models, which identify defect-prone software modules using a variety of software metrics, serve two main purposes. First, defect models can be used to predict modules that are likely to be defect-prone. Software Quality Assurance (SQA) teams can use defect models in a prediction setting to effectively allocate their limited resources to the modules that are most likely to be defective. Second, defect models can be used to understand the impact that various software metrics have on the defect-proneness of a module.
The insights derived from defect models can help software teams to avoid past pitfalls that lead to defective modules The predictions and insights that are derived from defect prediction models may not be accurate and reliable if researchers do not consider the impact those experimental components (e.g., datasets, metrics, and classifiers) of defect prediction modeling. Indeed, there exists a plethora of research that raise concerns about the impact of experimental components on defect prediction modeling. For example, Sheppard et al. Found that the reported performance of a defect prediction model shares a strong relationship with the group of researchers who construct the models. Their observations suggest that many published defect prediction studies are biased, and calls their validity into question.
To assess the impact of experimental components on defect prediction modeling, the association between the reported performance of a defect model and the used experimental components (i.e., datasets, metrics, and classifiers) are considered. Experimental components (i.e., metrics) are used to construct defect prediction models share a stronger relationship with the reported performance than research group does, suggesting that experimental components of defect prediction modeling may impact the conclusions of defect prediction studies.
2.5.1 Software Defect Prediction
The main goal of most software defect prediction studies is: (1) to predict where defects might appear in the future, and (2) to quantify the relative importance of various factors (i.e., independent variables) used to build a model. Ideally, these predictions will correctly classify software artifacts (e.g., subsystems and files) as being defect-prone or not and in some cases may also predict the number of defects, defect density (the number of defects / SLOC) and/or the likelihood of a software artifact including a defect. Figure 2.1 shows an overview of the software defect prediction process.
First, data is collected from software repositories, which archive historical data related to the development of the software project, such as source code and defect repositories. Then, various metrics that are used as independent variables (e.g., SLOC) and a dependent variable (e.g., the number of defects)are extracted to build a model. The relationship between the independent variables and dependent variable is modeled using statistical techniques and machine learning techniques. Finally, the performance of the built model is measured using several criteria such as precision, recall and AUC (Area Under the Curve) of ROC (the Receiver Operating Characteristic).
2.5.2 Major Experimental Components In Defect Prediction Modeling
It addresses three major experimental components in defect prediction modeling:
1. Overlooking noise generated by issue report mislabeling
Overlooking noise generated by issue report mislabeling and investigates the impact that realistic noise generated by issue report mislabeling has on the predictions and insights of defect prediction models.
2. Overlooking the parameters of classification Techniques
Overlooking the optimal parameter settings of classification techniques and investigate the impact that parameter settings of classification techniques have on the accuracy and reliability of the performance of defect prediction models when automated parameter optimization is applied.
3. Overlooking the most accurate and reliable model validation techniques
Overlooking the most accurate and reliable model validation techniques and investigate the impact that model validation techniques have on the accuracy and reliability of the performance of defect prediction models.
The impact of experimental components has on the predictions and insights of defect prediction models. Through case studies of systems that span both proprietary and open-source domains, demonstrate that the experimental components (e.g., metric family) that are used to construct defect prediction models share a stronger relationship with the reported performance than research group does. Noise generated by issue report mislabeling has an impact on the predictions and insights of defect prediction models. Parameter settings of classification techniques have an impact on the accuracy and reliability of the performance of defect prediction models. Model validation techniques have an impact on the accuracy and reliability of the performance estimates that are produced by defect prediction models.
2.6 Related Works
No single software fault detection technique is capable of addressing all concerns in error detection. Similar software reviews and testing, static analysis tools (or automated static analysis) can be used to remove faults before a software product release. Inspection, prototyping, testing and proofs of correctness are several approaches to identify faults. Formal inspections to identify faults in the early stages of developing the most effective and expensive quality assurance techniques. Prototype through several requirements clearly helps to overcome the faults which are understood. Testing is one of the least effective techniques. May escape detection in the early stages, which is to blame, those tests could be detected in time. The accuracy proofs especially on the coding level are a good means of detection. Accuracy in manufacturing the most effective and economical way of building software.
J. Zhang et al. [29] determine the extent to which automated static analysis can in the economic production to help a high quality product, they have static analysis and examine errors and customer reported losses for the three major in developed industrial software systems analyzed at Nortel Networks , The data show that automated static analysis is an affordable means of software error detection. Using Orthogonal defect classification scheme, they found that automated static analysis effectively in identifying and mapping error checking, so that subsequent software production phases to focus on more complex, functional and algorithmic error.
Khoshgovar and Allen [30], [31] have proposed a model to check for software quality factors such as future fault density modules list. The inputs to the model are software complexity metrics such as LOC, number of unique operators and complexity. A stepwise regression is then performed to find weights for each factor. Briand et al.[32] using object-oriented metrics to predict classes that are likely to contain faults and used PCA in combination with logistic regression to predict failure-prone classes. Morasca and Ruhe [33] predicts risky faults modules using rough set theory and logistic regression in commercial software.
Over the years, several software techniques have been developed to support log-based fault analysis, the integration of state-of-the art gathering techniques to manipulate and to model the log data, for example MEADEP [34], Analyzing NOW [22], and SEC [35], [36]. However, log-based analysis is not supported by fully automated procedures so that most of the processing load to analysts log is the often limited knowledge about the system. For instance, the authors in [22] have defined a complex algorithm for OS reboots from the log to identify on the basis of sequential analysis of log messages. Moreover, since an error activating multiple messages in the log cause a considerable effort to spent to the entries on the same mistake manifestation merged results [37], [38], [39] . Pre-processing tasks are critical to obtaining accurate failure analysis [40], [41].
While many case studies in the failure prediction in application for industry records reported [42], [43], [44] few studies have estimated achieved through early fault detection to reduce the test effort or increase the software quality. Li et al. [45] reported experience of application field fault prediction in ABB Inc. Their experiences are practical questions about how to select a suitable modelling method and how to evaluate the accuracy of the forecasts for several releases in the time period.
Mende and Koschke [46] and Kamei et. al [47] suggested that the efforts consciously measure to assess the failure prediction accuracy. While conventional valuation measures such as recall, precision, Alberg charts and ROC curves ignore the cost of quality assurance takes its action, the audit or review of a module is roughly expected to be proportional to the size. C F. Kemerer et al. [48] studied the influence of the checking rate on software quality, while the controller for a comprehensive range of factors that can affect the analysis. The data comes from the Personal Software Process (PSP), which implements carried out inspections, the development group activities. In particular, the PSP design and code review rates correspond to the preparatory courses in inspections.
CHAPTER-3
Adaptive PSO Based Association Rule Mining Technique for Software Defect Classification using ANN
This chapter propose a system categorizes various defects by using association rule mining dependent problem classification approach, which is applied to collect the actual defects using recognition. Association rule mining algorithm at times results in useless policies. To avoid this kind of concerns, the principles prior to classification determined by assistance as well as confidence value has to be optimized. In this exploration, Adaptive Particle Swarm (APSO) optimization algorithm is used. This can discover the best assistance and confidence value to have the best policies. And finally Artificial Neural Network (ANN) can be used to classify the actual defects determined.
3.1 Overview
Defect will be damaging in most levels connected with software package development. Any defect is often a drawback, insufficiency or even inaccuracy in the software package item. Problem stays in the experience of living connected with software package; each and every defect that arises in the software package levels is the defect for the reason that software package [49]. Exploration in program code critiques features typically devoted to defect matters rather than defect kinds, which offers an imperfect view connected with program code examine positive aspects [50]. The huge number of study about the dissimilarities connected with defects in addition to their particular character cannot be protected completely in this article although most of us give full attention to a number of notable examples. It could roughly separate these into few classes: (1) defect taxonomies, (2) root cause analysis, in addition to (3) defect group [51]. The price of finding in addition to repairing bugs or even defects is the major solitary expense take into account a brief history connected with software package. Bugs maintenance tasks focus on needs in addition to go on through development [52].
Software defect recognition aspires in order to automatically determine faulty software template modules for productive software test out so that you can increase the calibre of some sort of software system [53]. Computer software high quality has become an important difficulty considering that prolonged. There are several professions linked to the caliber of software yet it now have concentrated towards defect elimination for outsourcing techniques tasks.
Recently, high quality is usually accepted as the most important element that has strong have an effect on achievement on the software product [54]. The products some sort of software system might be described using different capabilities for instance dependability, maintainability etc., [55]. Among software high quality improvement actions screening is usually likely the most important. It's, thus, involving distinct fascination to gauge along with appreciate how beneficial is usually a distinct set of evaluations with respect to it is chance to identify by far the most troublesome (post-release) defects [56].
Computer software high quality begins while using the treatment or even considerable decrease involving software defects ahead of other high quality capabilities for instance maintainability, portability, dependability, or even simplicity [57]. Defect elimination is usually a high quality confidence alternate for strengthening high quality involving software product. The target involving defect elimination methods would be to generate far better high quality items in available cost range along with time period. Top quality of a product directly relates to the quantity of defects: small defects bring about accomplish far better high quality [58].
Software defect elimination function generally concentrates on specific evaluation along with testing method. ODC is usually a mechanism where computer software problem of which happen in the computer software development life cycle will be exploited [59]. Defect avoidance is one of the most essential although habitually overlooked element of computer software high quality assurance in any challenge. When functional whatsoever levels associated with computer software development, it could reduce time, overheads along with wherewithal required to help manufacture a high quality product [60].
Defect avoidance is usually a process of figuring out most of these defects, their causes along with fixing these and to avoid those through continuing [61]. Defect avoidance may boost both high quality along with productivity. When the quantity of defects treated lowers, then the high quality improves as the quantity of extra defects in the supplied computer software minimizes [62]. Defect avoidance is the many essential although habitually overlooked facet of computer software high quality assurance in any challenge. When functional whatsoever levels associated with computer software development, it could reduce time, overheads along with wherewithal required to help manufacture a high quality product [63].
3.2 Problem for Software Defect Classification
1. The cost of finding the defects represents one of the most expensive software development activities.
2. To generate patterns, Association rule mining is a major technique. On the other hand, as huge numbers of association rules which may be sometimes insignificant.
3. Genetic optimization algorithm is limited to random solutions and convergence.
4. Association rule mining is frequently very expensive.
5. For extracting rules from software using a single mining technique will not be efficient. So an optimization technique is required.
6. The Naive Bayes classifier requires a very large number of records to obtain good results.
3.3 Proposed Methodology
Defect reduction may be the majority of vivid nevertheless usually dismissed attribute of software package good quality warranty in different project. In the event that helpful by any means steps of software package improvement, it could possibly slow up the time, overheads as well as methods included to manufacture a superior quality merchandise. This proposed system categorizing diverse errors by using association tip exploration structured deficiency category method, that's placed on group this defects following recognition. Association rule mining formula sometimes causes incomprehensible rules. Therefore it is very difficult to classify these defects dependent on these incomprehensible rules. To prevent like problems, it must enhance the principles prior to category dependent on help as well as confidence value.
In this investigation, it have centered on seeking the best rules dependent on adaptive Particle Swarm (PSO) optimization strategy. This may obtain the best help as well as confidence value to obtain best rules. Right after getting best rules, this all will probably classify these defects dependent on artificial neural network. Lastly the quality is going to be assured by employing different good quality metrics like deficiency solidity, Level of sensitivity and many others.
The software defect prediction dataset KC1 is used in the defect prediction work. The input attributes used from the dataset are Essential complexity, Design complexity, Total operators + operands, Intelligence and False, True. The features were subjected to frequent item set mining in order to extract the association rules.
3.3.1. Frequent Item Set Mining
Every product may be the component pair of candidate. This help values are generally computed with the product sets independently. When using the system, the particular support value is pushed
Abbildung in dieser Leseprobe nicht enthalten
where, A and B = Frequent item sets.
While using minimum support value, this support values are estimated for your separate product packages are usually when compared. An item packages along with support value below this minimum support value is usually taken away. This left product packages are usually picked. Future this picked product packages had been merged while using exact same product packages. Using the support value, once more this support value is usually computed for your product packages plus they are taken away. Through the removal as well as the pruning move the product set which is intended for providing connection rules are located out. Using the formula, this confidence value can be determined.
Abbildung in dieser Leseprobe nicht enthalten
where A and B = Frequent item sets.
To produce association guidelines, your typical item set made finally as well as the lowest confidence value is actually utilized. The complete item sets made have been in use. The product sets with the objects in typical item set is actually known. Applying solution (2), your confidence values for that decided on item sets have been computed. The product sets with confidence value greater than your lowest confidence value allotted usually are decided on. The actual left item sets usually are diminished. The actual decided on item sets will be the association guidelines. The association rules produced is given as input to the adaptive PSO algorithm for optimization.
3.3.2. Adaptive Particle Swarm Optimization (APSO)
Population based search protocol may become Particle swarm Optimization (PSO). It really is made to help pretend the good manners regarding birds throughout hunt for foodstuff using a cornfield or even fish institution. The technique can effectively come across best or even around best remedies throughout big search rooms. You'll find a pair of different sorts of variants are used as outlined by PSO. The foremost is “individual best” in addition to the second is "global best".
As a result, Adaptive Particle Swarm Optimization (APSO) approach is required offering far more specific clustering result the spot that the inertia weight can be as effectively regarded as. Right now, your association principles were optimized by way of APSO. The working technique of your APSO is actually performed as similar to PSO mechanism.
A. Association rules optimization using APSO
- Swarm initialization : For a population size u, develop the particles randomly.
- Define the fitness function : In line with the provided population, the fitness function need to be established for the constraints. The following eqn. (3.3) is the fitness function.
Abbildung in dieser Leseprobe nicht enthalten
- gb and pb Initialization : In the beginning the fitness value estimated for every particle is set as the Pbest value of each particle. Among the Pbest values, the optimal one is preferred as the gb value.
- Velocity Computation : The novel velocity is determined by the following equation. (3.4),
Abbildung in dieser Leseprobe nicht enthalten
- Inertia weight calculation
Abbildung in dieser Leseprobe nicht enthalten
where,
wmx and wmn - Maximum and minimum Inertia weight.
mit - Maximum number of iteration.
- Swarm Updation: Determine the fitness function again and revise the pb and gb values. If the novel value is better than the previous one, exchange the old by the current one. And in addition pick the optimal pb as the gb.
- Criterion to Stop: Prolong till the solution is actually suitable or maximum iteration is achieved. After the association rules optimization is established the defects were classified using Artificial Neural Network.
Abbildung in dieser Leseprobe nicht enthalten
Fig. 3.2 Architecture of the proposed methodology
3.3.3 Classification Phase using Artificial Neural Network
Among the category methods making use of Feed Forward Back Propagation Neural Network classifier (FFBNN) is employed regarding classifying software problems. Neural network is often a three-layer regular classifier together with in suggestions nodes, n input nodes and l hidden nodes and k output nodes. It really is looked at that if both the invisible levels utilized the other invisible layer is to affiliate each and every couple in a single significant unit and next is undoubtedly to be the real invisible layer immediately after classifying this input data from the initial invisible layer.
For our proposed task, this input levels will be the association guidelines, HUa Hidden Items the other production unit, f. The particular framework from the FFBNN classifier is usually confirmed.
A. NN Function Steps
1) Fix loads for every neuron’s except the neurons in the input layer.
2) Develop the neural network with the extracted features { A 1, A 2 , A 3, A 4 , A 5} as the input units, HUa Hidden units and age f as the output unit.
3) The computation of the suggested Bias function for the input layer is,
Abbildung in dieser Leseprobe nicht enthalten
The activation function for the output layer is estimated as,
Abbildung in dieser Leseprobe nicht enthalten
4) Recognize the learning error as offered beneath.
Abbildung in dieser Leseprobe nicht enthalten
where,
LE - learning rate of FFBNN.
Yn' - Desired outputs.
Zn' - Actual outputs.
B. Learning Algorithm – Back Propagation Algorithm used for minimizing the Error
In Feed Forward Neural System, Back Propagation Algorithm is employed as the Learning Algorithm. Back Propagation Algorithm is really a monitored Learning strategy and more over it is definitely a breakdown of delta rule. To make working out collection, it needs a dataset of the necessary productivity for numerous inputs. Typically, Back Propagation Algorithm is ideal for Feed-Forward Networks. That Learning algorithm wants that the service purpose utilized by the neurons be differentiable.
- Back propagation Algorithm Steps for FFBNN
1) The loads for the neurons of hidden layer and the output layer are developed arbitrarily selecting the weight. Nevertheless the input layer has the constant weight.
2) The suggested Bias function and the activation function are estimated using Eqn. (3.7) and (3.8) for the FFBNN.
3) The Back Propagation Error is determined for each node and then the weights are updated as follows,
Abbildung in dieser Leseprobe nicht enthalten
4) The weight ΔW(n') is transformed as follows.
Abbildung in dieser Leseprobe nicht enthalten
where,
δ - Learning Rate, which normally ranges from 0.2 to 0.5.
E(BP) - BP Error
5) The process is recurred applying (2) and (3) steps, until the BP error gets minimized. i.e. E(BP) < 0.1.
6) If the minimum value is received, then the FFBNN is properly qualified for the screening phase. Consequently, FFBNN classifier is properly qualified and the association principles are tried utilizing the attributes. The classification of flaws is moved out which categorizes the whole flaws are created from the classifier.
3.4 Experiment Evaluation
3.4.1 Setup
The proposed Defect classification system with Artificial neural network is implemented in the working platform of NETBEANS version 7.2 (jdk 1.7). Here it have applied some data in PROMISE dataset such as KC1 software defect prediction from http://promise.site. uottawa.ca/SERepository/datasets/kc1.arff. PROMISE indicates the Predictor Models in Software Engineering. PROMISE data sets is to handle noise (e.g. with outlier eradication, or characteristic subset collection, or statistical methods that deal with outliers better) in order to complete lacking data with surrogates coming from other features.
The quality of the system is determined using the quality metrics. The quality metrics estimated in our suggested strategy are as follows: Defect Density, Sensitivity, Specificity and Accuracy.
B. Sensitivity
Sensitivity (also named the real positive rate) determines the ratio of real positives which are appropriately estimated.
C. Specificity
Specificity determines the proportion of negatives that are appropriately identified.
D. Accuracy
Accuracy is estimated by employing both sensitivity and specificity relations. FAR and FRR can be calculated using sensitivity and specificity.
Abbildung in dieser Leseprobe nicht enthalten
The Prevailing strategy introduced this can be a software defect classification centred on Association principle mining, ABC algorithm and naive bayes classifier. Through Figure 3.8 it's distinct that the precision of our planned strategy efficiency was larger in comparison with the prevailing method. Hence the efficiency methods computation revealed which our planned strategy is successful than the prevailing method.
REFERENCES
[1] L. C. Briand, K. E.Emam, B. G. Freimut, O. Laitenberger, "A Comprehensive Evaluation of Capture-Recapture Models for Estimating Software Defect Content", IEEE Transactions On Software Engineering, Vol. 26, No. 6, June 2000.
[2] P. M. Johnson, "Does Every Inspection Really Need a Meeting?", Empirical Software Eng. J., vol. 3, no. 3, pp. 9-35, 1998.
[3] L. Votta, "Does Every Inspection Need a Meeting?" ACM Software Eng. Notes, vol. 18, no. 5, pp. 107-114, Dec. 1993.
[4] M Hamill and K Goseva-Popstojanova, "Common Trends in Software Fault and Failure Data" IEEE Transactions On Software Engineering, Vol. 35, No. 4, July/August 2009.
[5] J. Tian, "Software Quality Engineering: Testing, Quality Assurance, and Quantifiable Improvement", John Wiley & Sons, 2005.
[6] M. Young and R.N. Taylor, "Rethinking the Taxonomy of Fault Detection Techniques",Proc. Int'l Conf. Software Eng., pp. 53-62, 1989.
[7] V. R. Basili, S. Green, O. Laitenberger, F. Lanubile, F. Shull, S. Soerumgaard, and M. Zelkowitz, "The Empirical Investigation of Perspective-Based Reading", Empirical Software Eng. J., vol. 1, no. 2, pp. 133-164, 1996.
[8] A Porter, L. Votta, and V. Basili, "Comparing Detection Methods for Software Requirements Inspections: a Replicated Experiment", IEEE Trans. Software Eng., vol. 21, no. 6, pp. 563-575, June 1995.
[9] Porter, H. Siy, A. Mockus, and L. Votta, "Understanding the Sources of Variation in Software Inspections", ACM Trans. Software Eng. and Methodology, vol. 7, no. 1, pp. 41-79, Jan. 1998.
[10] M Cinque, D Cotroneo, and A Pecchia, "Event Logs for the Analysis of Software Failures: A Rule-Based Approach", IEEE Transactions On Software Engineering, Vol. 39, No. 6, June 2013.
[11] A Gunes¸ Koru, D Zhang, K El Emam, and H Liu, "An Investigation into the Functional Form of the Size-Defect Relationship for Software Modules" IEEE Transactions On Software Engineering, Vol. 35, No. 2, March/April 2009.
[12] Stefan Biffl and Michael Halling, "Investigating the Defect Detection Effectiveness and Cost Benefit of Nominal Inspection Teams" IEEE Transactions On Software Engineering, Vol. 29, No. 5, May 2003.
[13] S Shivaji, E. J Whitehead Jr., R Akella and S Kim, "Reducing Features to Improve Code Change-Based Bug Prediction", IEEE Transactions On Software Engineering, Vol. 39, No. 4, April-2013.
[14] C F. Kemerer and Mark C. Paulk, "The Impact of Design and Code Reviews on Software Quality: An Empirical Study Based on PSP Data", IEEE Transactions On Software Engineering, Vol. 35, No. 4, July/August 2009.
[15] Ackerman A, L. Buchwald, and F. Lewski, "Software Inspections: An Effective Verification Process",IEEE Software, vol. 6, no. 3, pp. 31-36, 1989.
[16] G. Bronevetsky,I.Laguna,B.R. de Supinski, S. Bagchi, "Automatic fault characterization via abnormality-enhanced classification," Dependable Systems and Networks (DSN), 2012 42nd Annual IEEE/IFIP International Conference on , vol., no., pp.1,12, 25-28 June 2012
[17] A Monden, T Hayashi, S Shinoda, K Shirai, J Yoshida, M Barker and K Matsumoto, "Assessing the Cost Effectiveness of Fault Prediction in Acceptance Testing", IEEE Transactions on Software Engineering, DOI-098-5589, 2013.
[18] Fadi Wedyan, Dalal Alrmuny and James M. Bieman, "The Effectiveness of Automated Static Analysis Tools for Fault Detection and Refactoring Prediction", ICST '09. International Conference, vol., no., pp.141,150, 1-4 April 2009.
[19] S Liu, Y Chen, F Nagoya and J A. McDermid, "Formal Specification-Based Inspection for Verification of Programs", IEEE Transactions on software engineering, vol. 38, no. 5, september/october 2012.
[20] David Lo, Hong Cheng, Jiawei Han, SiauCheng Khoo and Chengnian Sun, "Classification of Software Behaviors for Failure Detection: A Discriminative Pattern Mining Approach", KDD '09 Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. Pages 557-566 ACM, USA, 2009.
[21] M Hamill and K Goseva-Popstojanova, "Common Trends in Software Fault and Failure Data" IEEE Transactions on Software Engineering, Vol. 35, No. 4, July/August 2009.
[22] A. Thakur and R.K. Iyer, "Analyze-Now—An Environment for Collection and Analysis of Failures in a Networked of Workstations", IEEE Trans. Reliability, vol. 45, no. 4, pp. 561-570, Dec. 1996.
[23] V. Challagulla, F. Bastani, I. Yen, and R. Paul, "Empirical Assessment of Machine Learning Based Software Defect Prediction Techniques", Proc. IEEE 10th Int’l Workshop Object-Oriented Real-Time Dependable Systems, pp. 263-270, 2005.
[24] S Shivaji, E. J Whitehead Jr., R Akella and S Kim, "Reducing Features to Improve Code Change-Based Bug Prediction", IEEE Transactions on Software Engineering, Vol. 39, No. 4, April-2013.
[25] B. S. Lerner, S Christov, L J. Osterweil, R Bendraou, U Kannengiesser and A Wise, "Exception Handling Patterns for Process Modeling", IEEE Transactions On Software Engineering, Vol. 36, No. 2, March/April 2010.
[26] OMG, Unified Modelling Language, Superstructure Specification, Version 2.1.1, http://www.omg.org/spec/ UML/2.1.1/ Superstructure/PDF/, 2010.
[27] A. Wise, "Little-JIL 1.5 Language Report", technical report, Dept. of Computer Science, Univ. of Massachusetts, 2006.
[28] Orthogonal Defect Classification – A concept for In-Process Measurements, IEEE Transactions on Software Engineering, SE-18.p.943-956.
[29] J Zheng, L Williams, N Nagappan, W Snipes, J P. Hudepohl and M A. Vouk, "On the Value of Static Analysis for Fault Detection in Software", IEEE Transactions on Software Engineering, Vol. 32, No. 4, April 2006.
[30] T. Khoshgoftaar and E. Allen, "Predicting the Order of FaultProne Modules in Legacy Software", Proc. Int’l Symp. Software Reliability Eng., pp. 344-353, 1998.
[31] T. Khoshgoftaar and E. Allen, "Ordering Fault-Prone Software Modules", Software Quality J., vol. 11, no. 1, pp. 19-37, 2003.
[32] L.C. Briand, J. Wiist, S.V. Ikonomovski, and H. Lounis, "Investigating Quality Factors in Object-Oriented Designs: An Industrial Case Study", Proc. Int’l Conf. Software Eng., pp. 345-354, 1999.
[33] S. Morasca and G. Ruhe, "A Hybrid Approach to Analyze Empirical Software Engineering Data and Its Application to Predict Module Fault-Proneness in Maintenance", J. Systems Software, vol. 53, no. 3, pp. 225-237, 2000.
[34] D. Tang, M. Hecht, J. Miller, and J. Handal, "Meadep: A Dependability Evaluation Tool for Engineers", IEEE Trans. Reliability, vol. 47, no. 4, pp. 443-450, Dec. 1998.
[35] R. Vaarandi, "SEC—A Lightweight Event Correlation Tool", Proc. Workshop IP Operations and Management, 2002.
[36] J.P. Rouillard, "Real-Time Log File Analysis Using the Simple Event Correlator (SEC)", Proc. USENIX Systems Administration Conf., 2004.
[37] J.P. Hansen and D.P. Siewiorek, "Models for Time Coalescence in Event Logs", Proc. Int’l Symp. Fault-Tolerant Computing, pp. 221227, 1992.
[38] Y. Liang, Y. Zhang, A. Sivasubramaniam, M. Jette, and R.K. Sahoo, "Bluegene/L Failure Analysis and Prediction Models", Proc. Int’l Conf. Dependable Systems and Networks, pp. 425-434, 2006.
[39] A. Pecchia, D. Cotroneo, Z. Kalbarczyk, and R.K. Iyer, "Improving Log-Based Field Failure Data Analysis of Multi-Node Computing Systems", Proc. Int’l Conf. Dependable Systems and Networks, pp. 97-108, 2011.
[40] D. Yuan, J. Zheng, S. Park, Y. Zhou, and S. Savage, "Improving Software Diagnosability via Log Enhancement", Proc. Int’l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 3-14, 2011.
[41] J.A. Duraes and H.S. Madeira, "Emulation of Software Faults: A Field Data Study and a Practical Approach", IEEE Trans. Software Eng., vol. 32, no. 11, pp. 849-867, Nov. 2006.
[42] N. Ohlsson, and H. Alberg, "Predicting fault-prone software modules in telephone switches", IEEE Trans. Software Engineering, vol. 22, no. 12, pp. 886-894, 1996.
[43] T. J. Ostrand, E. J. Weyuker, and R. M. Bell, "Predicting the location and number of faults in large software systems", IEEE Trans. on Software Engineering, vol. 31, no. 4, pp. 340-355, 2005.
[44] A. Tosun, B. Turhan, and A. Bener, "Practical considerations in deploying AI for defect prediction: a case study within the Turkish telecommunication industry", Proc. 5th Int’l Conf. on Predictor Models in Software Engineering (PROMISE’09), pp. 1-9, 2009.
[45] P. L. Li, J. Herbsleb, M. Shaw, and B. Robinson, "Experiences and results from initiating field defect prediction and product test prioritization efforts at ABB Inc.", Proc. 28th Int’l Conf. on Software Engineering, pp. 413-422, 2006.
[46] T. Mende and R. Koschke, "Revisiting the evaluation of defect prediction models", Proc. Int’l Conference on Predictor Models in Software Engineering (PROMISE’09), pp. 1–10, 2009.
[47] Y. Kamei, S. Matsumoto, A. Monden, K. Matsumoto, B. Adams, and A. E. Hassan, "Revisiting common bug prediction findings using effort aware models", Proc. 26th IEEE Int’l Conference on Software Maintenance (ICSM2010), pp. 1-10, 2010.
[48] C F. Kemerer and Mark C. Paulk, "The Impact of Design and Code Reviews on Software Quality: An Empirical Study Based on PSP Data", IEEE Transactions on Software Engineering, Vol. 35, No. 4, July/August 2009.
[49] Hafiz Ansar Khan, "Establishing a Defect Management Process Model for Software Quality Improvement", International Journal of future Computer and Communication, Vol. 2, No. 6, 2013.
[50] Mika V. Ma' ntyla' and Casper Lassenius, "What Types of Defects are really Discovered in Code Reviews?", IEEE Transactions on Software Engineering, Vol. 35, 2009.
[51] Stefan Wagner, "Defect Classification and Defect Types Revisited", ACM journal on software testing, 2008.
[52] Capers Jones, "Software Defect Origins and Removal Methods", 2012.
[53] Yuan Jiang, Ming Li and Zhi-Hua Zhou, "Software Defect Detection with ROCUS", Journal of Computer Science and Technology, 2011.
[54] Muhammad Faizan, Sami Ulhaq and M.N.A. Khan, "Defect Prevention and Process Improvement Methodology for Outsourced Software Projects", Middle-East Journal of Scientific Research, Vol. 19, No. 5, pp. 674-682, 2014.
[55] Stefan Wagner, "A Model and Sensitivity analysis of the Quality Economics of Defect-Detection Techniques", ACM journal on software testing, 2006.
[56] AudrisMockus and NachiappanNagappan, "Test Coverage and Post-Verification Defects: A Multiple Case Study", Proceedings of the Third International Symposium on Empirical Software Engineering and Measurement, ESEM, pp. 291-301, 2009.
[57] Iraj Hirmanpour and Joe Schofield, "Defect Management through the Personal Software Process", The Journal of Defense Software Engineering, 2003.
[58] Muhammad Faizan, Muhammad Naeem Ahmed Khan and Sami Ulhaq, "Contemporary Trends in Defect Prevention: A Survey Report", International Journal on Modern Education and Computer Science, Vol. 3, pp. 14-20, 2012.
[59] Prakriti Trivedi and SomPachori, "Modelling and Analysing of Software Defect Prevention Using ODC", International Journal of Advanced Computer Science and Applications, Vol. 1, No. 3, 2010.
[60] Suma. V. and T.R. Gopalakrishnan Nair, "A Effective Defect Prevention Approach in Software Process for Achieving Better Quality Levels", World Academy of Science, Engineering and Technology, 2008.
[61] Suma V, "Defect Prevention Approaches In Medium Scale It Enterprises", National Conference on Recent Research Trends in Information Technology, 2008.
[62] Pankaj Jalote and Naresh Agrawal, "Using Defect analysis Feedback for Improving Quality and Productivity in Iterative Software Development", In proceedings of ITI 3rd International Conference on Information and Communications Technology, pp. 703-713, 2007.
[63] Abhiraja Sharma, Naveen Hemrajani, SavitaShiwani and Ruchi Dave, "Defect Prevention Technique in Test Case of Software Process for Quality Improvement", International Journal on Computer Technology Applications, Vol. 3, No. 1, pp. 56-61, 2012.
[64] Eric Bean, "Defect Prevention and Detection in Software for Automated test Equipment", IEEE Transactions on Instrumentation and Measurement, Vol. 11, No. 4, pp. 16-23, 2008.
[65] SakthiKumaresh and andBaskaran Ramachandran, "Defect Prevention based on 5 Dimensions of Defect Origin", International Journal of Software Engineering & Applications (IJSEA), Vol.3, No.4, 2012.
[66] Sayed MehranSharafi, "SHADD: A scenario-based approach to software architectural defects detection", Elsevier Journal of Advances in Engineering Software, 2012.
[67] Arpita Mittal and Sanjay kumarDubey, "Defect Handling in Software Metrics", International Journal of Advanced Research in Computer and Communication Engineering, Vol. 1, May 2012.
[68] T.R. Gopalakrishnan Nair, V. Suma and P. Kumar Tiwari, "Significance of Depth of Inspection and Inspection Performance Metrics for Consistent Defect Management in Software Industry", Journal of IET software, 2012.
[69] Marcos Kalinowski, David N. Card and Guilherme H. Travassos, "Evidence-Based Guidelines to Defect Causal Analysis", IEEE Transactions on Software Engineering, 2012.
[70] T.R. Gopalakrishnan Nair and R. Selvarani, "Defect Proneness Estimation and Feedback approach for Software design quality Improvement", Elsevier Journal of Information and Software technology, 2012.
[71] Shivkumar Shivaji, E. James Whitehead Jr., Ram Akella, and Sunghun Kim "Reducing Features to Improve Code Change-Based Bug Prediction", IEEE Transactions On Software Engineering, Vol. 39, No. 4, April 2013.
[72] M Cinque, D Cotroneo, and A Pecchia, "Event Logs for the Analysis of Software Failures: A Rule-Based Approach", IEEE Transactions on Software Engineering, Vol. 39, No. 6, June 2013.
[73] D.L. Parnas and M. Lawford, "The Role of Inspection in Software Quality Assurance," IEEE Trans. Software Eng., vol. 29, no. 8, pp. 674-676, Aug. 2003.
[74] A Porter, L. Votta, and V. Basili, "Comparing Detection Methods for Software Requirements Inspections: a Replicated Experiment", IEEE Trans. Software Eng., vol. 21, no. 6, pp. 563-575, June 1995.
[75] J Zheng, L Williams, N Nagappan, W Snipes, J P. Hudepohl and M A. Vouk, "On the Value of Static Analysis for Fault Detection in Software", IEEE Transactions on Software Engineering, Vol. 32, No. 4, April 2006.
[76] S Liu, Y Chen, F Nagoya and J A. McDermid, "Formal Specification-Based Inspection for Verification of Programs", IEEE Transactions on software engineering, vol. 38, no. 5, september/october 2012.
[77] F Padberg, T Ragg, and R Schoknecht, "Using Machine Learning for Estimating the Defect Content After an Inspection" IEEE Transactions on Software Engineering, Vol. 30, No. 1, January 2004.
[78] Ackerman A, L. Buchwald, and F. Lewski, "Software Inspections: An Effective Verification Process",IEEE Software, vol. 6, no. 3, pp. 31-36, 1989.
[79] O. Laitenberger and J.-M. DeBaud, ªAn Encompassing Life Cycle Centric Survey of Software Inspection,º J. Systems and Software, vol. 50, no. 1, pp. 5-31, Jan. 2000.
[80] M.E. Fagan, "Design and Code Inspections to Reduce Errors in Program Development", IBM Systems J., vol. 15, no. 3, pp. 182-211, 1976.
[81] L. C. Briand, K. E.Emam, B. G. Freimut, O. Laitenberger, "A Comprehensive Evaluation of Capture-Recapture Models for Estimating Software Defect Content", IEEE Transactions on Software Engineering, Vol. 26, No. 6, June 2000.
[82] M. Young and R.N. Taylor, "Rethinking the Taxonomy of Fault Detection Techniques",Proc. Int'l Conf. Software Eng., pp. 53-62, 1989.
[83] A. Porter and L. Votta, "An Experiment to Assess Different Defect Detection Methods for Software Requirements Inspections,º Proc. Int'l Conf. Software Eng., pp. 103-112, 1994.
[84] Y. Kamei, S. Matsumoto, A. Monden, K. Matsumoto, B. Adams, and A. E. Hassan, "Revisiting common bug prediction findings using effort aware models", Proc. 26th IEEE Int'l Conference on Software Maintenance (ICSM2010), pp. 1-10, 2010.
[85] M Hamill and K Goseva-Popstojanova, "Common Trends in Software Fault and Failure Data" IEEE Transactions on Software Engineering, Vol. 35, No. 4, July/August 2009.
[86] W. Bush, J. Pincus, and D. Sielaff, "A Static Analyzer for Finding Dynamic Programming Errors," Software Practice and Experience, vol. 30, no. 7, pp. 775-802, June 2000.
[87] P.M. Johnson and D. Tjahjono, "Assessing Software Review Meetings: A Controlled Experimental Study Using CSRS,º Technical Report 96-06, Dept. of Information and Computer Sciences, Univ. of Hawaii, csdl.ics.hawaii.edu, 1996.
[88] A Monden, T Hayashi, S Shinoda, K Shirai, J Yoshida, M Barker and K Matsumoto, "Assessing the Cost Effectiveness of Fault Prediction in Acceptance Testing", IEEE Transactions on Software Engineering, DOI-098-5589, 2013.
[89] N. Fenton and M. Neil, "A Critique of Software Defect Prediction Models", IEEE Trans. Software Eng., vol. 25, no. 5, pp. 675-689, Sept./Oct. 1999.
[90] V. Challagulla, F. Bastani, I. Yen, and R. Paul, "Empirical Assessment of Machine Learning Based Software Defect Prediction Techniques", Proc. IEEE 10th Int'l Workshop Object-Oriented Real-Time Dependable Systems, pp. 263-270, 2005.
[91] V. Suma and T R Gopalakrishnan Nair , 2008, " Effective Defect Prevention Approach in Software Process for Achieving Better Quality Levels" Proceedings of World Academy of Science, Engineering and Technology Volume 32 August 2008
[92] C. P. Chang and C. P. Chu., Defect prevention in software processes: An action based approach, The Journal of Systems and Software 80, 559-570, 2007.
[93] M. Li, H. Xiaoyuan and A. Sontakke, "Defect Prevention: A General Framework and Its Application", Proceedings of the sixth International Conference on Quality Software (QSIC'06), IEEE Computer Society, 2006.
[94] V.R. Basili, S. Green, O. Laitenberger, F. Lanubile, F. Shull, S. Soerumgaard, and M. Zelkowitz, "The Empirical Investigation of Perspective-Based Reading", Empirical Software Eng. J., vol. 1, no. 2, pp. 133-164, 1996.
[95] D. Hovemeyer and W. Pugh, "Finding Bugs is Easy," Proc. Conf. Object Oriented Programming Systems Languages and Applications (OOSPLA) Companion, pp. 132-135, 2004.
[96] Mohamad Mahdi Askari and Vahid Khatibi Bardsiri, "Software Defect Prediction using a High Performance Neural Network",International Journal of Software Engineering and Its Applications,Vol. 8, No. 12,pp. 177-188,2014.
[97] Issam H. Laradji, Mohammad Alshayeb and Lahouari Ghouti, "Software defect prediction using ensemble learning on selected features",Information and Software Technology, 2014.
[98] Hafsa Zafar, Zeeshan Rana, Shafay Shamail and Mian M Awais, "Finding Focused Itemsets from Software Defect Data",In.proc.of 15th International Multitopic Conference (INMIC), 2012.
[99] Wanjiang. Han,Lixin. Jiang, Tianbo. Lu,Xiaoyan. Zhang and Sun Yi, "Study on Residual Defect Prediction using Multiple Technologies", Journal of Advances In Information Technology, vol. 5, no. 3, 2014.
[100] Pritam H. Patil,Suvarna Thube, Bhakti Ratnaparkhi and K.Rajeswari, "Analysis of Different Data Mining Tools using Classification, Clustering and Association Rule Mining" International Journal of Computer Applications, Vol.93, No.8, pp.0975–8887, 2014.
[101] Hui Wang, "Software Defects Classification Prediction Based On Mining Software Repository",Thesis,2014
[102] Lin Lin, Mei-Ling Shyu and Shu-Ching Chen, "Association rule mining with a correlation-based interestingness measure for video semantic concept detection", Int. J. Information and Decision Sciences, 2009
[103] Gabriela Czibula , Zsuzsanna Marian and Istvan Gergely Czibula, "Software defect prediction using relational association rule mining", Information Sciences, Vol. 264, pp.260–278,2014.
[104] Naheed Azeem and Shazia Usmani, "Analysis of Data Mining Based Software Defect Prediction Techniques",Global Journal of Computer Science and Technology, Vol.11, No.16, 2011.
[105] S.Vijayarani and M.Sathiya Prabha, "Association Rule Hiding using Artificial Bee Colony Algorithm",International Journal of Computer Applications, Vol.33,No.2, 2011.
[106] Pooja Paramshetti and D. A. Phalke, "Survey on Software Defect Prediction Using Machine Learning Techniques",International Journal of Science and Research (IJSR),Vol.3, No.12, 2014.
[107] Remya Kartha K and Vikraman Nair R, "Data Mining for Causal Analysis of Software Defects",International Journal of Computer Science and Mobile Computing,pp.1-7,2013.
[108] Zsuzsanna Marian, "On The Software Metrics Influence In Relational Association Rule-Based Software Defect Prediction", Studia Univ. Babes Bolyai, Informatica, Vol.Lviii, No. 4, 2013
[109] T. Karthikeyan and N. Ravikumar, "A Survey on Association Rule Mining",International Journal of Advanced Research in Computer and Communication Engineering, Vol. 3,No. 1, 2014
[110] Safia Yasmeen, "Software Bug Detection Algorithm using Data mining Techniques", International Journal of Innovative Research in Advanced Engineering (IJIRAE) Vol.1,No. 5,2014.
[111] CH.Sekhar and S Reshma Anjum, "Cloud Data Mining based on Association Rule",International Journal of Computer Science and Information Technologies, Vol.5, No.2 ,pp.2091-2094,2014.
[112] Byoung-Jun Park , Sung-Kwun Oh and Witold Pedrycz, "The design of polynomial function-based neural network predictors for detection of software defects",Information Sciences, Vol.229,pp. 40–57, 2013.
[113] Software Defect Dataset, PROMISE REPOSITORY, http://promise.site.uottawa.ca/SERepository/ datasetspage.html, December 4, 2013.
[114] J. Wang, B. Shen and Y. Chen, "Compressed C4.5 Models for Software Defect Prediction", IEEE 12th International Conference on Quality Software (QSIC), pp. 13-16, August 2012.
[115] Q. Song, Z. Jia, M. Shepperd, Shi Ying, and Jin Liu, "A general software defect-proneness prediction framework", Software Engineering, IEEE Transactions on, 37(3):356-370, 2011.
[116] T. Galinac Grbac, P. Runeson, and D. Huljenic, "A second replicated quantitative analysis of fault distributions in complex software systems", IEEE Trans. Softw. Eng., 39(4):462-476, Apr. 2013.
[117] Haghighi, A. A. S., Dezfuli, M. A., and Fakhrahmad, S. M., "Applying mining schemes to software fault prediction: A proposed approach aimed at test cost reduction", In Proceedings of the World Congress on Engineering, pp.415-419, 2012.
[118] E. Arisholm, L. C. Briand, and E. B. Johannessen. A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. J. Syst. Softw., 83(1):2-17, Jan. 2010.
[119] Okutan O. T. Yildiz, "Software defect prediction using Bayesian networks", In proceeding to Empirical Software Engineering, pp. 1-28, 2012.
[120] Z. Yan, X. Chen and P. Guo, "Software Defect Prediction Using Fuzzy Support Vector Regression", Springer-Verlag Berlin Heidelberg, 2010.
[121] R Geoff Dromey, "Software Control Quality - Prevention Verses Cure?", ACM Journal of Software Quality Journal archive, Volume-11 Issue 3, Pages 197-210, July 2003.
[122] Kaur S, Kumar D, "Software fault prediction in object-oriented software systems using density based clustering approach", International Journal of Research in Engineering and Technology (IJRET), 1(2):111-7, Mar-2012.
[123] Shepperd, M., Song, Q., Sun, Z., and Mair, C., "Data Quality: Some Comments on the NASA Software Defect Data Sets", IEEE Transactions on Software Engineering, pp.1208-1215, 2013.
[124] H. Najadat and I. Alsmadi,"Enhance Rule-Based Detection for Software Fault-Prone Modules", International Journal of Software Engineering and Its Applications, Vol. 6, No. 1, January 2012
[125] T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell, "A systematic literature review on fault prediction performance in software engineering", IEEE Trans. Softw. Eng., 38(6):1276- 1304, Nov. 2012.
[126] H. Can, X. Jianchun, Z. R. L. Juelong, Y. Quiliang and X. Liqiang, "A new model for software defect prediction using particle swarm optimization and support vector machine", IEEE 25th Chinese Control and Decision Conference (CCDC), 2013.
[127] S. Kim, T. Zimmermann, E. J. Whitehead Jr., and A. Zeller, "Predicting faults from cached history", In Proceedings of the 29th international conference on Software Engineering, ICSE '07, pages 489-498, 2007.
[128] B. Turhan, T. Menzies, A. B. Bener, and J. Di Stefano, "On the relative value of cross-company and within-company data for defect prediction", Empirical Softw. Eng., 14:540-578, October 2009.
[129] R. Moser, W. Pedrycz, and G. Succi, "A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction", ACM/IEEE 30th International Conference on Software Engineering,ICSE'08, pages 181-190, 2008.
[130] H. Zhang, X. Zhang, and Ming Gu, "Predicting defective software components from code complexity measures", IEEE In Dependable Computing Pacific Rim International Symposium on, pages 93-96, 2007.
[131] S. Lessmann, B. Baesens, C. Mues, and S. Pietsch, "Benchmarking classification models for software defect prediction: A proposed framework and novel findings", IEEE Transactions on Software Engineering, 34(4):485-496, 2008.
[132] N. Ohlsson and H. Alberg, "Predicting fault-prone software modules in telephone switches" IEEE Trans. Softw., 22(12):886 -894, Dec. 1996.
Frequently asked questions
What is the main topic of this text, and what is it about?
This document is a language preview which includes the title, table of contents, objectives and key themes, chapter summaries, and key words. It focuses on software fault detection, prevention, and prediction techniques, with an emphasis on improving software quality through various methodologies.
What is covered in Chapter 1: INTRODUCTION?
Chapter 1 gives an overview of the thesis. It explains the current state of fault occurrences in software development, discusses the research motivation, defines the problem, states the research objectives, and lists the research contributions. It concludes by presenting the organization of the thesis.
What are some key points discussed in Section 1.1 Overview?
This section emphasizes that all software contains faults, and better understanding the relationship between faults and failures is essential for effective detection and prevention. It defines failures and faults, noting that faults are problems developers see, while failures are problems users experience. Faults can be introduced at any phase of the software life cycle.
What are the research motivations behind this study as discussed in Section 1.2?
The research is motivated by the inevitable occurrence of defects in human-based activities in software engineering. It also mentions the need for enhancing software quality and reliability. The research is also motivated by existing limitations of various inspection approaches and defect detection techniques.
What problems are defined in Section 1.3?
This section outlines problems with the variation in effectiveness of inspections, the lack of empirical information available on the effort required for effective inspection techniques, and the unreliability of current techniques for estimating defect content after an inspection. Also highlighted is the weakness of current software defect prediction models, focusing on their inability to cope with the unknown relationship between defects and failures.
What are the objectives of the research mentioned in Section 1.4?
The main objective is to present enhanced approaches for inspection and defect detection to ensure every functional scenario defined in a specification is implemented correctly. It explores the adequacy, accuracy, scalability, and uncertainty of architecture-based software reliability models.
What are the key contributions of this research as outlined in Section 1.5?
The thesis contributes four novel mechanisms for fault reduction based on software defect classification and fault prediction. This includes an Adaptive PSO based Association Rule Mining Technique, a Fault Prediction Approach based on the Probabilistic Model for Improvising Software Inspection, Defect Classification using Relational Association Rule Mining Based on Fuzzy Classifier, and a Rule-based Prediction Method for Defect Detection.
What topics are addressed in Chapter 2: BACKGROUND STUDY?
Chapter 2 explores current practices for software fault detection and prevention mechanisms, their benefits, and limitations. It examines approaches to improve defect prediction models and related work in the field.
What are the main topics covered in Section 2.2 Software Fault Detection Mechanism?
This section discusses various fault detection activities performed during software development. It focuses on techniques such as Automated Static Analysis, Graph Mining, Classifiers, and Pattern Mining for failure detection.
What are the categories included in Fault Detection?
The categories are:
- Detection Using Automated Static Analysis
- Detection Using Graph mining
- Detection Using Classifiers
- Detection Using Pattern Mining
What does the section on Software Fault Prevention Mechanism cover?
This section emphasizes the importance of preventing faults during software development and describes fault prevention as a quality improvement process. It covers fault identification, classification, analysis, and prevention activities.
What are the benefits and limitations of Faults Prevention as described in Section 2.4?
Fault prevention can be cost-effective, especially if implemented early in the development cycle. However, it may be limited by a lack of specific domain knowledge, labor-intensive inspection processes, and the potential absence of well-developed quality measurements.
What is discussed in Section 2.5 regarding defect prediction?
This section addresses improving defect prediction models, focusing on overlooking noise generated by issue report mislabeling, overlooking the parameters of classification techniques, and overlooking the most accurate and reliable model validation techniques.
What is discussed in the Chapter 3: Adaptive PSO Based Association Rule Mining Technique for Software Defect Classification using ANN?
It proposes a system that categorizes various defects by using association rule mining dependent problem classification approach, which is applied to collect the actual defects using recognition. Association rule mining algorithm at times results in useless policies.
What are the steps mentioned in Section 3.3 Proposed Methodology?
The proposed method uses frequent itemset mining, Adaptive Particle Swarm Optimization (APSO), and an Artificial Neural Network (ANN) for classification. The method has multiple parts:
- Frequent Item Set Mining
- Adaptive Particle Swarm Optimization (APSO)
- Classification Phase using Artificial Neural Network
- Quote paper
- Dhana Laxmi (Author), 2019, Fault Prediction Approach, Munich, GRIN Verlag, https://www.hausarbeiten.de/document/464418