Market basket Analysis Essay

Discovery of association rules in laboratory test data for Medall diagnostic centers using Market Basket AnalysisAbstractDiagnostic services plays an important role in medical decision-making. Based on estimation, 70% of all decisions related to patient’s diagnosis and treatment including hospital admission and discharge are based on laboratory test results. No diagnosis is set by a single laboratory test result and each test result is interpreted and judged together with other (previous) test results. This project is focused on understanding what type of laboratory tests are referred by physician based on their specialization and which type of tests are referred together using Market Basket Analysis (MBA) for Medall diagnostic center.

IntroductionMarket basket analysis is a data mining technique used for knowledge discovery in databases (KDD) which includes selection of data, its preprocessing, transformation, data mining and interpretation. MBA is most commonly used in retail industry to understand the purchasing behavior of customers to uncover what items are frequently brought together by the customer.

For example, consider a supermarket which maintains all the purchases made by each customer in their databases. This data in general is referred as market-basket data. By analyzing this data, the supermarket may be able to identify which sets of items are likely to be purchased together by a customer. With this insight, the supermarket could, for example, may place those set of items in shelves next to each other so that it becomes easy for a customer to shop in the store.In this project, we use market basket analysis to analyze Medall’s laboratory test data to produce knowledge that is relevant and actionable and can therefore help Medall diagnostic center to identify association rules which reveal relationship among different types of test and determine which test or combination of test are likely to be referred by physician of what specialization and from which region. The raw data consists of millions of Laboratory test names referred by physicians from different region. Each region may have physicians from different specialization and each physician may have several line items or records, one for each laboratory test referred for processing at Medall diagnostic center.The outcome of analysis will therefor assist Medall in designing their services in terms of positioning their sample collection centers in different region, customizing their offerings based on physician specialization and region and creating focused marketing strategies resulting in increased revenues. However, the physician’s judgement and experience is important in the diagnostic investigation, which will not be ignored in this project.This study will be achieved through three parts. In the first part, comprehensive literature review is given, defining Market Basket and measures available for evaluating the association between the items. In second part, we will discuss our approach for identifying association rules for Medall laboratory test data. And finally in the third part, a discussion of rules discovered by our approach followed by conclusions and directions for future work would be explained.Part 1: Market Basket Analysis (MBA) basic conceptsMarket basket analysis also known as association rule mining or affinity analysis is a data mining technique to determine which products/items customers purchase together and identifies the strength of association between those pairs of products. The data is taken at the transaction level and analyzed to produce a list of items bought by customer in a single purchase. The technique determines relationships of what items were purchased with which other item(s). Item association does not mean cause and effect, but is a measure of co-occurrence. The set of items which customer buys is referred to as an item set, and market basket analysis seeks to find relationships between purchases. These relationships are then used to build profiles containing If-Then rules of the items purchased. The rules are probabilistic in nature as they are derived from the frequencies of co-occurrence observed in purchases.The rules could be written as: If {Item A} Then {Item B}The IF part of the rule ({Item A}) is the antecedent which is the condition and the THEN part of the rule ({Item B}) is the consequent which is the result and the meaning of the rule is deduced as: A and B, both are item sets and the rule says that if a customer bought item A then he is likely to buy item B as well with a conditional probability percentage factor known as %C where C is the confidence value of a rule.Understanding the presence, nature and strength of association ruleThe measures used to understand the presence, nature and strength of association between the items are ” Lift, support and confidence. All three measures are used as they complement each other providing non redundant information.The lift is calculated as first step as it provides information on whether association exists between the products and association is positive or negative. Once the existence of association is established through the value of lift, the next step is to calculate the value for support, which is the actual probability that set of items co-occurs with another set of items in the dataset. Then confidence is calculated which is the probability that a set of items occurs is given that another set of items has already occurred.Lift is defined as P (A €© B) / P (A) * P (B). The numerator denotes association between event A and B (i.e., item A and item B co-occur). The denominator denotes that event A and event B occurs independent of each other (i.e., item A and item B have no association). Below interpretations can be made based on lift value.Lift value =1.0‹indicates that the relationship between item A and B can be explained by chance.Lift value>1.0 ‹indicates presence of item A is associated with presence of item B and the relationship is positive in nature.Lift value

Still stressed from student homework?
Get quality assistance from academic writers!