{"id":29351,"date":"2013-05-31T12:07:50","date_gmt":"2013-05-31T06:37:50","guid":{"rendered":"http:\/\/www.kopykitab.com\/blog\/?p=29351"},"modified":"2013-05-31T12:07:50","modified_gmt":"2013-05-31T06:37:50","slug":"data-mining-notes","status":"publish","type":"post","link":"https:\/\/www.kopykitab.com\/blog\/data-mining-notes\/","title":{"rendered":"Data Mining Notes"},"content":{"rendered":"<h1 style=\"text-align: center;\">Data Mining Notes<\/h1>\n<p>&nbsp;<\/p>\n<p><b>Introduction:<\/b><\/p>\n<p>\u2022Data mining is the task of discovering interesting patterns from large amounts of data, where the data can be stored in databases, data ware houses, or other information repositories.<\/p>\n<p>\u2022Data mining is often defined as finding hidden information in a database.<\/p>\n<p>\u2022Data mining involves an integration of techniques from multiple disciplines such as database and data warehouse technology, statistics, machine learning, high-performance computing, pattern recognition, neural networks, data visualization, information retrieval, image and signal processing, and spatial or temporal data analysis.<\/p>\n<p>&nbsp;<\/p>\n<p><b>DATA MINING MODEL &amp; TASK<\/b><\/p>\n<p><img class=\"alignnone size-medium wp-image-29353\" alt=\"3\" src=\"http:\/\/www.kopykitab.com\/blog\/wp-content\/uploads\/2013\/05\/322-300x69.jpg\" width=\"300\" height=\"69\" \/><\/p>\n<p><b>1. Classification:<\/b><\/p>\n<p>\u2022Classification maps data into predefined groups or classes.<\/p>\n<p>\u2022It is often referred to as supervised learning because the classes are determined before examining the data.<\/p>\n<p>\u2022Two examples of classification applications are determining whether to make a bank loan and identifying credit risks.<\/p>\n<p>\u2022Pattern recognition is a type of classification where an input pattern is classified into one of several classes based on its similarity to these predefined classes.<\/p>\n<p>&nbsp;<\/p>\n<p><b>2.Regression<\/b><\/p>\n<p>\u2022Regression is used to map a data item to a real valued prediction variable.<\/p>\n<p>\u2022Regression assumes that the target data fit into some known type of function (e.g. linear, logistic etc) and then determines the best function of this type that models the given data.<\/p>\n<p>\u2022For example, a college professor wishes to reach a certain level of savings before his retirement. Periodically, he predicts what his retirement savings will be based on its current value and several past values. He uses a simple linear regression formula to predict this value by fitting past behavior to a linear function and then using this function to predict the values at points in the future. Based on these values, he then alters his investment portfolio.<\/p>\n<p><b>\u00a0<\/b><\/p>\n<p><b>3.Time Sereis Analysis:<\/b><\/p>\n<p>\u2022With time series analysis, the value of an attribute is examined as it varies over time.<\/p>\n<p>\u2022A time series plot is used to visualize the time series. In this figure, you can easily see that the plots for Y and Z have similar behavior, while X appears to have less volatility.<\/p>\n<p>\u2022There are three basic functions performed in time series analysis. In one case, distance measures are used to determine the similarity between different time series. In the second case, the structure of the line is examined to determine (and perhaps classify) its behavior. A third application would be to use the historical time series plot to predict future values.<\/p>\n<p>&nbsp;<\/p>\n<p><b>4 Prediction:<\/b><\/p>\n<p>\u2022Many real-world data mining applications can be seen as predicting future data states based on past and current data. Prediction can be viewed as a type of classification.<\/p>\n<p>\u2022The difference is that prediction is predicting a future state rather than a current state. Here reference is made to a type of application rather than to a type of data mining approach. Prediction applications include flooding, speech recognition, machine learning and pattern recognition. Although future values may be predicted using time series analysis or regression techniques, other approaches may be use as well.<\/p>\n<p>&nbsp;<\/p>\n<p><b>5 Clustering:<\/b><\/p>\n<p>\u2022Clustering is similar to classification except that the groups are into predefined, but rather defined by the data alone.<\/p>\n<p>\u2022Clustering is alternatively referred to as unsupervised learning or segmentation.<\/p>\n<p>\u2022It can be thought of as partitioning or segmenting the data into groups that might or might not be disjointed. The clustering is usually accomplished by determining the similarity among the data on predefined attributes.\u00a0 The most similar data are grouped into clusters.<\/p>\n<p>\u2022A special type of clustering is called segmentation. With segmentation a database is partitioned into disjointed groupings of similar tuples called segments. Segmentation is often viewed as being identical to clustering. In other circles segmentation is viewed as a specific type of clustering applied to a database itself.<\/p>\n<p><b>\u00a0<\/b><\/p>\n<p><b>6 Summerization Association Rules:<\/b><\/p>\n<p>\u2022Summarization maps data into subsets with associated simple descriptions. Summarization is also called characterization or generalization.<\/p>\n<p>\u2022It extracts or derives representative information about the database. This may be accomplished by actually retrieving portions of the data. Alternatively, summary type information can be derived from the data. The\u00a0 summarization succinctly characterizes the contents of the database.<\/p>\n<p>\u2022Link analysis, alternatively referred to as affinity analysis or association, refers to the data mining task of uncovering relationships among data.<\/p>\n<p>\u2022The best example of this type of application is to determine association rules. An association rule is a model that identifies specific types of data associations. These associations are often used in the retail sales community to identify items that are frequently purchased together.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Data Mining Notes &nbsp; Introduction: \u2022Data mining is the task of discovering interesting patterns from large amounts of data, where the data can be stored in databases, data ware houses, or other information repositories. \u2022Data mining is often defined as finding hidden information in a database. \u2022Data mining involves an integration of techniques from multiple &#8230; <a title=\"Data Mining Notes\" class=\"read-more\" href=\"https:\/\/www.kopykitab.com\/blog\/data-mining-notes\/\" aria-label=\"More on Data Mining Notes\">Read more<\/a><\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"","fifu_image_alt":""},"categories":[4773],"tags":[2852],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.kopykitab.com\/blog\/wp-json\/wp\/v2\/posts\/29351"}],"collection":[{"href":"https:\/\/www.kopykitab.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kopykitab.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kopykitab.com\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kopykitab.com\/blog\/wp-json\/wp\/v2\/comments?post=29351"}],"version-history":[{"count":0,"href":"https:\/\/www.kopykitab.com\/blog\/wp-json\/wp\/v2\/posts\/29351\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.kopykitab.com\/blog\/wp-json\/wp\/v2\/media?parent=29351"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kopykitab.com\/blog\/wp-json\/wp\/v2\/categories?post=29351"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kopykitab.com\/blog\/wp-json\/wp\/v2\/tags?post=29351"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}