This unit entry is for students who completed this unit in 2012 only. For students planning to study the unit, please refer to the unit indexes in the the current edition of the Handbook. If you have any queries contact the managing faculty for your course or area of study.
print version
6 points, SCA Band 2, 0.125 EFTSL
Refer to the specific
census and withdrawal dates for the semester(s) in which this unit is offered, or view unit timetables.
Synopsis
Modern methods of discovering patterns in large-scale databases are introduced, including classification, clustering and association rules analysis. These are contrasted with more traditional methods of finding information from data, such as data queries. Data pre-processing methods for dealing with noisy and missing data and with dimensionality reduction are reviewed. Hands-on case studies in building data mining models are performed using a popular software package.
Outcomes
At the completion of this unit students will:
- be able to differentiate between supervised and unsupervised learning;
- know how to apply the main techniques for supervised and unsupervised learning;
- know how to use statistical methods for evaluating data mining models;
- be able to perform data pre-processing for data with outliers, incomplete and noisy data;
- be able to extract and analyse patterns from data using a data mining tool;
- have an understanding of the difference between discovery of hidden patterns and simple query extractions in a dataset;
- have an understanding of the different methods available to facilitate discovery of hidden patterns in a dataset;
- have developed the ability to preprocess data in preparation for data mining experiments;
- have developed the ability to evaluate the quality of data mining models;
- be able to appreciate the need to have representative sample input data to enable learning of patterns embedded in population data;
- be able to appreciate the need to provide quality input data to produce useful data mining models;
- have acquired the skill to use the common features in data mining tools;
- have acquired the skill to use the visualisation features in a data mining tools to facilitate knowledge discovery from a data set;
- have acquired the skill to compare data mining models based on the results on a set of performance criteria;
- be able to work in a team to extract knowledge from a common data set using different data mining methods and techniques.
Assessment
Examination (3 hours): 60%; In-semester assessment: 40%
Chief examiner(s)
Dr Grace Rumantir
Contact hours
2 hrs lectures/wk, 2 hrs laboratories/wk
Prerequisites
Sound fundamental knowledge in maths and statistics. Basic database and computer programming knowledge.
Prohibitions
CSE5230, FIT5024
Additional information on this unit is available from the faculty at:
http://www.infotech.monash.edu.au/units/fit5045/