Postgraduate - UnitFIT5045 - Knowledge discovery and data mining

This unit entry is for students who completed this unit in 2012 only. For students planning to study the unit, please refer to the unit indexes in the the current edition of the Handbook. If you have any queries contact the managing faculty for your course or area of study.

print version

6 points, SCA Band 2, 0.125 EFTSL

Refer to the specific census and withdrawal dates for the semester(s) in which this unit is offered, or view unit timetables.

Level	Postgraduate
Faculty	Faculty of Information Technology
Offered	Caulfield Second semester 2012 (Day) Gippsland Second semester 2012 (Off-campus)

Synopsis

Modern methods of discovering patterns in large-scale databases are introduced, including classification, clustering and association rules analysis. These are contrasted with more traditional methods of finding information from data, such as data queries. Data pre-processing methods for dealing with noisy and missing data and with dimensionality reduction are reviewed. Hands-on case studies in building data mining models are performed using a popular software package.

Outcomes

At the completion of this unit students will:

be able to differentiate between supervised and unsupervised learning;
know how to apply the main techniques for supervised and unsupervised learning;
know how to use statistical methods for evaluating data mining models;
be able to perform data pre-processing for data with outliers, incomplete and noisy data;
be able to extract and analyse patterns from data using a data mining tool;
have an understanding of the difference between discovery of hidden patterns and simple query extractions in a dataset;
have an understanding of the different methods available to facilitate discovery of hidden patterns in a dataset;
have developed the ability to preprocess data in preparation for data mining experiments;
have developed the ability to evaluate the quality of data mining models;
be able to appreciate the need to have representative sample input data to enable learning of patterns embedded in population data;
be able to appreciate the need to provide quality input data to produce useful data mining models;
have acquired the skill to use the common features in data mining tools;
have acquired the skill to use the visualisation features in a data mining tools to facilitate knowledge discovery from a data set;
have acquired the skill to compare data mining models based on the results on a set of performance criteria;
be able to work in a team to extract knowledge from a common data set using different data mining methods and techniques.

Assessment

Examination (3 hours): 60%; In-semester assessment: 40%

Chief examiner(s)

Dr Grace Rumantir

Contact hours

2 hrs lectures/wk, 2 hrs laboratories/wk

Prerequisites

Sound fundamental knowledge in maths and statistics. Basic database and computer programming knowledge.

Prohibitions

CSE5230, FIT5024

Additional information on this unit is available from the faculty at:

http://www.infotech.monash.edu.au/units/fit5045/