6 points, SCA Band 2, 0.125 EFTSL
Postgraduate - Unit
Refer to the specific census and withdrawal dates for the semester(s) in which this unit is offered.
- Second semester 2017 (Day)
Advanced methods of discovering patterns in large-scale multi-dimensional databases are discussed. Solving classification, clustering, association rules analysis and regression problems on different kinds of data are covered. Data pre-processing methods for dealing with noisy and missing data in the context of Big Data are reviewed. Evaluation and analysis of data mining models are emphasised. Hands-on case studies in building data mining models are performed using popular modern software packages.
On successful completion of this unit, students should be able to:
- explain the kinds of data from which knowledge can be mined, the way each data type can be presented to a data mining algorithm, the kinds of patterns that can be mined from each data type;
- evaluate the quality of data mining models;
- perform pre-processing of large-scale multi-dimensional data sets in preparation for data mining experiments;
- perform data pre-processing for data with outliers, incomplete and noisy data;
- compare the various learning algorithms and the ability to effectively apply suitable algorithms to mine frequent patterns and associations from data, to perform data classification, data clustering and regression analysis;
- use modern data mining tools to solve non-trivial data mining problems;
- research the current trends in data mining applications;
- work in a team to extract knowledge from a common data set using various data mining methods and techniques.
Examination (2 hours plus 30 minutes reading and noting time)): 60%; In-semester assessment: 40%
Minimum total expected workload equals 12 hours per week comprising:
- Contact hours for on-campus students:
- Two hours of lectures
- One 2-hour laboratory
- Additional requirements (all students):
- A minimum of 8 hours independent study per week for completing lab and project work, private study and revision.
See also Unit timetable information
FIT5047 or FIT5045 or equivalent
Sound fundamental knowledge in maths and statistics; database and computer programming knowledge.