6 points, SCA Band 2, 0.125 EFTSL
Postgraduate - Unit
Refer to the specific census and withdrawal dates for the semester(s) in which this unit is offered.
Not offered in 2018
Semi-structured data is one of the fastest growing kinds of data in both the public and private sector, for instance in health. Email collections with sender-recipient graphs, metadata and text content is one example. This unit will explore basic forms of semi-structured data: text, time-sequence data, graphs and multiple relations in a database. Basic machine learning algorithms for these kinds of data will be analysed and applied. Some characteristic industry problems for the application of semi-structured data will also be investigated such as cohort analysis and market-basket analysis.
At the completion of this unit, students should be able to:
- appraise what kinds of semi-structured data exist and the problems they present for analysis;
- analyse different kinds of algorithms for different kinds of semi-structured data;
- develop and modify some standard algorithms for semi-structured data;
- examine some characteristic industry problems involving semi-structured data, and analyse the suitability of different algorithms.
Examination (2 hours, plus 30 minutes reading and noting time): 50%; in-semester assessment: 50%
Minimum total expected workload equals 12 hours per week comprising:
- Two hours/week lectures
- Two hours/week laboratories
A minimum of 8 hours per week of personal study (22 hours per week for Monash Online students) for completing lab/tutorial activities, assignments, private study and revision, and for online students, participating in discussions.
See also Unit timetable information