Course Code
|
IERG 4300
IERG4300
|
科目名稱
|
Web-scale Info
Analytics 網絡規模資訊分析
|
||||||||
教員
|
學 分
|
||||||||||
課程性質
|
同科其他選擇
|
||||||||||
Workload
|
l 非PAPER類HOMEWORK
|
好重
|
|||||||||
重
|
1
|
||||||||||
平均
|
|||||||||||
輕
|
1
|
||||||||||
極輕
|
|||||||||||
評價教學內容
|
#1 課程內容唔算太深,但係Assignment極度難做。課程內容包括Map Reduce Programming,Association
Rule Mining,Clustering,Dimension
Reduction同Recommendation System等等Machine
Learning Technique。
呢科Total有四份Assignment,第一份Assignment寫MapReduce
Program,第二份寫Parallel Apriori Algorithm,第三份寫Parallel Kmeans Clustering,第四份寫Recommendation
System同PCA。每份Assignment需時兩三日,每隔兩星期出一份,讀呢科之前要有心理準備,熟習Python或Java更佳。教學Powerpoint可謂零作用,有心Take呢科者預左要瘋狂上網Google。有心做Assignment係學到野既。但預左好多野都要自學。 #2 難度: 極度困難,無論係concept定Assignment都好難。
reg之前:做好心理準備,可能係你4年u life入面最辛苦的一個course。 有java/python底會輕鬆少少。唔抗拒linux command line environment。個course成日都要用aws/google cloud Linux server. 要做好瘋狂google的準備,會有無數嘅bug等住你。 第一份assignment會係add-drop period之前dead,如果接受到第一份hw的workload,先再考慮係咪繼續讀落去。平均assignment需時為25hr+
教學內容: 主要圍繞parallel programing,寫mapreduce programe 去做task. 係一個全新的idea,同其他Programing course 好唔同。
某d concept會好有趣,例如heavy-hitter problem。 但都有傳統algorithem, 例如k-means, PCA, SVD 之類。
大部分時間都覺得幾悶,好多時lecture教到好深的concept,但assigment完全唔會用得到。
Assignment: 忠告,咪做deadline fighter。一份assignment需要至少3日去做。 每次run programe嘅時間都好長,如果寫得差,可能1個鐘先run到個result,如果result唔啱,要再debug,run多一次。 所以要用好耐時間去做... |
||||||||||
評價教員教學
|
#1 教授講得都算清楚,講得唔悶。不過Tutor質素好低,講既英文完全聽唔明。 #2 講得都清晰,但少悶。
如果遇到問題可以去搵一個叫handason的tutor,幾好人
|
||||||||||
CUSIS科目資料
|
Description:
The course
discusses data-intensive analytics, and automated processing of very large
amount of structured and unstructured information. We focus on leveraging the
MapReduce paradigm to create parallel algorithms that can be scaled up to
handle massive data sets such as those collected from the World Wide Web or
other Internet systems and applications. We organize the course around a list
of large-scale data analytic problems in practice. The required theories and
methodologies for tackling each problem will be introduced. As such, the
course only expects students to have solid knowledge in probability,
statistics, linear algebra and computer programming skills. Topics to be
covered include: the MapReduce computational model and its system
architecture and realization in practice ; Finding Frequent Item-sets and
Association Rules ; Finding Similar Items in high-dimensional data ;
Dimensionality Reduction techniques ; Clustering ; Recommendation systems ;
Analysis of Massive Graphs and its applications on the World Wide Web ;
Large-scale supervised machine learning; Processing and mining of Data
Streams and their applications on large-scale network/ online-activity
monitoring.
Advisory: Basic
hands-on operating system configuration and software installation skills
covered in lab courses like IERG2602 and IERG3800 are required.
Learning
Outcome:
Upon successful
completion of the course, the students will have acquired the ability to:
1. Model and
formulate a wide range of large-scale data analytic problems in practice.
2. Design and
implement scalable software to tackle large-scale data analytic problems.
|
||||||||||
其他資料
|
2018Sem1:學位 70|註冊 20|剩餘 50 2019Sem1:學位 65|註冊 27|剩餘 38 2020Sem1:學位 85|註冊 31|剩餘 54 |
||||||||||
同學推薦
|
高度推薦
|
推薦
|
2
|
有保留
|
極有保留
|
123
【更新進度】23-24 s1/s2/ss 科目列表已上傳。
【更新進度】23-24 s1/s2/ss 的科目評價已更新。[2/7/2024]
【更新進度】23-24 s1/s2/ss 的科目評價已更新。[2/7/2024]
IERG 4300 網絡規模資訊分析 Web-scale Info Analytics
訂閱:
發佈留言 (Atom)
沒有留言:
發佈留言