Course Syllabus
Spring 2020
E599 High Performance Big Data Systems
Judy Qiu
Class Information ( Course Syllabus and Term Projects)
Time: Tue 3:30PM – 6:00PM
Place: Myles Brand Hall (I) 232
Bloomington, IN 47408
Office Hours
Instructor: Prof. Judy Qiu
Friday 5:00 PM to 6:00 PM Luddy Hall Room 4138
AIs Office Hours
Selahattin Akkas
Thursday 11:00 AM to 12:00 PM Luddy Hall 4147 (ISE AI Office Hours)
Prerequisites
General programming experience with Python or Java is required.
Objectives
In this class, you will see how to apply Deep Learning on time series data in real-time. Big data analysis tools show you how you can prepare, blend and analyze different data sources in minutes not days on next generation computer systems (CPU/GPU/TPU). There's no better way to experience end-to-end analytics platforms in action. Students interested in real-world applications will work on hands-on projects of time series datasets such as those from Automotive Vehicles and Edge devices.
Scope and Topics
High-Performance Big Data Systems involve Real-Time Data Analytics on High-Performance Computing (HPC) clusters optimized for data analysis. This course introduces research and development in hardware, algorithms and software for AI and big data systems ranging from commodity clouds, hybrid HPC-clouds, edge computing and supercomputers. High-Performance systems are designed to scale and fully exploit the specialized features (communication, memory, energy, I/O, accelerator) of each different architecture. Students will study the literature of new hardware architectures (many-core and other emerging architectures) and benchmark on existing systems. Applications will focus on real- time Machine Learning/AI . A major student project aimed at a demonstration of the capabilities of High Performance Big Data systems.
Course Materials
Reference Book
- Distributed Systems: Principles and Paradigms, Andrew S. Tanenbaum et al. (2nd Edition) Prentice Hall Publishers, NJ 07458, USA.
- The Deep Learning textbook (MIT Press) by Ian Goodfellow, Yoshua Bengio and Aaron Courville
- Distributed Systems and Cloud Computing: From Parallel Processing to the Internet of Things, Kai Hwang et al., Morgan Kaufmann Publishers, an imprint of Elsevier, Inc., Burlington, MA 01803, USA.
- High Performance Computing, Kevin Dowd & Charles Severance, O'Reilly Publisher, CA 95472, USA.
Evaluation
- 20% Participation & Quizzes
- 10% Proposal
- 20% Midway Report
- 30% Final Report
- 20% Presentation & Demo
Academic Misconduct
Plagiarism and cheating undermine the academic environment. Students who cheat undermine their own education, the self-esteem that comes with true mastery, and the academic mission of the University. The regulations governing student academic conduct and the procedures that must be used in handling violations of those regulations are covered in the Code of Student Rights, Responsibilities, and Conduct. Part II.A. defines academic misconduct, and Part IV.B. explains the procedures for handling cases of academic misconduct; these two sections are reprinted each semester in the Registrar publication Enrollment and Student Academic Information, under the heading "Academic Misconduct Policy."
Policy for Late Assignments or Projects
Assignments and projects are due at 11:59 PM Sunday unless otherwise noted. You have one free late submission for up to 24 hours. For other late submissions, the grade will be reduced by 20% for each 24 hours late. No submissions will be accepted after 48 hours past due time.
Class Schedule (Tentative)
|
Lectures |
Topics |
Literature |
Assignments |
Labs |
|
Lecture 1 |
· Course Introduction |
Applying for FutureSystems account |
Lab 1 |
|
|
Lecture 2 |
· Big Data Learning Systems and Applications |
· Term project ideas |
TensorFlow Introduction, K-Means Walkthrough |
Lab 2 |
|
Lecture 3 |
· Distributed Systems* · Architectures* · Deep Learning Frameworks |
· Chapter 1 (Tanenbaum) · Chapter 2 (T.) |
tf.data, Distributed TensorFlow |
Lab 3 |
|
Lecture 4 |
· Communication* · Parallelization* |
· Chapter 4 (T.) |
Proposal Due / Data Exploration with pandas and matplotlib |
Lab 4 |
|
Lecture 5 |
· Synchronization* · Consistency and Replications* · Fault Tolerance |
· Chapter 6 (T.) · Chapter 7 (T.) · Chapter 8 (T.) |
LSTM Anomaly Detection |
Lab 5 |
|
Lecture 6 |
· Real-Time Machine Learning |
Streaming Data Analysis |
Apache Storm |
Lab 6 |
|
Lecture 7 |
Midterm Presentation |
Midterm Report Due |
||
|
Lecture 8 |
· High Performance Computing Architectures · Measuring Performance |
· CPU,GPU, AI chips |
Edge TPU |
Lab 7 |
|
Lecture 9 |
· Memory Technology · Vectorization |
· Profiling Tool with Intel VTune |
Vectorization |
Lab 8 |
|
Lecture 10 |
· Message Passing Interface (MPI) |
· Solving Bigger Problems by Scaling |
MPI |
Lab 9 |
|
Lecture 11 |
· Benchmarking · User Benchmarks · Industry Benchmarks |
· Time Series |
MLPerf |
Lab 10 |
|
Lecture 12 |
· Discussion of Parallel Thinking · Data Processing Pipeline (preparation, processing, management analysis and visualization) |
· HPC and Big Data |
End-to-end Analytics |
Lab 11 |
|
Lecture 13 |
· Big Data Frameworks |
· HPC and Big Data |
Apache Hadoop and Spark |
Lab 12 |
|
Lecture 14 |
· Discussions on the Convergence of HPC, Big Data and Machine Learning |
· Digital Twin |
|
|
|
Lecture 15 |
· Course Review |
· Project Review |
Final Report Due |
|
|
Final Week |
· Final Presentation |
|
|
|
Course Summary:
| Date | Details | Due |
|---|---|---|