Course Syllabus
Fall 2020
E599 High Performance Big Data Systems
Judy Qiu
Class Information ( Course Syllabus and Term Projects)
Time: Mon 4:55PM – 7:25 PM
Office Hours
Instructor: Prof. Judy Qiu
Monday 1:00 - 2:00 PM via ZOOM
AIs Office Hours
Selahattin Akkas
Tuesday 1:00 - 2:00 PM via ZOOM
Prerequisites
General programming experience with Python or Java is required.
Objectives
In this class, you will see how to apply Deep Learning on time series data in real-time. Big data analysis tools show you how you can prepare, blend and analyze different data sources in minutes not days on next generation computer systems (CPU/GPU/TPU). There's no better way to experience end-to-end analytics platforms in action. Students interested in real-world applications will work on hands-on projects of time series datasets such as those from Automotive Vehicles and Edge devices.
Scope and Topics
High-Performance Big Data Systems involve Real-Time Data Analytics on High-Performance Computing (HPC) clusters optimized for data analysis. This course introduces research and development in hardware, algorithms and software for AI and big data systems ranging from commodity clouds, hybrid HPC-clouds, edge computing and supercomputers. High-Performance systems are designed to scale and fully exploit the specialized features (communication, memory, energy, I/O, accelerator) of each different architecture. Students will study the literature of new hardware architectures (many-core and other emerging architectures) and benchmark on existing systems. Applications will focus on real- time Machine Learning/AI . A major student project aimed at a demonstration of the capabilities of High Performance Big Data systems.
Course Materials
Reference Book
- Distributed Systems: Principles and Paradigms, Andrew S. Tanenbaum et al. (2nd Edition) Prentice Hall Publishers, NJ 07458, USA.
- The Deep Learning textbook (MIT Press) by Ian Goodfellow, Yoshua Bengio and Aaron Courville
- Distributed Systems and Cloud Computing: From Parallel Processing to the Internet of Things, Kai Hwang et al., Morgan Kaufmann Publishers, an imprint of Elsevier, Inc., Burlington, MA 01803, USA.
- High Performance Computing, Kevin Dowd & Charles Severance, O'Reilly Publisher, CA 95472, USA.
Evaluation
- 20% Participation & Quizzes
- 10% Proposal
- 20% Midway Report
- 30% Final Report
- 20% Presentation & Demo
Academic Misconduct
Plagiarism and cheating undermine the academic environment. Students who cheat undermine their own education, the self-esteem that comes with true mastery, and the academic mission of the University. The regulations governing student academic conduct and the procedures that must be used in handling violations of those regulations are covered in the Code of Student Rights, Responsibilities, and Conduct. Part II.A. defines academic misconduct, and Part IV.B. explains the procedures for handling cases of academic misconduct; these two sections are reprinted each semester in the Registrar publication Enrollment and Student Academic Information, under the heading "Academic Misconduct Policy."
Policy for Late Assignments or Projects
Assignments and projects are due at 11:59 PM Sunday unless otherwise noted. You have one free late submission for up to 24 hours. For other late submissions, the grade will be reduced by 20% for each 24 hours late. No submissions will be accepted after 48 hours past due time.
Class Schedule (Tentative)
|
Lectures |
Topics |
Literature |
Assignments |
Labs |
|
Lecture 1 |
· Course Introduction |
Applying for FutureSystems account |
Lab 1 |
|
|
Lecture 2 |
· Big Data Learning Systems and Applications |
· Term project ideas |
TensorFlow Introduction, K-Means Walkthrough |
Lab 2 |
|
Lecture 3 |
· Distributed Systems* · Architectures* · Deep Learning Frameworks |
· Chapter 1 (Tanenbaum) · Chapter 2 (T.) |
tf.data, Distributed TensorFlow |
Lab 3 |
|
Lecture 4 |
· Communication* |
· Chapter 4 (T.) |
Proposal Due / Data Exploration with pandas and matplotlib |
Lab 4 |
|
Lecture 5 |
· Synchronization* |
· Chapter 6 (T.) |
LSTM Anomaly Detection |
Lab 5 |
|
Lecture 6 |
· Data Processing Pipeline (preparation, processing, management analysis and visualization) |
Streaming Data Analysis |
Apache Storm |
Lab 6 |
|
Lecture 7 |
· High-Performance Computing Architectures · Measuring Performance |
· CPU, GPU, AI chips |
End to End Lap Time Prediction on Apache Storm |
Lab 7 |
|
Lecture 8 |
Midterm Presentation |
|
Midterm Report Due |
- |
|
Lecture 9 |
· Consistency and Replications* |
· Chapter 7 (T.) |
End-to-end Analytics |
Lab 8 |
|
Lecture 10 |
· Fault Tolerance |
· Chapter 8 (T.) |
- |
|
|
Lecture 11 |
· Benchmarking · User Benchmarks · Industry Benchmarks |
· Time Series |
JupyterHub, Pub/Sub, End to End Anomaly Detection |
Lab 9 |
|
Lecture 12 |
· Discussion of Parallel Thinking · Parallelization* · Message Passing Interface
|
· HPC and Big Data · Solving Bigger Problems by Scaling |
MPI |
Lab 10 |
|
Lecture 13 |
· Big Data Frameworks |
· HPC and Big Data |
Apache Spark |
Lab 11 |
|
Lecture 15 |
· Memory Technology · |
· Profiling Tool with Intel VTune |
Vectorization |
Lab 12 |
|
Lecture 16 |
· Course Review |
· Project Review |
Final Report Due |
|
|
Final Week |
· Final Report |
|
|
|
Course Summary:
| Date | Details | Due |
|---|---|---|