Course Syllabus

Spring 2020

E599 High Performance Big Data Systems

Judy Qiu

 

Class Information ( Course Syllabus and Term Projects)

Time: Tue 3:30PM – 6:00PM
Place: Myles Brand Hall (I) 232
Bloomington, IN 47408

 

Office Hours

Instructor: Prof. Judy Qiu
Friday 5:00 PM to 6:00 PM Luddy Hall Room 4138

 

AIs Office Hours

Selahattin Akkas

Thursday 11:00 AM to 12:00 PM Luddy Hall 4147 (ISE AI Office Hours)

Prerequisites

General programming experience with Python or Java is required.

 

 

Objectives

In this class, you will see how to apply Deep Learning on time series data in real-time. Big data analysis tools show you how you can prepare, blend and analyze different data sources in minutes not days on next generation computer systems (CPU/GPU/TPU). There's no better way to experience end-to-end analytics platforms in action. Students interested in real-world applications will work on hands-on projects of time series datasets such as those from Automotive Vehicles and Edge devices.  

 

Scope and Topics

High-Performance Big Data Systems involve Real-Time Data Analytics on High-Performance Computing (HPC) clusters optimized for data analysis. This course introduces research and development in hardware, algorithms and software for AI and big data systems ranging from commodity clouds, hybrid HPC-clouds, edge computing and supercomputers. High-Performance systems are designed to scale and fully exploit the specialized features (communication, memory, energy, I/O, accelerator) of each different architecture.  Students will study the literature of  new hardware architectures (many-core and other emerging architectures) and benchmark on existing systems. Applications will focus on real- time Machine Learning/AI . A major student project aimed at a demonstration of the capabilities of High Performance Big Data systems.

 

 

Course Materials

Reference Book

  • Distributed Systems: Principles and Paradigms, Andrew S. Tanenbaum et al. (2nd Edition) Prentice Hall Publishers, NJ 07458, USA.
  • The Deep Learning textbook (MIT Press) by Ian Goodfellow, Yoshua Bengio and Aaron Courville
  • Distributed Systems and Cloud Computing: From Parallel Processing to the Internet of Things, Kai Hwang et al., Morgan Kaufmann Publishers, an imprint of Elsevier, Inc., Burlington, MA 01803, USA.
  • High Performance Computing, Kevin Dowd & Charles Severance, O'Reilly Publisher, CA 95472, USA.

 

Evaluation

  • 20% Participation & Quizzes
  • 10% Proposal
  • 20% Midway Report
  • 30% Final Report
  • 20% Presentation & Demo

 

 

Academic Misconduct

Plagiarism and cheating undermine the academic environment. Students who cheat undermine their own education, the self-esteem that comes with true mastery, and the academic mission of the University. The regulations governing student academic conduct and the procedures that must be used in handling violations of those regulations are covered in the Code of Student Rights, Responsibilities, and Conduct. Part II.A. defines academic misconduct, and Part IV.B. explains the procedures for handling cases of academic misconduct; these two sections are reprinted each semester in the Registrar publication Enrollment and Student Academic Information, under the heading "Academic Misconduct Policy."

 


Policy for Late Assignments or Projects

Assignments and projects are due at 11:59 PM Sunday unless otherwise noted. You have one free late submission for up to 24 hours. For other late submissions, the grade will be reduced by 20% for each 24 hours late. No submissions will be accepted after 48 hours past due time.

 

 

 

Class Schedule (Tentative)

Lectures

Topics

Literature

Assignments

Labs

Lecture 1

·      Course Introduction

Applying for FutureSystems account

Lab 1

Lecture 2

·      Big Data Learning Systems and Applications

·     Term project ideas

TensorFlow Introduction, K-Means Walkthrough

Lab 2

Lecture 3

·      Distributed Systems*

·      Architectures*

·      Deep Learning Frameworks

·     Chapter 1 (Tanenbaum)

·     Chapter 2 (T.)

tf.data, Distributed TensorFlow

Lab 3

Lecture 4

·      Communication*

·      Parallelization*

·     Chapter 4 (T.)

Proposal Due / Data Exploration with pandas and matplotlib

Lab 4

Lecture 5

·      Synchronization*

·      Consistency and Replications*

·      Fault Tolerance

·     Chapter 6 (T.)

·     Chapter 7 (T.)

·     Chapter 8 (T.)

 LSTM Anomaly Detection

Lab 5

Lecture 6

·      Real-Time Machine Learning

Streaming Data Analysis

Apache Storm

Lab 6

Lecture 7

Midterm Presentation

Midterm Report Due

Lecture 8

·      High Performance Computing Architectures

·      Measuring Performance

·     CPU,GPU, AI chips

Edge TPU

Lab 7

Lecture 9

·      Memory Technology

·      Vectorization

·     Profiling Tool with Intel VTune

Vectorization

Lab 8

Lecture 10

·      Message Passing Interface (MPI)

·     Solving Bigger Problems by Scaling

 MPI

Lab 9

Lecture 11

·      Benchmarking

·      User Benchmarks

·      Industry Benchmarks

·     Time Series

MLPerf

Lab 10

Lecture 12

·      Discussion of Parallel Thinking

·      Data Processing Pipeline (preparation, processing, management analysis and visualization)

·     HPC and Big Data

End-to-end Analytics

Lab 11

Lecture 13

·      Big Data Frameworks

·     HPC and Big Data

 Apache Hadoop and Spark

Lab 12

Lecture 14

·      Discussions on the Convergence of HPC, Big Data and Machine Learning

·     Digital Twin

 

Lecture 15

·      Course Review

·     Project Review

Final Report Due

 

Final Week

·      Final Presentation

 

 

 

Course Summary:

Course Summary
Date Details Due