Instructor:

Volkan Vural (vvural@ucsd.edu)

Teaching Assistant:

TA 1 - Chetan Gandotra (cgandotr@ucsd.edu)

TA 2 - Tushar Bansal (tbansal@ucsd.edu)

Time:

Fridays: 9:00 am - 4:30 pm

TA Office Hours

In-class TA session: 4:30 pm - 5:30 pm (immediately after class)

Overview:

The Machine learning class is designed to provide professionals in business enterprises and scientific communities with the skills critical to design, build, verify and test predictive data models. The class will provide conceptual and hands-on training for the critical data analysis techniques that help discover patterns and relationships from historical data.

Data Mining or Machine Learning –– the art and science of learning from data –– covers a number of different procedures. This hands-on course emphasizes key learning techniques: Decision Trees, Numeric Prediction, Clustering, Bayesian learning, Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Deep Learning, etc.

Course Objectives:

Data mining and predictive modeling approaches, such as neural networks, decision trees, regression trees and clustering, are ideally suited for domains characterized by the presence of large amounts of noisy data, possibly many variables or dimensions, and the absence of general theories or hypothesis about the data. These approaches, offer a means of effective classification and analysis leading to identification of functional relationships, important factors, trends and patterns. Moreover, predictive modeling methods are capable of automatic extraction of knowledge deeply hidden in the data, enabling discovery of new insights not otherwise attainable. This course covers basic data mining, data analysis, pattern recognition concepts and predictive modeling algorithms so that a user can explore and implement analyses on their data.
Successful participation of the hands-on component is essential. This approach fosters deep understanding of the topics covered in class and accelerates the preparedness of the students to apply the newly learned techniques to their own projects.

iPython Notebooks, Scikit-Learn, NumPy, pandas will be utilized to provide collections of data mining methods and applications that will be used in hands-on class labs, homework assignments and the final exam. They contain tools for data pre-processing, classification, regression, clustering, association rules, artificial neural network and visualization.

Successful participation of the hands-on component is very essential. This approach fosters deep understanding of the topics covered in class and accelerates the preparedness of the students to apply the newly learned techniques to their own projects.

General Topics:

  • Overview: Data Mining and Machine Learning
  • Preprocessing the Data
  • Overview of Classification Methods - Generative and Discriminative
  • Representational Learning
  • Learning Algorithms Implementations
  • Combining Classifiers
  • Regression

Class Piazza and Github:

Piazza sign up link. We will be using it for all further announcements and for sharing lecture slides.

The class Github contains all the iPython notebooks that we will use.