Skip to main content

Course Index

The MS in Business Analytics residential modules are spread out over two calendar years and a period of 12 months.

The MSBA program curriculum focuses on four core areas of Artificial Intelligence (AI) in Business: Descriptive Analytics, Predictive Modeling, Causal Inference and Prescriptive Modeling. It is built on a foundation of fulfilling three core goals for all professionals: data engineering, data analytics, and storytelling. MSBA exposes students to both open-source and commercial software to reflect the heterogeneity of software used in different companies. This includes, but is not limited to STATA, R, Python, and Tableau.

MSBA Curriculum details. Module 1: Introduction to Business Analytics, Foundations of Statistics Using R, Digital Marketing Analytics. Module 2: Databases for Business Analytics, Dealing with Data, Decision Models, Big Data. Module 3: Machine Learning, Data Mining in R, Operation Analytics, Data Driven Decision Making. Module 4: Network Analytics, Decision Under Risk, Data Visualization. Module 5: Revenue Management and Pricing, Data Privacy and Ethics, Strategy, Change, and Analytics, Modern AI

Below, MSBA Academic Director Anindya Ghose discusses the course curriculum and offers a high-level snapshot of the topics and skills covered in each module:

Use the below links to learn more about the program's unique modular structure:



Pre-Program runs from January until the start of Pre-Module 1 in March. During this time, the MSBA program will provide matriculated students access to the Admitted Students Website, where they will be able to get more familiar with the program administration, format, logistics, as well as access optional materials to brush up on any skills prior to the Pre-Module period.


Pre-Module 1 is the period from March through April before students arrive to the in-class residential period of the module. During this time, the MSBA program will distribute pre-readings, assignments, group work, and online sessions so that students can complete their Module 1 course work prior to the first in-class session. This period is also an opportunity for the cohort to meet each other virtually before the program convenes on campus for Module 1.

Module I

Introduction to Business Analytics and Ai
Previously taught by: Foster Provost; Alex Tuzhilin
Course description: This course addresses the data engineering foundation and predictive analytics pillar in the world of AI in Business. It will change the way you think about data and its role in business. Businesses, governments, and individuals create massive collections of data as a byproduct of their activity. Increasingly, decision-makers and systems rely on intelligent technology to analyze data systematically to improve decision-making. In many cases, automating analytical and decision-making processes is necessary because of the volume of data and the speed with which new data are generated. We will examine how data analysis technologies can be used to improve decision-making. We will study the fundamental principles and techniques of data mining, and we will examine real-world examples and cases to place data-mining techniques in context, to develop data-analytic thinking, and to illustrate that proper application is as much an art as it is a science. In addition, we will work “hands-on” with data mining software.

Topics covered: 

Data mining and data mining processes
Introduction to predictive modeling
Data fitting and over fitting
Model testing
Cross-validation and learning curves
Model performance analytics
Unsupervised learning and clustering
Bayesian reasoning and text classification


Foundations of Statistics Using R 
Previously co-taught by: Kristen Sosulski, Peter Lakner
Course description: This course addresses the data engineering foundation in the world of AI in Business. The purpose of this course is to ensure that students are prepared to use R as a statistical tool and understand the fundamental statistical concepts. This course is divided into two parts: 1) Getting Started with R and 2) Statistics and R.

Part 1: Getting started with R: The R portion of the course will equip students with the skills needed to work with data using the R statistical computing application. This begins with developing a basic understanding of the R working environment. Second, students will learn to use R while being introduced to the necessary arithmetic and logical operators, and salient functions for manipulating data. Next, students will be introduced to the common data structures, variables, and data types used in R. Students will learn how to develop their own R scripts and utilize the various packages available in R for visualization, manipulation, and statistical analysis. Students will learn how to import data sets and transform and manipulate those datasets for various analytical purposes such as dealing with missing data. Finally, students will learn how to create control structures, such as loops and conditional statements to traverse, sort, merge, and evaluate data.

Part 2: Statistics and R: In the second part of the class basic concepts of probability and statistics will be introduced. We shall study the concepts of population and sample, discuss the difference between population parameters and sample statistics, and draw inferences from known sample statistics to usually unknown population parameters. We shall study discrete distributions along with their means and standard deviations, paying particular attention to the binomial distribution. We shall also study continuous distributions and their probability density functions, paying special attention to the most central of the continuous distributions—the normal distribution. The Central Limit Theorem will be introduced, and confidence intervals and statistical tests will be discussed. We shall then study the simple and multiple linear regression and their applications to prediction and forecast.

Topics covered: 

Getting started with R (commands, arithmetic operators, logical operators, functions)
Data structures and types
Writing scripts
Descriptive statistics
Statistical graphs
Working and manipulating data sets in R
Digital Marketing Analytics
Previously taught by: Anindya Ghose
Course description: This course addresses the predictive analytics and causal inference pillars in the world of AI in Business. The emergence of the Internet has drastically changed business and society. Some traditional marketing strategies are now completely outdated, others have been deeply transformed, and new digital marketing strategies are continuously emerging based on the unprecedented access to vast amounts of information about products, firms, and consumer behavior. The Internet is now encroaching core business activities such as new product design, advertising, marketing and sales, creation of word-of-mouth, new start-up funding, and customer service. Our goal in this class is to discuss the new business models in the digital economy that have been enabled by Internet-based social media and advertising technologies, and to analyze the impact these technologies and business models have on industries, firms, and people. We will inform our discussions with insights from data and metrics that can guide us for measurement in causal inference.

Topics covered: 

Econometric regression modeling
Econometrics-based tools

Module II

Databases for Business Analytics
Previously taught by: Panos Ipeirotis
This course addresses the data engineering goal in the world of AI in Business. Databases are ubiquitous in all businesses and hold significant amount of information about the business. Every data analysis and report typically starts with an SQL query, as SQL is the lingua franca of all database systems. Therefore, SQL is a necessity for anyone who needs to analyze data as part of their job, and many tech companies consider knowledge of SQL a prerequisite for all their analysts and managers.

This database class is designed for absolute beginners and teaches students how databases are structured and how to write SQL queries that retrieve data from a database. The class is heavily hands-on, with a focus on developing the necessary skills for writing SQL queries. We will cover the following topics:

  • Basics of Entity-Relationship model, and the connection to databases
  • USE, DESCRIBE queries, to understand the structure of a database
  • Selection queries: *, column, column AS, DISTINCT, ORDER BY, LIMIT
  • Filtering data using “where”: Boolean conditions, IN, BETWEEN, LIKE
  • Join queries: Inner and Outer joins, self joins
  • Aggregation queries: GROUP BY, SUM, AVG, MAX, MIN, etc

At the completion of this course, students will be able to navigate relational databases, issue queries against databases in an organization, and generate data that can be used for analyses and reports.

Dealing with Data Using Python
Previously taught by: Panos Ipeirotis
This course addresses the data engineering goal in the world of AI in Business. All analytics projects rely on data. A crucial step in a business analytics process is creating the dataset that will be analyzed. Unfortunately, the vast majority of the stakeholders do not pay serious attention at this step; however, streamlining and understanding the data often takes 90% of the effort and time of a data analytics project. Furthermore, because most people do not know how the dataset was created, they miss important details and assumptions that were part of the data gathering and handling process, leading to serious problems down the road. This class is designed to teach students to handle data programmatically, without being software engineers. This course guides students through the whole data management process, from initial data acquisition to final data analysis. From a tools perspective, we cover the Python ecosystem: Python serves as a great general-purpose programming language for a wide variety of data management tasks, and is commonly used as the “glue” that brings together all the different aspects of the analytics process.

Decision Models
Previously taught by: Jiawei Zhang; Ilan Lobel
Course description: This course addresses the prescriptive analytics pillar in the world of AI in Business. It trains students to turn real-world problems into mathematical and spreadsheet models and to use such models to make better managerial decisions. This is a hands-on course that focuses on modeling business problems, turning them into Excel spreadsheet models and using tools like Solver and Crystal Ball to obtain solutions to these managerial problems. The course focuses on two classes of models: optimization and simulation. The application areas are diverse and they originate from problems in finance, marketing and operations. We cover problems such as how to optimize a supply chain, how to price products when faced with demand uncertainty, and how to price exotic financial options using Monte Carlo simulation.

Topics covered: 

Linear and linear integer programming
Nonlinear programming and evolutionary solver
Simulation and optimization
Multi-period linear programming
Monte Carlo simulation
Big Data 
Previously taught by: Norm White; Ramesh Shankar
Course description: This course addresses the data engineering foundation in the world of AI in Business. This course offers an in-depth hands-on exploration of various cutting-edge information technologies used for big data analytics. The course will cover background readings on the theoretical foundations of Hadoop and MapReduce, as well as business articles on how Hadoop and related technologies are used by companies. The course will also cover some basics of navigating Google Cloud Platform (GCP) for uploading and analyzing data using Google Cloud Storage, BigQuery, and PySpark on Dataproc (Hadoop cluster). Students have the opportunity to be hands-on with Hadoop – specifically, Linux, Hadoop distributed file system (HDFS), Apache Sqoop, Apache Pig and Apache Hive – for data management and extract-transform-load (ETL) operations. Students will also learn about cloud file storage using GCP, querying cloud data, visualizing cloud data, and using PySpark to run analytics on cloud data.

Topics covered: 

Big data
System architecture and ecosystems
Data management and extract-transform-load operations
Linux, Hadoop, MapReduce, Apache Sqoop, Apache Pig, Apache Hive

Module III

Machine Learning
Previously taught by: Luis Torgo; Ravi Bapna
Course description: This course addresses the predictive analytics and causal inference pillars in the world of AI in Business. This course is about gaining exposure to core machine learning techniques and their applications to business domains and functions. MIT research shows 9% higher top line and 26% higher net margins for companies with ‘Leading Digital’ capabilities.  While most firm have capabilities in summarizing the data they have, very few have the analytical abilities to gain true insights from such data to get business results. The course will expose you to the art-of-the-possible with respect to state-of-the-art methods and applications of supervised and unsupervised machine learning.  The majority of the course will focus on supervised machine learning for prediction; it will be driven practical uses cases and use a mixture of lecture, discussion of key issues, and an in-class group prediction contest over the three days. At the end of the course all students will become excellent at understanding the power of data mining to create business value. Students will learn how to identify opportunities of using supervised and unsupervised machine learning methods, setup the problems correctly, develop intuition of how the major classes of machine learning algorithms work, and how to use the appropriate metrics and approaches to judge performance. The course will also cover the important topic of algorithmic bias and examine ways to correct for it. Finally, the course will look at explainable AI and the interface between machine learning and causal inference to estimate heterogeneous treatment effects.

Topics covered: 

Data pre-processing (dealing with unknown values)
Defining the data mining task
Classification approaches
Performance estimation for time series models
Modeling and performance estimation
Model outcomes and model selection

Operations Analytics
Previously taught by: Harry Chernoff; Maxime Cohen
Course description: This course addresses the causal inference and prescriptive analytics pillar in the world of AI in Business. This course is an introduction to the principles and techniques of operations analytics. Operations and supply management is defined as the design, operation, and improvement of the systems that create and deliver the firm's primary products and services. In this course, students will learn operations models and techniques that work with large data sources. Operations management has dealt with applying analytics for many years. Recently, however, due to big data, many older models and software are incapable of running the analyses. This course will demonstrate the application of Operations models that are currently being used in industry incorporating big data.

Topics covered:

Process flow
Process design and analysis
Project management
Quality, value and cost

Data Driven Decision Making
Previously taught by: Vishal Singh; Rob Seamans
Course description: This course addresses the causal inference pillar in the world of AI in Business. In every aspect of our daily lives, from the way we work, shop, communicate, or socialize, we are both consuming and creating vast amounts of information. More often than not, these daily activities create a trail of digitized data that is being stored, mined, and analyzed by firms hoping to create valuable business intelligence. With technological advances and developments in customer databases, firms have access to vast amounts of high-quality data which allows them to understand customer behavior and customize business tactics to increasingly fine segments or even segments of one. However, much of the promise of such data-driven policies has failed to materialize because managers find it difficult to translate customer data into actionable policies. The general objective of this course is to fill this gap by providing students with tools and techniques that can be utilized for making business decisions. Note that this is not a statistics or mathematics course. The emphasis of the class will be on applications and interpretation of the results for making real life business decisions.

Topics covered:

Regression-based model development
Capturing non-linear effects: dummy variables & log transformations
Estimating & Interpreting log demand models
Using log-regressions to understand competitive marketplace

Module IV
Network Analytics
Previously taught by: Arun Sundararajan
Course description: This course addresses various applications in the world of AI in Business. Social media and mobile commerce create massive connected data sets that contain a wealth of business and social insights. This course will translate cutting-edge network science research into actionable analytics strategies for dealing with big data that is networked, text-intensive and unstructured, with applications from viral marketing, A/B testing and media planning.

Topics covered:

Network basics
Strength and trust in social networks
Measuring and interpreting network position
Community structure in networks
Identifying and measuring contagion in networks
Decision Under Risk
Previously taught by: Gustavo Vulcano; Ilan Lobel
Course description: This course addresses the prescriptive analytics pillar in the world of AI in Business. Analytics is “the scientific process of transforming data into insight for making better decisions.” For example, sales data can help us understand consumer purchase behaviors as well as demand patterns. These insights can be used to make sales forecasts, which in turn can inform assortment and production planning decisions. Optimization models have played a very important role in turning “insights” into “decisions” for companies in various industries: advertising, airlines, energy, investment and finance, marketing, manufacturing, retailing, etc. This course is aimed at enriching the student exposure to business analytics techniques. It has two main parts. The first part covers sensitivity analysis, which is a follow-up of the linear programming topic covered in the Decision Models course, and which relates to understanding the impact of changing the parameters of a model on the optimal solution. It is executed using Excel Solver. The second part, which spans most of the course, covers decision making under uncertainty. Students will learn how to build optimization models that incorporate random parameters (e.g., stochastic demand, price, etc.).

Topics covered: 

Sensitivity analysis for linear programming
Decision analysis
Two-stage stochastic optimization with recourse
Dynamic programming

Data Visualization
Previously taught by: Kristen Sosulski
Course description: This course addresses the descriptive analytics pillar in the world of AI in Business. This course is an introduction to the principles and techniques of data visualization. Visualizations are graphical depictions of data that can improve comprehension, communication, and decision making. In this course, students will learn visual representation methods and techniques that increase the understanding of complex data and models. Emphasis is placed on the identification of patterns, trends and differences from datasets across categories, space, and time. This is a hands-on course. Students will use several tools to refine their data and create visualizations. These include: R/RStudio, Python, Tableau, ThinkCell for PowerPoint, Geocodio, and Excel.

Topics covered: 

Design principles for charts and graphs
Creating data displays
Designing effective digital presentations
Visualizing categorical data
Time series data, multiple variables, and geospatial data
Dashboard design
Web-based visualizations

Module V

Strategy, Change and Analytics
Previously taught by: JP Eggers
Course description: This course addresses the descriptive analytics and causal inference pillars in the world of AI in Business. This course focuses on significant strategic decisions—such as the introduction of new products or the acquisition of another firm—and explores how data-driven and analytical approaches can be used to inform these decisions from a senior management perspective. A case-based approach allows us to discuss details of significant strategic decisions. We will cover some core aspects of business strategy, including external analysis, competitor analysis, and opportunity analysis. We will also look more deeply at different aspects of the decision-making process within organizations, both to understand the process and to think about implementation. The goal is to understand the role of analytics and analytical approaches in the broader organization.

Topics covered:

Value creation and capture
Firm positioning versus competitors
Ratio analysis in strategy
Flexibility and commitment in strategy
Leading organizational change; causality and interpretation of analytical results

Modern Artificial Intelligence
Previously taught by: Alex Tuzhilin; Xi Chen
Course description: This course addresses the predictive analytics pillar in the world of AI in Business. The purpose of this course is to provide the students with systematic introduction to the recent developments in AI through the coverage of fundamental AI concepts, practical business applications and the hands-on experiences with modern AI frameworks, such as Weka.

Topics covered: 

AI's fundamental concepts and methods
Deep Learning frameworks
Learning how to apply AI-based methos to solving practical business problems
Understand future of AI technologies over next few years

Revenue Management and Pricing
Previously taught by: Rene Caldentey; Gustavo Vulcano
Course description: This course addresses the prescriptive analytics pillar in the world of AI in Business. Revenue management and Pricing (RMP) focuses on how firms should manage their pricing and product availability policies across different selling channels in order to maximize performance and profitability. One of the best-known applications of PRM is yield management whereby airlines, hotels, and other companies seek to maximize operating contribution by dynamically managing capacity over time. Building on a combination of lectures and case studies the course develops a set of methodologies that students can use to identify and develop opportunities for revenue optimization in different business contexts, including the transportation and hospitality industries, retail, media and entertainment, financial services, health care and manufacturing, and others. The course places particular emphasis on discussing quantitative models needed to tackle a number of important business problems including capacity allocation, markdown management, dynamic pricing for e-commerce, customized pricing, and demand forecasts under market uncertainty, to name a few.

Topics covered:

Demand segmentation
Price differentiation
Constrained pricing
Marginal value of capacity
Network revenue management
Pricing policies in action
Demand forecasting and data analysis

Module VI

Strategic Capstone

The Capstone project, which students work on throughout the year, is presented at the culmination of the program. This integrative exercise gives students an opportunity to review and interpret data through statistical and operational analysis with the use of predictive models and the application of optimization techniques. The result is a unified and practical case presentation on a topic of the group's choosing. This is a team-based project with approximately 4-5 students per group. The integrative projects should not take the form of formal dissertations or narrative papers. Rather, they should take the form of “reports to management,” emphasizing substance over length and the forest over the trees. Where possible, they should be action-oriented and framed in terms of business policy and competitive strategy. Given this format, they should be easily convertible into PowerPoint presentations.

To review more information on past projects, visit our Capstone page.