The MS in Business Analytics modules are spread out over two fiscal years and a period of 12 months. Between modules, students complete approximately 20-25 hours of work per week on pre- and post-module tasks.
Module 1: New York City, USA
Foundations of Statistics Using R
Previously co-taught by: Kristen Sosulski, Peter Lakner
Course description: The purpose of this course is to ensure that students are prepared to use R as a statistical tool and understand the fundamental statistical concepts. This course is divided into two parts: 1) Getting Started with R and 2) Statistics and R.
Part 1: Getting started with R : The R portion of the course will equip students with the skills needed to work with data using the R statistical computing application. This begins with developing a basic understanding of the R working environment. Second, students will learn to use R while being introduced to the necessary arithmetic and logical operators, and salient functions for manipulating data. Next, students will be introduced to the common data structures, variables, and data types used in R. Students will learn how to develop their own R scripts and utilize the various packages available in R for visualization, manipulation, and statistical analysis. Students will learn how to import data sets and transform and manipulate those datasets for various analytical purposes such as dealing with missing data. Finally, students will learn how to create control structures, such as loops and conditional statements to traverse, sort, merge, and evaluate data.
Part 2: Statistics and R: In the second part of the class basic concepts of probability and statistics will be introduced. We shall study the concepts of population and sample, discuss the difference between population parameters and sample statistics, and draw inferences from known sample statistics to usually unknown population parameters. We shall study discrete distributions along with their means and standard deviations, paying particular attention to the binomial distribution. We shall also study continuous distributions and their probability density functions, paying special attention to the most central of the continuous distributions—the normal distribution. The Central Limit Theorem will be introduced, and confidence intervals and statistical tests will be discussed. We shall then study the simple and multiple linear regression and their applications to prediction and forecast.
- Getting started with R (commands, arithmetic operators, logical operators, functions)
- Data structures and types
- Writing scripts
- Descriptive statistics
- Statistical graphs
- Working and manipulating data sets in R
Digital Marketing Analytics
Previously taught by: Anindya Ghose
Course description: The emergence of the Internet has drastically changed marketing. Some traditional marketing strategies are now completely outdated, others have been deeply transformed, and new digital marketing strategies are continuously emerging based on the unprecedented access to vast amounts of information about products, firms, and consumer behavior. The Internet is now encroaching core business activities such as new product design, advertising, marketing and sales, creation of word-of-mouth, new start-up funding, and customer service. Our goal in this class is to discuss the new business models in electronic commerce that have been enabled by Internet-based social media and advertising technologies, and to analyze the impact these technologies and business models have on industries, firms, and people. We will inform our discussions with insights from data and metrics that can guide us for measurement. To recognize how businesses can successfully leverage these technologies, we will therefore go beyond the technology itself and investigate some key questions.
- Econometric regression modeling
- Selection problems
- Omitted variables problems
- Log transformations
- Econometrics-based tools
Data Science for Business Analytics
Previously taught by: Foster Provost; Alex Tuzhilin
Course description: This course will change the way you think about data and its role in business. Businesses, governments, and individuals create massive collections of data as a byproduct of their activity. Increasingly, decision-makers and systems rely on intelligent technology to analyze data systematically to improve decision-making. In many cases, automating analytical and decision-making processes is necessary because of the volume of data and the speed with which new data are generated. We will examine how data analysis technologies can be used to improve decision-making. We will study the fundamental principles and techniques of data mining, and we will examine real-world examples and cases to place data-mining techniques in context, to develop data-analytic thinking, and to illustrate that proper application is as much an art as it is a science. In addition, we will work “hands-on” with data mining software.
- Data mining and data mining processes
- Introduction to predictive modeling
- Data fitting and over fitting
- Model testing
- Cross-validation and learning curves
- Model performance analytics
- Unsupervised learning and clustering
- Bayesian reasoning and text classification
Dealing with Data
Previously taught by: Panos Ipeirotis
Course description: All analytics projects rely on data. A crucial step in a business analytics process is creating the dataset that will be analyzed. Unfortunately, the vast majority of the stakeholders do not pay serious attention at this step; however, streamlining and understanding the data often takes 90% of the effort and time of a data analytics project. Furthermore, because most people do not know how the dataset was created, they miss important details and assumptions that were part of the data gathering and handling process, leading to serious problems down the road. This class is designed to teach students to handle data programmatically, without being software engineers. This course guides students through the whole data management process, from initial data acquisition to final data analysis. From a tools perspective, we cover Python and SQL. SQL is the lingua franca for all data analysts, and virtually all companies store their data in SQL-accessible repositories. Python serves as a great general-purpose programming language for a wide variety of data management tasks, and is commonly used as the “glue” that brings together all the different aspects of the analytics process.
- Data modeling and ER model
- Relational databases and SQL
- Accessing data sources: Web APIs
- Data manipulation using Python Pandas
- Regular expressions and Web Crawling
- Text Analytics
Previously taught by: Jiawei Zhang; Ilan Lobel
Course description: This course trains students to turn real-world problems into mathematical and spreadsheet models and to use such models to make better managerial decisions. This is a hands-on course that focuses on modeling business problems, turning them into Excel spreadsheet models and using tools like Solver and Crystal Ball to obtain solutions to these managerial problems. The course focuses on two classes of models: optimization and simulation. The application areas are diverse and they originate from problems in finance, marketing and operations. We cover problems such as how to optimize a supply chain, how to price products when faced with demand uncertainty, and how to price exotic financial options using Monte Carlo simulation.
- Linear and linear integer programming
- Nonlinear programming and evolutionary solver
- Simulation and optimization
- Multi-period linear programming
- Monte Carlo simulation
Please note all courses and topics are subject to change.
Module 2: London, UK
Previously taught by: Luis Torgo
Course description: The goal of this course is to provide hands-on experience on key data mining technologies using one particular tool—the R environment. R is a fast growing technology that has been witnessing widespread acceptance both in academia and industry. Recent surveys have even put it in the top regarding usage by professional data miners (Rexer Analytics survey, 2013). There are many factors contributing for this acceptance, but clearly these include the price (free), being open source (trustworthy software that can be easily inspected/checked for flaws), the extension of available methods (exponential growth of the set of available methods for different application areas), and the available support from the community (an extremely large community of knowledgeable experts proving top-notch support for free). This course illustrates the use of R for several key data mining processes. This illustration will be driven by concrete case studies that we will “solve” using R. The course can be regarded as a hands-on complement of the Data Science for Business Analytics.
- Data pre-processing (dealing with unknown values)
- Defining the data mining task
- Classification approaches
- Performance estimation for time series models
- Modeling and performance estimation
- Model outcomes and model selection
Data Driven Decision Making
Previously taught by: Vishal Singh
Course description: "Every two days we now create as much information as we did from the dawn of civilization up until 2003." —Eric Schmidt (CEO Google)
"Data are widely available; what is scarce is the ability to extract wisdom from them." —Hal Varian (UC Berkeley and Chief Economist, Google)
The two quotes above summarize the main theme of this course. In every aspect of our daily lives, from the way we work, shop, communicate, or socialize, we are both consuming and creating vast amounts of information. More often than not, these daily activities create a trail of digitized data that is being stored, mined, and analyzed by firms hoping to create valuable business intelligence. With technological advances and developments in customer databases, firms have access to vast amounts of high-quality data which allows them to understand customer behavior and customize business tactics to increasingly fine segments or even segments of one. However, much of the promise of such data-driven policies has failed to materialize because managers find it difficult to translate customer data into actionable policies. The general objective of this course is to fill this gap by providing students with tools and techniques that can be utilized for making business decisions. Note that this is not a statistics or mathematics course. The emphasis of the class will be on applications and interpretation of the results for making real life business decisions.
- Regression-based model development
- Capturing non-linear effects: dummy variables & log transformations
- Estimating & Interpreting log demand models
- Using log-regressions to understand competitive marketplace
Previously taught by: Harry Chernoff
Course description: “Operations and supply management use analytical thinking to deal with real-world problems.” —F. Robert Jacobs.
This course is an introduction to the principles and techniques of operations analytics. Operations and supply management is defined as the design, operation, and improvement of the systems that create and deliver the firm's primary products and services.
“A critical success factor in gaining competitive advantage is the ability to apply the right analytics at the right time, to the right people, at the right place and under the right situation.” —Joseph Chan.
In this course, students will learn operations models and techniques that work with large data sources. Operations management has dealt with applying analytics for many years. Recently, however, due to big data, many older models and software are incapable of running the analyses. This course will demonstrate the application of Operations models that are currently being used in industry incorporating big data.
- Process flow
- Process design and analysis
- Project management
- Quality, value and cost
Please note all courses and topics are subject to change.
Module 3: Shanghai, China
Previously taught by: Kristen Sosulski
Course description: “Visualization is a kind of narrative, providing a clear answer to a question without extraneous details.” -Ben Fry. This course is an introduction to the principles and techniques of data visualization. Visualizations are graphical depictions of data that can improve comprehension, communication, and decision making. In this course, students will learn visual representation methods and techniques that increase the understanding of complex data and models. Emphasis is placed on the identification of patterns, trends and differences from datasets across categories, space, and time. This is a hands-on course. Students will use several tools to refine their data and create visualizations. These include: R, Python, ManyEyes, HTML/CSS, D3.js, Google Charts, Adobe Illustrator, and Excel.
- Design principles for charts and graphs
- Creating data displays
- Designing effective digital presentations
- Visualizing categorical data
- Time series data, multiple variables, and geospatial data
- Dashboard design
- Web-based visualizations
Previously taught by: Arun Sundararajan
Course description: Social media and mobile commerce create massive connected data sets that contain a wealth of business and social insights. This course will translate cutting-edge network science research into actionable analytics strategies for dealing with big data that is networked, text-intensive and unstructured, with applications from viral marketing, A/B testing and media planning.
- Network basics
- Strength and trust in social networks
- Measuring and interpreting network position
- Community structure in networks
- Identifying and measuring contagion in networks
Decision Under Risk
Previously taught by: Gustavo Vulcano
Course description: Analytics is “the scientific process of transforming data into insight for making better decisions.” For example, sales data can help us understand consumer purchase behaviors as well as demand patterns. These insights can be used to make sales forecasts, which in turn can inform assortment and production planning decisions. Optimization models have played a very important role in turning “insights” into “decisions” for companies in various industries: advertising, airlines, energy, investment and finance, marketing, manufacturing, retailing, etc. This course is aimed at enriching the student exposure to business analytics techniques. It has two main parts. The first part covers sensitivity analysis, which is a follow-up of the linear programming topic covered in the Decision Models course, and which relates to understanding the impact of changing the parameters of a model on the optimal solution. It is executed using Excel Solver. The second part, which spans most of the course, covers decision making under uncertainty. Students will learn how to build optimization models that incorporate random parameters (e.g., stochastic demand, price, etc.).
- Sensitivity analysis for linear programming
- Decision analysis
- Two-stage stochastic optimization with recourse
- Dynamic programming
Please note all courses and topics are subject to change.
Module 4: New York City, USA
Previously taught by: Rene Caldentey; Gustavo Vulcano
Course description: Revenue management and Pricing (RMP) focuses on how firms should manage their pricing and product availability policies across different selling channels in order to maximize performance and profitability. One of the best-known applications of PRM is yield management whereby airlines, hotels, and other companies seek to maximize operating contribution by dynamically managing capacity over time. Building on a combination of lectures and case studies the course develops a set of methodologies that students can use to identify and develop opportunities for revenue optimization in different business contexts, including the transportation and hospitality industries, retail, media and entertainment, financial services, health care and manufacturing, and others. The course places particular emphasis on discussing quantitative models needed to tackle a number of important business problems including capacity allocation, markdown management, dynamic pricing for e-commerce, customized pricing, and demand forecasts under market uncertainty, to name a few.
- Demand segmentation
- Price differentiation
- Constrained pricing
- Marginal value of capacity
- Network revenue management
- Pricing policies in action
- Demand forecasting and data analysis
Data Privacy and Ethics
Previously taught by: Solon Barocas
Course description: There is a growing sense of urgency around the ethics of analytics. Harvard Business Review, for instance, declared that “oversight for algorithms” and “data privacy” would be among the top trends that business professionals could not ignore in 2016. Our class will tackle these topics head-on. Together, we will explore what it means to use analytics ethically and how to think about its ethical implications. We will approach these matters from the perspective of professionals who lead analytics-oriented teams or organizations, and whose success depends on their ability to recognize the ethical issues at stake and resolve these to the satisfaction of multiple stakeholders.
- Understanding sources of unfairness in analytics
- Unique challenge that analytics pose for privacy
- Policy responses and proposals
- Conducting experiments ethically
Strategy, Change and Analytics
Previously taught by: JP Eggers
Course description: This course focuses on significant strategic decisions—such as the introduction of new products or the acquisition of another firm—and explores how data-driven and analytical approaches can be used to inform these decisions from a senior management perspective. A case-based approach allows us to discuss details of significant strategic decisions. We will cover some core aspects of business strategy, including external analysis, competitor analysis, and opportunity analysis. We will also look more deeply at different aspects of the decision-making process within organizations, both to understand the process and to think about implementation. The goal is to understand the role of analytics and analytical approaches in the broader organization.
- Value creation and capture
- Firm positioning versus competitors
- Ratio analysis in strategy
- Flexibility and commitment in strategy
- Leading organizational change; causality and interpretation of analytical results
Please note all courses and topics are subject to change.
Module 5: New York City, USA
The Capstone project, which students work on throughout the year, is presented at the culmination of the program. This integrative exercise gives students an opportunity to review and interpret data through statistical and operational analysis with the use of predictive models and the application of optimization techniques. The result is a unified and practical case presentation on a topic of the group's choosing. This is a team-based project with approximately 4-6 students per group. The integrative projects should not take the form of formal dissertations or narrative papers. Rather, they should take the form of “reports to management,” emphasizing substance over length and the forest over the trees. Where possible, they should be action-oriented and framed in terms of business policy and competitive strategy. Given this format, they should be easily convertible into PowerPoint presentations.
To review more information on past projects, visit our Capstone page.