NYU Stern School of Business

Undergraduate College

INFO-UB.0046.001 (C20.0046): Dealing with Data

Spring 2013

Instructor Details

Wang, Jing

jwang5@stern.nyu.edu

212-998-0465

Thursdays, 11:00am-12:30pm by appointment

KMC 8-179

 

Course Meetings

TR, 9:30am to 10:45am

KMC 5-75


Final Exam:

Schedule exceptions
    Class will not meet on:
    Class will meet on:

 

Course Description and Learning Goals

We live in a world where a large volume of data is generated everyday: user queries in search engines, discussions in social forums, call histories in mobile phones, credit card transactions in electronic commerce, voting records in election campaigns, etc. These data can potentially be used as the new "oil" to fuel business, science, government, and society as a whole. The key challenge here is how to transform raw data into valuable information and, ultimately, knowledge that helps people make better decisions.


The objective of this course is to teach you some of the basic techniques in dealing with potentially massive amount of data. This course guides you through the whole data management process, from the initial data acquisition to the final data analysis. We will study a broad set of topics, including data acquisition, data cleaning and formatting, common data formats, data representation and storage, data transformations, database management systems, data analysis, “big data” analytics, and data visualization.


Upon completion of this course, you will:

 

Course Pre-Requisites

The course does not assume any prior programming experience. However, some basic knowledge of programming would be helpful.

 

Course Outline

Week Topics
1 Introduction to data, how different types of data (binary, floating point, character, etc.) are represented in the computer?
2 Files, records, and fields; sequential processing and random access processing; sorting and merging data.
3 Converting unstructured data to common formats like csv, tab delimited, fixed format, xml.
4 Unix command-line text processing tools: sed, grep, cut, awk, uniq, etc., concept of pipeline processing, regular expressions.
5 Relational model, SQL as data definition language.
6 Relational algebra, SQL as data manipulation language.
7 Entity-Relationship diagrams, translating ERD into relational model.
8 Normalization of relational tables.
9 Common business analytics tools: Excel, matlab, Stata, R
10 "Big Data". Discussion of Google file system, MapReduce, and Hadoop.
11 "Big Data" analytics. HBASE, Hive, Pig and other tools for handling massive databases.
12 Data visualization. How to visualize large amounts of data using graphical techniques?
13 Final project presentations

 

Required Course Materials

 

Assessment Components

Student grades will be determined based on homework assignments, the midterm exam, the final project, class participation and team member rating.

Component

Percentage

Homework Assignments

20%

Midterm Exam

35%

Final Project

35%

Class Participation

5%

Team Member Rating

5%

 

Group Projects

Guidelines for Group Projects

Business activities involve group effort. Consequently, learning how to work effectively in a group is a critical part of your business education.

Every member is expected to carry an equal share of the group’s workload. As such, it is in your interest to be involved in all aspects of the project. Even if you divide the work rather than work on each piece together, you are still responsible for each part. The group project will be graded as a whole:   its different components will not be graded separately. Your exams may contain questions that are based on aspects of your group projects.

It is recommended that each group establish ground rules early in the process to facilitate your joint work including a problem-solving process for handling conflicts. In the infrequent case where you believe that a group member is not carrying out his or her fair share of work, you are urged not to permit problems to develop to a point where they become serious. If you cannot resolve conflicts internally after your best efforts, they should be brought to my attention and I will work with you to find a resolution.

You will be asked to complete a peer evaluation form to evaluate the contribution of each of your group members (including your own contribution) at the conclusion of each project. If there is consensus that a group member did not contribute a fair share of work to the project, I will consider this feedback during grading.

 

Grading

At NYU Stern we seek to teach challenging courses that allow students to demonstrate their mastery of the subject matter.  In general, students in undergraduate core courses can expect a grading distribution where: 

Note that while the School uses these ranges as a guide, the actual distribution for this course and your own grade will depend upon how well you actually perform in this course.

 

Re-Grading

The process of assigning grades is intended to be one of unbiased evaluation. Students are encouraged to respect the integrity and authority of the professor’s grading system and are discouraged from pursuing arbitrary challenges to it.

If you believe an inadvertent error has been made in the grading of an individual assignment or in assessing an overall course grade, a request to have the grade re-evaluated may be submitted. You must submit such requests in writing to me within 7 days of receiving the grade, including a brief written statement of why you believe that an error in grading has been made.

 

Professional Responsibilities For This Course

Attendance

 

Participation

In-class contribution is a significant part of your grade and an important part of our shared learning experience. Your active participation helps me to evaluate your overall performance.
You can excel in this area if you come to class on time and contribute to the course by:

 

Assignments

 

Classroom Norms

 

Stern Policies

General Behavior
The School expects that students will conduct themselves with respect and professionalism toward faculty, students, and others present in class and will follow the rules laid down by the instructor for classroom behavior.  Students who fail to do so may be asked to leave the classroom. 

 

Collaboration on Graded Assignments
Students may not work together on graded assignment unless the instructor gives express permission. 

 

Course Evaluations
Course evaluations are important to us and to students who come after you.  Please complete them thoughtfully.

 

Academic Integrity

Integrity is critical to the learning process and to all that we do here at NYU Stern. As members of our community, all students agree to abide by the NYU Stern Student Code of Conduct, which includes a commitment to:

The entire Stern Student Code of Conduct applies to all students enrolled in Stern courses and can be found here:

Undergraduate College: http://www.stern.nyu.edu/uc/codeofconduct
Graduate Programs: http://w4.stern.nyu.edu/studentactivities/involved.cfm?doc_id=102505

To help ensure the integrity of our learning community, prose assignments you submit to Blackboard will be submitted to Turnitin.  Turnitin will compare your submission to a database of prior submissions to Turnitin, current and archived Web pages, periodicals, journals, and publications.  Additionally, your document will become part of the Turnitin database.

 

Recording of Classes

Your class may be recorded for educational purposes

 

Students with Disabilities

If you have a qualified disability and will require academic accommodation of any kind during this course, you must notify me at the beginning of the course and provide a letter from the Moses Center for Students with Disabilities (CSD, 998-4980, www.nyu.edu/csd) verifying your registration and outlining the accommodations they recommend.  If you will need to take an exam at the CSD, you must submit a completed Exam Accommodations Form to them at least one week prior to the scheduled exam time to be guaranteed accommodation.

 

Printer Friendly Version