Coding for Data Science and Data Management, DSE UniMI – 2019/2020

The course aims at providing technical skills about coding/scripting aspects for data analysis and to manage persistent data storage of sources and results involved in analysis. On the one side, the Python programming language and the R framework are illustrated. The goal is to deal with essential notions about data structures and control structures of both Python and R. On the other side, the goal is to present the core notions of relational databases, such as keys, integrity, and primary/foreign key constraints, as well as the SQL language for data definition, manipulation, and query. Recent and innovative NoSQL solutions are also discussed, with special focus on a document-oriented system called MongoDB.

Course Structure

  1. R
  2. Python
  3. Databases

Syllabus (R)

Coding for Data Science and Data Management (R), DSE UniMI – 2019/2020

Lectures (R)

  • Introduction to the R framework and R Studio (html)
  • Basic Data Types (html)
  • Basic Data Structures (html)
  • Basic operations (html)
  • Time Series (html)
  • Control Structures (html)
  • User-Defined Functions (html)
  • Performance Optimization (html)
  • Data Acquisition (html)
  • Data visualization (ggplot2) (plotly)
  • Building interactive interfaces, documents and websites (shiny) (rmarkdown)
  • Building R packages (structure) (metadata) (data)

Midterm Exam (R)

Midterm exam: R package (pdf) (grades)

Important notice:  

  • the grade obtained in the R midterm is valid until Feb 2021
  • if you passed the R midterm you don’t have to take the R module in the written exam
  • if you decide to take the R module in the written exam, this will overwrite the midterm whatever the new result is (i.e. the midterm grade won’t be valid any longer)
  • if you wish to immediately reject the midterm grade you can contact me at [email protected]