Hackathon: Data Acquisition, Management, and Visualization

This hackathon is part of the laboratories organized by the MSc in Data Science for Economics, University of Milan, Italy. Academic year 2020/2021.

Description

In this 2-day, full-time lab, you will learn how to create, deploy, and maintain a real-world data-driven platform. Through the case study of the COVID-19 Data Hub, we will see how to design and develop an international platform that now serves thousands of users worldwide with more than 3 million downloads. You will work on similar challenging projects in the hackathon that we start at the end of day 1, work overnight, and finish with your pitch at the end of day 2.

From a technical side, you will gain hands-on experience with code management in Github, cloud computing with Google Cloud Platform, shell scripting in Linux, database management, advanced R/Rshiny and Python programming.

This is an advanced and intensive lab. Students who apply should be confident with at least 75% of the topics covered and autonomously gain a basic understanding of the remaining 25% before the beginning of the experience. If possible, this 2-day, full-time lab, will be held on the weekend so that 1) you will have no other commitments and 2) only highly motivated students will be willing to sacrifice their weekend and join us. Depending on the situation and on the university constraints this lab may take place:

  • online in the weekend
  • at the University in the workweek
  • at a co-working space in the weekend

In any case, you should be ready to work remotely or on-site, equally in the weekend or workweek.

Duration

2-day full time. Around 10h per day 9:00-19:00 with 2h break + overnight between day 1 and 2.

Calendar

The lab will be held towards the end of the 3rd semester to increase the probability that we can meet on site.

Contents

Saturday

I will illustrate in detail the workflow used to build the COVID-19 Data Hub from scratch. In particular:

  • workflow design
  • data acquisition
  • managing the code on Github
  • managing the data on the cloud
  • automating tasks via cron jobs

From this setup, we will see how to create an SQL database and an interactive visualization tool on top of it.

At the end of the day, we will discuss and select the projects you will start to work on in the hackathon in groups of 2-3 people. I will propose some projects, but you are encouraged to bring your own ideas.

Overnight

Have fun with your projects! I’ll be available remotely.

Sunday

Complete your project and pitch it at the end of the day. I’ll be there (hopefully) in person to help out.

Eligible students

Max 10 students.

Requirements

This is an advanced and intensive lab. Students who apply should be confident with at least 75% of the following topics and autonomously gain a basic understanding of the remaining 25% before the beginning of the experience. The topics are:

  • Github
  • Shell scripting (in Linux)
  • SQL
  • R or Python

Moreover, by the beginning of the lab you should have created:

How to apply

Send an email at dse@unimi.it, with your CV and cover letter explaining why you want to join this lab and how you meet the requirements. Please include in your CV all relevant links to you Github account, previous projects, your website or portfolio.

  • email: dse@unimi.it
  • subject: Hackathon: Data Acquisition, Management, and Visualization
  • attachments: CV and cover letter

Deadline: 31 January 2021, 23:59.