Hackathon: Data Acquisition, Management, and Visualization
This hackathon is part of the laboratories organized by the MSc in Data Science for Economics, University of Milan, Italy. Academic year 2020/2021.
In this 2-day, full-time lab, you will learn how to create, deploy, and maintain a real-world data-driven platform. Through the case study of the COVID-19 Data Hub, we will see how to design and develop an international platform that now serves thousands of users worldwide with more than 3 million downloads. You will work on similar challenging projects in the hackathon that we start at the end of day 1, work overnight, and finish with your pitch at the end of day 2.
From a technical side, you will gain hands-on experience with code management in Github, cloud computing with Google Cloud Platform, shell scripting in Linux, database management, advanced R/Rshiny and Python programming.
This is an advanced and intensive lab. Students who apply should be confident with at least 75% of the topics covered and autonomously gain a basic understanding of the remaining 25% before the beginning of the experience. If possible, this 2-day, full-time lab, will be held on the weekend so that 1) you will have no other commitments and 2) only highly motivated students will be willing to sacrifice their weekend and join us. Depending on the situation and on the university constraints this lab may take place:
- online in the weekend
- at the University in the workweek
- at a co-working space in the weekend
In any case, you should be ready to work remotely or on-site, equally in the weekend or workweek.
2-day full time. Around 10h per day 9:00-19:00 with 2h break + overnight between day 1 and 2.
The lab will be held towards the end of the 3rd semester to increase the probability that we can meet on site.
I will illustrate in detail the workflow used to build the COVID-19 Data Hub from scratch. In particular:
- workflow design
- data acquisition
- managing the code on Github
- managing the data on the cloud
- automating tasks via cron jobs
From this setup, we will see how to create an SQL database and an interactive visualization tool on top of it.
At the end of the day, we will discuss and select the projects you will start to work on in the hackathon in groups of 2-3 people. I will propose some projects, but you are encouraged to bring your own ideas.
Have fun with your projects! I’ll be available remotely.
Complete your project and pitch it at the end of the day. I’ll be there (hopefully) in person to help out.
Max 10 students.
This is an advanced and intensive lab. Students who apply should be confident with at least 75% of the following topics and autonomously gain a basic understanding of the remaining 25% before the beginning of the experience. The topics are:
- Shell scripting (in Linux)
- R or Python
Moreover, by the beginning of the lab you should have created:
- your account on GitHub (free)
- your account on Google Cloud Platform (free tier up to $300, definitely enough for our purposes)
How to apply
Send an email at firstname.lastname@example.org, with your CV and cover letter explaining why you want to join this lab and how you meet the requirements. Please include in your CV all relevant links to you Github account, previous projects, your website or portfolio.
- email: email@example.com
- subject: Hackathon: Data Acquisition, Management, and Visualization
- attachments: CV and cover letter
Deadline: 31 January 2021, 23:59.