Servicenavigation

Hadoop and NGS data processing hackathon

The hackathon is a technical meeting of the WG3 parallel computing task force.

Local organisers: Aleksi Kallio and Eija Korpelainen.

Background

The bioinformatics community is adopting novel cloud computing technologies to cope with challenges set forth by complex data integration tasks and NGS data masses. So far the focus has been on virtualization and technologies that allow rapid deployment of computational environments, such as Amazon EC2.

This hackathon style workshop focuses on the next step: Adopting cloud computing beyond virtualization. Technologies such as Hadoop allow to scale out massive computational environments, which will be central when tackling future challenges of growing NGS datasets. Technically Hadoop, map-reduce programming and so called NoSQL databases are a greater migration step than the previous virtualization work. For that reason collaboration and pragmatical approach are needed. This workshop will bring together people working with NGS data processing and focuses on practical problem solving.

Date & location

The hackathon will be held from Monday to Wednesday 5.-7.3.2012, at CSC - IT Center for Science Ltd, Espoo (Helsinki), Finland.

Participation

The hackathon is open to everyone interested, but please inform local organisers about your participation before Monday 13.2. Only one person per COST partner can be reimbursed for travel and hotel costs.

Tasks

The main content of the hackathon is working together with various tasks. See Tasks page for more information and results.

Schedule

Monday 5.3.
12.00 Lunch ground floor cafeteria
13.00 Welcome and practical issues 7th floor meeting room Sessio
13.10 Round of introductions 7th floor meeting room Sessio
13.30 Introduction to CloudBioLinux / Roman Valls 7th floor meeting room Sessio
14.00 Working with tasks for the rest of the day 7th floor meeting rooms Sessio and Selain
15.00 Coffee 7th floor meeting rooms Sessio
~18.00 Possibility for “unofficial social program”
Tuesday 6.3.
9.00 Wrap-up of Monday 7th floor meeting room Sessio
9.20 A bit about Hadoop / Luca Pireddu 7th floor meeting room Sessio
9.40 Working with tasks for the rest of the day 7th floor meeting rooms Sessio and Selain
10.30 Coffee 7th floor meeting room Sessio
12.00 Lunch ground floor cafeteria
15.00 Coffee 7th floor meeting rooms Sessio
20.30 Dinner restaurant Kuurna in central Helsinki
Wednesday 7.3.
9.00 Wrap-up of Tuesday 7th floor meeting room Sessio
9.20 Working with tasks for the rest of the day 7th floor meeting rooms Sessio and Selain
10.30 Coffee with BioMedInfra / ELIXIR people 7th floor meeting room Sessio
11.30 Future prospects of scalable cloud computing / Keijo Heljanko 7th floor meeting room Sessio
12.00 Lunch ground floor cafeteria
13.20 Wrap-up of Wednesday 7th floor meeting room Sessio
14.00 End of the hackathon, we can stay for the rest of the day 7th floor meeting rooms Sessio and Selain

Accommodation

Accommodation is arranged from Radisson Blu Espoo hotel (http://www.radissonblu.com/hotel-espoo). It is within walking distance from CSC.

You have to pay the accommodation yourself and then include the costs to COST reimbursement, if you are entitled to.

Transportation

CSC is located at: Life Science Center, Keilaranta 14, Espoo. See Getting there.

From the Helsinki-Vantaa airport (code: HEL) you can either take a bus or a taxi. Taxi will cost €40 - €45.

With bus, first take 540 from airport to Leppävaara railway station. Step off on a bridge on top of the railway station (stop name: Leppävaara) and take bus 510 or 512A (towards Westendinasema). Step off at stop called Keilaranta. Bus ride takes 40-50 minutes.

IT facilities

Each participant needs to bring his or her own laptop. Laptops can be also arranged locally, but please contact in advance. There will be some displays, keyboards and mice to connect to your laptop. LAN and WLAN will be available.

During the hackathon we will have a dedicated Hadoop cluster consisting of 40 HPC nodes, provided by the BioMedInfra people at CSC. There will be also an NFS server that can be used in benchmarking.

Cluster nodes are Dell PowerEdge C6100. Some have 24GB and some have 48GB of memory. CPUs: 2x Intel Xeon X5650 2.66GHz (6-core), HT enabled → 24 threads in total. Disks: 2x500GB SATA 7.2k 3.5”. Network: Onboard Intel 82576 Dual Port 1GbE + Intel 82559 Dual Port 10GbE Mezzanine Card, Gen2 PCIe x8.

NFS server is 4x Intel(R) Xeon(R) CPU L5410 @ 2.33GHz, 8 G memory, 512 GB Flash Cache, two 10G network interfaces, two FC connections to SAN, 6 LUNS taken from 6-different RAID-groups (NL SAS disks, 7200 rpm), raidgroup setup is RAID6 8D+2P.

IRC

We will use IRC during the hackathon and keep it running also afterwards.

Channel is csc_hackathon and network is Freenode.

/join #csc_hackathon

 
Last modified: 2012/03/16 12:41 by Aleksi Kallio
DokuWikiRSS Feed