Hadoop and NGS data processing hackathon IV

The hackathon is a technical meeting of the WG3 parallel computing task force.

Local organisers: Dimitar Vassilev, Ognyan Kulev.


The large volumes of data generated by modern sequencing technologies present significant challenges in their manipulation and analysis. In addition, the continuous technological advances are lowering sequencing costs, thus allowing larger data sets to be collected and new types of applications to be envisioned. The bioinformatics community is adopting novel computing technologies to cope with the challenges set forth by complex data integration, management, storage and processing tasks.

This hackathon will give the chance to experts in the field to work together, exchange ideas and implement or test prototype solutions to problems of mutual interest. The outcome of this event will be a report describing the results produced by any research, experimental, or design work held at the event, and a reference to any source code that was produced, which must be released to the public.

The hackathon is centered on problems in the high-throughput sequencing domain. Within this realm, the topics of interest include, but are not limited to:

  • cloud computing
  • data/sample tracking; recording operations performed on data
  • dataset and storage management
  • distributed bioinformatics applications
  • distributed cloud databases
  • distributed data processing frameworks, such as Hadoop
  • high-level distributed programming (e.g., Pig, Hive)
  • process automation
  • workflow management; high-level programming

The event is aimed at bioinformaticians and software developers who are using or planning to use Hadoop or similar cloud computing frameworks.

What is a hackathon?

A hackathon is an event in which computer programmers and others involved in software development […] collaborate intensively on software projects. (Wikipedia).

In short, the way SeqAhead hackathons are organized is as follows. Prior to the event, participants contribute ideas for “Tasks” to be tackled at the hackathon. Tasks are items of work defined such that:

  • something useful could be delivered in the time span available (usually two days);
  • they should be of interest to more people than just the author.

At the event, the first thing to do will consist in everyone expressing their interest for whichever tasks they wish to work on. Then we enter the following loop:

  1) Pick a task that interests you;
  2) Approach others who are also interested in the same task;
  3) Create a wiki page for the task (use it to document your work results);
  4) Work together on the task, produce something;
UNTIL true == false;

What to produce?

Whatever the task you decide to work on, produce something concrete.

  • Code: produce a patch for a project or commit new code to our own projects.
  • A report: if you're doing performance testing for instance or dealing with a problem that is too big to solve in a couple of days, produce a report that details your results and/or your longer-term plan or a design for a solution.
  • Documentation, howtos, recipes; i.e. some of the tasks are to produce instructions for performing certain task.

Hackathon tasks

Date and Location

The hackathon will be held on 1.-2. April 2014 (Tuesday and Wednesday) at AgroBioInstitute, Sofia, Bulgaria. Exact address is:

4th floor, Faculty of Biology, Sofia University
8, Dragan Tsankov Blvd
1164 Sofia

Google maps:


The event is open to everyone. Please contact the local organizers if you're interested.

Getting here

Sofia Airport <> is within Sofia city boundaries. Terminal 1 is mostly for charter and internal flights, so you'll arrive and depart from Terminal 2.

Currency in Bulgaria is BGN <>. €1 is very close to 2 BGN. It's recommended to have BGN for shopping.

Public transport by bus is not recommended. It's expected that you bought ticket beforehand and it have to be punched in mechanical punch machine. You may try to buy ticket from bus driver but he/she may not have ones or doesn't understand English. Ticket price is equivalent to €0.50. Terminal 2 of Sofia Airport doesn't have a place where you can buy tickets (should be checked again).

Subway from airport to city is in construction and not available yet. In the city, subway is highly recommended for transport. At metrostations, there are machines for tickets that accept BGN coins and banknotes.

It's preferred to get taxi from the airport to the accommodation place. Use the official taxi company “OK Supertrans”: <>. When you leave customs, go right and you'll see a line of yellow taxi cabs with the logo of “OK Supertrans”. Do not accept offers by random taxi drivers that walk in the airport. Price to the accomodation should be at most €10-15.



Tuesday (2014-04-01)

09:00 Illumina presentation
10:00 Coffee break
10:15 Hackathon
12:30 Lunch
13:30 Hackathon
16:30 Coffee break
16:45 Hackathon
19:00 Hackathon day end
19:30 Dinner

Wednesday (2014-04-02)

09:00 Hackathon
10:15 Coffee break
10:30 Hackathon
12:30 Lunch
13:30 Hackathon
15:30 Coffee break
15:45 Hackathon
17:30 Closing hackathon, reporting progress
18:00 Hackathon ends

COST reimbursement

For those who qualify, have registered for the event through the local organizers, and attend all days of the hackathon, COST will reimburse you for some of your expenses. The invitation you receive should contain a link to a memo that explains everything in detail. In short, you should be covered for:

  • 3 nights x €120 for accommodations and breakfast
  • 3 meals x €20 (plus you will be invited to the event dinner)
  • flights
  • local transport, with some limitations

These figures are provided only for informational purposes only. Please verify the figures by checking the COST system or contacting the chair of the COST Action BM1006, which funds this event.


Wi-Fi: SeqAhead or SeqAhead2
Password: hackathon2014

JGC compute machines follow. You can connect with ssh/PuTTY. Username is “hackathon”. Password is “hackathon2014”. You can “sudo -i” to get root access. System is Debian 7 (Wheezy). For assistance in installations, talk with Ognyan or Milko. (29G RAM) (29G RAM) (60G RAM) (60G RAM)

People could apply for the GWDG German cloud, before the hackathon:

It is currently used by INCF for development tasks and it works pretty well.

Unless the UPPMAX cloud is available…


Last modified: 2014/04/01 10:22 by Ognyan Kulev
DokuWikiRSS Feed