Rosalind Project

The Rosalind Project started right after I had finished with my Bachelors in Biology. I had decided after taking to some of my teachers in school during my last semester that I wanted to study more computer science and ultimately study
Bioinformatics (which I ended up doing the following year).

I ended up completing quite a large amount of them during this time period and learned a lot of the basics of coding and programming by doing these tasks. During my masters courses I did not end up making many more of these as most of my time
was utilized by the school. However, during my internship at Pharmacelera I learned a lot more about good coding practices and redesigned each of the code bits for use in a library that could eventually be downloaded by any one attempting to
perform the tasks listed in Rosalind.

So now, I will explain how the project is broken down, what else needs to be done and how to use the codes found here!

Rosalind Website

Project Breakdown


The project is broken into two major parts on the top most level, the input files and the source code. The input files is where you can store all the inputs that you are given and where the code will pull data from to complete tasks. If you
it to be used in another way you can feel free to message me or you can attempt to modify it yourself, it’s not too complex of an input system.

Source Code

The code mainly functions in a way similar to the way that the deepchem library functions, the only library that I have spent a large amount of time becoming familiar with. It has a root call function called runLib that can be called and
contains a full list of all of the individual problems that I have had to deal with. This file is only used if you want to use the library and not include it in another larger system.

Rosalind Library

This is the meat of the code and contains the problem code, the tool code and the loader code (There may be more sections included later if I end up needing them).

The problem codes are each written to solve a single problem found at rosalind and are called when using the runLib code. These in turn pull from their own code, tool and loader code.

Loader is primarily a single bit of code that takes the import data and sends it to the program in a usable format, specifically in a list of list. (I will eventually move this to an object but it is not necessary at the moment).

The tool code is primarily functions that I have used in more than one code and can easily be reused in new code. As more progress is made on this library this section will expand quite a bit more.

More Resources

Github link

Rosalind Problem Tutorials

  • DNA: Counting DNA Nucleotides
  • RNA: Transcribing DNA into RNA
  • REVC: Complementing a Strand of DNA
  • FIB: Rabbits and Recurrence Relations
  • GC: Computing GC Content
  • Hamm: Counting Point Mutations
  • IPRB: Mendel’s First Law
  • Subs: Finding a Motif in DNA