When doing assignments, you discovered that it was challenging to collaborate
Collaboration is key to success, also when coding!
You could:
When doing assignments, you discovered that it was challenging to collaborate
Collaboration is key to success, also when coding!
You could:
All of which would be a recipe for inevitable disaster!
Of course there is a better way!
We can use… git!
Git is a distributed version control system
It tracks versions of files
“Git is a distributed version control system that tracks versions of files. It is often used to control source code by programmers who are developing software collaboratively.” (Wiki definition)
Of course there is a better way!
We can use… git!
Git is a distributed version control system
It tracks versions of files
“Git is a distributed version control system that tracks versions of analyses. It is often used to control analysis code by bioinformaticians who are developing pipelines collaboratively.” (R4BDS definition)
Bonus info: The brain child of… Linus Torvalds
Tracking changes in the Linux kernel
R
versus RStudio
, engine and interfaceIf you share code in mails, chat groups, shared docs
You have no way of tracking what happened when to what and how
You may sit with some results and have no idea how they came about
This is the absolute opposite of doing reproducible bio data science
Git and github:
helps you track the EXACT changes made in your analysis project
enables multiple collaborators
are industry standard in largely any company doing (Bio) Data Science
facilitates teamwork increasing productivity
Again: Spend time initially to save time in the long run
Remember the talk from the first lab? Using git is absolute key to doing reproducible research
…and perhaps nearer to you for now
Try it out for your self
Break and then exercises
TODAY FOLLOWING THE EXERCISES POINT-BY-POINT IS SUPERCALIFRAGILISTICEXPIALIDOCIOUS!!!
R for Bio Data Science