Further learning
Events
- BioInfoCore meeting – Please be aware that the BioInfoCore meeting takes place every year during the ISMB conference
- RITrainPlus – Join this comunitiy’s private Facebook Group to get updates on upcoming seminars, community events and training opportunities. This CoP also has presence in LinkedIn.
Communities of practice
- Bioinfo-Core – The ISCB’s community of practice for core facilities managers
- RITrainPlus – Project focused on upskilling the technical staff of research infrastructures and core facilities in different fields
- UK Bioinformatics group – For discussion, announcements and meeting organisation
- Core Technologies for the Life Sciences – Europe-based professional body for core facility managers of all flavours
- Association of Biomolecular Resource Facilities – US-based society for core facility managers of all flavours
- Aurora – Advance HE’s leadership development initiative for women
- Network of European Bioimage Analysts
- biostars.org
- reddit.com/r/bioinformatics
- EMBL-EBI has set up a mailing list, cfpi@ebi.ac.uk, for participants (delegates and trainers) of our courses for core facility managers and principal investigators. The web page for users of the mailing list is: https://listserver.ebi.ac.uk/mailman/listinfo/cfpi
- There is also an email-based interface for users (not administrators) of your list; you can get info about using it by sending a message with just the word ‘help’ as the subject or in the body, to cfpi-request@ebi.ac.uk
Recruiting, developing and training
Competency frameworks
Many of these are now navigable at the EMBL-EBI Competency Hub.
- The ISCB education committee’s competency framework for bioinformatics professionals
- A clinical bioinformatics competency framework to support Health Education England to prepare clinical practitioners for the application of genomics in the healthcare service
- The RItrain competency framework for managers and leaders of research infrastructure
- The CORBEL competency framework for technical operators of research
- The BioExcel competency framework for scientists working on biomolecular modelling and simulation
Developing and training
- Datacamp: https://www.datacamp.com/
- EMMRI – Executive Master’s in Management of Research Infrastructures
- As well as the full master’s, there’s an open programme and a freely available webinar series
- Bioinformatics Training for Life Scientists: Guidelines for Best Practice
- ELIXIR’s Training e-Support System (TeSS)
- EMBL-EBI Training Programme
- University of Cambridge Bioinformatics Training Programme
- Whitehead Institute BaRC hot topics
Quality assurance for databases and software
- ELIXIR core resources paper: https://f1000research.com/articles/5-2422/v2
- ELIXIR software recommendations: https://f1000research.com/articles/5-2000/v1
- Bioschemas semantic markup: www.bioschemas.org
- Subscribe to the ELIXIR informed newsletter: https://www.elixir-europe.org/news
- Global alliance for genomics and health: http://genomicsandhealth.org/
- Visualisation of statistical principles: http://students.brown.edu/seeing-theory/
Project management and time-tracking tools
Models/concepts
- Gannt Charts
- Agile process – initially used for software development but readily adaptable to other projects. Basis is two-week sprints to get defined chunks of work done.
- Kanban – better for general operations than projects; process for shifting as many tasks from ‘doing’ to ‘done’ as efficiently as possible
- Risk register – tool for managing and mitigating risk in a projec
Tools
- Jira agile project management software – much loved by software developers; free licenses for academics; integration with other Atlassian tools (e.g. confluence wiki); beware – there are lots of third-party plugins that do really cool things but they are expensive!
- Asana project management app – user friendly; good integration with visualisation tools such as instagannt and diary crosslinking; free licensing only good for teams of 15 or fewer
- Excel – includes simple templates for, e.g. Gannt charts
- Github/https://about.gitlab.com/ – single application for the development of the entire DevOps cycle.
- https://toggl.com/ – time tracking software; can be integrated with numerous other tools
- https://clickup.com/ – user-friendly alternative to Jira; built with generic project management in mind but includes lots of tools for managing software development projects.
- Google templates for project management
- Redmine
- Trello – simple, visual kanban-style board; not ideal for complex projects but OK if you just have to get through tons of tasks or tickets.
- Bugherd – neat combination of project management and collaborative bug tracking; great for development of web-based services
- Basecamp – project management and team communication; central storage of shared documents
Compute
- Docker cannot be run on HPC unless you give your users root access (which you don’t want to do!); Singularity was designed to enable containerised processes on HPC. Docker is moving in this direction but is not there yet.
- Resourcing the use of commercial cloud – you need to optimise use of commercial cloud services to ask for the right resources for the job; you need to calculate how your user is billed for the job that they’re doing (or that you’re doing for them). Platform as a service enables you to ‘get stuff done’ whilst retaining a lot of flexibility; containers as a service are a subtype of platform as a service. Kubernetes enables you to manage containers as a service.
- You also need a workflow language to string containerised services together, e.g. common workflow language. Better to push the task to the data than to pull the data to the task.
- Software as a service – EMBL-EBI tools as a classic example; user operates everything over the web.
- Hybrid cloud – can you set up an entire workflow and port it from your own compute to the public cloud if you need more compute than your system can provide? But how do you manage large data sets in hybrid-cloud mode? There are trade-offs to be made; better to use AWL’s standard protocol than NFS.
- If you know where your data are, or if all your data are in one place, Galaxy, NextFlow etc. are fit for purpose; if your data are distributed in many places and data discovery becomes part of your workflow, you need something more scalable that allows for the use of in-house or external clusters.
- NF-core – community effort to collect a curated set of analysis pipelines built using Nextflow
- Packaging up different versions of software for different types of user – Bioconda – https://bioconda.github.io/
- The EMBL-EBI ResOps cloud consultancy team is currently in the process of creating a standalone version of its ResOps that can be watched on-demand, but it will be some time before this is released. The material for self-study is at https://bit.ly/resops-2019, and if anyone wants to give it a go Tony Wildish has very kindly offered to help with any questions: please send them to cloud-consultants@ebi.ac.uk.
- There are also courses from Coursera, Linux Academy and Kube Academy on getting cloud ready; it’s worth checking the course pre-requisites first.
Papers and other pearls of wisdom
Papers
- Establishing a Successful Bioinformatics Core Facility Teamy. 2009; PLoS Compututational Biology.
- Metrics for Success: Strategies for Enabling Core Facility Performance and Assessing Outcomes. 2016; Journal of Biomolecular Techniques.
- A Framework for Managing Core Facilities within the Research Enterprise. 2009; Journal of Biomolecular Techniques.
- Institutional Management of Core Facilities during Challenging Financial Times. 2011; Journal of Biomolecular Techniques.
- Acknowledging and citing core facilities. 2022; EMBO Reports.
- Science Forum: A survey of research quality in core facilities. 2020; Cell BiologyChromosomes and Gene Expression.
- Core facilities are central hubs of discovery. 2022; Nature Index.
Blogs
- Ewan Birney’s blog. See especially
- Advice on big data Experiments and analysis, Part 1
- Managing and analysing data part II
- Publishing big data science
Reports, forums and other resources
The following resources were provided by Nicolas Descostes
- Computing practices: https://lists.papersapp.com/Jp1ShEk_mgiD
- Open source: https://lists.papersapp.com/BENsIwOp3764
- Reproducibility: https://lists.papersapp.com/4gvwAqhxi84B
- Teaching Bioinformatics: https://lists.papersapp.com/e9OQtLVsyla7
- Facility experience: https://lists.papersapp.com/zUvYsDXwgV0G
- Facility evaluation: https://lists.papersapp.com/BD3lSHoFtHIS
- Bioinformatics communities: https://lists.papersapp.com/sFLcToWp3y3s
- Bioinformatics CVs: https://lists.papersapp.com/6tYXheLvvS52