My first months as a RSE


I shall warn you naïve reader, this is a long read, so go get yourself a cuppa (maybe some biscuits) before starting. You have been warned…

Just over 9 months ago I joined the Research Software Engineering team at the University of Sheffield. I was quite lucky to have jumped into this position as I was finishing my PhD at the university of Manchester. Also, I am quite lucky to be working with a bunch of amazing people.

I have shared some of the reasons that lead me to pursue a career as a RSE. Basically, over the course of my PhD I realized that some things were broken (once again, this is just my point of view), and I wanted to somehow, fix some of these issues. For starters, the bulk of my PhD involved developing software, running a huge number of simulations, and as a consequence generating digital objects.

I was initially provided some bits of poorly/non-documented scripts and I spent months trying to reproduce the results obtained using such scripts… obviously after some time I decided to give up and had to write my own scripts from scratch. In another occasion, my supervisor and I came across the publication of a group working on a similar problem to that I was working on. Since the published article included a basic description of the software they generated I thought it would be possible to reach out and have access to this. I could not be more wrong, when I reached out I received a response from the PI stating that the only way they would share their code was if I and my supervisor signed a waiver in which we agreed to add them as significant collaborators and authors of any of the resulting publications. Again, I gave up and wrote my own scripts.

There is something else: the number of publications and citations of a given researcher are commonly used as metrics to determine how successful this individual is in his/her/x area of research, whether this individual is worthy of promotion or if he/she/x will make the most of the funding requested.

As a consequence, we, the individuals working in research, spend weeks (or months) writing articles. We make sure we thoroughly describe the research approach we used to obtain the results we are presenting: how we designed the experiment, how we selected the sample, how the data was analysed, what measures were taken to ensure the experiments and results are reproducible and accurate, and finally, the so sought after results. And yes, many (if not the majority) of the results published make use of the some sort of software for the data analysis and/or the generation of plots and figures. Yet not many papers request that the data is made available for reproducibility and replicability purposes. And even less papers require that the software/code/script used to get those results is made available.

We, as researchers spend many hours observing the scientific method to ensure people can trust our results: the method is documented, the apparatus is tested, and the results are demonstrated to be accurate, reproducible, and reliable. So why should computational software be any different? We should ensure that we develop software adhering to best practices with a minimum of understandable and frequently updated documentation and code testing. THIS realization is what lead me to become a RSE and an advocate for openness, replicability and, reproducibility.

I acknowledge that at the very beginning this seemed a bit like a gamble, and some of my laboratory colleagues did not understand my decision. Why would I not jump into the traditional academic path: a decent postdoc (or way too many) and then fight for a lectureship? I, myself questioned my decision. Even after having a job offer from the University of Sheffield I could not help but wonder: is this the career path I want to follow?

Over the last 9 months I have been able to work with a bunch of amazing researchers willing to become more open about their research and want to ensure that the code they develop is not only useful for their purposes, but actually, public and useful to others as well.

I have came across cases in which the PI decided to make datasets open as well as scripts used to analyse such data. The result: incredible collaborations, joint publications, high public visibility, and loads of invitations to present at international conferences (Don’t you believe so?? Have a look at Dr. Alasdair Rae website or follow him on Twitter) Did I mention loads of citations? When you ensure that your data is not only properly collected but also properly curated (e.g. assigned relevant metadata, version recorded, archived) and your software is developed to high standards these become valuable digital objects. Which in turn people can cite, acknowledge, and use to advance science.

Some research groups are becoming more aware of this situation and are costing RSEs in their grant applications. Researchers are starting to understand that well developed, well maintained, and open software can mean a difference between reliable results or having to retract those results at a later stage. People understand that they need good digital skills in nowadays research world. Hence when we announced a Software Carpentry workshop covering the unix shell, programming in python, version control, and databases, the tickets sold out in under and hour (yes under an hour!!).

Yet, there is still much to be done. Many people within the university still don’t know we exist, and many researchers still see software as a second class tool rather than as an integral element in their research pipeline. We still need to make a case on why a RSE team is needed and why best coding practices should be considered as a core skill for any research student and staff involved in computational studies. That is why also a great part of my job is associated to dissemination and training.

I also have had the chance to work on other projects I am really passionate about such as technical and digital inclusion and diversity. In addition, I have had the opportunity to be a member of the RSE conference committee as both talks and diversity co-chair. Something I have enjoyed very much. I love contribution to community building activities and this has given me a great opportunity to do so. More importantly, it has allowed me to understand the RSE community better and think of ideas to make this conference a better, more welcoming event for all of us attending.

Now the RSE panorama looks a bit brighter. More and more institutions are aware of the need of having a RSE team or at least individuals performing RSE tasks. Do I still question my career choices? No, I do not. I get to work with loads of amazing/interesting people. I contribute to large projects in multiple disciplines. I get to work with code and data that is totally new to me (so I am constantly learning). I mentor PhD students who are writing their thesis (mainly on how not to give up and how to break this into manageable chunks). And I get a say on ways to strengthen the community and on projects I want to pursue. If you ask me I love what I do and the people I have met along this path. I still suffer from imposter syndrome though (but that will be covered in another post), I always have a lot of work and still feel I am not doing enough, but I am still relatively new at this. I am sure there are still numerous challenges we will have to face as RSEs. And I am sure I will share these here along with our victories.

But for now, I have kept you here long enough x


My first months as a RSE

I shall warn you naïve reader, this is a long read, so go get yourself a cuppa (maybe some biscuits) before starting. You have been warned…

Alan Turing Institute Data Study

I had the opportunity to attend the Alan Turing Institute (ATI) Data Study Group (22nd-26th May 2017). The ATI is the national institute for data science and as such, it has strong ties to both academia and industry.

Archive

  • 2017
comments powered by Disqus