Every program has to allocate memory, for instance — it’s what happens when you create a variable in the program to hold data. In compiled languages such as Rust, coders must specifically allocate the memory they want, then ‘free’ it up when it is no longer needed; scripting languages usually hide those details. As a result, scripting languages are easier to learn but less memory-efficient. “When you’re building data structures designed to barely fit into the working memory of a large server, a factor of two takes something from being completely infeasible to being something that you could run if you have dedicated access to the machine,” Patro explains. So, his team develops its algorithms in Rust.

Titus Brown, a bioinformatician at the University of California, Davis, says newcomers to his group often start with R, because it’s easy to learn and has all the required bioinformatics tools. But many reach a stage at which they want to analyse thousands of genomes rather than a handful, which then produces hundreds of thousands of output files. “At that point, people will often be directed more towards Python, because Python has a broader array of abilities to deal with that scope of data,” says Brown.

Help is at hand

The user community is also a key consideration — Saibene says that’s one of the things she loves most about R. The language’s large, welcoming and engaged user community means that tools are developed and updated frequently; that local user groups exist around the world; and that tutorials and other resources are available in many spoken languages, including her native Spanish. The alternative — having resources available only in English — imposes a “cognitive load” on many users, she says. “You are going to remove a lot of barriers if those resources are in Spanish.”

Another consideration is economics. Although many languages are free to use, some require users to pay for a licence, which could put them out of reach of many researchers. Some scientists also prefer to use open-source software instead of proprietary systems. That said, tool developers often create ‘bindings’ so that tools they have written in one language can be used directly in multiple others — and many programming languages include ways to execute code in others. Artificial-intelligence tools, such as GitHub Copilot, make it easy to translate code from one language to another or to generate code in an unfamiliar programming language.

There’s no shortage of online help, whatever language you choose. Useful resources include The Carpentries, the Data Science Learning Community and Stack Overflow.