Success Stories in Systems Biology by UCD Earth Institute

SUCCESS STORIES IN SYSTEMS BIOLOGY

Project funded by the European Comission under the Seventh Framework Programme for Research and Technological Development

Contents

Welcome

Research

A model that gets to the heart of systems biology Denis Noble, Oxford University

Taking a systems-eye view of cancers in children Walter Kolch, Systems Biology Ireland

Systems biology study points to Turing model of finger formation James Sharpe, Centre for Genomic Regulation

Balancing flavour and texture in tomatoes - with systems biology Stuart Dunbar, Syngenta

Systems X.ch: Swiss bank on systems biology Daniel Vonder Muehll, SystemsX, Gisou van der Goot, EPF Lausanne, Cris Kuhlemeier, University of Bern

Answering big questions with a small bug Luis Serrano, Centre for Genomic Regulation

Blueprints for life Bas Teusink, Netherlands Platform for Systems Biology

Merrimack: Following a systems path to drug discovery Peter Sorger, Massachusetts Institute of Technology, and Birgit Schoeberl, Merrimack Pharmaceuticals

Beating the conundrum of variability in cardiac simulations Blanca Rodriguez, Oxford University

Stats and modelling â&#x20AC;&#x201C; a Belgian diagnostic tool for arthritis Thibault Helleputte, DNAlytics

Virtual Liver Network: a collaborative solution to hepatic diseases Adriano Henney, Virtual Liver Network

Tools and Resources

SBML: A lingua franca for systems biology Michael Hucka, California Institute of Technology

Model Support: JWS Online and BioModels Database Jacky Snoep, Stellenbosch University, and Henning Hermjakob, European Bioinformatics Institute

SYSMic: An interdisciplinary skills course for biologists Gerold Baier and Geraint Thomas, University College London

COPASI: An open source software package easing the path to modelling Pedro Mendes, University of Manchester

SEEK and ye shall find data Katy Wolstencroft, University of Leiden

ISBE - www.isbe.eu

Welcome I am very pleased to introduce this collection of success stories in systems biology, produced as part of the Infrastructure for Systems Biology Europe (ISBE) project and funded through the European Union Seventh Framework Programme. This is just a small sample of the cutting edge

BIOGRAPHY

research being carried out in systems biology in academic, clinical and industrial settings which is making an increasingly significant impact as we seek to tackle grand societal challenges in areas as diverse as health, agriculture and biotechnology. In the past fifteen years, systems biology research has become embedded across a range of biological and biomedical fields. This has been driven by the growing abundance and complexity of large biological data sets, the development of tools and techniques with which to perform comprehensive, genome-wide analyses, and the capacity to share these via high speed connections between disparate research groups and disciplines. Using a systems approach, life scientists are now able, for the first time, to study the complex and dynamic interplay between the components of a system (be it at the level of a cell, tissue, organ, organism or population). This can now go beyond understanding function, to enable intervention in the behaviour of the system in a predictive and rational manner. The stories told in this publication illustrate the breadth of work going on in this dynamic area. Challenges remain, however: how to increase access to modelling, developing workable standards that enable the effective reuse of data and models as well as training new generations of scientists

RIChaRd KITnEY IS PROfESSOR Of BIOMEdICaL SYSTEMS EnGInEERInG, ChaIRMan Of ThE InSTITuTE Of SYSTEMS and SYnThETIC BIOLOGY and CO-dIRECTOR Of ThE EPSRC naTIOnaL CEnTRE fOR SYnThETIC BIOLOGY and InnOvaTIOn aT IMPERIaL COLLEGE LOndOn.

in this multidisciplinary approach. A number of the tools and resources featured in this collection have been developed to address individual bottlenecks but an integrated infrastructure for systems biology is needed if we are truly to exploit the potential of systems biology. From 2012-2015, the ISBE consortium of 23 partners from 11 EU member states has been working with the academic, clinical and industrial research communities to develop plans for a systems biology infrastructure that will address these challenges with services that support a range of users, from novices to experts, from SMEs to Big Pharma, from university labs to hospital clinics. Our proposals for the infrastructure’s development over the coming years have been published in a business plan (July 2015) and the roll out of preliminary services is due to commence in late 2015. I would like to thank my colleagues from academia and industry who gave up their time for this collection. They have given a fascinating insight into the iterative research process that is the basis of systems biology, the cycle of experimentation and computational modelling that harnesses the potential of ‘big data’ and turns it into tangible outcomes for society.

Kitney is a leading researcher in the field of synthetic biology and, with Professor Paul Freemont, has been responsible for developing the UK National Centre for Synthetic Biology - which is now recognised as one of the leading international centres in the field. In June 2001, Professor Kitney was awarded the Order of the British Empire (OBE) in the Queen's Birthday Honours List for services to Information Technology in Healthcare.

Yours Prof. Richard I Kitney, OBE FREng Coordinator, Infrastructure for Systems Biology Europe Imperial College London

ISBE - www.isbe.eu

BIOGRAPHY PROF. DENIS NOBLE DISCUSSES THE DEVELOPMENT OF HIS WORK IN CARDIAC CELL MODELLING FROM 1960 TO THE PRESENT

DENIS NOBLE IS A BRITISH BIOLOGIST WHO WAS THE FIRST TO MODEL CARDIAC CELLS, DETAILED IN TWO PAPERS IN NATURE IN 1960. HE WAS EDUCATED AT UNIVERSITY COLLEGE LONDON AND MOVED TO OXFORD IN 1963 AS FELLOW AND TUTOR IN PHYSIOLOGY AT BALLIOL COLLEGE. FROM 1984 TO 2004, HE HELD THE BURDON SANDERSON CHAIR OF CARDIOVASCULAR PHYSIOLOGY AT OXFORD UNIVERSITY. He is now Professor Emeritus and co-Director of Computational Physiology. His research focuses on using computer models of biological organs and organ systems to interpret function from the molecular level to the whole organism.

Your heart: without it, you wouldn’t survive very long. So medicine strives to keep it healthy and fix it if something goes wrong. Yet the heart’s central role in our bodies can also make it difficult to test out new clinical approaches in humans. One way to get around this is to build a mathematical model that predicts how the heart will behave, and today the ‘virtual heart’ approach is helping to make drug discovery and testing safer.

Middle out means that you start at one level - which might be in the middle, in our case it’s the cell. Then you reach down to individual molecules and you reach up to the organ.

The heart model has its origins in 1960, and its growth since then exemplifies the systems biology approach of using modelling and experimental data to enable new insights. It began when Denis Noble and his PhD supervisor Otto Hutter worked with heart tissue at University College London. They were interested in a type of electrical ‘gate’ in heart cells called the potassium channel, and Noble wanted to develop a mathematical model of the heart to explore its actions. He based his work on a 1952 mathematical model that described the characteristics of excitable cells, and to build up the model Noble managed to wrangle some time on the Ferranti Mercury Computer in London. He sat in on maths lectures to get up to speed with the

formulae and spent late nights punching in machine code in his allotted time between 2 and 4am. Soon his work paid off and the heart model began to work. “It didn’t take too long to get to the point where rhythm was coming out of the equations,” recalls Prof. Noble, who is today an Emeritus Professor at Oxford University and President of the International Union of Physiological Sciences. Papers in the prestigious journal Nature followed swiftly, and since then the heart model and experimental data have closely intertwined, building up our knowledge of how this key organ works. In some cases, the model has informed the experiments - Noble recalls how in the early 1960s his model put paid to a method of using double probes to stimulate heart tissue in the lab: the maths clearly showed that the experimental approach was disrupting heart cell function. In other cases, experimental findings enhanced the model. “By about 1967, the existence of calcium channels had been demonstrated, and that was the first point at which it was obvious that the model would have to be expanded,” says Prof. Noble. “That process of expanding and taking more and more into account has gone on ever since.” In the decades since he punched machine code into the Mercury, Prof. Noble has worked with collaborators around the world to build up the heart model and shed new light on how ion channels work. Meanwhile, computer technology grew too, enabling more sophisticated modelling and the development of a virtual organ. The growth of the heart

ISBE - www.isbe.eu

model exemplifies the ‘middle-out’ approach that Prof. Noble has long supported. “Middle out means that you start at one level - which might be in the middle, in our case it’s the cell,” he explains. “Then you reach down to individual molecules and you reach up to the organ.”

We used computation to show why ranolazine’s combination of actions would be expected to be synergistic and that provided data about the drug as it went through regulatory approval.

He now sees potential for the virtual heart to continue informing drug discovery and regulation, thereby reducing risks in drug development. “Many side effects of drugs hit the heart and cause arrhythmia, that has in the past been the cause of withdrawal of drugs,” he says. “And many of the companies have got out of this kind of work, it’s too risky so we are looking to see if you can use the model to filter at an early stage synergistic actions of potential drugs. Getting it right in the early stages [of drug discovery and development] is a good idea, and this is where the model can help.” Words: Claire O’Connell January 2014

The mathematical approach can also offer a safe and ethical support to look for new medications and anticipate side-effects, says Prof. Noble, explaining how in one case the heart model showed in the late 1970s that blocking a newly discovered ion channel would have interesting clinical effects. “We were able to show that a blocker of this mechanism would not stop the heart, that it would slow it,” he says. “So a pharmaceutical company looked for and found a compound that did that, and it is now out there as an approved drug ivabradine.” In another case the heart model helped to explain the dual-action effects of a compound called ranolazine, explains Prof. Noble. “We used computation to show why its combination of actions would be expected to be synergistic,” he says. “And that provided data about the drug as it went through regulatory approval.”

ISBE - www.isbe.eu

CuRREnT dEvELOPMEnTS In CaRdIaC CELL MOdELLInG Premature heartbeats explained as a change in the stability properties of the dynamical system of the heart cell during the course of an action potential (Tran et al., 2009). Tissue electromechanics models show how infarctions can cause arrhythmia. Multiscale models of electrophysiology illustrate how cellular action potentials give rise to the electrocardiogram (ECG) measured macroscopically on the surface of the body (Sundnes et al., 2006). Multiscale, multiphysics models can account for the effects of genetic mutations at levels from ion-channel structure, function, and macroscopic current; cell, tissue and organ function

further Information Denis Noble’s website www.musicoflife.co.uk

BIOGRAPHY WALTER KOLCH ON A EUROPEAN PROJECT THAT IS TACKLING PAEDIATRIC TUMOURS USING SYSTEMS BIOLOGY

ASSET IS A 14 PARTNER COLLABORATIVE EUROPEAN FP7-HEALTH-2010 PROJECT IN THE THEME ‘TACKLING HUMAN DISEASES THROUGH SYSTEMS BIOLOGY APPROACHES’. USING A COMBINATION OF STATE-OFTHE-ART GENOMICS, PROTEOMICS AND MATHEMATICAL MODELLING, ASSET’S MAJOR GOAL IS TO IDENTIFIY MECHANISTICALLY UNDERSTOOD NETWORK VULNERABILITIES THAT CAN BE EXPLOITED FOR NEW APPROACHES TO THE DIAGNOSIS AND TREATMENT OF HIGHLY AGGRESSIVE AND DEVASTATING PAEDIATRIC TUMOURS. Prof. Walter Kolch is director of Systems Biology Ireland and the Conway Institute of Biomolecular and Biomedical Research in University College Dublin. In the last ten years, Prof. Kolch has built an international reputation across three areas: MAPK signalling, proteomics, and cancer research, especially in regard to using systems biology approaches.

Cancer is never welcome, but when it arises in children it seems all the more unfair. The EUfunded ASSET (Analysing and Striking the Sensitivities of Embryonal Tumours) project is taking a systems biology approach to figuring out the most useful treatments for a range of paediatric tumours (neuroblastoma, medulloblastoma and Ewing’s sarcoma), and they are discovering new ways of investigating the diseases. A major focus within the consortium is neuroblastoma, explains ASSET co-ordinator Professor Walter Kolch, who is Director of Systems Biology Ireland (SBI) at University College Dublin. Contrary to what the tumour’s name might suggest, “neuroblastoma does not arise in the brain but in the belly,” he explains. “It is a tumour of the peripheral sympathetic nervous system.” Accounting for about 15% of all childhood cancers, neuroblastoma strikes during infancy or toddler years and presents in a variety of forms - some tumours are aggressive and potentially rapidly lethal, while other tumours will eventually disappear even if they have spread throughout the body (metastasised) and even if there is no treatment. “It can be extremely aggressive and kill the child in a few months, or else it is the only tumour known which spontaneously regress even though it has metastasised,” explains Professor Kolch. “So in this case a metastatic tumour can go away without any treatment.”

Aggressive treatment for cancer is not only harsh for the child, but it could potentially have longer term effects on their health. So if clinicians can tell from the outset whether a neuroblastoma is aggressive or likely to resolve on its own, the therapy could be targeted at the children who would benefit from it, while those whose tumours who don’t need treatment will be spared the need to go through it. But how can you tell? Taking a systems biology approach, the 14-partner ASSET consortium has found that molecular signals in the tumour cells can yield important information. Their work centres on a biochemical signalling pathway called ‘JNK’, which is activated to differing levels in the tumour cells.

JNK can make the neuroblastoma cells die or live depending on the exact activation kinetics “JNK can make the neuroblastoma cells die or live depending on the exact activation kinetics,” explains Professor Kolch. “If the pathway is activated very slowly then the cells survive and grow, but if it is activated steeply like a switch, the tumour cells die - and that is a good thing.” Using a mix of experimental measurements and computational modelling, the ASSET scientists have been able to assess the aggressiveness of the tumour cells. “By

ISBE - www.isbe.eu

developing a computational model of the activation network we could see which of the nodes control it,” says Professor Kolch. “Then in individual patients we can measure these three or four protein concentrations and we can adjust this model for each patient individually. Using this approach we have shown we can assess each patient’s tumour for likely aggressiveness.”

We have found better ways to stratify patients across the different types of cancer The project is also using computational modelling to look for new treatments, he adds, by simulating the effects of drug combinations on the key biochemical pathways.

The ASSET consortium is making progress too with a systems biology approach to the paediatric brain tumour medulloblastoma and to Ewing’s sarcoma, notes Professor Kolch. “We have found better ways to stratify patients across the different types of cancer,” he says. And now the obvious route for translating findings is through the clinical research groups in France, Germany and Switzerland who are involved in the project. “These clinical groups are doing clinical studies all the time and can conduct trials to see how to most effectively implement the discoveries that come through our systems biology approach,” notes Professor Kolch. “This is the best and most direct way to bring what we are finding into the clinic.” Words: Claire O'Connell June 2015

“At the moment neuroblastoma is treated mainly with drugs that damage DNA and thereby cause JNK to be activated,” explains Professor Kolch. “But we are looking at how using combinations of existing drugs could activate JNK more specifically, without causing the problematic DNA damage. We have made very good progress in identifying potential combinations, and some of these agents are already in clinical use so there is the hope that any new combinations we find could be translated into the clinic fairly quickly.”

further Information ASSET www.ucd.ie/sbi/asset

how do tumours form? From an academic standpoint childhood tumours offer a relatively ‘pure’ model to study how cancers arise, because the person has had less time to develop non-cancerous DNA mutations. By analysing molecular characteristics of a childhood cancer called neuroblastoma, Prof. Walter Kolch and colleagues in the ASSET project have found that there are many ways for such tumours to arise at the level of DNA, but that the effects on cell signalling are less varied. “It tells you if you look at the effects of these mutations they are rather similar so you need to target the effects rather than the cause,” he says.

Genetically modified zebrafish were used in the project to evaluate potential neuroblastoma drugs

Systems Biology Ireland www.ucd.ie/sbi

Researchers from University College Dublin analysing results

ISBE - www.isbe.eu

BIOGRAPHY YOU SEE THEM EVERY DAY, BUT HAVE YOU EVER STOPPED TO WONDER HOW YOUR FINGERS ‘KNEW’ TO BECOME FINGERS WHEN YOU WERE A DEVELOPING EMBRYO? It’s a puzzle that has had scientists scratching their heads for decades, but a recent study led by Dr James Sharpe at the Centre for Genomic Regulation in Barcelona has pointed out how a molecular system controls the process.

JAMES SHARPE IS AN ICREA (INSTITUCIÓ CATALANA DE RECERCA I ESTUDIS AVANÇATS) RESEARCH PROFESSOR AND ALSO THE LEADER OF THE SYSTEMS ANALYSIS OF DEVELOPMENT GROUP AT THE CENTRE FOR GENOMIC REGULATION (CRG), BARCELONA. The goal of his research group is to focus on the development of the vertebrate limb, both at the level of gene regulatory networks, and of the physical interactions between cells and tissues. To achieve this, the group includes embryologists, computer scientists, imaging specialists and engineers.

Dr Sharpe’s lab is interested in how cells and tissues interact to build organs and body parts, and the limb has a strong track record in research on how it organises itself. “The reality is that for any organ we probably know just the tip of the iceberg, but for the limb we have more of that tip of the iceberg than for the others,” he says. To dig deeper into the networks of cellular and molecular interactions that underpin limb development, Dr Sharpe uses a systems approach, linking computer models with experimental results. “The developing limb involves hundreds of thousands of cells communicating with each other, moving around and making decisions to become muscle cells or bone cells,” he says. “So we examine that using a back-and-forth, iterative approach, where we use computer modelling alongside experimental and molecular science.” One of the lab’s most recent successes, published in Science in 2014, has shed light on fingers. During embryonic development in animals with backbones and arms, fingers form when a flat plate of tissue grows out at the end of the miniature limb, and cells then die away in a pattern to form spaces and chisel out the digits.

“At that very early stage, when the hand is forming, the individual cells within that plate of tissue have not yet decided whether they are going to make a finger or an interdigit,” explains Dr Sharpe, who co-ordinates the EMBL-CRG Systems Biology Unit. His lab examined the process underlying that cellular decision using a mouse model where cells have been transgenically engineered to produce the fluorescently glowing protein GFP if they are choosing the digit fate (where the Sox9 gene is expressed) rather than the interdigital fate.

We examine the developing limb using a back-and-forth, iterative approach, where we use computer modelling alongside experimental and molecular science.

This approach allowed them to sort the cells into two groups during the six-to-ten hour window when the cells are making this critical cell-fate choice: those about to make fingers, and those about to become interdigital.

ISBE - www.isbe.eu

By examining the networks of genes that were active in these distinct populations of cells, the researchers identified two key signalling pathways - BMP and WNT – that were most strongly involved in the decision about ‘fingerness’ or ‘gapness’. “This analysis of gene expression was complemented by examination of the proposed pathways at the protein activity level, which also supported the conclusions,” says Dr Sharpe. Once enough molecular data had been gathered, the Barcelona lab began construction of a computer model to explore a system of development proposed by British computing pioneer Alan Turing, where chemicals react with each other and diffuse over space to create particular types of stripy or spotty patterns.

The most exciting thing was that we got the same result in the computer simulation and in the real experiments. “The computer model was essential, because Turing systems are very non-intuitive,” says Dr Sharpe. “But our initial step of screening for the molecular components was also key: if the model had been abstract – not based on data about real molecules involved – it would be unable to make predictions that we could experimentally test in the lab.” The researchers turned the BMP and WNT signalling up and down – both in the computer model and in in-vitro experiments – and watched what happened. When they switched off the BMP pathway all the cells became gaps. If they repressed the WNT pathway instead, all of the cells became fingers. And repressing both the BMP and WNT pathways at the same time but to different degrees rearranged the pattern into fewer, fatter fingers.

“The most exciting thing was that we got the same result in the computer simulation and in the real experiments. This interplay between the modelling and experimentation is at the heart of the systems biology approach, and it is the strongest proof we have that these molecules are part of a Turing system,” says Dr Sharpe. “For decades this idea was actively resisted, but our results provide good evidence for it, and we think this Turing mechanism possibly goes back all the way into fish, even though the number of digit structures they have is not the same.” Figuring out fingers is just one aspect of understanding limb development, and Dr Sharpe’s lab is also using systems biology to examine how a limb as a whole organises itself to form a humerus, ulna and radius, wrist bones and - finally - fingers. Understanding such aspects of limb development should also help to inform the wider field of regenerative biology and tissue engineering. “To be able to heal and maybe one day to even build multi-cellular tissues, we ought really to understand how multicellular tissues build themselves in the first place, and we still have a lot to learn about that,” says Dr Sharpe. “Our view is that a systems biology approach will ultimately be the only way to explain, understand and then engineer living multicellular tissues, either tissues in dish that can then be put back into patients, or by stimulating patients’ own tissues to heal and regenerate.”

Patterns in biology – the Turing connection

In August 1952, British computing pioneer Alan Turing published a seminal paper entitled The Chemical Basis of Morphogenesis. It outlined how just two distinct molecules (‘morphogens’) could underpin the spontaneous development of spotted and stripy patterns by diffusing and interacting in specific ways to form repetitive motifs. Turing died two years after the paper was published in Philosophical Transactions B, but the reaction-diffusion model it described has since been proposed to underpin numerous repetitive patterns in nature, including the stripes on a zebra, and now the more subtle patterns of digit formation.

Words: Claire O’Connell May 2015

further Information Centre for Genomic Regulation crg.es

ISBE - www.isbe.eu

BIOGRAPHY STUART JOHN DUNBAR ON A COLLABORATION BETWEEN SYNGENTA AND IMPERIAL COLLEGE LONDON THAT IS PROVING FRUITFUL When you choose tomatoes at the market, what do you look for? Firm, crisp texture? A ripe product that will tantalise the tastebuds with its strong flavour? Or what if you could have both firmness and that ripe taste?

SYNGENTA IS A WORLDLEADING AGRI-BUSINESS COMMITTED TO SUSTAINABLE AGRICULTURE THROUGH INNOVATIVE RESEARCH AND TECHNOLOGY. THE COMPANY EMPLOYS 28,000 PEOPLE IN OVER 90 COUNTRIES WITH A PURPOSE OF BRINGING PLANT POTENTIAL TO LIFE. Prof. Stuart John Dunbar is Head of Bioscience at Syngenta and project leader of the University Innovation Centre on Systems Biology at Imperial College London. Stuart is also an Adjunct Professor of Cellular and Molecular Sciences at Imperial.

A project between Syngenta and Imperial College London is using systems biology to examine the interplay of factors that determine flavour during ripening of this major fruit crop, and its findings are informing breeding programmes to develop new varieties with a satisfying ripe flavour while the product is still crisp. Food producers and retailers want to sell as much of their fare as possible, but consumers have their preferences - and for many that means turning up their noses at more ‘squidgy’ tomatoes, explains Dr Stuart John Dunbar, Head of Bioscience at the global agribusiness company Syngenta.

understanding of the biochemical and genetic ‘control points’ of key features such as texture and flavour. “We are interested in detecting molecular markers in the tomato ripening pathways that grow flavour, and ultimately could we bring the flavour components of ripening earlier into the ripening process,” explains Stuart, who is an adjunct professor of cellular and molecular biology at Imperial.

It would have been impossible to analyse the datasets using conventional biochemical approaches, so we took a systems approach where we tried to integrate complexity and find the answers.

“If you buy a tomato from supermarket, it tastes more tomatoey and nicer the riper it is and therefore the squidgier it is, but squidgy tomatoes are not good for supermarkets,” he says. “In northern Europe and in the United States in particular, shoppers want tasty tomatoes but they want a nice, crisp texture, they won’t tend to buy soft tomatoes.” But how do you match those requirements? A project at the Syngenta Innovation Centre on Systems Biology at Imperial College London (the University Innovation Centre, or UIC) has been building a predictive model of tomato ripening and fruit quality to get a better Stuart at a public outreach event at Imperial College London, 2013

ISBE - www.isbe.eu

The project focused on four isogenic lines from the Ailsa Craig variety of tomato where the lines have been bred to inhibit components of the ripening process. “We have genomic and biochemical information on these tomatoes, and the tomatoes fit a range of components on the ripening and flavour pathways - they are representative of different types of outcomes,” explains Stuart, citing the example of one, Never Ripe. “It does what it says on the tin - it never gets ripe.”

The project has successfully modelled and integrated data from the tomato lines about gene expression and metabolism

the data gaps that we can reliably bridge are quite small, which was an interesting outcome, but the positive thing was that you need less data than we thought,” says Stuart. Molecular markers identified in the UIC project are now being brought forward and used in a breeding programme, but Stuart cautions there are no guarantees yet that retailers and consumers will ultimately get the holy grail of a firm and ripe-tasting tomato. “The project furthered our understanding of the complex pathways involved in fruit ripening and has allowed us to focus on specific aspects for further analysis and marker development,” explains Stuart. “It takes a long time to breed a new variety so we will see the fruits of our labour only after 6 years or so!” Words: Claire O’Connell May 2015

The work, led by Dr Charlie Baxter from Syngenta, used systems biology to model the biochemistry and the metabolic profile of these four isogenic lines, and the enormity of that task meant a systems biology approach was needed, explains Stuart. “It would have been impossible to analyse the datasets using conventional biochemical approaches, so we took a systems approach where we tried to integrate complexity and find the answers.”

“It has told us how you can cross data gaps and how big those data gaps can be, and it told us about the nature of the data that you need -

ISBE - www.isbe.eu

How does crop growth affect biodiversity and food webs? Research at the Syngenta Innovation Centre on Systems Biology at Imperial College London using a systems biology approach to model the potential impact of crops on biodiversity, and so far the project has been extremely successful, according to Dr Stuart John Dunbar, Head of Bioscience at Syngenta. “It has showed that we can predict food webs and the impacts of different types of cropping systems on biodiversity,” he says. “It is helping us with our whole biodiversity agenda and understanding the impacts on ecosystem services.”

The project used machine learning (where software carries out actions that have not been specifically pre-programmed) thanks to Professor Stephen Muggleton’s team at Imperial, and this offered an unbiased approach to tackling the questions, according to Stuart: “It is a way of addressing problems without having to predefine what the problem was, so it is an unbiased way of addressing the complexity.” The project successfully modelled and integrated data from the tomato lines about gene expression and metabolism, and several learnings emerged, including the value of systems biology and machine learning in this context.

a systems approach to ecosystems

further Information Syngenta www.syngenta.com Syngenta Centre on Systems Biology at Imperial College London www3.imperial.ac.uk/syngenta-uic

BIOGRAPHY Daniel Vonder Müehll is Managing Director of SystemsX.ch and leads the Management Office responsible for the executive work of the systems biology initiative. Previously he was Head of Research Management at University of Basel and Head of Permafrost Monitoring Switzerland.

F. Gisou van der Goot is a professor at the Ecole Polytechnique Fédérale de Lausanne, Switzerland. From the study of the interaction of bacterial pathogens with target cells to her work on protein palmitoylation, van der Goot is interested in how the organisation of cellular membranes allows precise and efficient communication.

Cris Kuhlemeier is head of the Institute of Plant Sciences, University of Bern. The Institute carries out research in plant development, molecular plant physiology, plant nutrition, plant ecology, vegetation Ecology and palaeoecology

NATIONAL INITIATIVE IS PUTTING SWISS SYSTEMS BIOLOGY ON THE MAP Switzerland is an independent state used to striking out on its own path. In this vein, the Swiss launched an initiative in 2008 to fund and promote systems biology to fire Switzerland into the vanguard of an emerging field. "There's been a paradigm shift from reductionist molecular biology to a holistic systems biology approach, and Switzerland decided to support this new era by providing funding," explains Daniel Vonder Müehll, Managing Director of the initiative, SystemsX.ch. Encouragement also came from big pharma firms like Roche and Novartis. The initiative now gels together 15 equal institutional partners from academia and industry which provide matching funds to the initiative, overseen by the Swiss National Science Foundation. Over 1,000 scientists are currently involved in SystemsX.ch projects. But also remarkable is the multidisciplinary nature of the projects forged and advances achieved. The aim of systems biology and, hence, of SystemsX.ch concerns understanding processes and the dynamics of biological systems and relies on extensive quantitative data and modelling. One of the earliest consortiums and at the same time the largest project within SystemsX.ch is LipidX, setting forth in 2008 to explore the world of lipids; though major building blocks of cells, lipids have been neglected in the ‘-omics’ era. They are difficult to study and do not conform well to a reductionist beat. "You can purify a protein. But lipids are much smaller molecules and they

generally assemble into higher order structures such as membranes, rather than staying as individual entities. Therefore it makes little sense to study them in isolation," says Gisou van der Goot of EPF Lausanne, lead investigator for the project, now in its second phase.

There's been a paradigm shift from a reductionist molecular biology to a holistic systems biology approach, and Switzerland decided to support this new era by providing funding The consortium just published the first database that provides knowledge on lipids. It also took initial steps to getting to grips with how the nutrient state of cells and environment affects lipid composition, and the influence of lipids on cells. “Before this project none of us were actually doing lipidomics and thinking in a systems way about lipids. SystemsX made this possible and changed the way we tackle the problem,” van der Goot explains. The entire initiative sped up the adoption of technologies and quantitative approaches in experimental labs. Questions in systems biology require a symphony of analyses in the lab and computer modelling, which itself demands input from biology, maths, physics, chemistry, computer science, informatics, engineering and medicine. ISBE - www.isbe.eu

It is not plain sailing. “It’s a real challenge. You can talk but it doesn’t mean that other people listen,” observes van der Goot, speaking of the challenges of interdisciplinary communication. “It took time, but was worth the effort.” Plant scientist Cris Kuhlemeier at the University of Bern says the first four years saw the consortium he leads “mostly trying to get to understand each other.” The SystemsX.ch project PlantGrowth2 is now “really an integrated group of people,” he says proudly. Crops in recent years have been scrutinised down the lens of genetics, but little is known about the mechanics of growth. “DNA is a linear code, but how do you go from that to a 3D shape. Maybe 20 years ago people were totally focused on chemical signalling, but now we are trying to explain morphogenesis also in terms of physics” says Kuhlemeier, who has studied a mysterious, truly quantitative problem for two decades: why leaves spiral predictably around stems according to the Fibonacci sequence, with the next leaf always displaced by 137 degrees. “This is a quantitative problem. You cannot solve it only with genetics. There are no transcription factors that specify angles. We had to resort to modelling. I joined forces with a computational scientist and we produced the first mathematical model of a development problem in plant biology,” Kuhlemeier recalls. But the PlantGrowth2 project is not solely concerned with theory: it aims to tackle a realworld problem: hunger. The consortium investigates teff, the staple crop in the Horn of Africa, in an effort to improve yield and stop it falling over (lodging) in wind and rain – causing losses of 35-50%. They have used quantitative methods and modelling to circle weak points and develop better varieties.

picked up. A highpoint has been the creation of new imaging technology called MorphoGraphX, published in May 2015 in eLife, which makes it possible to segment cells and follow their deformation over time in 3D. Uses include shape extraction, growth analysis, signal quantification and protein localisation. Vonder Müehll recalls the first years of the initiative, when projects were strong on data collection, but not accomplished in systems biology. “Scientists are human and most stay in their comfort zone if possible, so you need clear incentives,” he says. In the second phase, which began in 2013, SystemsX.ch redirected more emphasis on theoretical work, simulations and modelling. The next big event is the ‘All SystemsX.ch Day’ to be held in Berne on September 15th 2015, a big meet and greet networking opportunity, accompanied by talks and panel discussions. The Swiss initiative ends in 2018, with no follow up: the Swiss are confident that they have opened a path that researchers will willingly follow, assisted by multidisciplinary alliances and greater comfort in striking out on a systems biology route. That’s one expected return on investment. Words: Anthony King May 2015

SystemsX.ch faCT fILE

SystemsX.ch is the largest ever public research initiative in Switzerland. It focuses specifically on a broad topical area of basic research, systems biology. Initially, over 120 million Swiss francs were committed to the initiative, with its first stage running from 2008 to 2012. A further 100 million francs was committed for consolidation over the period 2013 to 2016. A slight readjustment in the second stage saw promotion of more theory, simulations and modelling projects. In addition, a special round invited for applications of Medical Research and Development projects, with medical and clinical parts. Projects covering topics such as prions, HIV, metastasis, melanoma and inflammatory bowel disease won funding. Today, SystemsX.ch supports around 220 projects, more than 1,000 scientists and almost 400 research groups.

SystemX changed the way we tackled the problem Micro-machinery was devised to measure cell wall strength at different places. Kuhlemeier’s biology postdoc Sarah Robinson has begun taking on hard core physics courses, while a computational scientist colleague has impressed him with how much biology he

ISBE - www.isbe.eu

further Information SystemsX www.systemsx.ch LipidX www.systemsx.ch/projects/research-technology-and-development-projects/lipidx/ PlantGrowth2 wiki.systemsx.ch/display/PGRTDproj/Plant+Growth+Home

BIOGRAPHY LUIS SERRANO FROM THE CENTRE FOR GENOMIC REGULATION IN BARCELONA EXPLAINS HOW HIS RESEARCH ON A SMALL BACTERIUM CAN HELP US TO UNDERSTAND OTHER LIVING SYSTEMS AND HOW IT CAUSES DISEASE Given enough technology and know-how, could we completely understand how an entire living system works? It’s an ambitious suggestion, but Dr Luis Serrano and colleagues at the Centre for Genomic Regulation in Barcelona are in the process of finding out. THE CENTRE FOR GENOMIC REGULATION IN BARCELONA IS AN INTERNATIONAL BIOMEDICAL RESEARCH INSTITUTE OF EXCELLENCE WHOSE MISSION IS TO DISCOVER AND ADVANCE KNOWLEDGE FOR THE BENEFIT OF SOCIETY, PUBLIC HEALTH AND ECONOMIC PROSPERITY. Dr Luis Serrano is Director of CRG and leads the Design of Biological Systems research group. The group works toward a quantitative understanding of biological systems to an extent that one is able to predict systemic features, with the hope to rationally design and modify their behaviour.

Living organisms vary hugely in size and complexity, so the researchers in Barcelona have chosen their focus wisely: a small bacterium called Mycoplasma pneumoniae, a single cell organism that has a relatively simple metabolism.

If you have enough results and enough money and enough know-how, would you fully understand a living system? The microbe is of clinical relevance because it can cause atypical pneumonia in humans, explains Dr Serrano, but the main reason for selecting it as a model organism is its manageable size. ”It is one of the smallest bacteria you can grow in the lab,” he says. “And the whole idea of the project has been to ask if you have enough results and enough money and enough knowhow would you fully understand a living system.” To find out more about M. pneumoniae Dr Serrano’s group and collaborators at the European Molecular Biology Laboratory have

been analysing the main biochemical components of the bacterial cell, how they respond under different conditions and how the components fit together to form a functioning system. Much like a car, a cell has various components that need to work both in their own right and together for the system - or car - to work. In the car, an engine, gears and wheels function individually and together to make the car go. In a cell, molecular systems involving DNA, RNA, proteins and sugars work in synchrony to run the living system, and Dr Serrano has been looking at these systems. ”We acquire the relevant data from the cell we are looking at its metabolism, its transcriptome [RNA], the proteome [proteins in the cell],” he says. “But we are not looking at every protein individually, we try to get the whole picture: so we are not looking at every screw in the car, we are looking at the main components.” By perturbing the cells and seeing how each system reacts, Dr Serrano and co-workers have been bringing a larger picture into focus of how the system as a whole responds to changes in its environment. ”We explore how it responds to factors like exposure to drugs, changes in temperature or changes in nutrients,” he says. “Our approach is a little like if you wanted to analyse the nervous system of the human you could apply something very hot and if the person jumps then the nervous system has responded.”

ISBE - www.isbe.eu

One of the biggest findings to emerge is that M. pneumoniae has sophisticated mechanisms for controlling how its genes are expressed. The microbe has two systems of ‘methylation’ - a form of tagging on DNA that can determine whether genes are switched on or silenced. “We know the bacterium has very strong methylation and there are two systems, one is general, and there is another more specific system that we don’t know what it is doing,” says Dr Serrano.

Having such insights … may also offer routes to engineer the microbe as a drug-delivery platform to bring medications to specific sites in the human body.

doing something and which ones are just passing by?” In the longer term, having such insights into the bacterium could help to understand how it causes disease, and it may also offer routes to engineer the microbe as a drug-delivery platform to bring medications to specific sites in the human body. But for now Dr Serrano is driven by the ultimate goal of getting that complete picture of a living organism. “When I give talks everyone is excited and amazed by the amount of information and what we are doing,” he says. “And the impact will be that we come out with a model that will explain the whole cell in detail. Then we will say for the first time that we understand the whole thing.”

COMPLICaTIOnS aSSOCIaTEd WITh MYCOPLASMA PNEUMONIAE Lobar consolidation Abscess Bronchiolitis obliterans Necrotizing pneumonitis Acute respiratory distress syndrome Respiratory failure

Words: Claire O’Connell October 2013

In addition, the small bacterium contains a relatively large amount of ‘non-coding RNA’, an observation that has now also been made in more complex bacteria as well as in eukaryotic cells, which are the types of cells that make up plants and animals.

Design of Biological Systems Group, 2013

”It looks now like bacteria have as large a proportion of non-coding RNA as eukaryotes,” says Dr Serrano. ”I think this is something that is characteristic of all branches of life.” The team also saw that the cell can read stretches of its DNA either classically or in a ‘staircase’ pattern - once more this was a surprise to see in the tiny cell, notes Dr Serrano, and the phenomenon has since been observed in other species of bacteria too.

Mycoplasma pneumoniae

The big challenge is now to integrate the large volumes of data from the tiny Mycoplasma and to sift out the signal from the noise. ”We have been looking at proteomics, transcriptomics, chemogenomics, everything,” says Dr Serrano. “And now we want to put it together in a way that makes sense. So we are trying to integrate everything into a big model but this is not easy. You might find 100 proteins acetylated or 60 proteins phosphorylated how much of this is noise and how much is biologically significant? Which ones are really

ISBE - www.isbe.eu

further Information Centre for Genomic Regulation crg.es

BIOGRAPHY BAS TEUSINK FROM THE NETHERLANDS PLATFORM FOR SYSTEMS BIOLOGY DISCUSSES THE REMARKABLE DEVELOPMENTS IN HIS RESEARCH MADE POSSIBLE BY THE APPLICATION OF SYSTEMS BIOLOGY APPROACHES

THE NETHERLANDS PLATFORM FOR SYSTEMS BIOLOGY FOSTERS SYSTEMS BIOLOGY APPROACHES IN THE RED, GREEN, WHITE AND BLUE SECTORS OF THE LIFE SCIENCES, CREATING SYNERGIES BETWEEN SYSTEMS BIOLOGY RESEARCH INSTITUTES/GROUPS AND OTHER STAKEHOLDERS IN SYSTEMS AND SYNTHETIC BIOLOGY, BIOTECHNOLOGY AND MEDICINE. Prof. Dr. Bas Teusink developed the Kluyver Centre Systems Biology programme; he is Full Professor in Systems Bioinformatics at IBIVU, VU Amsterdam.

We must understand component parts to get to grips with a complicated machine. Once you build such a machine yourself, you can tweak and adopt it. Industry understands how to do this, but has not done so well in deconstructing the live machinery critical for the fermentations at the core of so many food and pharmaceutical processes – the microbial cells. Bas Teusink at the Netherlands Platform for Systems Biology (SB@NL) is mapping out the design of microbial cellular networks by asking two straightforward but big questions: what makes the cell’s biochemical network tick and why did evolution choose that design? His group’s modelling of cells’ metabolic blueprints on a genome scale is yielding some dramatic successes relevant to industrial fermentation processes. His group worked in conjunction with the Kluyver Centre for Genomics of Industrial Fermentation, now part of the BE-Basic Foundation, an international public-private partnership that develops industrial bio-based solutions. This collaboration is putting pep into the R&D of industries that rely on innovation in industrial fermentation, optimising what is a critical step in many food, beverage and pharma processes. The aim is to boost performance and robustness of industrial microbes by revealing how the genome and environment interact. Recently Teusink’s group doubled output of a certain toxin, a vaccine component essential for a highly contagious but preventable disease that kills thousands each year. “We could

deconstruct the metabolic network of the organism based on its genome,” Teusink explains. “In this case it was grown in a traditional production process where a historically defined medium was used.” A big pharma company is involved, but cannot be named. Improving the medium would have meant trial and error, but Teusink’s team instead modelled around 1500 reactions underpinning the cell. They realized an ingredient in the growth medium impeded production. Teasing out the metabolic networks also showed them that the cells would be able to use alternative substrates to the ones that inhibited production. They did the heavy lifting in silico, along with experimental test, successfully predicting an improved formula.

…it’s only now, because of our model, that we can understand thirty years of research… ”You can design all sorts of hypotheses this way about the media. You can ask what are the minimal inputs I need to support growth or what are the cheapest materials,” Teusink explains. The end result: higher productivity at lower costs. But Teusink’s systems biology approach has also yielded a fundamental breakthrough, solving a three decade long mutant mystery, recently published in Science (van Heerden et al., 2014).

ISBE - www.isbe.eu

Researchers had struggled with a particular mutant in yeast for years, but couldn’t figure out this strange phenotype. It can’t grow on glucose, something yeasts normally prefer above all else. Glucose is degraded in a metabolic pathway called glycolysis – Greek for breaking down sugar. It turns out there are two solutions to the problem of degrading glucose in these cells; with the mutant form you have a 99.9% chance of not growing on glucose, but this means that there is still a tiny subpopulation of the mutant that can thrive on it. “This small subpopulation was 1 in 10,000, but we now realize that there are two states these yeast can be in,” says Teusink. When this “bistability” phenomenon was further investigated it turns out 7% of wild type cells by chance do not grown on yeast either and normally just die off when fed it. Genetically the two yeast types in both groups are the same, but chance gives rise to heterogeneity in the system. “This now explains all these weird phenotypes in mutants that people have generated in this field. So it’s only now, because of our model, that we can understand thirty years of research.”

All sorts of processes could benefit from a greater understanding of why only some cells start to grow. Teusink says his yeast work shows that sometimes the average response seen in a population of cells is no such thing it is actually the sum of two completely different behaviours caused by bistability. Such noise in life is becoming clearer as technological advances improve single-cell measurements and the theory behind cellular network architecture advances; computers will need to run even faster to keep up with network models, Teusink predicts. Teusink, from his base in Amsterdam, believes Europe must try harder when it comes to training biology students. Today glycolysis is taught as a pathway that goes from A to B, a static process; students are instructed that certain genes are involved, but what does this really mean? “The way we should actually teach this is to make a model of this pathway and let students play with it to see how it actually behaves. It’s not so trivial, and stability and steady state concepts today are not clear to students,” says Teusink. “Biology is complicated and you need the maths.”

BE-BaSIC faCT fILE Public-private partnership between: 27 Industrial partners 7 Research Institutes 13 Universities Since 2011: 447 peer reviewed papers 8 patents filed 8 start ups

Words: Anthony King March 2014

Biology is complicated and you need the maths

So far so basic, except that glycolysis is a central pathway in life and these subpopulations are everywhere and are particularly important during transitions – such as at the start of a fermentation process when microbes meet a large batch of sugar. “In these transitions we often see that only part of the population starts to grow and the other part dies or does nothing. Suppose that you inoculate a million cells in your milk or your fermentation vat, but only half these cells start doing something. This will lead to a delay in your production [a lag phase],” Teusink explains. Once you understand this split in your population it is possible for you to add something to the media as a pre-treatment to cut down on this delay.

ISBE - www.isbe.eu

Lactobacillus bacteria

further Information Netherlands Platform for Systems Biology (SB@NL) biosb.nl/sbnl BE-Basic Foundation www.be-basic.org

BIOGRAPHY MERRIMACK PHARMACEUTICALS IS A BIOPHARMACEUTICAL COMPANY DISCOVERING, DEVELOPING AND PREPARING TO COMMERCIALIZE INNOVATIVE MEDICINES PAIRED WITH COMPANION DIAGNOSTICS FOR THE TREATMENT OF CANCER. MERRIMACK APPLIES A SYSTEMS BIOLOGY-BASED APPROACH TO BIOMEDICAL RESEARCH, THROUGHOUT THE RESEARCH AND DEVELOPMENT PROCESS.

US PHARMACEUTICAL COMPANY BASES ITS BUSINESS ON A SYSTEMS APPROACH AND IS REAPING THE REWARDS Merrimack Pharmaceuticals is a NASDAQlisted biopharma company that has confidently placed its chips on a systems biology approach to cancer drug discovery. The power of a systems approach is to reveal not just individual components of a system, but how each part connects. Peter Sorger, Professor of Systems Biology at Harvard Medical School, helped found the company while at MIT in partnership with serial entrepreneur Anthony Sinskey, Professor of Microbiology, MIT. Cofounders Ulrik Nielsen and Gavin MacBeath are still with the company in senior positions.

Peter Sorger PhD is a Professor of Systems Biology at Harvard Medical School and holds a joint appointment in MIT’s Dept. of Biological Engineering and Center for Cancer Research. Sorger was co-founder of the MIT systems biology program CSBi, Merrimack Pharmaceuticals and Glencoe Software.

Birgit Schoeberl is Senior VP of Research with responsibility for discovery and clinical stage projects. She is an internationally recognised leader in Systems Biology. She has been with Merrimack since the very beginning and has been integral to develop the Systems Biology platform.

The very beginning of Merrimack came partly out of dissatisfaction with the myriad explanations in the literature of the induction of apoptosis by anticancer drugs, Sorger recalls. They decided it was necessary to understand the key physiological pathways involved in drug response and that that would require a mix of computational modelling and dynamic modelling.

About 80% of the cost of drugs today is in yesterday’s failures, so the number one target for systems biology is to change that

Merrimack remains rooted in the principles of grafting quantitative biology, computational models and engineering to understand the signalling pathways that are involved in disease in a holistic way and then using these insights to identify drug targets, engineer novel

therapeutics and identify biomarkers. Today, Merrimack has a market capitalization of over $900m, has around 270 people on staff and 6 drugs in clinical development. Sorger believes firmly that systems biology offers a new path to drug discovery that will be far more efficient. “About 80% of the cost of today drugs is in yesterday’s failures, so one target for systems biology is to change that: so to reduce the rate at which drugs fail and to incrementally improve the process by linking the science back to critical decision making in a company.” Merrimack’s core values include drilling into the complex biology behind cancer. So far this has yielded the six molecules in clinical development. In November, the company reported news for Phase 2 studies in the treatment of women with ER/PR2, HER2 negative breast cancer with the inhibitor MM121. A positive signal was shown in a subpopulation of patients that would potentially benefit from targeted therapy. The findings support ErbB3 signaling as an important pathway of resistance for breast, ovarian and lung cancers. ”MM-121 is a monoclonal antibody against ErbB3 but is not as active in HER2 overexpressing or amplified tumours. Therefore we designed a second molecule, MM-111, which targets ErbB3 in HER2 overexpressing tumours,” says Birgit Schoeberl, Merrimack SVP of Discovery. “Based on our preclinical research, we defined five different biomarkers that would be predictive of ErbB3 activity in tumour samples and designed our clinical trials to test this hypothesis.” She added ISBE - www.isbe.eu

that, with the retrospective analysis of the five biomarkers in Merrimack's clinical samples they were able to identify a subgroup of patients with the same response biomarkers across NSCLC, ovarian and metastatic breast cancer who appear to benefit from the treatment with MM-121. This is the first time we've gone from an in silico preclinical biomarker hypothesis to the ultimate translation into the clinic says Schoeberl, which is a "big moment for Merrimack and I think for systems biology in general. The predictive biomarkers will help identify which patients may benefit from MM121, which completed six Phase 2 clinical trials in collaboration with Sanofi.” Merrimack recently regained worldwide rights to develop and commercialise MM-121 in June 2014 from the cancer arm of French pharma giant Sanofi. Merrimack pairs up an experimentalist and a modeller in a “discovery pod” and looks to understand the biology before setting off to develop certain drug candidates. “Early on we made some proof-of-concept antibodies targeting ErbB3 and showed that the insights derived from the model translated into the inhibition of cell proliferation, before starting an antibody campaign and selecting the lead molecule,” says Schoeberl. The approach of going under the hood early on to get a good understanding of the biology is essential to Merrimack’s philosophy. It is about understanding how a drug targeting a specific disease gene will really work when it gets into a complex human patient, for example. The approach should allow for a better understanding of which patients will respond to which drugs. It could also kill drugs off earlier, says Schoeberl, reducing resource loss through expensive late failures. ”Making a drug should ideally be much more like designing a car or an airplane where it is not a trial and error process. There is a lot of design and modelling and simulation that happens even before a car is built,” she says. “We aspire to design and engineer our drugs based on clearly defined design criteria.” At the moment, the highest number of failures and most money gets spent before Phase I trials (Tollman et al., 2012). “I believe that Systems Biology applied to target identification and preclinical drug development could increase ISBE - www.isbe.eu

success rates across the industry,” adds Schoeberl. ”In the future a novel drug that comes out and costs US$180,000 per year per patient and is unknown if it will work in 50% of the people it is prescribed to, it is not going to be tolerable,” adds Sorger. The systems biology route should mean a more quantitative and also more predictive approach; big pharma is unlikely to turn, however. It is not structured to do so and has no culture of systems biology. Chances are, agrees Sorger, new systems companies are likely to be spun out of universities and research institutes, as was the case with Merrimack.

MERRIMaCK faCT fILE 2000 Founded by scientists from Harvard and MIT 2011 Announces $77M in private financing 2012 Launch on NASDAQ 270+ Employees $900M Company value 6 Cancer drugs in clinical development Based on 2014 figures

This is the first time we've gone from an in silico preclinical biomarker hypothesis to the ultimate translation into the clinic Schoeberl herself is a chemical engineer by training, having started her career initially in the oil industry. Her background exemplifies the cross-disciplinary nature and quantitative underpinnings of a systems approach. Schoeberl then did a PhD in systems biology in her native Germany because she was always fascinated by biotechnology. “The concept of systems understanding and systems dynamics is what you do in chemical engineering. It was a good background and the biology I basically learnt along the way.” Words: Anthony King January 2014

further Information Merrimack Pharmaceuticals www.merrimackpharma.com

BIOGRAPHY BLANCA RODRIGUEz ON USING COMPUTER MODELS TO UNDERSTAND THE DIFFERENCES BETWEEN INDIVIDUALS IN THE SUBTLE BUT SOMETIMES CLINICALLY IMPORTANT WAYS THAT THEIR HEARTS BEHAVE What makes your heart miss a beat? How our hearts react to stresses such as disease, exercise and even medicines can vary from person to person, and understanding those differences is key for developing more effective diagnostics and safer therapies. BLANCA RODRIGUEZ HOLDS A WELLCOME TRUST SENIOR RESEARCH FELLOWSHIP AND IS PROFESSOR OF COMPUTATIONAL MEDICINE IN THE DEPARTMENT OF COMPUTER SCIENCE AT THE UNIVERSITY OF OXFORD. Blanca and her team investigate causes and modulators of variability in the electrophysiological response of human hearts to disease and therapies. Understanding variability is crucial to ultimately determine who, when and how patients may be at risk, and how to improve their diagnosis and treatment. The mechanisms are complex, multiscale and nonlinear, and Blanca’s team exploits the power of computational approaches combined with experimental and clinical research to unravel key mechanisms of cardiac arrhythmias.

That’s why Professor Blanca Rodriguez at the University of Oxford is developing sophisticated computer models of how populations of heart cells work, and her group’s findings stand to have a wide impact on refining simulations of and experiments on living organs. Abstract models of the heart are not new more than 50 years ago Professor Denis Noble created the first mathematical model based on the behaviour of cardiac cells, and the field has grown since then, explains Professor Rodriguez, who is Professor of Computational Medicine and a Wellcome Trust Senior Research Fellow in Basic Biomedical Sciences at Oxford. “Cardiac modelling a very mature area of computational medicine,” she says. “We now have multi-scale models of the human heart, so they represent the activity of the heart from the sub-cellular to the whole organ level.” Individual experiments provide snapshots of particular aspects of the heart, then the model can integrate and reassemble the experimental data to build a more complete picture. This computational model can act as a testbed for simulations, and the responses can direct further experiments, which in turn fine-tune the model. “With the simulations we can

identify key factors that determine ischemic risk or an adverse response to a drug for example, then we can test those predictions with additional experiments,” says Professor Rodriguez. “So the computation directs the next round of experiments, and the results of these experiments feed into the computational model.” Such heart simulations offer the advantage of high resolution data both in space and in time, notes Professor Rodriguez: “That means we can look at any viable property of the tissue that we want to and we can make calculations that are very difficult to do with experiments or clinical methods.”

The computation directs the next round of experiments, and the results of these experiments feed into the computational model. To improve the simulated heart model, she and her team have brought in an added real-life complication - the differences between individuals in the subtle but sometimes clinically important ways that their heart behave. “We are looking at how we can use computer models to understand inter-subject variability better,” explains Professor Rodriguez. “The current cardiac models are based on a generic response of a particular cell, but that doesn’t

ISBE - www.isbe.eu

allow us to investigate why certain people react badly to a medicine or why people die of a certain disease and not other people. So we have developed a methodology that allows us to simulate populations of cells rather than a single cell. This means we can consider a wide range of possible cells, or possible responses to the same disease or medicine.”

We are looking at exploring pain research in neuroscience and developing models for diabetes using a similar approach Calibrating populations of cells in the model with experimental data means a tighter coupling between the computer model and the ‘live’ results, thus building potentially more realistic simulations with which to test various conditions. Such simulations could ultimately help to reduce the levels of animal testing required for medications to assess cardiac side-effects, and Dr Oliver Britton - who completed his PhD with Professor Rodriguez and Dr Alfonso BuenoOrovio in Oxford - recently won the ‘3Rs Prize’ from The National Centre for the Replacement Refinement and Reduction of Animals in Research (NC3Rs).

“We want this to have the widest impact possible and we have been talking to them to shape the research agenda in a way that can be broadly exploitable.” Another application of the model is to predict risk for patients with genetic susceptibilities for hereditary heart conditions such as Long QT syndrome, which can lead to sudden adult death.

“We are looking at exploring pain research in neuroscience and developing models for diabetes using a similar approach,” she says.

ISBE - www.isbe.eu

We all respond to medicines in a particular way. Knowing in advance how a patient is likely to react to a drug is important for safety, and scientists at Oxford are developing userfriendly software to research potential effects on populations of heart cells.

“These genetic variations can be characterised at the ion channel level, so we can plug that into our populations of models and determine whether it is a low or high risk mutation,” explains Professor Rodriguez. “We are also using clinical data from partners at The Oxford Heart Hospital - they have in vivo recordings of hearts that we are using to construct the populations and investigate the potential for simulation there too.” The Oxford researchers are also scaling up their populations-based model from cells to the whole heart to examine the impact of ionchannel behaviour on ECG readings, she adds. "My group is using the technology in different ways and we are quite keen in exploring how far it can take us both clinically, in industry and of course in the science we do.” Words: Claire O’Connell May 2015

“Bringing in our calibration allows you really to tie computational models with a certain type of experiment,” explains Professor Rodriguez. “And we have created user-friendly software (called Virtual Assay, co-developed with Oxford Computing Consultants) that allows nonexperts to run the simulations.” The researchers are now working with various companies in the life sciences sector to test out the simulations and software, and being able to generate experimentally-calibrated populations in simulations could have applications that go even beyond the heart, notes Professor Rodriguez.

virtual assay

Called Virtual Assay, it generates various models of responses and calibrates them with data from experiments on populations of cells. The now calibrated model can be tested with drugs of interest to simulate a response. The software has already flexed its computational muscles for in silico case studies of specific drugs and their effects on populations of cardiac cells.

further Information Department of Computer Science, University of Oxford www.cs.ox.ac.uk Virtual Assay: Drug safety and efficacy prediction software www.cs.ox.ac.uk/ccs/tools

BIOGRAPHY THIBAULT HELLEPUTTE FROM DNALYTICS EXPLAINS HOW SYSTEMS BIOLOGY IS HELPING TO DELIVER PERSONALISED MEDICINE PRODUCTS A Belgian start-up has launched a unique test to determine the type of arthritis a patient suffers from. Patients may present with inflammatory arthritis, but in a quarter of cases a clinician will not be able to diagnose which type it is. DNALYTICS IS A BELGIAN COMPANY FOUNDED IN 2012 AS A UNIVERSITY OF LOUVAIN SPIN OFF THAT BASES ITS ACTIVITIES ON A DATA MINING TECHNOLOGY PLATFORM. DNALYTICS COVERS THE DEVELOPMENT OF DATADRIVEN PERSONALISED MEDICINE SOLUTIONS, FROM R&D TO MARKET ACCESS. Thibault Helleputte is co-founder and CEO of DNAlytics. He holds a M.S. in Computing Science Engineering and a PhD in Engineering Sciences from the University of Louvain. His research work during his PhD centered on Machine Learning applied to the design of novel tools for automated prediction based on genomic technology.

The traditional approach is to wait until the symptoms become clearer, but this can take one to three years. During this time the joint of the patient can suffer irreversible damage. Now DNAlytics in Belgium has developed Rheumakit, a biomarker-based tool to diagnose patients suffering from undifferentiated arthritis. It predicts whether a patient suffers from osteoarthritis – mechanical damage to the joint – or something more complex like rheumatoid arthritis, an autoimmune disease. Data analytics and predictive modeling, along with a systems biology mindset, are central to the firm’s approach. “The treatments are very different, so this is important,” says Thibault Helleputte, CEO and cofounder of DNAlytics. “Our solution combines some biological measurements in blood, some clinical observation of the patient and gene expression signature, combined in a predictive way.” DNAlytics promises “data-driven personalized medicine from R&D to market access” in its tagline. It spun out of the UC Louvain’s computational and engineering department and provides consultancy to pharma

companies, especially in situations where datasets are so expansive that specific approaches to modelling of data analysis are required. But it also develops personalised medicine products. With Rheumakit, Helleputte realised that traditional measures like inflammation markers in blood were not sufficient for diagnosis, but could be combined with other measures. They started with 50,000 candidates of gene expression markers, more than 10 clinical variables and about the same number of biological measurements from the patients. “We applied feature selection algorithms in order to automatically select the features (i.e. the variables) that are most relevant to differentiating the different pathologies,” says Helleputte.

Our solution combines some biological measurements in blood, some clinical observation of the patient and gene expression signature, combined in a predictive way. The current diagnostic solution combines three clinical markers and about 100 gene expression measures, obtained by looking at RNA expression levels. DNA is fixed, but RNA is a snapshot of metabolism and can vary according to disease and medication.

ISBE - www.isbe.eu

The DNAlytics team will take whatever are the most relevant markers for making a predictive diagnosis, whether that is clinical information, imaging data, psychological data or a range of genomic, epigenetic or proteomic data. Asked what gives the firm an edge, Helleputte says they focus not on the performance of their model on already observed data, but rather its predictive performance for new samples, yet unseen.

If you could predict in advance which patients will respond to treatment, you could use the right drug “Because when you work with data that have more variables than observations, so more features than patients say, you are guaranteed to find a perfect model for your data, and this is really bad news. You will find an infinity of models that look perfect, but these will then fail to make good predictions on new patients, because they are in fact too specific to the data that you have already observed.” Consequently DNAlytics focuses on generalisation ability and on multivariate solutions, meaning that a marker useful in combination with others is preferred, as are multiple data sources. “We select markers that will be more robust, and we developed algorithms to measure marker (in)stability,” says Helleputte.

treatment to be selected. Anti-TNFs, the most popular of which are disease-modifying antirheumatic drugs (DMARDs), cost Belgium around €100 million every year, yet only work on 60% of rheumatoid arthritis patients. “That’s €40 million just wasted,” notes Helleputte. “If you could predict in advance which patients will respond to treatment, you could use the right drug.” DNAlytics is steeped in a systems biology mindset. It measures the activity of several metabolic pathways known to be involved in the disease or as targets of existing treatments. “Really understanding all those pathways and the mechanics behind them, not just genes or proteins in isolation, but to see how those elements combine, so from DNA to RNA to proteins to products of degradation, in blood, in serum, in cells, that is key to the future of our business.” Helleputte is passionate about the projects he and DNAlytics are wading into. “These are not the kind of projects you can do in your office, or on your own. You need to meet clinicians, meet regulatory authorities, meet the patients and perform the data analysis. These are really complex projects and that's really stimulating for me,” he enthuses. Words: Anthony King June 2015

Rheumakit

How does it work? Clinicians go online and order a kit. They take a biopsy from the knee of the patient and put it in the kit within vials containing an RNA-preserving solution. The box is shipped to the company’s lab in Belgium, which generate transcriptomic data from the sample and uploads the results onto the web application. The clinician answers some clinical and biological questions on the website, and a mathematical algorithm kicks into gear and delivers the diagnosis with a few seconds. “The process takes a matter of a few days, which is a tremendous gain,” says Helleputte.

Personalised medicine is increasingly entering oncology practice, with a patient’s tumour genotyped to see which drugs will work best against it. This is not the case in rheumatology, but DNAlytics is working with rheumatologists and clinicians to reverse this situation. There are around 10 treatments for rheumatoid arthritis, but they are only effective in 60% of patients (and each is effective on a different subpopulation). “Right now it is impossible to tell which treatment will be beneficial to which patient, so it's trial and error,” says Helleputte. He is working on genetic profiling and other measures to allow the most relevant

ISBE - www.isbe.eu

further Information DNAlytics dnalytics.com RheumaKit www.rheumakit.com

BIOGRAPHY PUBLICLY-FUNDED GERMAN FLAGSHIP INITIATIVE IS LEADING THE WAY IN LIVER RESEARCH

SINCE ITS BEGINNING IN APRIL 2010, THE VIRTUAL LIVER NETWORK HAS ENGAGED GROUND BREAKING AREAS OF SYSTEMS BIOLOGY IN A COORDINATED AND FOCUSED ATTEMPT TO SHOW THAT MODELLING AND SIMULATION CAN HELP TACKLE THE CHALLENGES OF UNDERSTANDING THE DYNAMICS OF BIOLOGICAL COMPLEXITY. Dr Adriano Henney is Programme Director of the VLN. Dr Henney has a PhD in Medicine and many years academic research experience in cardiovascular disease in laboratories in London, Cambridge and Oxford, and worked with AstraZeneca exploring strategic improvements to the company’s approaches to pharmaceutical target identification, and the reduction of attrition in early development.

Systems biology has now reached a new stage of maturity. No better proof is the existence of an audacious research project called the Virtual Liver Network (VLN). It provides an excellent example of how systems biology is now yielding a level of detail and quantitative data in biology at a scale not previously attained. “We need to do research in biology at the scale of what astrophysics has done,” explains Adriano Henney, Programme Director of the VLN. The aim of the VLN is to design a dynamic mathematical model of the human liver. This model will represent, rather than fully replicate, the liver’s physiology and morphology. More importantly, it will also integrate the wealth of data we have acquired post-genome through multiple models. Its ultimate goal is to represent the multiple liver functions, including detoxification, the fight against inflammation and the production of biochemicals necessary for digestion.

€50M flagship initiative is supported by the German Federal Ministry of Research and Education This so-called multi-scale modelling is a challenge. “The ability to model across scales of time and space is not easily done in biology,” explains Dr Henney. What makes this project possible is the data crunching capabilities of bioinformatics and the power of new computer modelling. This combined approach enables the integration of quantitative data from the

sub-cellular levels to the whole organ. Ultimately, better treatments for the many liver-related diseases are expected to be produced. This €50M flagship initiative is supported by the German Federal Ministry of Research and Education, BMBF. Research teams that were previously in competition are gathered under the VLN umbrella for five years, until 2015. “This is the first example of an investment in systems biology of this size in a single country that focuses on delivering solutions to clinicians, and aiming to do so using simple to use formats,” Adriano Henney points out. The VLN involves a distributed network of research teams spread over Germany, in 70 laboratories. This approach is unique in international research in the biosciences. No team in the USA, Japan or any other country has managed to perform such an intricate geographically distributed research collaboration. Nor has any other research effort integrated the most fundamental biological research directly through to clinical studies in patients. An organ as seemingly anodyne as the liver harbours surprising complexity. Using modelling and simulation to tackle this complexity, VLN scientists have been able to show, according to Dr Henney, “that we can use it to highlight inaccuracies in our current knowledge of physiological processes within this vital organ”. Specifically, the results of the team lead by Prof. Rolf Gebhardt, Deputy Director of the Institute of Biochemistry at the

ISBE - www.isbe.eu

University of Leipzig, point to inaccuracies in our knowledge of liver steatosis, or fatty liver disease. The team found that challenging liver cells, called hepatocytes, by external fatty acids, results in accumulation of triglycerides in fat droplets. However, it only results in minor changes in the central metabolism of the liver, against all expectations. The team also found that the influence of insulin on fatty acid biosynthesis in liver was previously strongly overestimated, while that on the conversion of carbohydrates into fatty acids was rather underestimated. These findings led to a patent likely to have a high impact on future therapies of steatosis and related diseases.

Integrating the most fundamental biological research directly through to clinical studies in patients

identify patients more likely to benefit from treatment. Previously, liver toxicity has been the reason for the failure of a significant proportion of novel medicines. Now, systems biology is opening new avenues for drug discovery. For now, the team hopes to extend the funding by another five years, to create the prototype of a true multi-scale model within a single organ and link it to human physiology.

vLn faCT fILE German governmentfunded initiative €50M investment over 5 years 70 research groups 41 Institutions 250 Scientists

To meet the challenges of 21st Century medicine to deliver more effective therapies, we need a deeper understanding of the complexity of common disease and the dynamic interplay of genes and environment that underpins it. Systems biology offers potential solutions, examples of which are being pioneered in the Virtual Liver Network. Words: Sabine Louet January 2014

Crucially, the project aims to translate the basic research into clinically-relevant applications for doctors. A team working on a showcase of the inflammatory process in the liver has developed a user-friendly interface for doctors, available on a tablet. This team is led by Prof. Steven Dooley, a specialist of molecular hepatology at the Mannheim Medical Faculty, and Jens Timmer, an expert in dynamic process modelling at the University of Freiburg. These inflammation models are available to professionals without the need for extensive training and can be used to help patients understand their illness. Further concrete results of the VLN project have potential applications in medicine. They include two patents on potential biomarkers for steatosis, which are pending. These disease indicators could ultimately be used as a diagnostic test predicating the onset of fatty liver disease. The network’s research efforts also draw on expertise from industrial collaborators, including German pharmaceutical company Bayer Technology Services. Industry partners have studied some genetic variants connected to the way individuals metabolise drugs. This team is led by VLN leadership team member, Lars Küpfer. The team’s findings will help

ISBE - www.isbe.eu

further Information Virtual Liver Network virtual-liver.de

BIOGRAPHY MICHAEL HUCKA FROM CALIFORNIA INSTITUTE OF TECHNOLOGY ON AN OPEN FORMAT FOR CREATING MODELS THAT HAS STIMULATED THE CREATION OF AN INTERNATIONAL COLLABORATIVE COMMUNITY OF SOFTWARE DEVELOPERS

SYSTEMS BIOLOGY MARKUP LANGUAGE (SBML) IS A FREE AND OPEN INTERCHANGE FORMAT FOR COMPUTER MODELS OF BIOLOGICAL PROCESSES. SBML IS USEFUL FOR MODELS OF METABOLISM, CELL SIGNALLING, AND MORE. IT CONTINUES TO BE EVOLVED AND EXPANDED BY AN INTERNATIONAL COMMUNITY. Michael Hucka is a member of the professional staff of the Department of Computing and Mathematical Sciences, California Institute of Technology. His work focuses on the development of software standards and infrastructure for scientific computing and he has been instrumental in the development of SBML.

The emergence of web-based technologies has catalysed the emergence of systems biology, in which dynamic biological structures and processes can be recast as software code and computational models. This promises to provide a powerful new interpretative lens for molecular biologists struggling to cope with the increasingly large datasets generated by largescale investments in modern ‘omics’ disciplines—such as genomics, transcriptomics, proteomics, kinomics and metabolomics. But if systems biology is to become a fully networked discipline—with all of the synergies that that implies—it requires an underlying information infrastructure akin to that which underpins information and communications technology. It needs agreed data standards and formats for the automated creation, publication and exchange of complex biological information. The emergence during the past fifteen years of systems biology mark-up language (SBML) as a free and open format for creating models, is an important part of this effort. SBML is an application of Extensible Mark-up Language (XML), the Worldwide Web Consortium (W3C) standard for formatting documents that now underpins data exchange across the public internet and private networks. SBML provides a common format for describing the structure and components of biological models. The role of SBML as the lingua franca of systems biology is loosely analogous to that of

hypertext mark-up language (HTML), says Mike Hucka, of the California Institute of Technology, a key figure in the creation and ongoing development of SBML. “Just as HTML lets a person or software express some content using text formatted in a certain way—with the formatting mark-up normally hidden from view by software—so too SBML lets a person or software express certain kinds of data using text formatted in a certain way,” he says. The “certain kinds of data” are usually but not necessarily mathematical models of some biological phenomena, and the “formatting” is actually descriptions of data fields and relationships between parts of a mathematical model, instead of items such as section headings and bold or italic text that you find in HTML.

SBML provides a common format for describing the structure and components of biological models The significance of SBML goes beyond its immediate role as an information tool. The emergence of SBML has itself acted as a positive feedback loop in terms of the development of systems biology. It has stimulated the creation of an international collaborative community of software developers, theoretical biologists and

ISBE - www.isbe.eu

computational modellers. The reproducibility of SBML-based models allows for their scrutiny and evaluation by other scientists. Previously, models were typically represented in the form of printed equations, which limited their utility. SBML has enabled the development of curated databases of biological models—such as the BioModels Database and JWS Online—which users can access and interrogate dynamically. It also underpinned the automated generation of 140,000 biological models from representations of biochemical pathways, in the Path2Models project, which will greatly accelerate the development of the entire field.

The specification is subject to ongoing revision and development, to take account of changes in the external technological environment and changes driven by developments within the systems biology field. In setting out the importance of SBML, Hucka frequently invokes a passage written by Nicolas Le Novère, of the Babraham Institute in Cambridge, UK: “One of the biggest problems of ‘Theoretical Biology’ was the failure of two of Popper's criteria for science: reproducibility and falsification. I have reviewed papers in the field for quite a few years now, and there is one commonality. You can't really evaluate them. You have to completely trust what is written by the authors. SBML could change that. It could permit better evaluation of modelling, and raise the whole field to a new level of confidence and consideration by other scientists in life science.” Ten years on from that observation, it is clear that SBML, the de facto standard for biological modelling, has delivered on its promise, by providing a common format that many software systems and users could agree to use. SBML is itself not a static entity. The specification is subject to ongoing revision and development, to take account of changes in the external technological environment and changes driven by developments within the systems biology field. Over time, people also find new needs and desire additions or

ISBE - www.isbe.eu

changes to existing features in a standard. All of these activities―corrections, updates and evolutionary changes―need to be done in a fair and systematic fashion. SBML Level 3, the most recent version of SBML, has been structured in such a way as to enable new extensions to be added with ease. It has a modular structure, with a defined set of core features and additional packages that extend its functionality into specific topic areas.

SBML In uSE Total number of known SBML-compatible software packages each year

Notwithstanding its importance, SBML relies, in large part, on voluntary efforts from the developer community to support its ongoing maintenance and development. Direct funding is limited to Mike Hucka’s four-strong team who support its underlying infrastructure, including the extensive SBML.org website, libSBML, a free, open-source programming library, which enables users to read, write and manipulate SBML files, and JSBML, a Javabased alternative to libSBML. US National Institutes of Health funding for this work ends in mid-2016. At the very least, follow-on funding is needed to maintain the ongoing effort, but a wider level of support for SBML’s broader development is crucial for the further maturation of systems biology. Words: Cormac Sheridan June 2015

further Information SBML sbml.org California Institute of Technology www.caltech.edu European Systems Biology Community community.isbe.eu

BIOGRAPHY DEVELOPMENTS IN STANDARDS-BASED INFORMATION INFRASTRUCTURES ARE HELPING BIOLOGISTS TO INTERROGATE ‘BIG DATA’

(L-R) Dawie van Niekerk, Johann Eicher, Danie Palm and Jacky Snoep (JWS Online team, University of Stellenbosch) JWS ONLINE IS A SYSTEMS BIOLOGY TOOL FOR SIMULATION OF KINETIC MODELS FROM A CURATED MODEL DATABASE. JACKY SNOEP IS PROFESSOR OF BIOCHEMISTRY AT STELLENBOSCH UNIVERSITY, SOUTH AFRICA.

One of the big challenges for biological research is its full maturation into a ‘big science’ discipline, capable of tackling big, complex questions in a coordinated fashion. The inherent complexity and diversity in biology makes the maturation harder for the life sciences than it was for physics, but a growing adoption of standardisation, datasharing strategies and attempts to address big questions in truly collaborative efforts are speeding up the process.

the ‘digitisation’ of biology The ongoing development of systems biology, which integrates computer-based mathematical modelling of living systems with experimental observation, represents an important strand of this process of maturation. An essential element of this is the development of robust, standards-based information infrastructures capable of managing large quantities of data and software code in readily accessible formats. JWS Online and BioModels represent two significant and complementary model management initiatives, which are contributing to the ‘digitisation’ of biology by enabling researchers to explore previously developed models of diverse biological processes. JWS Online, originally developed in 2000 at Stellenbosch University (SU; Stellenbosch, South Africa), is now co-developed at the University of Manchester (UM; Manchester, United Kingdom) and the Vrije Universiteit (VU; Amsterdam, Netherlands). It played a prominent role in pioneering the concept of providing researchers with online, centralised access to biological models. It includes a simulation environment that enables scientists

to run individual models remotely, eliminating the need for painstaking recoding work that would be otherwise necessary. “It’s a lot of work to code mathematical models from the literature, and it’s error-prone,” says Jacky Snoep, Professor of Biochemistry at Stellenbosch University. “Every researcher who wanted to use these models would have had to do the same work.”

JWS Online now contains some 200 curated models, which have been rendered into a standard format using Systems Biology Markup Language (SBML), the de facto standard for creating computational models of biological processes. The system is also employed by the FEBS Journal, to test models that are submitted for review along with papers. Reviewers have access to the JWS toolset, via a secure site, and can run the models, to ensure that the data contained in the paper can be reproduced by the model. JWS Online has been incorporated into the SEEK collaboration environment (PubMed: 21943917), originally developed for the SysMo project on the systems biology of microorganisms. The SEEK is a mature data and model management platform for largescale systems biology projects. “The data and model management structure we set up for the SysMo project is currently the best system available, and its approach is likely to evolve into a standard,” says Snoep.

ISBE - www.isbe.eu

BioModels Database, developed at the European Bioinformatics Institute (EBI; Hinxton, UK) since 2005, was created in response to the needs of the community for a model repository. It reflects the growing number of models published in the literature and provides them in a computationally reusable form. These models originate from a plethora of domains representing work that spans decades of refinement. Notable examples include: • Synthetic biology (BIOMD0000000012) • Neurobiology (BIOMD0000000020) • Oncology (BIOMD0000000234) • Virology (BIOMD0000000463) • Immunology (BIOMD0000000243) • PK/PD – Systems Pharmacology (BIOMD0000000490)

a true database, which can be interrogated dynamically BioModels Database is now by far the field’s largest repository of biological models, having amassed more than 1,000 manually curated biological models. Each of these models are described in peer-reviewed publications, manually curated to verify that the model in the database is capable of reproducing the published results, and is extensively annotated to specify the biological entities that are represented within the model. Additional annotations are also provided that link the model itself to further information such as mathematical concepts, ontological terms (including those that reference biological processes), and to other models (allowing hierarchical analysis on model lineages). Over 300 journals recommend deposition of models directly to the database in their submission guidance notes to authors. With the ever-growing means by which 'big data' is generated, there is an ever-evolving need to deal with it. The BioModels team has recently introduced a means to automatically manage models from large data sets; Besides the 1,000+ curated models, BioModels also now contains an additional 140,000 models which were generated automatically from representations of biochemical pathways taken from multiple sources. Collectively, these

ISBE - www.isbe.eu

models cover domains such as metabolism, signal transduction, electrophysiology, population and ecosystem dynamics, pharmacokinetics and pharmacodynamics, and mechanisms of disease. “We are fulfilling the classical library function in this domain—we have the record of the developed models,” says Henning Hermjakob, head of the EBI’s proteomics services team. BioModels is not just a passive repository—it is a true database, which can be interrogated dynamically. Models can be readily downloaded or can be run remotely using several different tools, including the JWS Online simulation environment. A ‘Model of the Month’ feature enables new users to learn about important individual models in a largely jargon-free way; BioModels provides a variety of teaching materials and resources, and can be regarded, says Hermjakob, as a portal to the world of modelling. Modelling biological systems continues to evolve from being an early-stage endeavour. The field has taken major steps in recent years, culminating in the publication of the first whole-cell computational model, which predicts a cell’s phenotype (or visible characteristics) from its genotype (or genetic makeup) (Karr et al., 2012). JWS Online and BioModels are both vital components of the information infrastructure supporting these efforts.

CURRENT DEVELOPMENTS IN CARDIAC CELL MODELLING

BioModels Database serves as a reliable repository of computational models of biological processes, and hosts models described in peerreviewed scientific literature. Recently it has also begun to incorporate models that can be automatically generated from 'big data' pathway resources. Henning Hermjakob is team leader of Proteomics Services at the European Bioinformatics Institute, based in Hinxton, Cambridge. a Growing database

2005

2013

Models

1000+

Species

300

400,000+

CrossReferences 1000

1,000,000+

Words: Cormac Sheridan January 2014

further Information JWS Online jjj.biochem.sun.ac.za BioModels Database www.ebi.ac.uk/biomodels The SEEK www.seek4science.org SysMo www.sysmo.net Path2Models code.google.com/p/path2models

BIOGRAPHY UK ONLINE COURSE IS CREATING NEW GENERATION OF SYSTEMS BIOLOGISTS

SYSMIC IS A COMPREHENSIVE ONLINE COURSE IN SYSTEMS BIOLOGY AIMED AT RESEARCHERS IN THE BIOLOGICAL SCIENCES. THE COURSE PROVIDES INTRODUCTORY AND ADVANCED TRAINING IN MATHS AND COMPUTING BASED AROUND BIOLOGICAL EXAMPLES. Geraint Thomas is the lead on the SysMIC project and is based in the Department of Cell and Developmental Biology at University College London. He is a core member of the UCL/Birckbeck Institute of Structural Molecular Biology and the admissions tutor and PhD liaison for the Understanding Biological Complexity PhD programme at CoMPLEX. Gerold Baier leads the development of the SysMIC course and is based in the Department of Cell and Developmental Biology at University College London. With a background in biochemistry and nonlinear dynamics, his research work is on developing computational models of epileptic seizure dynamics.

A UK initiative is delivering a thriving and expanding online course in systems biology, Systems training in Maths, Informatics and Computational Biology (SysMIC). The course, which is now looking to spread beyond the UK, serves up introductory and advanced training in maths and computing, but always with biological examples. SysMIC sprang from a desire by a UK research council to ensure its workforce had the mathematical and computational horsepower to engage with systems biology and the interdisciplinary science agenda. Funded by the Biotechnology and Biological Sciences Research Council in the UK and run by a consortium of University College London, Birbeck College, the University of Edinburgh and Open University, SysMIC has immersed participants in programming MATLAB to model and simulate biological systems and and using R statistical software. The core team at UCL comprises Geraint Thomas, Gerold Baier, web technologist Philip Lewis and administrator Hannah Lawrence, and the course has already served up skills and confidence to over one thousand biologists. Twelve of the 14 University Doctoral Training Partnerships funded by the UK’s Biotechnology and Biological Science Research Council have adopted the course as core training for their PhD students, but it caters too for established researchers wishing to brush up their skills. “It is for anyone interested in biosciences, from molecules to ecosystems, and for all levels of seniority,” explains Geraint Thomas, the cellular biologist at University College London, UK, who leads SysMIC.

They will emerge with skills suited to modelling biological systems, be it ecological systems or the growth of organisms or cell signalling pathways. Plus, they will no longer be baffled by other people’s simulations of the biological world. “The immediate aim is that they can read a paper, recognise the assumptions made, look at the code and check it if they want,” says Thomas. There are three modules, each last six months and requiring five hours work per week – progressing from basic skills to advanced topics and then project work.

SysMIC is for anyone interested in biosciences, from molecules to ecosystems, and for all levels of seniority Course organisers say undergrad and postgrad students too often see their maths skills atrophy as biology courses pack in core subjects and squeeze maths out. SysMIC will instil linear algebra, dynamical systems, and differential equations, but turn maths teaching on its head. Traditionally teaching starts with maths and seeks applications. “We start with a biological example and ask what kind of mathematics is needed,” says Gerold Baier at UCL. Computationally the programming language of MATLAB and R statistical package are taught, giving the basics to set up code, analyse data and write a script to run a model. Real papers are chosen, but nothing too advanced for beginners.

ISBE - www.isbe.eu

Trainees see gains in being able to communicate with mathematicians, statisticians and computer scientists. “It is the communication skills that people value most. They learn how to think in a structured, quantitative way, and are able to talk to specialists in a way that allows them gain much more profound advice,” Baier explains.

Now that things are working out, we want to expand into Europe The course is woven to fit time-poor people, engaged full time in biological research. There is strong demand from pharma. “The pharma industry have employed people with a bioscience or pharmacological degree, but often little mathematical or programming training, yet the nature of data today means that you must use quantitative and computational means to deal with them properly,” says Baier. Bespoke parts for bioscience industries are being created, which focus on experimental design, statistical analysis and how to optimise those in drug discovery programme. The core material is akin to a solid trunk of taught modules, but with branches now beginning to grow out to cater for individuals or particular groups. That the future of biology is going to be more mathematical is hardly a revelation. This was clear 15 years ago, says Thomas, “but it has taken a while to turn around the training super tanker.” Thomas set out as a biochemist, tacked to a more multidisciplinary line, before devoting six years to a maths degree at the Open University, UK. “I enjoyed it straight away. I got a buzz out of solving things. ” says Thomas. “But at every stage of my degree I was thinking, I can apply this to my research, or my colleague could use this mathematical approach.” Those taking the course often say they spend so long dealing with complexity that it is satisfying to have clear answers to problems.

ISBE - www.isbe.eu

SysMIC adds new papers, examples and datasets continuously, along with multimedia resources. There are online tutors and an expanding FAQs pile. Course quizzes serve to reassure people that they are making progress, while supervisors, line managers or a training team can determine whether sufficient high standard is being achieved. Course sponsors typically expect 70% of course questions to be completed by students, with the remainder devoted to a mini-project, but that can be adjusted. The course never sits still. It’s expanding to take on a parallel version with the Python programming language. As new languages establish themselves, organisers will build new versions of the course, which will stand as a perpetual resource. “If you do it in MATLAB and 3 years later you realise you need some Python training, then you can go back and see familiar material and take on all the examples in a completely different language,” Thomas says. Another major objective he has is to move beyond the UK and embrace European scientists. “Now that things are working out, we want to expand into Europe.” Anja Korencic at the University of Ljubljana began the course to assist her with modelling in her research project on circadian clocks. “I had some MATLAB, but I was looking for a systematic introduction to modelling and to be able to communicate with modellers.” She is impressed by how the course begins at a basic level, taking you through step by step, and how it is really designed for biologists. “I thought I might skip over some of the early sessions, but even the first and second sessions had some really nice tricks or details that proved really useful for me.”

The SysMIC course comprises of three modules, each taking approximately 6 months to complete: Modules 1 and 2 consist of a series of units based around biological examples which are supported with mathematical background reading. The biological examples shows how the maths techniques can be used to model biological systems, with code examples of computer programming. Students are taught using hands-on code examples in MATLAB (used for mathematical analysis and modelling) and the R package (used for data analysis). Module 3 consists of support for students undertaking an extended project to apply interdisciplinary skills to their own area of interest.

Words: Anthony King June 2015

further Information SysMIC www.sysmic.ac.uk

BIOGRAPHY PEDRO MENDES FROM THE UNIVERSITY OF MANCHESTER EXPLAINS HOW BIOLOGISTS ARE BEING GIVEN A HELPING HAND TO INTRODUCE MODELLING INTO THEIR RESEARCH The best front-of-house greetings are friendly and do not require you to know what’s going on behind the scenes. This trend is seen in technology with intuitive digital user interfaces developed for simplicity and ease of use, hiding the heavy duty lifting. But there is an analogous option for biologists. COPASI IS A SOFTWARE APPLICATION FOR SIMULATION AND ANALYSIS OF BIOCHEMICAL NETWORKS AND THEIR DYNAMICS. COPASI IS A STAND-ALONE PROGRAM THAT SUPPORTS MODELS IN THE SYSTEMS BIOLOGY MARKUP LANGUAGE (SBML) STANDARD AND CAN SIMULATE THEIR BEHAVIOR USING ORDINARY DIFFERENTIAL EQUATIONS (ODES) OR GILLESPIE'S STOCHASTIC SIMULATION ALGORITHM. Pedro Mendes is Professor of Computational Systems Biology at the University of Manchester and Professor in Residence at the University of Connecticut Health Center. His research is in the area of computational systems biology, which aims to better understand biological systems through the use of computer models.

The software began its journey in 2000 as collaborations between Mendes and Ursula Kummer, biological modeller at the University of Heidelberg, Germany. The first version launched four year later in 2004.

COPASI is an open source software package that offers biologists a guiding hand in modelling and simulation. This is a welcome service given how modelling and simulation is increasingly needed to aid the understanding of cellular behavior and to facilitate a quantitative reading of experiments.

Under the hood it had two major types of simulations. One for simulating biochemical networks with differential equations, which is the most traditional approach and probably the most widely used. “Essentially users describe a network and the software builds differential equations for them, based on the network and some mathematical details they may need to add,” Mendes explains.

“The software was made to target those biologists who want to do modelling, but don’t necessarily have all the mathematical background. The software tries to hide some of the mathematics behind the user interface, so the user doesn’t need to know all the algorithms being used,” says Pedro Mendes, Professor of Computational Systems Biology at the University of Manchester, UK, and a leader in its development.

COPASI allows researchers to create models of biochemical networks and then using different algorithms to simulate them and analyse the results

“It allows researchers to create models of biochemical networks and then using different algorithms to simulate them and analyse the results.” Cells are composed of many organelles; so this software allows the reactions to be distributed across several compartments. It is used in systems biology to develop reaction kinetic models for biochemical networks, to simulate their behavior and to analyse their properties. Models can be based on ordinary differential equations or stochastic kinetics.

The other type – stochastic simulations – considers each molecule as a single entity, takes on more of the physical details of the whole network and offers more accurate, more mechanistic simulations. But COPASI makes it straightforward for researchers to switch between the two. “You can tell the software to do it this way or that, and it tries to do everything automatically,” says Mendes.

ISBE - www.isbe.eu

COPASI is not standing still. Different algorithms are added all the time, with parameter estimation now an option: this essentially links experimental data with the model, building a bridge between the model and what is observed. Being open source, COPASI’s success can be difficult to precisely quantify. But with 10,000 downloads last year and a registered community of 2,000 users, it is obviously a popular computational tool. Right now, the Mendes and Kummer labs are responding to user demand by introducing delay differential equations, which are types of models that have explicit delays built in; it is important in fields like circadian biology but has uses in other areas too.

MATLAB requires researchers to learn some MATLAB programming language. “Learning our software should be easy and the language used is often specific to biochemistry,” Mendes explains. He compares it to the difference between Windows and DOS – where you had to type in and remember commands. For windows, you use macros and icons and don’t need to remember everything. That is how COPASI works. It essentially has menus and dialogue boxes and icons and people click on these things and build the model up. Still, SBML means researchers can test models out on either or both packages.

It is open source, so others can contribute. “We have an API (an application programming interface) which is essentially a way of allowing other people to write programmes and use part of COPASI in their programmes,” says Mendes.

Learning our software should be easy

They have posted instructive clips on YouTube, introducing COPASI and giving tutorials on what you can do with it. The Mendes and Kummer labs often give workshops at conferences with the lessons learnt from watching users helping them to continually improve this vital software package.

COPASI is able to import and export in Systems Biology Markup Language (SBML), which is a free file format useful for exchanging models of metabolism and cell signaling and more.

ISBE - www.isbe.eu

Copasi is an international collaboration between:

Putting his perfectionist hat on, Mendes says they are striving to improve the software and learning tools for users.

“If a pathogen invades a person, the immune system takes a while to respond. Sometimes you don’t need to account for that delay,” says Mendes. “But sometimes researchers want to make a higher level model where they add a specific delay in the model. If the system responds five hours later, this must be represented in a specific algorithm.”

What other options are open to researchers? People can write their own specific software in languages like C or python says Mendes, which is how he started out when doing his PhD. This led him to develop GEPASI in the early 1990s, the precursor of COPASI. “The number of people writing their own programmes is smaller now,” he says. “The majority of people who don’t use COPASI use MATLAB, a commercial package designed originally for engineers.”

COPaSI faCT fILE

10,000 downloads in 2014 Community of 2,000 users

Words: Anthony King June 2015

further Information COPASI www.copasi.org

BIOGRAPHY KATY WOLSTENCROFT FROM THE UNIVERSITY OF LEIDEN DISCUSSES HOW THE WEBBASED SEEK PLATFORM IS ENABLING SCIENTISTS GREATER ACCESS TO AND MORE INTELLIGENT USE OF THE VAST REPOSITORIES OF BIOLOGICAL DATA BEING AMASSED

THE SEEK PLATFORM IS A WEBBASED RESOURCE FOR SHARING HETEROGENEOUS SCIENTIFIC RESEARCH DATASETS, MODELS OR SIMULATIONS, PROCESSES AND RESEARCH OUTCOMES. IT PRESERVES ASSOCIATIONS BETWEEN THEM, ALONG WITH INFORMATION ABOUT THE PEOPLE AND ORGANISATIONS INVOLVED. Dr Katy Wolstencroft is an Assistant Professor at the Leiden Institute of Advanced Computer Science (LIACS), teaching courses in bioinformatics and computer science. Dr Wolstencroft previously was a Research Fellow in the School of Computer Science, University of Manchester working on scientific workflows with the Taverna workbench, and Systems Biology data and model management with the SEEK platform.

The increasing data intensity of biological research, which is closely linked to the increasing complexity of scientific collaboration, has created an urgent need for new tools to allow researchers to navigate the ever-expanding information universe. Systems biology, the emerging discipline that seeks to map precisely all of the dynamic processes within living cells and organisms, has created very particular data management requirements. At its core is a tight coupling between experimental data and data modelling, as predictions and hypotheses based on computer models are tested experimentally, which can lead to further refinements in the model or to revisions in certain parameters. The scale and complexity of the data that are generated require standards of data stewardship that represent significant challenges to biologists and data management specialists alike.

an ambitious attempt to capture the complexities of systems biology research â&#x20AC;Ś to maximise the use and reuse of the data that are generated The SEEK platform is a commons interface which has grown out of a large-scale European project on the systems biology of microorganisms (SysMO). It represents an ambitious attempt to capture the complexities of systems biology research within a webbased data management and collaboration

environment, in order to maximise the use and reuse of the data that are generated. The platform extends into the systems biology domain concepts and standards developed under the semantic web initiative of the World Wide Web Consortium (W3C), an ongoing effort to present disparate forms of information in machine-readable formats, to enable more sophisticated forms of data searching and analysis across distributed systems. SEEK was developed by researchers based at the University of Manchester (UK), the Heidelberg Institute for Theoretical Studies (Germany) and the University of Stellenbosch (South Africa) in response to a requirement on the part of SysMOâ&#x20AC;&#x2122;s funders that its grantees, who are distributed across more than 100 institutions located in six countries, share data and data models. Before SEEK there was no obvious way to do this in any kind of comprehensive or controlled fashion. Researchers shared data by exchanging very basic forms of documentation, such as spreadsheets, by setting up project-specific wikis or by using generic web-based or cloudbased collaboration environments, which are not adapted to the specific methodologies or information architectures of systems biology. The SEEK system acts both as a repository, which allows users to publish and share data and models, and as a registry, which provides links to relevant data sources and models hosted elsewhere. Its main components include an assets catalogue, which holds data files, protocols, workflows, models and

ISBE - www.isbe.eu

publications; a ‘yellow pages’ feature, which contains information on SysMO participants and their host institutions; and an access control feature, which enables user to control third party access to their data. One of the main challenges inherent in its design was to create a system that was sufficiently powerful and robust to be useful, while not placing an excessive burden on its users. Biological information is inherently heterogeneous and complex. Systems biology generates multiple types of data, including various species of ‘omics data (genomics, transcriptomics, proteomics, metabolomics, etc.), imaging data and enzyme kinetics data. To enable all of this to be managed coherently in a web environment, data and accompanying models and experimental protocols need to be ‘annotated’ or described in a precisely defined manner, and the relationships between the various elements must also be specified.

The SEEK … eliminates what would otherwise represent a significant overhead for users

established under the European Commission’s 7th Framework Programme (FP7), which has deployed SEEK to develop a comprehensive picture of research activity within the network. The Virtual Liver Network, which comprises 70 research groups distributed across Germany, has implemented SEEK to enable its members to find and share data, models and processes that relate to liver function—at multiple levels of organisation, from the individual cell up to the complete organism. Other users include: Unicellsys, another FP7 project, which is developing a quantitative understanding of the control of and coordination of cell growth in response to internal and external triggers; JenAge, a German research initiative on the systems biology of ageing; and ROSage, another German project, which is exploring the role of reactive oxygen species in the aging process.

SEEK In uSE German government-funded initiative €50M investment over 5 years 70 research groups 41 Institutions 250 Scientists

SEEK is part of a wider ecosystem of standards-compliant, open-source systems that will, ultimately, facilitate greater access to and more intelligent use of the vast repositories of biological data that are being amassed globally. Words: Cormac Sheridan December 2013

The SEEK system can generate this ‘metadata’—or data about data—on the fly, as users deposit data held in commonly used file formats, such as spreadsheets, using predefined templates. This eliminates what would otherwise represent a significant overhead for users. “There are not that many incentives for people to spend time curating and annotating their data and their models,” says Katy Wolstencroft , a member of the SEEK development team at Manchester (now at the University of Leiden, in the Netherlands.). The system also draws on the ISA framework (Investigation, Studies, Assays), an emerging software standard for managing biosciences data. SEEK can be readily adapted for any systems biology project. Its user base has, in fact, grown to more than a dozen other implementations since it became available via an open source licence in 2010. These include the European Virtual Institute of Malaria Research (EVIMalaR), a Network of Excellence

ISBE - www.isbe.eu

further Information The SEEK platform www.seek4science.org ISA framework (Investigation, Studies, Assays) www.isacommons.org SysMO www.sysmo.eu

acknowledgements This publication was produced by Systems Biology Ireland, University College Dublin, as part of the Infrastructure for Systems Biology Europe (ISBE) programme, supported by an EU FP7 Infrastructure award (Grant agreement no: 312455). We would like to thank the interviewees for generously giving up their time to talk about their work, as well as their support teams for providing additional information. Editor: Will Fitzmaurice, Systems Biology Ireland Writers: Anthony King, Sabine Louet, Claire O’Connell, Cormac Sheridan Design: Resonate Design

Image credits Profile photographs courtesy of the researchers and their institutions unless otherwise stated. Page 4 Page 4 Page 4 Page 14 Page 20 Page 25 Page 26 Page 28 Page 29 Page 30 Page 32 Page 32

Denis Noble: SBMC 2010 © Britt Schilling ap_i/Shutterstock CLIPAREA/Shutterstock YanLev/Shutterstock Nerthuz/iStockphoto Stuart Dunbar at outreach event ©Layton Thompson Happetr/Shutterstock bymandesigns/Shutterstock Thibault Helleputte © Laetizia Bazzoni Rheumakit © DNAlytics zebrafish image courtesy of Melinda Halasz University College Dublin researchers © UCD

Contact For further information on Infrastructure for Systems Biology Europe, visit www.isbe.eu. For further information on this publication, please contact: Will Fitzmaurice Systems Biology Ireland University College Dublin william.fitzmaurice@ucd.ie