Ten rules for integrating data science into drug discovery

robot holds the pill

Learn how to translate innovative science into life-changing medicines

In 2020, a team of data science leaders presented a blueprint on how to tap into the full potential of data science in the pharmaceutical industry and to synthesize more accurate results during drug development. A set of simple rules illustrates how organizations can foster a data-oriented culture and challenge processes that have been stalling progress for decades. Below, you will find ten easy-to-follow recommendations that can guide your company on a journey toward digital transformation.

1. Prioritize data science as a fundamental discipline in drug discovery

Traditionally, biology, chemistry, and medicine were at the center of drug development, with biopharma companies paying special attention to attracting talent in these areas of expertise. However, disruptive technological progress has led to the need for specialists who could change drug discovery and development by handling large datasets; for example, pushing the industry forward with the help of in silico analyses. In addition, a massive influx of clinical and medical data requires an alternative perspective on the advancement of drug development processes. As a consequence, data scientists have gained positions at the frontline of digital transformation in drug discovery lately. 

The graph below demonstrates how specialists from technology companies can contribute to a medicine discovery.technology companies can accelerate drug discoveryFigure 1. How technology companies can accelerate drug discovery according to the Deloitte Centre for Health Solutions.

Cooperation among technology, drug discovery, and biology companies minimizes the time for discovering and delivering new potential medicines.

2. Get in touch with data scientists before data generation

Low quality data is a daunting challenge in clinical development. Unfortunately, contacting data scientists often comes as an afterthought for biomedical researchers. Limited biomedical insights often stem from the inability to design experiments adequately, and fully realize the potential of data generation and analysis. Regular communication of experimental and computational scientists, as well as data scientists, can help your company avoid unnecessary errors in data generation and enable your expert teams to prepare data from different sources for its reuse and integration in a universal data store.


Explore how Avenga created a tailored solution for query tracking, internal optimization, and drug testing that was delivered to a top pharmaceutical company. [Success story]


3. Set the principles of data generation

Researchers have created a FAIR rule for generating data during drug discovery and development. FAIR stands for Findable, Accessible, Interoperable, and Reusable data that enables accurate meta-analyses in the different stages of drug discovery. The term illustrates the essential qualities associated with data that can be used for creating new hypotheses and advancing existing programs. If your organization can find, access, interoperate, and reuse data, your teams have the necessary foundation to analyze available datasets and promptly discover scientific insights.

4. Use an integrated data store to analyze and visualize data

Approaching data as an untapped asset implies the development of resources and tools that will enable data scientists, experiment scientists, and clinicians to accumulate accurate results and implement them into the decision-making process. The ideal model includes a search engine that allows for identifying entities and the relationships among them. The potential list of features the user can find in the database encompasses:

  • Targets
  • Compounds
  • Indications
  • Biological pathways
  • Experiments
  • Studies
  • Portfolio projects

Additionally, an integrated data store requires your organization to build application programming interfaces (APIs) to accumulate results and interactive graphical user interfaces (GUIs) to visualize them in an easily-accessible form.

5. Establish cooperation between distributed teams of data scientists

Biopharma companies used to build centralized data science groups that assisted other departments. However, apparently, this traditional model of maintaining communication with data scientists is no longer advantageous as they have limited access to project details. One of the greatest advancements, in this case, came with employing computational specialists for a company’s specific departments. This distributed model allows data scientists to acquire new knowledge in certain research areas, and create a community of specialists who share insights and cooperate across the organization.

6. Nurture a technology-oriented culture

Experts in data science cannot produce the desired results without the knowledge of biological and experimental data. This principle applies to all the specialists working in drug discovery with the rising need for high levels of digital proficiency in biopharma companies. Training sessions allow employees to learn the basics of data science, and find new ways to analyze internal and external datasets for specific goals in drug discovery. A technology-driven organizational culture also implies improved cooperation among different specialists.


Discover how Artificial Intelligence (AI) and Machine Learning (ML) revolutionize drug safety. [Discover more]


cooperation between distributed teams of data scientists

7. Recognize the potential limitations of AI and big data technologies

AI has distinct advantages in different segments of drug discovery, starting from improved disease understanding to a more efficient clinical trial design. But, despite its recent popularity, biostatisticians, computational biologists, and chemists have been acquainted with Machine Learning (ML) for decades. AI and big data in the pharmaceutical industry still have a range of limitations, and the analysis of large and annotated datasets inevitably poses some complex challenges. An inaccurate picture of AI’s strengths and weaknesses can create expectations for miraculous transformations. An inaccurate picture of AI’s strengths and weaknesses can create expectations for miraculous transformations. Therefore, new technologies shouldn’t be seen as a panacea, but rather a powerful instrument for living up to the potential of data science in pharma.

8. Promote strategic partnerships

Contextualizing internal data with the wealth of public data is key to the development of strong partnerships with other pharmaceutical companies and academic institutions. Your organization can publish relevant data and provide access to available instruments, and in this way, contribute to the biopharma community and promote new data solutions. Precompetitive and collaborative projects can become game-changers for the industry’s future. Major biopharma companies consider AI to be an instrument for collaboration. 

The graph below illustrates AI-driven deals in the big pharma industry that happened in 2019. analysis of deals disclosed in the marketFigure 2. Deloitte analysis of deals disclosed in the market.

The statistics show that major biopharma companies integrated AI into their discovery process by employing AI experts and data scientists, and building new opportunities for cooperation.

9. Provide data science teams with sufficient resources

No two projects in drug discovery are identical, so the implementation of data science for various purposes is bound to require specific expertise. The scope of a data scientist’s responsibilities may encompass engineering, curation, integration, analysis, and/or mining of data. Appropriate resourcing would mean that the company provides its professionals with datasets, software licenses, external collaborations, professional services, and the hardware necessary to achieve their specific objectives.


Find out how Avenga created an innovative drug ordering system that automatically communicates the drug order to a pharmacy and then displays messages related to predetermined situations of the user. [Success story]


10. Attract talent and support your experts

The increasing demand for data scientists continues to transform the biopharma industry. Forward-looking companies search for professionals who can combine computational modeling with domain knowledge and use data analytics in the pharmaceutical industry for improved decision-making. In addition, as organizations create alternative drug discovery project teams, it is crucial to provide data scientists with complete exposure to the area of research and project specifics.

Closing remarks

Implementing data science in drug discovery means that companies can accelerate the production of life-changing medicines and promptly extract useful insights from data. This process requires biopharma companies to be at the frontline of innovation and see strategic partnerships as a game-changing tool for progress. Hopefully, this set of rules will help your company maximize the value of data and deliver the best treatments to your patients.

Interested in learning more? Discover additional ways of how you can strategize the use of AI for the growth of your business in our portfolio.

Let us help you on the way to digital
Start a conversation. Please use the contact form below and we’ll get back to you shortly.